NEURAL NETWORKS: HOW DO WE (AND OTHER ANIMALS) DO CLASSIFICATION?
I've been discussing sniffer dogs as an analogy for a machine classification system, but we haven't thought about how animals actually manage to solve very complicated classification problems. Certainly, sniffer dogs in the airport are not doing LDA or any similar model-based calculation. As I write this, human nervous systems are still the most accurate and flexible classification methods known (expect this to change in the next few years). In the training or learning phase, it is believed that connections between neurons are modulated based on simple rules (such as Hebb's rule). Once the connections have been formed, sensory inputs are passed along through these connections ultimately leading to activation of a neuron (or group of neurons) whose specific activity represents the class of the observation.
Although understanding learned animal behavior is a complicated neurobiology problem, and outside of the scope of this book, animal nervous systems were the inspiration for a class of machine learning methods known as neural networks (or more accurately, artificial neural networks). Although we won't consider these further here, these methods fall into the category of nonlinear classification methods that build complex classifiers out of many simpler units. In the case of neural networks, the simple units are referred to as "neurons," and they are represented by simple mathematical functions. Because no probabilistic models underlie such networks, computer science researchers have developed objective functions for neural networks inspired by physical intuition (e.g., energy minimization, Poultney et al. 2006) and training methods based on biological analogy (e.g., wake-sleep algorithm, Hinton et al. 1995). An added level of complexity in these models is the choice of the structure of the neural network. Among the most practically successful artificial neural networks are those designed to recognize patterns in images, with structures inspired by the animal visual cortex (Le Cun et al. 1990). Finally, because large neural networks may have millions of parameters, learning the parameters requires specialized optimization techniques, and regularizing to prevent overfitting is a complex issue.
Where good solutions to these challenges have been identified, there has been considerable success using so-called "deep" neural networks to perform practical classification tasks (e.g., Krizhevsky et al. 2012). These models are referred to as "deep" because they have many layers (say more than three) of simple units ("neurons"). In deep neural networks (and deep learning), computations are done directly on the data by a first layer, and the results of those computations is sent to a second layer that does a further computation, the results of which are sent to a third layer, etc., such that information is propagated "deep" into the model. These models are designed to mimic the way sensory information is processed by layers of biological neurons in the retina and visual cortex. Because of their recent successes in a wide range of classification tasks, we should expect to see these methods increasingly applied to genome-scale biological data in the coming years.