How do neurons in your brain encode the diversity of stimuli present in the world? This is one of the questions that neuroscientists have to answers about how the brain works. The world holds an infinite array of things to see, hear, touch, etc., yet your brain only has a finite number of neurons to encode them. How is this infinite diversity assimilated by a machine with finite components?
To address this issue, I want to talk about Hromadka et al. publishing in the journal PLoS Biology. Hromadka et al. perform electrical recordings in the auditory cortex of unanesthetized rats. (The significance of it being unanesthetized we will talk about in a second.) They played different sounds for the rats and recorded the responses in many different neurons to those sounds.
They found that only a small number of neurons from their sample responded to any particular sounds. This means that the representation of sounds is “sparse” as opposed to “dense.” I will define these terms for you.
One way to think about how the brain encodes the bewildering variety of things out in the world is to consider the following paradox: the grandmother neuron. The grandmother neuron is a concept originated by Jerry Lettvin. Basically it works like this. Your sensory systems break stimuli down into specific features of the stimulus. Your visual system breaks down things into lines and colors and movement, for instance. But in order for your brain to perceive these features as unitary objects, the activity originating in the feature neurons must eventually feed back to the same place. (This is called the binding problem — namely how features that are separated by the sensory systems bound together to form a single representation of an object.)
One way to solve the binding problem is by bringing all of the feature neurons together to activate one neuron. However, if all the features from one stimulus feed back to a single neuron, it would stand to reason that somewhere in your brain you have a grandmother neuron — a neuron that is only active when you perceive your grandma.
Of course, organizing the system this way would lead to some farcically odd consequences. First, you have a limited number of neurons in your brain, but there are an infinite number of stimuli to encode. If only one is used for each stimulus, you would run out of neurons. Second, what if you lost that neuron? Would you no longer be able to remember your grandmother? Clearly, the idea that you would have a single neuron that encodes each object isn’t going to work.
The solution to the grandmother neuron problem is an idea called ensemble encoding. Ensemble encoding means that you encode different objects in a set rather than a single neuron. Say you have a billion neurons in your visual cortex. Maybe 1,000 or a million or whatever would be activated whenever you see your grandmother. But each of the members of that set could be active in encoding other things. Ensemble encoding codes different stimuli combinatorially, allowing a significantly larger number of things to be encoded. (The number of combination of 1,000,000 units taken 1,000 at a time is a very large number.) Also, because there is a set of neurons being active, the loss of any particular neuron is not a big deal. Say you lost 1 out of the 1,000 neurons associated with the grandmother representation. That wouldn’t be such a big deal because the remainder of the representation is still there.
As an aside, periodically you hear these totally misinterpreted stories in the news about a Halle Berry neuron found is X brain area. The misinterpretation here is not that neuron doesn’t activate selectively in response to Halle Berry. That may be true. The misinterpretation is that it is the only neuron activated by Halle Berry or that only Halle Berry activates the neuron. The purpose of the neuron is not to encode Halle Berry. The purpose of the neuron is to encode a class of stimuli as part of an ensemble.
Ensemble encoding brings up another interesting question though: how many neurons are involved in representing each stimuli? I am going to call this the encoding density. Neuroscientists use two terms to describe encoding density. Sparse encoding is when the stimulus is encoded by a large change in activity in a small number of neurons. If you have a billion neurons and only 5 are activated by a stimulus, then we would say that stimulus is sparsely encoded. On the other hand, when the stimulus is encoded by a small change in activity in a large number of neurons, we say it is densely encoded. These terms — sparse vs. dense — speak (generally) to how tightly tuned a neuron is for particular stimuli. If a particular neuron activates to a wide range of stimuli, it isn’t tightly tuned. This is the case in dense encoding. On the other hand, tuning could be tight. The neuron might respond to only two or three different stimuli. Generally, the degree of tuning is inversely correlated with encoding density. We will see that this isn’t always true, and a counterexample is posed in this paper.
Another aside: If a neuroscientist were to ask, “what does a brain region encode?” sparse encoding is generally a bit easier to deal with than dense encoding. Remember that when you are doing experiments of this nature you only can record from a subset of neurons in a particular brain region. There are far too many to record them all. When you are dealing with a region that encodes things sparsely, the presence of absence of a stimulus causes a large change in neuronal activity. Thus, any large change in activity tends to suggest that whatever stimulus you were using is the one the brain regions encode. Dense encoding can be troublesome. Because dense encoding involves neurons that are promiscuous in what they respond to — they are broadly tuned — you are never quite sure whether the stimulus you are using is really what is encoded in that brain region.
An example of this problem is research that attributes prefrontal neurons to the processing of faces. (Here is an example.) It is true that neurons in the prefrontal cortex respond to faces, and it is true that these face-selective neurons are anatomically segregated from other neurons in the prefrontal cortex. However, prefrontal cortical neurons are broadly tuned; they respond to a lot of stuff. Furthermore, lesions to the prefrontal cortex do not typically produce deficits in face perception. My point is here is to illustrate the problem of over-interpretation when you are dealing with dense encoding. Brain regions that show dense encoding respond to a wide variety of stimuli, but a response does not necessarily mean that region exists to encode the particular stimulus you are using.
Hromadka et al.
Now that we have our terminology and background, let’s look at Hromadka et al.. The authors wanted to find out whether sounds are encoded sparsely or densely in the auditory cortex of unanesthetized rats. Whether or not the rats are anesthetized does matter. The drugs used in global anesthesia can sometimes change the network properties of a particular brain region. This means that experiments done with or without anesthesia can sometimes yield divergent results.
To do this, they mounted the heads of rats in a stereotactic frame and recorded from neurons in the auditory cortex. They surveyed the responses of each recorded neuron to a set of sounds including pure tones, complex noises, and natural sounds. As opposed to the more standard way of recording neuronal activity using a metal electrode, the researchers in this paper use a glass electrode. This is important because when you use a metal electrode, you identify the neurons by measuring activity: you have to see that a neuron is spiking to know that it is there. Using an electrode on the other hand, you can know that you are in a neuron by measuring the field potential (more negative inside the cell). This frees this paper from a bias that papers using metal electrodes might have. Metal electrodes tend to underestimate the number of inactive neurons; the use of a glass electrodes does not.
After surveying the response of many neurons to different sounds, the authors conclude that activity in the auditory cortex is sparsely coded. This is illustrated in the figure below (Figure 6 from the paper, click to enlarge).
The top row indicates the fraction of neurons that responded to particular types of stimuli. Note that increases the volume of the stimulus (increased dB) does not increases this proportion greatly. The bottom row is a histogram of the probability that a particular neuron will change its firing rate (in spikes per second) by a given degree. See how it is most probable that the neuron will not change its firing rate at all.
These results strongly suggest that the auditory cortex is sparsely coded. Neurons generally only change their firing rates in response to a narrow range of stimuli with the majority of neurons remaining unchanged.
There are some interesting complexities in the data, however. The first is that some of the neurons in their sample are broadly tuned: they respond to a broad range of stimuli. This would mean that the neurons would violate the principle that I described earlier — that sharp tuning is associated with sparse representations. The authors explain this anomaly thusly:
Half of the cells (50%) did not show any significant change (increase or decrease) in firing rate during any response epoch, to any stimulus; an example of such an unresponsive neuron was shown in Figure 2H. At the other extreme, a few broadly tuned cells showed significant changes in firing rate in all (four or five) octave bins (i.e., across the whole frequency space tested) for at least one of the response periods.
It might appear that the sparseness we report is incompatible with the broad frequency tuning of rat auditory cortical neurons. However, we found that sparseness was not achieved through narrow frequency tuning. Instead, it arose through a combination of factors. First, 50% of the neural population failed to respond to any of the simple stimuli we presented. Second, responses were often brief; in many neurons, the change in firing rate was limited to just one of the three response epochs. Thus, sparseness of the response in time contributed to the overall sparseness of the population response. Finally, even when changes occurred they were typically small; the increase in firing rate exceeded 20 sp/s in only about a quarter of the statistically significant responses. As a result, only a small fraction of neurons responded vigorously to any tone even though frequency tuning was broad.
What this means is that while most neurons do not respond to a particular stimulus — suggesting spare encoding — there are a fraction that are promiscuous responders — broadly tuned neurons. The authors speculate that the promiscuous responders might actually be a different type of neuron. In contrast to pyramidal cells — which are mostly excitatory — the promiscuous cells might be inhibitory interneurons. This subset seem to violate the tight tuning-sparse coding rule I discussed earlier:
Although definitive identification of interneurons requires other techniques such as morphological reconstruction, it is likely that majority of highly responsive cells in our sample were not excitatory pyramidal neurons. We speculate that the high responsiveness of inhibitory interneurons might contribute to population sparseness of stimulus-evoked responses by simply inhibiting responses of pyramidal neurons in the auditory cortex. Such inhibition could then lead to sparse communication between the primary auditory cortex and higher sensory cortical areas in awake animals.
Complexities aside, revealing that the auditory cortex encodes stimuli sparsely has important implications. According to some theories, sparse coding is more energy efficient than dense coding. This would explain why we see sparse coding in other areas of the brain besides the auditory cortex. The authors talk about the significance of their work:
The population sparseness in the awake auditory cortex we described arose through a combination of three factors. First, half of neurons failed to respond to any tone we presented. Second, responses were often brief. Third, the amplitude of responses was usually low. Thus, even though the frequency tuning of single neurons is usually broad, only a small fraction of neurons responded vigorously and most neurons were silent.
Experimental evidence for sparse coding has been found in a range of experimental preparations, including the visual, motor, barrel, and olfactory systems, the zebra finch auditory system, and cat lateral geniculate nucleus. However, the sparseness of representations in the auditory cortex has not been explicitly addressed in previous work. Our results constitute the first direct evidence that the representation of sounds in the auditory cortex of unanesthetized animals is sparse.
Our data support the “efficient coding hypothesis,” according to which the goal of sensory processing is to construct an efficient representation of the sensory environment. Sparse codes can provide efficient representations for natural scenes. Sparse representations may also offer energy efficient coding, where fewer spikes are required compared to dense representations. (Emphasis mine. Citations removed.)
For the reasons that I discussed above related to the grandmother neuron, there appears to be a certain trade-off going on in encoding density. On the one hand, you want things to be as sparse as possible because activating fewer neurons is energy efficient. On the other hand, if you encode using too few neurons, you loose out of coding complexity and the grandmother neuron paradox starts to come into play. It has also been argued that sparse representations are easier to identify and hence to learn than dense representations. If you are a higher order brain region attempting to decipher a stimulus in a lower order brain region, big changes in activity in a small number of neurons is easier to recognize than small changes in activity in a large number of neurons. With dense coding, you get the added issue that changes in activity may difficult to resolve from baseline activity.
Hat-tip: Faculty of 1000
Hromadka, T., DeWeese, M.R., Zador, A.M. (2008). Sparse Representation of Sounds in the Unanesthetized Auditory Cortex. PLoS Biology, 6(1), e16. DOI: 10.1371/journal.pbio.0060016