Simulated Simulation: Mirror Neurons Emerge in a Speech Recognition Model

By developinginte… on November 7, 2007.

Speech recognition remains a daunting challenge for computer programmers partly because the continuous speech stream is highly under-determined. For example take coarticulation, which refers to the fact that the auditory frequencies corresponding to a given letter are strongly influenced by the letters both preceding and following it - sometimes interpreted to mean that there is no invariant set of purely auditory characteristics defining any given letter. Thus it's difficult to recover the words that a person is saying, since each part of that word is influenced by the words surrounding it (and so on, ad infinitum).

One class of speech perception theories (the "motor theories") propose that this computational problem is circumvented by the brain in a very interesting way: the incoming speech stream is actually simulated by the motor system, and thus the perceiver can, through some reverse translation process, recover what the intended articulatory components of the speech stream were.

There's at least some interesting evidence to support these theories, as reviewed by Westermann and Miranda in their recent Brain and Language paper, including the fact that deaf or tracheotomized infants don't show normal babbling. This indicates a tight coupling between speech gestures and speech perception.

Westermann & Miranda report their efforts to simulate the motor theory of speech perception in a computational neural network model of development, which consists of two processing layers: a motor layer and a perception layer.

The models can operate in two modes: listening, or listening+babbling. The perception layer consists of neurons which receive input from the first two peak frequencies of an incoming sound, with their connection weights to those frequencies randomly centered in the potential input space, and activation of that unit calculated according to to a gaussian function of distance from the centroid. Similarly, each unit in the motor layer is tuned to a particular random combination of parameters in a realistic speech synthesizer, falling off according to a gaussian function of distance. Critically, these layers are bidirectionally connected to one another.

It is easy to see why babbling would be important for development in a model like this: the network is effectively training itself on the correspondence between motor parameters and the resulting sounds. In other words, it produces a sound (thus providing itself with motor input) and then hears that sound (thus providing itself with auditory input). Westermann & Miranda utilize a Hebbian algorithm (essentially, "fire together, wire together") to allow for these representations to become associated with one another, and thus for the network to learn.

The authors demonstrated how "preferred response" regions develop within the network, such that more linear relationships between motor and auditory changes are reflected in the migration of units towards those areas of the layer parameter space which have more consistent perceptual-motor mappings. There are two potential reasons for the profile of these preferred response regions: first, Westermann & Miranda appear to use a linear activation function (in contrast to the sigmoidal functions used in other network formalisms); second, Westermann & Miranda have elected not to include a hidden layer between the perceptual and motor layers, limiting the computational power of the network to linear relationships.

The authors also demonstrate how exposure to a language environment such as French or German can skew the kinds of types perceptual sensitivities which self-organize in the network, by causing migration of the perceptual units towards those frequency distributions most present in the ambient language environment.

The end result of this architecture and learning algorithm is a set of "mirrored" perceptual/motor units which may respond regardless of whether speech is self-produced or merely heard (or presumably, merely seen).

The larger point:

Of course there are numerous shortcomings to the model (many of which Westermann & Miranda admit) but the success of the model in the face of these shortcomings illustrates the simplicity of the assumptions required for mirror-neurons to emerge in a computational architecture.

According to this view, mirror neurons are essentially a "convergence zone" for sensory and motor input. The apparent location of mirror neurons in the human (as extrapolated from their location in monkeys) seems to support this idea: they tend to be located in premotor and planning-related regions of the cortex, areas which require a tight relationship between sensory and motor information. This also hints towards one explanation for the most fascinating characteristic of mirror neurons: they appear to be "goal" (or at least "object") directed, in that mirror neurons in monkeys will not fire to the mere observation of mimed behaviors when they are not plausibly goal- or object-directed. One might speculate that this apparent goal-sensitivity is related to the benefits that "convergence zones" for sensori-motor input have in object-directed action.

More like this

Architectures of Flexible Control: Incongruence and Change Detection

Among nature's most impressive feats of engineering is the remarkably flexible and self-optimizing quality of human cognition. People seem to dynamically determine whether speed or accuracy is of utmost importance in a certain task, or whether they should continue with a current approach or begin…

The Anterior Frontier: Prefrontal Cortex

Although much progress has been made since neurologist Richard Restack called the brain one of science's last frontiers, the functions of some brain areas remain mysterious. Foremost among these is prefrontal cortex (PFC), a region that is much reduced in size in most other primates, is among the…

The Limits To Memory: Balancing Inhibition and Excitation in the Parietal Cortex

Most computational models of working memory do not explicitly specify the role of the parietal cortex, despite an increasing number of observations that the parietal cortex is particularly important for working memory. A new paper in PNAS by Edin et al remedies this state of affairs by developing…

Mirror Neurons Redux

Small Gray Matters has an insightful post on the recent mirror neuron debate here at Scienceblogs. While I think a dose of skepticism is always helpful (especially when big mysteries like "empathy" and "theory of mind" are being tossed around), Small Gray Matters offers a persuasive defense of this…

Nice information! I have been looking for something like this for a while now. Thanks!

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Performance Improves with Transcranial Random Noise Stimulation

November 21, 2011

Stimulating the brain with high frequency electrical noise can supersede the beneficial effects observed from transcranial direct current stimulation, either anodal or cathodal (as well as those observed from sham stimulation), in perceptual learning, as newly reported by Fertonani, Pirully &…

Attractors All the Way Up: Metastability, Rostrocaudal Hierarchies, and Synaptic Facilitation

November 18, 2011

In their wonderful Neuroimage article, Braun & Mattia present a comprehensive introduction to the possible neuronal implementations and cognitive sequelae of a particular dynamical phenomenon: the attractor state. In another excellent paper, just recently out in Frontiers, Itskov, Hansel and…

Architecture of the VLPFC and its Monkey/Human Mapping

November 17, 2011

If you ever said to yourself, "I wonder whether the human mid- and posterior ventrolateral prefrontal cortex has a homologue in the monkey, and what features of its cytoarchitecture or subcortical connectivity may differentiate it from other regions of PFC" then this post is for you. Otherwise,…

Modus Tollens, Modus Shmollens! When people commit a fallacy so absurd that it's only recently been given a name.

November 16, 2011

Suppose - rather reasonably - that soups which taste like garlic have garlic in them. You observe two people eating soup; one of them says to the other, "There is no garlic in this soup." Do you think it's likely that the soup taste like garlic? If you said yes, then congratulations! You've just…

Greater Performance Improvements When Quick Responses Are Rewarded More Than Accuracy Itself.

November 8, 2011

Last month's Frontiers in Psychology contains a fascinating study by Dambacher, HuÌbner, and SchlÃ¶sser in which the authors demonstrate that the promise of financial reward can actually reduce performance when rewards are given for high accuracy. Counterintuitively, performance (characterized as…