Neural Substrates of Symbol Use

The capacity to use and manipulate symbols has been heralded as a uniquely human capacity (although we know at least a few cases where that seems untrue). The cognitive processes involved in symbol use have proven difficult to understand, perhaps because reductionist scientific methods seem to decompose this rich domain into a variety of smaller components, none of which seems to capture the most important or abstract characteristics of symbol use (as discussed previously).

So, it's important to specify how the simpler and better-understood aspects of symbol processing may interact and give rise to higher-level aspects of symbol processing, such as those involved in algebra. A productive approach to solving this problem of "reconstruction" is to consider the computational tradeoffs that may confront a flexible symbol processing system.

The use of symbols seems to require a delicate balance between many such tradeoffs, or mutually-exclusive computational demands:

1) symbols must be simultaneously distinct from their referents but must also be "grounded" and tightly-coupled with them (e.g., "x" in an equation);

2) symbols must be able to flexibly represent a variety of referents, yet should have extreme stability in their abstract meaning over time (e.g., no matter how many problems are encountered where "x= 5", x as an abstract symbol no more represents 5 than any other number) and

3) symbols must generalize across only some characteristics of their referents (e.g., "â" can refer to both "north," and "up," but does not also refer to associates of both, such as "the northern lights," which are both "north" and "up").

These conflicting demands of symbolic processing pose a formidable challenge to any single mechanism account of symbol-processing. Perhaps in order to circumvent these inherent tradeoffs, the brain apparently recruits a wide variety of regions in symbol processing tasks. However, these widespread patterns of activity can be functionally partitioned by considering how each region may balance the conflicting demands described above.

For example, prefrontal regions may be particularly suited to balancing the tradeoffs between symbol abstraction and symbol grounding due to their proximity with motor cortex: pre-motor regions may provide more concrete links between a symbol and its referent's sensorimotor representation. In contrast, more anterior regions of prefrontal cortex may represent increasingly abstract, invariant, or temporally protracted aspects of the symbol, including those aspects which are not tied to any of the symbol's more transient and concrete referents.

A variety of evidence supports this view, including the widely noted fact that representations tend to become more abstract and may span longer temporal episodes in more anterior regions of PFC. Consistent with these claims, frontopolar PFC is activated by "relational integration" tasks, which likely involve highly abstract symbolic processing and certainly have many of the same conflicting demands. In contrast, physically attaching a symbol to its referent can enhance the performance of both children and monkeys in symbolic tasks, as though abstract symbols initially develop out of these more concrete sensorimotor representations.

Thus PFC seems ideally suited to representing the entire continuum of symbolic reference, from the most abstract to the very concrete. However, it lacks a mechanism for selective updating, a particularly important feature in symbol manipulation. For example, when solving "2x = 6," the immediate goal requires that x is updated to equal "3", but long-term success with algebra requires that x remains equal to "unknown variable". As with planning, this flexibility may be provided by basal ganglia-mediated selective updating. Consistent with this claim, computational models implicate the basal ganglia in the updating of symbolic representations, as does some neuroimaging evidence.

But while the PFC-BG circuit is capable of maintaining both the concrete and abstract aspects of symbols, as well as selectively updating those representations, it does so only by actively maintaining these representations in posterior cortex. In other words, prefrontal and basal ganglia representations may act only as "pointers" to the more permanent home of those representations in posterior cortex. Yet this raises a etiological problem - how do these symbols (wherever they reside) ever capture meaningful but abstract relationships across many representations?

The extraction of these meaningful but general relationships may lie with anterior temporal regions, such as inferotemporal cortex, which seems to contain highly distributed and overlapping representations encoding the compositional features of the environment. In other words, symbols may be the nexus of a large number of overlapping representations. For example, the representation of the symbol "truth" may be the overlap between the representations for "reality", "p â p", and "¬ lie." Another example: the symbolic meaning of "â" includes both "north" and "up" because those representations overlap significantly, but does not include "the northern lights," which may consist not only of "up" and "north" but also a variety of other representational components (like "bright"). The point is that symbols require the extraction of meaningful relationships among a number of more variable representations, a role for which posterior cortex is well suited.

Evidence for the capacity of posterior cortex to accomplish these forms of generalization comes from computational models demonstrating that Hebbian learning is sufficient for extracting the principal components of the environment, from low-level sensory perception (e.g., in the formation of V1 receptive fields) all the way up to the recognition of individual faces in the anterior temporal lobe. Admittedly, neuroimaging evidence for this claim is scarce - but traditional analysis methods may not be sensitive enough to pick up the extremely sparse activations this account predicts underlie abstract symbols. In fact, this sparseness may itself be the reason that ventrolateral prefrontal cortex is often activated in symbolic tasks: active maintenance by PFC may be especially necessary to keep such sparse representations active in posterior cortex, a region usually characterized by more distributed patterns of activity .

In summary, these conflicting computational demands may, like those of planning, be balanced by the recruitment of multiple brain regions. The graded abstractness of frontal representations allows PFC to represent both the most abstract and the more grounded aspects of symbols, resolving the conflicting demand that symbols be both abstract and yet also tightly-coupled with their referents. Secondly, the selective updating required by symbol processing can be accomplished through selective updating by BG. Finally, the particular type of generalization manifested by symbols may rely on the slow extraction of systematic correlations in the environment, as accomplished by the temporal lobe and sensory cortex over the course of experience.

More like this

Many will agree that algebra is difficult to learn - it involves planning, problem-solving, the manipulation of symbols, and the application of abstract rules. Although it's tempting to imagine a specialized region of the brain for each of these processes, they may actually recruit roughly the…
Although much progress has been made since neurologist Richard Restack called the brain one of science's last frontiers, the functions of some brain areas remain mysterious. Foremost among these is prefrontal cortex (PFC), a region that is much reduced in size in most other primates, is among the…
As described in yesterday's post, many theories have been proposed on the possible functional organization of prefrontal cortex (PFC). Although it's clear that this region plays a large role in human intelligence, it is unclear exactly "how" it does so. Nonetheless at least some general…
Multitasking refers to the simultaneous performance of two or more tasks, switching back and forth between different tasks, or performing a number of different tasks in quick succession. It consists of two complementary stages: goal-shifting, in which one decides to divert their attention from one…

If neuroscientists can verify the discussed aspects of symbol handling, it widely exceeds my expectations of the scope of the explanations.

Specifically, since the more abstract uses of symbols in math have a neverending complexity, it seems very neat if computational trade offs can describe some properties. That binding or resolution of symbols ("x = 5") can be interpreted as selective updating is a revelation for this layman.

And I like the connection back to… , since its picture of spontaneous symbol-like processing and consequent ability to handle non-training data is impressive.

On another note, first reading back on the posts on symbol use was painful because of the continued critique of reductionism and use of parsimony as if it applies to models which cover different data sets and predictions. But one quote shows that it is not quite so:

And here is the crucial problem with hyper-reductionism (what I previously and incorrectly referred to as Occam's Razor). The preference for theories with relatively fewer assumptions (among all those other theories which are identical in explanatory power) ultimately leads us towards highly simplistic models of nature which apply only to one level of analysis, if they apply at all.

Exactly! Naive reductionism which thinks a convenient model may be automatically applicable as a basis for a more general one is ... naive.

Elsewhere there is a comment on the difference between pure and applied sciences here. IMHO naive reductionism does not really seem to be a problem in general. But it is perhaps true that applied sciences are more used to model-building for complex mixed systems even if we see those elsewhere too.

Elevating 'top-down' reductionism to the only approach instead of permitting 'bottom-up' modeling would probably be a problem for any area.

For example, doing a quasistatic low-frequency analysis of electronic circuits usually permits lumping some loads and omitting some internal nodes, while a full blown high-frequency analysis needs more elements.

Naive reductionism could get stuck on the first model, perhaps trying to extrapolate it into an incorrect description for higher frequencies. Unless one can probe the internal nodes and recover the full description from the partial description + the probe results, which perhaps counts as part of a "reconstructionist" effort.

By Torbjörn Larsson, OM (not verified) on 30 May 2007 #permalink

I think your points are all valid - you might be surprised to see how many people question the utility of modeling in neuroscience.

I should mention the selective updating of symbols is just my own idea, and there is no evidence specifically in favor of that idea, as opposed to symbols & their referents merely being bound through the hippocampus for that particular episode. But it's an empirical question I guess.

Thanks for the insightful comments as always.