Does it matter that cortical thickness correlates with intelligence?

ResearchBlogging.orgNumerous studies have attempted to correlate general intelligence with different anatomical measures. (You might even argue that the phrenologists were working in this vein.) Likewise many studies have attempted to relate intelligence to the function of different brain regions -- using techniques like fMRI or PET scanning. However, relatively few studies have attempted to correlate general intelligence with anatomical features of particular brains regions.

This is important because we know that the brain works not as regions operating in isolation, but as a set of neural systems comprised by many different brain regions each performing relatively simple tasks. In order to answer the question, "how does intelligence work?", we first have to ask the question, "what brain regions comprise the neural system underlying intelligence?"

Adding to this body of work, Karama et al. published a study that looked at a large cohort of children age 6 to 18 over time using MRI measures and psychological batteries. They attempt to correlate the cortical thickness in different brain regions over time with g-factor -- a measure of general intelligence (more on this in a second). The authors identified several brain regions that correlated well with increased in intelligence over time in this cohort.

I want to talk about both the significance of this paper in our understanding of intelligence and why we might be skeptical that this is the whole story.

g-factor is a measure of general intelligence developed by psychologist Charles Spearman. It is predicated on the notion that performance across different cognitive batteries tends to be positively correlated -- suggesting that their maybe some underlying factor "g" that supports the diverse processes measured in these tests. Psychological tests for g-factor use principal component analysis -- a way of identifying different factors in data sets that involve mixtures of effects.

A couple things about g. One, g-factor is different than IQ, although the two are highly correlated. IQ is more of an additive measure of the performance on cognitive tests whereas g-factor looks for the factors underlying performance on those tests. Two, and more important, g-factor is very controversial. I want to talk about this more later on, but there is a lot of debate about whether g-factor measures anything important.

Karama et al. use data from a total of 252 subjects (filtered down from many more) that had complete MRI and cognitive battery results. They then used a computer algorithm -- called Constrained Laplacian Anatomic Segmentation using Proximity (CLASP) -- to measure the relative gray-matter, cortical thickness of different brain regions. The cognitive battery used was the Wechsler Abbreviated Scale of Intelligence (WASI). The authors then used a multiple linear regression model to estimate which brain regions' cortical thickness correlated with g-factor.

Results:

  • No correlation between demographic characteristics and cortical thickness besides mean age.
  • Several regions showed a positive correlation with g-factor -- controlling for age and gender. These regions included much of the dorsolateral prefrontal cortex, the parietal cortex above and below the angular gyrus, the inferotemporal cortex and parahippocampal gyrus, and the cingulate gyrus. (If none of these terms make sense, hold on for a second.)
  • "Correlations in statistically significant foci were in the modest to moderate range (0.15 to 0.32)."
  • Further, when they stratified their sample into young children and adolescents, they found that most of the positive correlation disappeared in the young children at the previous threshold. However, as they relaxed their criterion, they found that the correlations appeared in the same regions as in the older children. "In statistically significant foci, correlations ranged between .25 and .44 for adolescents and between .2 and .33 for young children."

Below is an example of what their data looks like (from the whole data set, not stratified). Figure 1 from the paper:

i-13726b4c3c8d65ab54e88cde59fedc41-cortical thickness g.jpg

The authors data is significant for several reasons, but also merits some skepticism.

It is significant because very few studies to date have examined cortical thickness and related it to particular brain regions. Having found some brain regions but not others, this study could support or contend with certain theories of how intelligence is supported in the brain.

One popular theory to which this paper adds support is the Parieto-Frontal Integration Theory (P-FIT). I am not a specialist, but as I understand it P-FIT works something like this: the parietal cortex contains a "map" attention that is the apex of activity from different sensory regions. Basically as sensation comes into the brain it is refined into greater and greater levels of abstraction. Points of light become lines; lines become shapes; shapes become abstract objects. The end product of this refinement towards abstraction is a multi-modal map of different stimuli in the parietal. In contrast the frontal cortex is involved in adding significance to those objects and making decisions about them. The frontal cortex and the parietal cortex are reciprocally connected in what is called the frontoparietal loop. The argument is that circulating activity in this loop generates intelligent responses. This paper adds evidence to that hypothesis on the grounds that many of the regions that are key to the frontoparietal loop were also identified as having cortical thickness that correlated with g-factor.

Interestingly, these were not the only regions. Places like the cingulate cortex and the parahippocampal gyrus are not associated with the frontoparietal loop. These regions may underlie other aspects of intelligence not fully accounted for in that theory -- such as memory, etc.

So I think this paper is important vis-a-vis it's relation to some theories of intelligence.

But there are a couple reasons for skepticism.

The first is the "cortical thickness = high function" assumption present in this line of research is simplifying something complicated. It is true that cortical thickness generally correlates with function. For example, there is a general decline in cortical thickness with normal aging and higher cortical thickness is correlated with high performance on cognitive tests in this age group. Further, there is a much more pronounced, pathological decline in cortical thickness associated with neurodegenerative diseases such as Alzheimer's.

However, it is probably more complicated than that. Some studies have shown that it is the rate of increase in cortical thickness -- rather than the thickness itself -- that correlates best with intelligence during child development. Further, there are certainly examples of diseases where there is relatively high cortical thickness but relatively low function. For example, individuals with congenital amusia show higher than normal cortical thickness in parts of the auditory cortex. (They also show lower than normal white matter volume.) We could certainly come up with examples where the cortex is thicker, but less functional due to developmental abnormality.

The second and broader critique of this work is whether the tests that we have for "intelligence" measures something useful in the brain. The concepts of IQ and g-factor have been questioned by several authors. Stephen Jay Gould actually wrote a whole book -- The Mismeasure of Man -- trying to debunk the assumption that intelligence can be measured in a single number. (For a more recent and excellent critique, I recommend this article by Cosma Shalizi.) The common theme among many of these critiques is that the tests for intelligence conflate numerous separable brain processes into a single number. As a consequence, 1) you aren't sure what you are measuring, 2) you can't associate what you are measuring with a particular region (the output may be the result of an emergent process of several regions), and 3) you may be eliding significant differences in performance across individuals that you would recognize with a better test.

From the point of view of a behavioral neuroscientist, I must say that I agree with these critiques. Behavioral neuroscientists try to understand how different parts of the brain encode and transform information to initiate the behavioral result. Our operating theory is that the brain is composed of many regions. Each region performs a relatively simple computation. These regions interact in neural systems to support complex behaviors. From this point of view, what does intelligence even mean? We would be much more likely to try and break it down into sub-processes before attempting to assign it brain regions, but they have done the opposite in this paper.

This is why, ultimately, though I think studies like this are interesting as they relate to theories of intelligence -- like P-FIT, I would much rather that they abandon g-factor or IQ as their behavior measures. I am willing to accept the utility of IQ as, say, a diagnostic test for Alzheimer's disease. If you could show me that IQ declines in early Alzheimer's, IQ is related to cortical thickness, and cortical thickness declines in early Alzeimer's, that would be great. We could use it as a diagnostic tool that doesn't require imaging. But past that, I fail to see the utility in lumping together such diversity.

But that is just my two cents. I must be interested, or I wouldn't have written such a long article about it.

Hat-tip: Science Daily

Karama, S., Ad-Dab'bagh, Y., Haier, R., Deary, I., Lyttelton, O., Lepage, C., & Evans, A. (2009). Positive association between cognitive ability and cortical thickness in a representative US sample of healthy 6 to 18 year-olds Intelligence, 37 (2), 145-155 DOI: 10.1016/j.intell.2008.09.006

Categories

More like this

You know, for a lot of these correlational studies it would be interesting if the teams did something like what the big physics accelerator teams do. That is, create a model, to whatever level they like, then run Monte Carlos with varying inputs. I know this is investigative work, but it always seems like there are hundreds of these kinds of studies. Which means 10's or 20's of them are going to give statistically valid evidence of things that are junk.

In some sense there are things to learn. Obviously cortical thickness in the limit is understandably correlated with G or any kind of similar score. 0 to 0 at the bottom. I never see graphs in reporting of health studies that do runs over ranges or similar things like which show confidence levels. They may be in the original papers, but not in the reporting.

I would think cortical thickness would be correlated with increased level of interconnection of inputs with outputs, representing some sort of increased complexity of calculation in the region.

This is important because we know that the brain works not as regions operating in isolation, but as a set of neural systems comprised by many different brain regions each performing relatively simple tasks. In order to answer the question, "how does intelligence work?", we first have to ask the question, "what brain regions comprise the neural system underlying intelligence?"

In this regard I'd like to call your attention to A Proposal for a Coordinated Effort for the Determination of Brainwide Neuroanatomical Connectivity in Model Organisms at a Mesoscopic Scale by Bohland et al, referred to by Coturnix this morning. In addition to knowing what regions are involved in "intelligence", it would be nice to have fully organized information regarding their interconnections.

"The second and broader critique of this work is whether the tests that we have for "intelligence" measures something useful in the brain."

The discovery of 'g' seems to suggest that is does? And given how effective it is in predicting social/economic outcomes it's worth looking at?

Gottfredson, L. S. (2009). Logical fallacies used to dismiss the evidence on intelligence testing. In R. Phelps (Ed.), Correcting fallacies about educational and psychological testing (pp. 11-65). Washington, DC: American Psychological Association.

http://www.udel.edu/educ/gottfredson/reprints/2009fallacies.pdf

Jake, this is a good review and I agree with many of your major conclusions. However, your summary of the literature on g has several problems.

[g-factor] s predicated on the notion that performance across different cognitive batteries tends to be positively correlated

A quibble -- the positive correlation between performance on different test items is not just a notion but an empirical observation that has been supported by millions of data points over the last century. More on this below.

Psychological tests for g-factor use principal component analysis -- a way of identifying different factors in data sets that involve mixtures of effects.

Factor analysis, not PCA, is the method used by psychometricians. They are similar in principle but not in application.

g-factor is very controversial.

Not among intelligence researchers.

In this review, we emphasize intelligence in the sense of reasoning and novel problem-solving ability (BOX 1). Also called FLUID INTELLIGENCE(Gf), it is related to analytical intelligence1. Intelligence in this sense is not at all controversial...

ref.1

[These authors go on to explain that in their view Gf and g are one and the same.]

From another review:

Here (as in later sections) much of our discussion is devoted to the dominant psychometric approach, which has not only inspired the most research and attracted the most attention (up to this time) but is by far the most widely used in practical settings.

ref.2

This was published over a decade ago. The psychometric approach has continued to attract the most research and attention and is still by far the most widely used.

The second and broader critique of this work is whether the tests that we have for "intelligence" measures something useful in the brain.

There's wide agreement that the tests measure something useful about human behavior:

In summary, intelligence test scores predict a wide range of social outcomes with varying degrees of success. Correlations are highest for school achievement, where they account for about a quarter of the variance. They are somewhat lower for job performance, and very low for negatively valued outcomes such as criminality. In general, intelligence tests measure only some of the many personal characteristics that are relevant to life in contemporary America. Those characteristics are never the only influence on outcomes, though in the case of school performance they may well be the strongest.

ref.2

A more standard criticism of g:

while the g-based factor hierarchy is the most widely accepted current view of the structure of abilities, some theorists regard it as misleading (Ceci, 1990).

ref.2

that is:

One view is that the general factor (g) is largely responsible for better performance on various measures40,85.A contrary view accepts the empirical,factor-analytic result, but interprets it as reflecting multiple abilities each with corresponding mechanisms141. In
principle, factor analysis cannot distinguish between these two theories, whereas biological methods potentially could10,22,36. Other perspectives recognize the voluminous evidence for positive correlations between tasks and subfactors, but hold that practical, creative142 and
social or emotion-related73 abilities are also essential ingredients in successful adaptation that are not assessed in typical intelligence tests. Further, estimates of individual competence, as inferred from test performance, can be influenced by remarkably subtle situational
factors, the power and pervasiveness of which are typically underestimated2,136,137,143.

ref.1

The concepts of IQ and g-factor have been questioned by several authors. Stephen Jay Gould actually wrote a whole book -- The Mismeasure of Man -- trying to debunk the assumption that intelligence can be measured in a single number. (For a more recent and excellent critique, I recommend this article by Cosma Shalizi.) The common theme among many of these critiques is that the tests for intelligence conflate numerous separable brain processes into a single number. As a consequence, 1) you aren't sure what you are measuring, 2) you can't associate what you are measuring with a particular region (the output may be the result of an emergent process of several regions), and 3) you may be eliding significant differences in performance across individuals that you would recognize with a better test.

You give too much credit to Gould and Shalizi. Their primary criticisms are entirely less reasonable than the points you make.

The main thrusts of their arguments are that test data do not statistically support a g-factor. Gould's argument is statistically incompetent (for a statistican's critique see Measuring intelligence: facts and fallacies by David J. Bartholomew, 2004). Shalizi's criticism is incredibly sophisticated, but likewise incorrect. In a nutshell, Shalizi is trying to argue around the positive correlations between test batteries. If those correlations didn't exist, his argument would be meaningful. However, as I noted above, these intercorrelations are one of the best documented patterns in the social sciences.

significant differences in performance across individuals that you would recognize with a better test.

It's possibly not well known that enormous efforts have gone into trying to make tests that have practical validity for life outcomes yet do not mostly measure g. See for example the works of Gardner and Sternberg. The current consensus is that their efforts have failed. A notable exception might be measures of personality.

Conclusion:

Ultimately, we need to use biological measures such as cortical volume to determine what g really is. One possible approach is to combine chronometric measurements (e.g. reaction time) with brain imaging studies. Genetically informed study designs have a role to play here too.

references:
[1] www.loni.ucla.edu/~thompson/PDF/nrn0604-GrayThompson.pdf
[2] www.gifted.uconn.edu/siegle/research/Correlation/Intelligence.pdf

This kind of criticism seems driven by the guts, not reason. Sure, there are problems with every study. But the only real problem with a study is if it could lead you to an alternate hypothesis. None of these criticisms seems to do that.

Of course it's not a perfect measure of intelligence. Who cares? We all know that some are smarter than others, and people who score highly on these tests are those we would characterize as smart, and those who do poorly we'd categorize as dumb. Nothing it perfect. The correlation is pretty strong compared to other published work. There's no reason to think that if we had a better measure of intelligence it would go away; it seems just as probably (if not more) that a better measure of intelligence might increase the correlation. After all, a better test would produce more consistent results.

Do I think that we've found the "intelligence" areas of the brain? Not really. We all know that loss of brain mass causes problems. It's probable that a decrease of mass in certain areas causes a decrease in the ability of those to exceed in the areas tested by these tests. In fact, it would be very interesting if different types of questions on the exam were correlated with different areas. I think that this kind of study could be extremely enlightening. That way, we might be able to correlate particular areas with particular types of performance ability. Whether these abilities match onto our idea of "intelligences" is irrelevant.

Thanks for mentioning this study. Most politically correct grad students, post-docs, and profs avoid the topic of IQ like the plague. You are more courageous, although just as PC, as most of today's crop of behaviouralists.

Unfortunately, one cannot be both PC and profound. It is a dilemma we all must struggle with these days.

By Bruno Hauschung (not verified) on 02 Apr 2009 #permalink