fMRI of a dead salmon: Why dead fish have almost nothing to do with "voodoo correlations" in neuroimaging

By developinginte… on September 25, 2009.

A number of very smart people (and smart communities) seem like they might be under the impression that the "voodoo correlations" scandal in the neuroimaging community is somehow related to recent work by Bennett et al, who used fMRI to show task-related neural activity in a dead fish.

These two things have almost nothing to do with one another.

1) The Bennett work is, in the words of a friend, "a cute way to make a point" that every fMRI paper I've ever read has failed to explicitly acknowledge. The reason they've failed to acknowledge it is that it's standard to run equivalent statistical tests to the ones that Bennett et al recommend (of course, that doesn't keep a "sizable minority" of studies from failing to do so - in Bennett's estimation between 25-35%; I suppose I'm not reading those studies). Anyway, the Bennett point is simple: when you run a large number of statistical tests simultaneously, even on a random dataset, you're bound to find some percentage of tests that turn up "significant" just as a result of chance, and with some probability those significant results will randomly cluster together in 3D space. If one fails to correct the significance threshold for the large number of statistical tests performed, then you get unreliable results, even if you only consider those significant results that cluster in 3D space. (it's this latter point that makes the study interesting, worthwhile, and worthy of publication in a high profile journal, in my opinion). Regardless, the potential issue was already well known, perhaps explaining the difficulty the authors reportedly have in publishing their work. The problem they identified is why virtually everyone everywhere uses, and for a long time has used, both multiple comparisons correction and cluster-based correction when reporting fMRI results. As Bennett et al noted in their poster, such corrections are widely available in all the major neuroimaging analysis packages and are the default in one major package, FSL.

2) The "Voodoo correlations" work, on the other hand, is principally about the non-independence of multiple tests. Simply put, even when you do both types of the corrections discussed above in point #1, it's not OK to take the results of that analysis (clusters in 3D space) and then run additional analyses of the same clusters in the same dataset because the data is now biased by the first analysis.

An example from the original Vul paper should make this problem clear:

We (the authors of this paper) have identified a weather station whose temperature readings predict daily changes in the value of a specific set of stocks with a correlation of r=-0.87. For $50.00, we will provide the list of stocks to any interested reader. That way, you can buy the stocks every morning when the weather station posts a drop in temperature, and sell when the temperature goes up. Obviously, your potential profits here are enormous. But you may wonder: how did we find this correlation? The figure of -.87 was arrived at by separately computing the correlation between the readings of the weather station in Adak Island, Alaska, with each of the 3315 financial instruments available for the New York Stock Exchange (through the Mathematica function FinancialData) over the 10 days that the market was open between November 18th and December 3rd, 2008. We then averaged the correlation values of the stocks whose correlation exceeded a high threshold of our choosing, thus yielding the figure of -.87. Should you pay us for this investment strategy? Probably not: Of the 3,315 stocks assessed, some were sure to be correlated with the Adak Island temperature measurements simply by chance - and if we select just those (as our selection process would do), there was no doubt we would find a high average correlation. Thus, the final measure (the average correlation of a subset of stocks) was not independent of the selection criteria (how stocks were chosen): this, in essence, is the non-independence error.

To summarize, the dead fish study is a point about first-pass analysis, which almost every paper I've ever seen does correctly. The papers that don't always note that the result failed to pass multiple comparisons or cluster correction, and typicallly discuss those results with caution. On the other hand, "voodoo correlations" is a point about nonindependence in statistical tests. This has not always been done correctly, and has not always been reported clearly. Moreover it primarily affects only a subset of correlations between brain and behavior - and not the vast majority of work in fMRI, which has to do with task-brain relationships.

More like this

Basics: Standard Deviation

When we look at a the data for a population+ often the first thing we do is look at the mean. But even if we know that the distribution

Seasons, short and simple

I love this question: Why is it warmer in the summer than in the winter (for the Northern hemisphere)? Go ahead and ask your friends. I suppose they will give one of the following likely answers:

The Real Bozo Attempts to Atone: Why the DDWFTW Car Works

Technorati Tags: ddftw, bozos, markcc-screwups

BIO101 - Lecture 7 - Physiology: Coordinated Response

Last week we looked at the organ systems involved in regulation and control of body functions: the nervous, sensory, endocrine and circadian systems. This week, we will cover the organ systems that are regulated and controlled.

Hey Chris. I'd go even farther on the salmon work: it's hilarious. How else could you get the science-loving public interested in fMRI statistical methods? Vul's work is obviously different and clearly more controversial. I initially wanted to get deeper into that in my article, but I decided to save it for a future story so that people didn't tie the salmon stuff in too directly with the voodoo stuff.

Chris - Great writeup of the differences between the multiple comparisons problem and the non-independence error. There has been a lot of confusion between the two over the last week and it is important to understand that each is a separate statistical problem in fMRI.

You mention several times in your post that almost every paper you have read does multiple comparisons correction correctly. This is great, but has not always been the case. When I was completing my training only a handful of papers properly corrected for multiple comparisons. The number are far better today, as 75% or more of published papers in good journals are corrected. Still, just because the majority of papers are doing it does not mean it is 'standard' yet. It is still quite possible to get a result published with a p-value cutoff of 0.001 and an 8 voxel extent threshold - doubly so if you are flexible with regard to what journal you send it to. The main argument of the salmon poster/paper is that everyone should be using multiple comparisons correction as part of their research.

Thanks to both of you for your comments. I agree with everything you've said. I finally got pushed over the edge to write this post when I started getting emails from trained experimental psychologists saying that Craig's work indicates the "voodoo correlations" problem goes deeper than we'd thought. We're going to be catching flak for that voodoo stuff for years, and I just didn't want this to get lumped in with it :)

You're quite right that these are two entirely separate issues. Although I disagree that almost all papers correctly use multiple comparisons correction - a small but significant minority of papers that I read don't. Maybe it depends on the field of neuroscience in question.

But the Voodoo Correlations issue and the multiple comparisons issue are united by one thing. They are both about statistical mistakes which are easy to understand, but still, a sizable number of fMRI papers have been falling foul of them.

This is not because fMRI researchers are stupid, but simply because we don't tend to think about statistical issues enough. The early fMRI pioneers were well versed in the physics and mathematics of what they were doing, but the huge number of neuroscientists who started using fMRI in the past few years tend to be much less knowledgeable about that side of it.

The Vul excerpt sounds akin to Taleb's "Fooled by Randomness", an excellent read for anyone who prefers scientific thinking over anecdotal--that is, anyone who prefers listening to non-random signals over random statistical noise.

http://www.amazon.com/Fooled-Randomness-Hidden-Chance-Markets/dp/140006…

Wow, I read both these papers without really appreciating the difference. The Vul work takes a bit of chewing (for a non-imager, anyway). Thanks for the summary!

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Performance Improves with Transcranial Random Noise Stimulation

November 21, 2011

Stimulating the brain with high frequency electrical noise can supersede the beneficial effects observed from transcranial direct current stimulation, either anodal or cathodal (as well as those observed from sham stimulation), in perceptual learning, as newly reported by Fertonani, Pirully &…

Attractors All the Way Up: Metastability, Rostrocaudal Hierarchies, and Synaptic Facilitation

November 18, 2011

In their wonderful Neuroimage article, Braun & Mattia present a comprehensive introduction to the possible neuronal implementations and cognitive sequelae of a particular dynamical phenomenon: the attractor state. In another excellent paper, just recently out in Frontiers, Itskov, Hansel and…

Architecture of the VLPFC and its Monkey/Human Mapping

November 17, 2011

If you ever said to yourself, "I wonder whether the human mid- and posterior ventrolateral prefrontal cortex has a homologue in the monkey, and what features of its cytoarchitecture or subcortical connectivity may differentiate it from other regions of PFC" then this post is for you. Otherwise,…

Modus Tollens, Modus Shmollens! When people commit a fallacy so absurd that it's only recently been given a name.

November 16, 2011

Suppose - rather reasonably - that soups which taste like garlic have garlic in them. You observe two people eating soup; one of them says to the other, "There is no garlic in this soup." Do you think it's likely that the soup taste like garlic? If you said yes, then congratulations! You've just…

Greater Performance Improvements When Quick Responses Are Rewarded More Than Accuracy Itself.

November 8, 2011

Last month's Frontiers in Psychology contains a fascinating study by Dambacher, HuÌbner, and SchlÃ¶sser in which the authors demonstrate that the promise of financial reward can actually reduce performance when rewards are given for high accuracy. Counterintuitively, performance (characterized as…