When analyzing data, understanding the limitations of your data is critical. One of the things we need to understand is significance: how strong does an effect have to be to considered not a result of random chance. Typically, we assume that if an effect has a five percent or less probability of occurring due to random chance, then it is “significant.” But significance becomes very problematic when making many simultaneous assessments. If we make one hundred assessments (e.g., comparisons) and not a single one is actually different (assume that the Omniscience of the Mad Biologist is operating here), then, on average, five of the 100 assessments will be “significant” even though there is no real phenomenon underlying the supposed significance.
I recently finished Cordelia Fine’s Delusions of Gender: How Our Minds, Society, and Neurosexism Create Difference. An area of research that’s in vogue (perhaps unjustifiably so) is seeing if male and female brain activation responds differently to certain stimuli, such as an emotionally disturbing picture. Then the results are ridiculously overinterpreted, and we conclude that boobies mean girls can’t do math. Or something. So someone wanted to test just how reliable assessing significance of these scans (from Fine):
Could the sex differences in brain activation be spurious? When looking for changes in blood flow between two conditions, researchers search in thousands of tiny sections of the brain (called voxels), and many researchers are now arguing that the threshold commonly set for declaring that a difference is “significant” just isn’t high enough.
And here’s the Bestest Control Experiment EVAH! (italics mine):
To illustrate this point, some researchers recently scanned an Atlantic salmon while showing it emotionally charged photographs. The salmon-which, by the way, “was not alive at the time of scanning”-was “asked to determine what emotion the individual in the photo must have been experiencing.” Using standard statistical procedures, they significant brain activity in one small region of the dead fish’s brain while it performed the empathizing task, compared with brain activity during “rest.” The researchers conclude not that this particular region of the brain is involved in postmortem piscine empathizing, but that the kind of statistical thresholds commonly used in neuroimaging studies (including Witelson’s emotion-matching study) are inadequate because they allow too many spurious results through the net.
But did the salmon speak English? DUH! BAD EXPERIMENT.
This is awesome.