Crazy Useful Paper on Statistics

By purepedantry on April 30, 2007.

Want to know when to use Standard Deviation (SD) as opposed to Standard Error (SE) or a Confidence Interval (CI)? Then you should read this really useful paper in JCB about error bars in scientific papers. Here is just a sampling of their useful rules:

Rule 3: error bars and statistics should only be shown for independently repeated experiments, and never for replicates. If a "representative" experiment is shown, it should not have error bars or P values, because in such an experiment, n = 1...

Rule 4: because experimental biologists are usually trying to compare experimental results with controls, it is usually appropriate to show inferential error bars, such as SE or CI, rather than SD. However, if n is very small (for example n = 3), rather than showing error bars and statistics, it is better to simply plot the individual data points.

They also have some handy tips on interpreting figures in papers -- such as this one about SEs (click to enlarge):

Figure 5. Estimating statistical significance using the overlap rule for SE bars. Here, SE bars are shown on two separate means, for control results C and experimental results E, when n is 3 (left) or n is 10 or more (right). "Gap" refers to the number of error bar arms that would fit between the bottom of the error bars on the controls and the top of the bars on the experimental results; i.e., a gap of 2 means the distance between the C and E error bars is equal to twice the average of the SEs for the two samples. When n = 3, and double the length of the SE error bars just touch (i.e., the gap is 2 SEs), P is ~0.05 (we don't recommend using error bars where n = 3 or some other very small value, but we include rules to help the reader interpret such figures, which are common in experimental biology).

Definitely read the whole thing.

The craziest part about many scientific papers is how many people still screw this up. My personal favorite is the use of SE with tiny n's, hiding the fact that you really can't be that confident about the mean.

Hat-tip: Faculty of 1000.

More like this

The problem with "many scientific papers", particularly those in high citation index journals goes way beyond this. Take a scan through a few issues of Science and Nature and see how many times 1) no error bars are provided when they should be, 2) one cannot tell what statistic (SEM, SD, etc) the bars convey when provided and 3) no proper inferential analysis is provided at all. "Real" biologists don't believe in statistics so of course they don't use them properly when forced... The development of gene-array techniques and the requisite need for biologists to "speak" inferential analyses has been entertaining to watch, frankly.

On the other hand, those of us in the "sainted p value" type fields can err on the other side. Namely the misconception that statistical analysis "reveals" or "demonstrates" an effect as if something is only true (in the state-of-nature sense) if the experiment reaches P<0.05. This leads to all kinds of erroneous thinking and interpretation of real findings.

hmm. apparently one should avoid "less than" symbols.

...reaches P less than 0.05. full stop.

this really useful paper

It was straight forward and useful, but I have some quibbles with it.

As I'm a physicist and not a biologist I usually haven't as much use for more qualified statistical methods such as tests. Consequently I have never heard of SE before reading blogs. I wish the paper had eliminated the SE rules to press more on the fact that "CIs make things easier to understand".

But I would especially quibble with the discussion of replicates and the implicit assumption of normal distribution of experiments.

First, I don't think that "the errors would reflect the accuracy of pipetting" is correct, it would reflect the precision of the measurement. And usually you need to assess that precision and its distribution to be able to do proper statistical testing.

So as a physicist I would probably break their suggested rules and make figures with error bars on replicates specifically, perhaps forced to use ranges for n={2,3} but preferably SD throughout.

Second, I would use these graphs or real tests to establish the distributions and then a suitable test to compare groups. The rules seems fine when it is known that independent normal distributions are observed.

But I like the later discussion of within group correlation. Often in experimental physics you find that you advertently or inadvertently increase precision by utilizing short-term stability in time series. But making later inferences on direct comparisons, real accuracy or repeatable precision is then difficult.

"when it is known"

Actually, in an experimental situation I would prefer to be precise: "when it is established earlier".

Torbjörn, it's usually a good idea to give some measure of precision, even in physicic. If, for example, you showed two curves in a graph, the reader should be given some idea of what uncertainty there is in those estimates. In some cases there will be virtually none, compared to the range of Y-values spanned by the curves; but in other cases there will be a lot.

This could be due to measurement uncertainty, or the fact that the curve presents a Y vs X model, and there are several more independent variables (or input factors) which varied during the measurements.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Best. Modelling Paper. Ever.

August 17, 2009

The abstract says it all: Zombies are a popular figure in pop culture/entertainment and they are usually portrayed as being brought about through an outbreak or epidemic. Consequently, we model a zombie attack, using biological assumptions based on popular zombie movies. We introduce a basic model…

Journal Editor Speaks about His Experiences

August 10, 2009

(I had this whole post ready talking about flexible representations, but now my computer is borked -- stupid monitor! -- so this is going to have to do.) Tyler Cowen over at Marginal Revolution links to a piece by a former editor at American Economic Review telling all about how papers are accepted…

Obesity is not a myth

July 30, 2009

There is a great conversation going on at Megan McArdle's blog with Paul Campos, author of The Obesity Myth. I say great because it give me the opportunity to show how astonishingly wrong Campos in suggesting that the obesity at the lower end of the BMI spectrum -- not just morbid obesity -- is…

Imaging a Superior Mnemonist

July 15, 2009

In neuroscience, we spend most of our time trying to understand the function of the "normal" brain -- whatever that means -- hence, we are most interested in the average. Under most occasions when scientists take an interest in the abnormal neurology, it is usually someone with who has something…

Key paper in depression genetics disputed

June 24, 2009

I wanted to draw attention to a new paper in JAMA recently because it reveals a lot about how conditional most of the statements we make in behavioral genetics are. Every time you hear a news article that says, "Gene for depression found," I want you to think about this case. //--> Risch et…