Love p-values for what they are, don't try to make them what they're not

By agelman on March 16, 2010.

Jeremy Miles pointed me to this article by Leonhard Held with what might seem like an appealing brew of classical, Bayesian, and graphical statistics:

P values are the most commonly used tool to measure evidence against a hypothesis. Several attempts have been made to transform P values to minimum Bayes factors and minimum posterior probabilities of the hypothesis under consideration. . . . I [Held] propose a graphical approach which easily translates any prior probability and P value to minimum posterior probabilities. The approach allows to visually inspect the dependence of the minimum posterior probability on the prior probability of the null hypothesis. . . . propose a graphical approach which easily translates any prior probability and P value to minimum posterior probabilities. The approach allows to visually inspect the dependence of the minimum posterior probability on the prior probability of the null hypothesis.

I think the author means well, and I believe that this tool might well be useful in his statistical practice (following the doctrine that it's just about always a good idea to formalize what you're already doing).

That said, I really don't like this sort of thing. My problem with this approach, as indicated by my title above, is that it's trying to make p-values do something they're not good at. What a p-value is good at is summarizing the evidence regarding a particular misfit of model do data.

Rather than go on and on about the general point, I'll focus on the example (which starts on page 6 of the paper). Here's the punchline:

At the end of the trial a clinically important and statistically significant difference in
survival was found (9% improvement in 2 year survival, 95% CI: 3-15%.

Game, set, and match. If you want, feel free to combine this with prior information and get a posterior distribution. But please, please, parameterize this in terms of the treatment effect: put a prior on it, do what you want. Adding prior information can change your confidence interval, possibly shrink it toward zero--that's fine. And if you want to do a decision analysis, you'll want to summarize your inference not merely by an interval estimate but by a full probability distribution--that's cool too. You might even be able to use hierarchical Bayes methods to embed this study into a larger analysis including other experimental data. Go for it.

But to summarize the current experiment, I'd say the classical confidence interval (or its Bayesian equivalent, the posterior interval based on a weakly informative prior) wins hands down. And, yes, the classical p-value is fine too. It is what it is, and its low value correctly conveys that a difference as large as observed in the data is highly unlikely to have occurred by chance.

More like this

Thinking about confidence intervals

Like David Rind over at Evidence in Medicine I'm a consumer of statistics, not a statistician. However as an epidemiologist my viewpoint is sometimes a bit different from a clinician's. As a pragmatic consumer, Rind resists being pegged as a frequentist or a Bayesian or any other dogmatic…

Pavlov's Dogs: Proving the Null With Bayesianism

How many times did Pavlov ring the bell before his dogs' meals until the dogs began to salivate? Surely, the number of experiences must make a difference, as anyone who's trained a dog would attest. As described in a brilliant article by C.R. Gallistel (in Psych. Review; preprint here), this has…

Statistics, damn statistics and well kept secrets

Marilyn Mann pointed me to an interesting post by David Rind over at Evidence in Medicine (thanks!). It's a follow-on to an earlier post of his about the importance of plausibility in interpreting medical literature, a subject that deserves a post of its own. In fact the piece at issue, "HIV…

Scientists can read your mind . . . as long as the're allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

Maggie Fox writes: Brain scans may be able to predict what you will do better than you can yourself . . . They found a way to interpret "real time" brain images to show whether people who viewed messages about using sunscreen would actually use sunscreen during the following week. The scans were…

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Bye

July 11, 2010

I realize that I haven't been posting much here. We had some plans to use the Applied Statistics blog for other purposes but it didn't really work out, so from now on you can go to my main blog for your statistical entertainment.

"How many zombies do you know?" Using indirect survey methods to measure alien attacks and outbreaks of the undead

July 1, 2010

I've been told that it's zombie day, so I thought I'd link to this research article by Gelman and Romero: The zombie menace has so far been studied only qualitatively or through the use of mathematical models without empirical content. We propose to use a new tool in survey research to allow…

Scientists can read your mind . . . as long as the're allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

June 23, 2010

Ethical and data-integrity problems in a study of mortality in Iraq

April 27, 2010

See discussion here. I've linked to it from here because ScienceBlogger and investigative journalist Tim Lambert has written some on the topic.

Random matrices in the news

April 12, 2010

Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as "the deep law that shapes our reality." It's interesting stuff, and he gets into some statistical applications at the end, so I'll…

Love p-values for what they are, don't try to make them what they're not

More like this

Thinking about confidence intervals

Pavlov's Dogs: Proving the Null With Bayesianism

Statistics, damn statistics and well kept secrets

Scientists can read your mind . . . as long as the're allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

Bye

"How many zombies do you know?" Using indirect survey methods to measure alien attacks and outbreaks of the undead

Scientists can read your mind . . . as long as the're allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

Ethical and data-integrity problems in a study of mortality in Iraq

Random matrices in the news

An Ant Diversity Sampler

CDC classifies three antibiotic-resistant bacteria as urgent threats

Is Python The New Basic? ("Python For Kids")