There’s been some hubbub recently over a study by Gerber and Malhotra (you can get a copy in pdf here), which shows a couple things. First, political science journals don’t publish many articles that report negative (null) results, but instead tend to publish those that report statistically significant results. Second, a large portion of those statistically significant results involve probabilities that are pretty damn close to .05 (the generally accepted cutoff for statistical significance). My first reactions were duh, and who cares?
Of course, I’m not a political scientist, so I can’t speak for them, but in psychology, everyone has always known that it’s damn hard to get null results published. And there are good reasons why that’s so. For one, null results are less informative. It’s difficult to tell whether they’re the result of a lack of the hypothesized relationship between your variables, or instead, the result of chance or methodological problems (especially a lack of statistical power). So, if you want to publish a null result, you’ve got a bunch of extra work to do. In addition to calculating power (which people in some discinplines do automatically, anyway), you’re almost certainly going to have to run extra variations of the study (even more than you would with a statistically significant result) to show that your null result wasn’t the result of methodological problems.
There’s another good reason why they aren’t published: they’re not expected! I know, on the surface, it looks like they should be expected, because typically there’s at least a 95% chance that you won’t get statistically significant results. But in reality, getting statistically significant results is actually pretty likely, because researchers generally don’t conduct studies unless they’re pretty confident, for theoretical reasons or whatever, that the hypothesized relationship between their variables exists. So if you get null results, it’s actually pretty surprising.
When null results do get published, it’s usually because the researchers had good reason to expect them, so they undertook the extra steps required to make null results publishable. In general, that’s only interesting when there’s a heated debate about the relationship between two variables, and one theory predicts that there is none. Even then, there are often better ways of demonstrating that than producing null results. You run other studies that test (non-null) hypotheses that distinguish between competing theories, for example.
As for why so many of the results cluster around the .05 level, well, that’s probably a cultural thing. Researchers tend to be overly obsessed with statistical significance (as this study ironically shows), and that means that when you’ve got results that are approaching significance, you’re going to employ a few tricks to get closer to it. In psychology, for example, you might run a few more participants than you’d planned, thereby increasing your statistical power, or you might tweak your methodology and rerun the experiment. In most cases, I think these solutions are harmless, particularly since there’s no real a priori reason to be obsessed with the .05 level in the first place. If you’re close, but not below it, chances are you’re onto something, but you need the extra little push to get people to pay attention. If you’re close, but actually committing a Type I error, chances are subsequent research will discover that. Sure, it might cause people to use time and resources driving down theoretical dead ends, but that’s just the way science works, and making a big deal out of it is kind of silly. So again, I say, duh, and who cares?