What is peer review, anyway?

Over at BPR3, a reader brought up an interesting question about the nature of peer-reviewed research, which I thought was relevant to our readers here as well. I'm reposting my entire response below.

The system of peer review, the bulwark of academic publishing, has served scholars for centuries. The principle behind the system is simple: If experts in a field find a research report noteworthy, then that report deserves to be published.

But who is an "expert"? And who decides who the experts are? Couldn't a group of individuals committed to promoting their own research -- which may or may not be well-founded -- get together to form their own "journal," which they could legitimately claim publishes "peer-reviewed research"?

They can, and they do.

BPR3 Danny Chrastina asks:

What if it's a pseudoscience paper which has been peer-reviewed by other pseudoscientists? I'm thinking of the reviews I've written of papers published in Homeopathy or J. Alt. Complement. Med. which point out their glaring misuse of quantum mechanical ideas.

Would a blog post citing such "research" qualify to use the BPR3 icon? There are a couple possibilities to consider:

1. The article is blatantly false and misrepresents its conclusions as "scientific." Informed scholars agree that the work is simply wrong. My take: Citing this article alone would not qualify a blog post to use the BPR3 icon. If a legitimate, peer-reviewed article is cited to show why the article in question is wrong, then the icon could be used.

2. The article actually offers good information, yet it appears in a journal whose claims to be "peer-reviewed" are suspect. My take: If the work appearing in this journal is not generally accepted as reliable by true experts, then one exceptional article should still not qualify to use the icon. If the article didn't undergo a rigorous peer review process, then citing the article offers no more authority than citing a newspaper article or a blog post.

A final question is perhaps the most difficult: How do we identify journals offering acceptable levels of peer review? Who's to say whether a given journal is good enough? After all, even the most rigorous scholarly journals sometimes make errors -- indeed, one of the most important parts of the scientific process is identifying and correcting problems in earlier work. Too rigorous a standard of peer review can stifle research just as much as too lax a standard.

I think the best way to maintain standards for BPR3 is probably to not start off with too many standards, but to be open to feedback from readers and bloggers about what factors we should take into account. In the case of the example Danny Chrastina offers, we can clearly see why the articles he describes don't qualify as peer-reviewed. While so-called "experts" may have reviewed the articles, since they misrepresent well-established scientific principles, it's immediately clear that the review process was not rigorous. When a journal exhibits a pattern of accepting this type of article and makes no effort to correct the problem, then we may be able to make a definitive statement and say that no article from that journal should qualify for our icon. But such a statement should only be made after we receive extensive feedback from readers and bloggers.

What we're trying to do with BPR3 is establish a clear way to show where thoughtful discussions about serious research are taking place throughout the blogosphere. This isn't to say that there isn't great discussion taking place elsewhere. There are lots of great posts debunking pseudoscience; however, if they don't cite serious research, in my view, they shouldn't qualify for our icon. Our guidelines for using the icon, while offering flexibility, make that clear.

But I'd be interested to hear what others have to say -- BPR3 is a community-run site, and it's important to have reader input into our policies. Is there a place for posts that don't cite serious research -- perhaps to discuss a general scientific concept or debunk a problematic news article? Should these posts be marked with a different sort of icon? What standards would we use to ensure that they were high-quality posts?


More like this

Last week's post on a Peer-Reviewed Research icon has generated a tremendous amount of interest, including many very thoughtful comments and an incisive post over on Cabi Blogs. I'll get to Philip's comments in a moment, because they are at the core of what "peer reviewed" means, but first let me…
Most CogDaily readers are familiar with the little icon we developed to indicate when we were reporting on peer reviewed research. We created it when we began to offer links to news and blog posts, as a way of distinguishing those less "serious" posts from when we were talking about peer-reviewed…
Bloggers for Peer-Reviewed Research Reporting has announced a contest to design an icon to identify serious blog posts discussing peer-reviewed research. Anyone will be able to use the icon on their blog posts whenever the post is a serious commentary about a paper published in a peer-reviewed…
Have you been following the progress over at BPR3? Here's an update: With the release of the Research Blogging icon, dozens of blogs and hundreds of posts are already showing the world when they are discussing peer-reviewed research. But the next step will be far more dramatic: a site which…

I think it's easy to "jump the gun" when worrying about these matters; it's awfully hard to tell what policies and procedures BPR3 has to institute before we've actually seen the system be abused. (Remember all those civics lessons about where the different amendments in the Bill of Rights came from, and how they responded to problems the ex-colonials had experienced first-hand?) The advantage of the BPR3 system is that, first, posts can be aggregated and more easily subjected to their own "peer review", allowing problems to be detected more readily. Second, with an infrastructure in place, additional icons — "ArXiv preprint", "Basic Concept in Science", "Debunking Pseudoscience" — can be introduced in parallel with the existing mechanism.

I would suggest that an article, to be eligible for the BPR3 icon, be required to appear in an "eligible" journal. Here are some off-the-cuff ideas for what would constitute an eligible journal.

1. The initial presumption would be that ALL journals are eligible.
2. A journal would be deemed ineligible if the BPR3 team found that it purposefully obscured its owners, publishers or other interested parties.
3. A journal would be deemed ineligible if the BPR3 team, after public input from other bloggers, concluded that such journal repeatedly and frequently published articles found to be demonstrably false and wholly without merit.
4. The willingness to issue clear and concise corrections to false and misleading papers would be a mitigating factor intended to protect "edgy" journals.
5. Journals deemed ineligible would be re-considered for eligibility on a yearly basis.

From a quantitative standpoint, here are some suggestions:

(1a) When possible, bloggers who report research under the BPR3 icon could report stats that professionals in the field use as rough indicators of article quality. Some important stats might include the age, impact factor, and rejection rate of a journal. So when a blogger references an article from a journal that has been around for quite awhile, has a high impact factor, and 99% of the articles submitted are rejected during peer review, it may be of higher quality than an article from a journal that came out two months ago, has a low impact factor, and a 30% rejection rate. Those statistics can be reported in addition to referencing the article, and, perhaps can be used as a method of assessing the quality of articles that a particular blogger reports over time (i.e., the mean of the reported stats). However, I'm not sure how easy it is to get those stats from journals.

(1b) The actual quality of reporting a particular article by a blogger could involve ratings by readers. I'd suggest checking out the rating system over at Yahoo!Answers which allows one to rate both the main post and reader comments to the post.

(2) Not too long ago, the APA introduced Psycoloquy, which is an online, peer-reviewed, psychology journal that allows Open Peer commentary. However, the brief Wikipedia article reports that Psycoloquy has been temporarily suspended due to open access issues. I'm thinking that in the not-too-distant future, most journals will convert to Psycholoquy's system and BPR3 looks like a fairly close approximation of this vision. Might be worth checking out APA's guidelines (if they have any) assuming Psycoloquy is reinstated at some point in time.

By Tony Jeremiah (not verified) on 28 Dec 2007 #permalink

I fear that you are too sanguine about the possibility of preventing exploitation of the weaknesses of the peer-review system that Danny Chrastina points out. You assume that there is a highly unified body of general knowledge "out there" that some group of universally "informed scholars" agree on.

As a person with PhDs in both psychology and philosophy, I can say that there are loads of examples of psychologists getting their philosophy hilariously wrong, and of philosophers getting their psychology incorrect as well. The journals involved rarely do anything about this. They aren't much interested in the "other" discipline, and don't have the relevant expertise to hand in any case.

As you note, groups of (what "we" regard as) "pseudoexperts" have cottoned on to this weakness in the peer review system and have started producing not only journals, but publication houses, academic conferences, and whole colleges with the explicit intention of assembling the apparatus of scholarly credibility. And to the outside world, it works. The two are pretty indistinguishable. Chiropractors, homeopaths, and naturopaths have probably gone furthest along this route. Creationists are not far behind. We may well think that what they do is nonsense, but there is no way for "peer review" to distinguish the two. You might think to ask homeopaths to submit their "findings" to a medical journal, but if you were going to accept that, then, by the same token, you would have to accept what a medical journal has to say about most work in psychology as well. It would be pretty ugly, I suspect.

It is simply up to each individual to decide for him- or herself which "scholarly communities" s/he trusts.

Also, I think you'll find that peer review has not guided scholarly publishing "for centuries." Until the 20th century, it was mostly the journal's owner and editor (often the same individual) who decided what would and would not go into a journal... just like in any magazine, scholarly or not. He might call informally on an outside expert or two, but it was far from standard practice. Check out how James McKeen Cattell ran his journals (including Psychological Review and Science, among many others) from the 1890s to the 1940s. In fact, *blind* peer review did not become the standard until the second half of the 20th century, mainly because of the complaints of women scientists that their submissions were not judged by the same standards as those of men.

By Chris Green (not verified) on 28 Dec 2007 #permalink

It took me a few months to understand what peer-review actually means (first time I heard it is on your blog) and now I (sort of) do.

I think peer review needs to be done by "official" peer reviewers or something like that, whose job is to review others' work. And then making sure that peer-review icons are legitimate would require new security measures to be taken somehow. Perhaps there could be a certain website that lists the officially peer-reviewed articles, allowing people to find the fake ones. Peer-review, however, shouldn't be the end of the line. It still needs to stand the test of time and observational/statistical evidence, or something like that. Anyway, here, I'm only suggesting. I'm not an expert or anything.

"If experts in a field find a research report noteworthy, then that report deserves to be published."

I would amend this to: "If experts in a field find a research report noteworthy within their field of expertise, then that report deserves to be published."

Researchers normally work within some kind of 'common framework', consisting of accepted methods, common knowledge, long-term goals, etc. For someone who is not familiar with this framework (and its non-written rules and standards) it is very hard to judge someone else's work on its merits. This is the reason why editors need help from those working in the field. The same applies to the problem of setting standards for 'peer-reviewed research': as soon as a journal is beyond the borders of one's expertise, it gets difficult to judge. For a general psychologist it would be possible to assess the quality of most psychology journals, but he would have a hard time when asked to compare different journals in, let's say, mathematics. The same principle applies when a non-homeopath tries to judge a homeopathy journal (note: I am not defending homeopathy here, but assume a neutral opinion). Therefore I would avoid any general 'quality-based' classification as much as possible, and let the reader judge for himself.

What might be relevant too, is that the 'peer' part of 'peer review' has two different meanings: 'peer of the author' and 'peer of the reader'. On the one hand it is important that the peer reviewer understands what the author has done, so he needs to be familiar with the subject and the methods used. On the other hand the peer reviewer represents a reader community, meaning he can judge an article on its value for this community. It is well possible that an article which is reviewed 'excellent' in this way would be considered irrelevant (or bad) when read by someone else. I have indeed seen many critiques from 'people outside a field': psychologists vs. philosophers, biology vs. medicine, neurology vs. psychiatry. The only thing 'peer reviewed' means is what it says: something was reviewed by peers and considered acceptable by their standards. This is relevant information in itself, let's not try to complicate things by adding quality levels for peers and/or journals.

By Marielle Winarto (not verified) on 29 Dec 2007 #permalink

You may be interested in this blog post in which I describe the peer review of an article published in BMC Anesthesiology.

The journal uses open peer review, so the authors and all readers can see the workings of the peer review, and this particular example is interesting because it concerns acupuncture, which is a CAM therapy.

"The system of peer review, the bulwark of academic publishing, has served scholars for centuries."

Actually, it was introduced in the early 20th century. Einstein thought it was ludicrous and refused to have anything to do with it. I think it's real problem is that, as practiced, it doesn't degrade well. In a perfect world it works just fine, but when the editors, reviewers, and submitters drop from infinitely intelligent, rational actors in a peer review game, it falls apart.

At the moment I'm rather bitter about it since most of the papers I have read recently were so actively, horribly wrong, often referring to data which cleanly contradicted their conclusions.

According to this article (which was itself peer-reviewed), peer review can be traced to the mid-1700s. However, as you point out, the process didn't become widely prevalent until the 20th century.

As a psychologist and a parent I am often amazed that the media picks up reports from less than stellar, some clearly deficient "peer-reviewed" journals. And for the average, non-research geek parent, there's no way to put this potentially flawed research in perspective. There's a whole slew of biased "studies" that are trotted out by parents who believe mercury in vaccines cause autism. That case is extreme, but it happens to a lesser extent all the time.

Of course, people also have to be aware of the bias of the top journals - we know they tend to publish only significant findings. And with a lot of issues the insignificant findings can also be important. I'm thinking of the unpublished studies showing, perhaps, no link between breastfeeding and child IQ, or television viewing and childhood obesity. I find this irritating - as have many researchers and observers over the years. There are good reasons for a bias for significant findings but it skews the public's perception of the issues. There may be many null results on a certain topic, but if there's one significant finding, we may hear it without the context of those other studies.

Sometime ago, meta-analysis was introduced as a quantitative technique for determining what a body of research had to say about a particular topic, relative to the traditional (qualitative) review article. However, two main drawbacks of this technique are the file draw problem (i.e., non-reporting of insignificant results) and the actual quality of the research conducted. Theoretically, high quality meta-analyses should include studies that report both significant and insignificant data, and, a technique for assessing the quality of studies used for the meta-analysis (usually involving 'weighting'). In practice, I'm not sure how often these issues are considered when conducting meta-analyses.

These details aside, a consumer's best bet in finding out the current main conclusion about a topic, would be to identify meta-analyses conducted on the topic. At the least, a meta-analysis should have gathered together in one article, a large number of studies focused on the topic of interest.

By Tony Jeremiah (not verified) on 04 Jan 2008 #permalink

Below I have pasted a recent example from the literature, demonstrating how the field of research effectively polices itself.

Cook, J. M., Palmer, S., Hoffman, K., & Coyne, J. C. Evaluation of clinical
trials appearing in Journal of Consulting and Clinical Psychology: CONSORT and
beyond. The Scientific Review of Mental Health Practice 2007, 5: 69-80

This paper examines the quality of reporting of randomized controlled trials
(RCTs) of adult psychotherapy interventions in the Journal of Consulting and
Clinical Psychology as judged by Consolidated Standards of Reporting Trials
(CONSORT) and items from the Evidence-Based Behavioral Medicine (EBBM)
Committee of the Society of Behavioral Medicine. Nine RCTs from 1992 and 12
from 2002 were identified and rated by two independent judges. There was a
significant improvement in reporting from 1992 to 2002, but a substantial gap
remained between RCTs published in 2002 and full compliance with CONSORT and
the EBBM. No articles specified primary and secondary endpoints, and
deficiencies were noted in features empirically related to confirmatory bias:
randomization, blinding, and reporting of intent to treat analyses. Compliance
with CONSORT will require education and enforcement of standards and will
yield a literature that is discontinuous with the existing literature in terms
of quality of reporting.

To Dr. Polly post #10,

The popular media prints what is in 'style', they seem to pick content which complements their Oscar movie picks, hot TV shows or current celebrity opinions.

They don't care where the info. comes from or whether it has validity.

A few comments:
1) I think it would be beneficial to have a look at Nature's Peer-to-peer, a blog aimed at discussing issues of peer review in science journals.

2) Concerning peer-review in general, I'll just drop Hendrik Schön's name and let it simmer for a bit.

3) Peer-review is, to me, primarily a filter to keep 'bad things' out of a journal. It reminds me of the first part of the good-practice programming axiom: "filter input". But to what point are BPR3 bloggers themselves peer-reviewers, and not simply reporters or commentators on what others have discovered and reviewed? Is it BPR3's goal to be yet another layer of post-publication peer-review? Or it is simply to try to recognize 'junk science' and to not blog on it?

4) What do we do with on-line open-acces journals, many of which have policies of retroactive peer-review in which a paper is posted and then reviewed by peers? I'm sure there are some very good ones out there, but their rejection rate is virtually zero (as one journal whos name I'm forgotten put it 'we will generally only reject a paper initially if it shows glaring misuse of science or fraud').

5) I do not believe that quantitative metrics like impact factor or rejection rate are useful because, for one, they are highly variable across fields and disciplines. An isoteric field will, by definition, have fewer people attempting to publish and fewer people reading. The IF and RR of such a journal would necessarily be tiny compared to, say, Nature. To compensate, some kind of readership:IR,RR ratio would have to be developed. This kind of system quickly becomes unmanageable (and thereby impotent) without a dedicated staff.

Also, checking things like allegations of fraud or use of junk science against an author, a journal, or an entire field are hard to track! How will you know if I, for example, have ever published anything that's not purely scientific? It's not always as simple as checking Google for first + last + "fraud".

I think an effective starting point to deal with potential problems is to devise a very general set of guidelines and then tack on reactionary rules as problems arise. The one drawback to a system like this is a possible low initial credibility for the BPR3 icon, and this can't be overlooked, either.

If more rules are added (initially and in the course of the development of the BPR3 brand), they should be easily modifiable, regularly reviewed, and clear to the public who reads articles with the BPR3 icon. It should be clear to readers that, say, use of the icon denotes that "this is a PR journal with an IF > 5 and RR > 60%". It is also the prerogative of the BPR3 team; I'm just more afraid of not including good science than including bad science (though that does indeed worry me).

Perhaps a series of icons could be used, say, to signify different classes of research? For example: working papers, OAJ papers, and peer-reviewed? Just a thought.