Science is supposed to be a project centered on building a body of reliable knowledge about the universe and how various pieces of it work. This means that the researchers contributing to this body of knowledge — for example, by submitting manuscripts to peer reviewed scientific journals — are supposed to be honest and accurate in what they report. They are not supposed to make up their data, or adjust it to fit the conclusion they were hoping the data would support. Without this commitment, science turns into creative writing with more graphs and less character development.
Because the goal is supposed to be a body of reliable knowledge upon which the whole scientific community can draw to build more knowledge, it’s especially problematic when particular pieces of the scientific literature turn out to be dishonest or misleading. Fabrication, falsification, and plagiarism are varieties of dishonesty that members of the scientific community look upon as high crimes. Indeed, they are activities that are defined as scientific misconduct and (at least in theory) prosecuted vigorously.
You would hope that one consequence of identifying scientists who have made dishonest contributions to the scientific literature would be that those dishonest contributions would be removed from that literature. But whether that hope is realized is an empirical question — one taken up by Anne Victoria Neale, Justin Northrup, Rhonda Dailey, Ellen Marks, and Judith Abrams in an article titled “Correction and use of biomedical literature affected by scientific misconduct” published in 2007 in the journal Science and Engineering Ethics. Here’s how Neale et al. frame their research:
Journals occasionally report on notorious research integrity violations, summarizing information from scientific misconduct investigations, and noting the affected publications. Many other lesser-known cases of fraudulent publications have been identified in ofﬁcial reports of scientific misconduct, yet there is only a small body of research on the nature and scope of the problem, and on the continued use of published articles affected by such misconduct.
The purpose of this study was to identify published research articles that were named in ofﬁcial ﬁndings of scientific misconduct that involved Public Health Services (PHS)-funded research or grant applications for PHS funding, and to investigate compliance with the administrative actions contained in these reports for corrections and retractions, as represented in PubMed. This research also explored the way in which such corrections are indicated to PubMed users, and determined the number of citations to the affected articles by subsequent authors. (6)
Worth noting here is that the research described in this paper focused on one particular part of the scientific literature, namely the published findings in biomedical sciences — so the findings here may not tell us a whole lot about the situation in chemistry, or physics, or astronomy, or geology, or any other scientific field whose publications are not indexed in PubMed. It’s also important to notice that this study is concerned with the fate of publications of authors who have actually been caught being dishonest to their scientific peers.
The standard for “being caught” the researchers apply here is having an official finding of misconduct (by the Office of Research Integrity, or ORI) against you. In part, this is because such a finding usually includes consequences connected to publications that may embody the dishonesty toward fellow scientists. Neale et al. write:
When the ﬁnal report of an institutional inquiry into misconduct deems that the allegation of scientific misconduct has been substantiated, the ORI issues a “Finding of Scientific Misconduct” report, which is published in its Annual Report, and also in the NIH Guide to Grants and Contracts. These reports usually specify administrative actions against the respondents. Routine administrative actions include debarment from applying for PHS funding or participating in study sections for a period of time, and notifying editors of any published articles determined to be fraudulent, plagiarized and/or in need of some type of correction, or directing the respondents to make such notifications. (7)
One hears (sometimes with only a vague gesture to the empirical data) that the incidence of fabrication, falsification, and plagiarism is much higher in the biomedical sciences than in other scientific fields, especially the “hard sciences”. (Neale et al. don’t make that claim, as far as I can tell.) Whether or not that is so, the body of literature associated with the biomedical sciences is well-indexed and powerfully searchable through PubMed, a service of the U.S. National Library of Medicine and the National Institutes of Health.
But, as Neale et al. point out, there may be reason to worry that articles indexed in PubMed that are retracted or corrected will be identifiable as such:
The National Library of Medicine (NLM) policy for tagging articles with corrections states that notices of errata and retractions will be linked to articles indexed and available on its online PubMed database only if the journal publishes the errata or retraction in a citable form. The citable form requirement stipulates that the errata or retraction is labeled as such, and is printed on a numbered page of the journal that published the originally article. The NLM does not consider unbound or tipped error notices, and for online journals, only considers errata listed in the table of contents with identifiable pagination. (7)
They also note:
In a 2002 survey of journal retraction policies, Michel Atlas noted one participant who stated that his journal did not publish retractions. Some journals allow one author to retract an article, but other journals require that every coauthor consent to the retraction. Fear of litigation is behind the inaction in some cases. (7)
Of course, not every retraction is the result of a finding at the end of an inquiry into misconduct. But, in situations where there has been an inquiry into misconduct, and the finding is that there has been misconduct that requires correction of the literature via a correction or a retraction, you would hope that the coauthors of the paper would consent to the appropriate action.
You’d also hope that scientific journals would recognize their interest in serving their readers by ensuring the scientific quality of the articles they publish. The pre-publication screening (via peer review and editorial oversight) can do part of the job here, but even in situations where there is nothing like misconduct on the part of authors, occasionally honest mistakes are discovered after publication. Why on earth would a journal have a policy that would prevent authors who become aware of such mistakes from communicating the relevant information to their fellow scientists who have access to the published work now known to be mistaken, whether through a correction or a retraction?
In any case, the present study points to policies and facts on the ground that might make us worry about how completely errors in the scientific literature (whether honest mistakes or intentional deceptions) are corrected.
Neale et al. set out to quantify the extent to which papers found needing to be retracted on account of misconduct were actually identified as retracted in the literature that scientists draw upon in their scientific work. To do this, they looked that the NIH Guide for Grants and Contracts and the ORI Annual Reports for 1991-2001. From all the published “Findings of Scientific Misconduct” in these two sources, they collected the information on publications identified as affected by the misconduct, on what administrative actions were taken against the people found to have committed scientific misconduct, and on whether those found to have committed scientific misconduct accepted responsibility for the misconduct.
Next, Neale et al. searched PubMed to see if the publications in question had been retracted, or if errata for them had been published. (As I understand their methodology, they searched for the articles themselves, and were checking to see if the retractions or erratum notices came up in these search results.)
Finally, they searched Web of Science to locate instances where other publications cited these articles which had been flagged as needing to be corrected or retracted. Remember that they were considering misconduct findings that were published through 2001 (and thus, one would assume, affected scientific work published prior to the findings of misconduct). Their examination of the citation history of these papers focused on a time interval after these findings:
Data collection from the ISI Web of Science was repeated two times during 2003, and once during 2004 to reﬁne the data collection methodology. Because citations increase over time, a ﬁnal citation analysis was conducted during the week of May 17, 2005 and the citations as of that week are reported here. This allowed for a minimum period of 3 years between the publication of an affected article and the cut-off date for the citation analysis.
Publishing may not always move quickly, but you might hope that three years is sufficient to communicate to the scientific community that draws on the literature whether a particular piece of that literature is not as reliable as it was first thought to be.
What did they find?
For the published findings of misconduct in the NIH Guide for Grants and Contracts and the ORI Annual Reports for 1991-2001, 102 articles were identified as needing retraction or correction. (There were 41 researchers whose misconduct was tied to these 102 articles, 19 of them identified as responsible for a single problematic paper and 22 of them responsible for two or more problematic papers. One of those 41 researchers was responsible for a whopping 10 articles in need of retraction or correction.)
Of those 102 articles, 79 reported results that were fabricated, falsified, or misrepresented; two contained plagiarism; 16 gave inaccurate reports of the methodology the researchers actually used; and five reported “results” from fabricated experimental subjects.
Just over half of the 41 researchers here (responsible for 53 of the flagged articles) accepted the findings of misconduct, while five were recorded as disagreeing with the findings or denying responsibility for the misconduct. (The other misconduct findings didn’t record the respondents’ response to the findings.)
By the time the findings of misconduct were published, corrigenda (corrections) had already been published for 32 of the flagged articles and were “in press” for 16 more. The findings of misconduct noted, in the administrative actions they prescribed following the findings of misconduct, that retractions or corrigenda should be published for another 47 of these flagged articles.
For those doing the math, this leaves seven of the articles flagged (as reporting results that were fabricated, falsified, or misrepresented, or as containing plagiarism, or as giving inaccurate reports of the methodology the researchers actually used, or as reporting “results” from fabricated experimental subjects) for which the administrative actions did not specifically call for correction or retraction. However, it’s not unreasonable to think that articles flawed in these ways ought to be corrected or retracted, in order to protect the reliability of the scientific literature and the trust scientists need to be able to place in the reports published by their fellow scientists if they don’t want to have to do the whole damned scientific job themselves.
How many of those 102 flagged articles turned up with corrections or retractions in the PubMed searches Neale et al. conducted? By May 2005 (which was their data collection cut-off), 47 of the articles were indexed as having a retraction, 26 were indexed as having an erratum, and 12 other “had pertinent information in the PubMed ‘Comment’ ﬁeld” (12). Ten more of the articles had no notice of corrigenda in PubMed but did have “an open access link to the NIH Guide ‘Findings of Scientific Misconduct’ that indicated the article was affected by misconduct.”. (13) Three had no such link that came up through PubMed.
This adds up to 98 articles — which means that four of the 102 flagged articles didn’t come up in PubMed searches at all.
These results show some variation in how the problematic articles were flagged as problematic to those searching PubMed, but the fact that only three of the 102 articles here were not flagged as such doesn’t seem like a terrible level of compliance with the goal of correcting the scientific literature. (Of course, to scientists basing their work on the three problematic papers that seemed to be indexed without warning labels, this might not be terribly comforting.)
The next question, though, is whether this correction of the scientific literature was successfully communicated to its intended audience — especially to other researchers who might be basing their own published work in part on these problematic papers. This is where the question of citations of the 102 problematic papers comes in.
By their data collection cut-off date (May 17, 2005), Neale et al. found listed in the Web of Science database nearly 6,000 citations to the 102 problematic papers. For the whole set of problematic papers, the median number of citations was 26, with a higher median number of citations (36) for the 13 problematic papers for which PubMed didn’t have a linked corrigendum. The median number of citations of the papers with errata was 33, while the median number of citations of the retracted papers was 27. One of these problematic papers was actually listed as cited by 592 other journal articles.
Potentially, this is a problem.
Neale et al. note that some of these citations may be a result of the long turnaround time between when researchers submit manuscripts and when the manuscripts are published. If the manuscripts are published at about the same time that the retractions or corrections of the sources they cite are published, it’s understandable that the authors citing the problematic papers could not have been in possession of the information that the papers they were citing were, in fact, problematic. (This might, however, take some of the wind out of the sails of those with the firm conviction that errors in the literature will be reliably detected by the other researchers who draw on and cite that literature in their own research.) However, given that Neale et al. were looking at a fairly long window of time after the last of these 102 papers had been flagged, in official findings of misconduct, as problematic, it seems clear that some significant number of these papers were still being cited after retractions or corrections were published (or after information in their comments field in PubMed indicated a problem, or after links in PubMed could have connected searchers to the official misconduct findings relevant to these papers).
Why didn’t the authors citing these problematic papers know that they were problematic? (It’s hard to imagine that they would cite them if they knew, for example, that they had been retracted.*) Neale et al. have this to say:
Most journals are not open access and on-line availability of corrections in the PubMed ‘Comment’ is determined by institutional subscriptions, making it difficult for some to learn more about the particular details related to the corrigenda. Researchers should be alert to ‘Comments’ linked to the open-access NIH Guide for Grants and Contracts, as its ‘Findings of Scientific Misconduct’ usually provide the most detail about the nature of the problem in the affected articles and are often more informative than the statements about the retraction or correction found in the journals (which do not always reveal that the article was affected by scientific misconduct).
How can the continued citation of research affected by scientific misconduct be reduced? More prominent labeling in the PubMed database is desirable to alert users to notices of retraction and errata. This could take the form of larger or bold fonts for these notices. In addition, a prominent placement of the word ‘retraction’ on the ﬁrst page of such articles would be useful, because once a user downloads an article, these notices are left behind.
Some of the problem, in other words, may be due to the vigilance (or lack thereof) displayed by those using the scientific literature, but some of it may come down to the extent to which that scientific literature is accessible to the researchers. Yet another instance where Open Access journals could make life better! (There is an irony in this observation being reported in Science and Engineering Ethics, which is not an Open Access journal.)
Neale et al. note that weeding such problematic papers out of the pool of scientific literature that researchers cite may require journal editors, manuscript authors, and even journal readers to take on more responsibility — for example, before they submit a manuscript for publication (either initially or after the last set of revisions), ensuring that none of the sources they cite have been retracted or corrected. Failing to exercise such vigilance could inadvertently render their own paper a problematic one (if it depends in part on another problematic paper).
However, until the scientific community is on board in recognizing such vigilance as a duty, it’s unlikely that failing to exercise it could itself rise to the level of scientific misconduct.
*It’s possible that at least some of the citations to the problematic papers were citations to identify now-discredited theories or claims. The methodology in the study doesn’t seem to have involved tracking down every article that cited one of the problematic papers and characterizing each of those citations (e.g., as approving or disapproving). It might be interesting to see research in how frequently scientists cite papers as reliable prior work or independent support for their findings versus how frequently they cite papers to disagree with them.
– – – – –
Anne Victoria Neale, Justin Northrup, Rhonda Dailey, Ellen Marks, & Judith Abrams (2007). Correction and use of biomedical literature affected
by scientific misconduct Science and Engineering Ethics, 13, 5-24 : 10.1007/s11948-006-0003-1