Let’s have a look and see if we can decide.
Sciencedaily.com has this piece on a paper published in the Journal of Medical Ethics in which the claim is made that “US Scientists Significantly More Likely to Publish Fake Research.” The problem is that the statistics given don’t show that.
The study is said to look at papers withdrawn according to PubMed between 2000 and 2010. Which means 2001 through 2009 inclusively (though I’m guessing that is not what was meant). Here’s the data:
Papers retracted: 788
… because of error: 545
… “attributed” to fraud (no indication in the write up what this attribution is attributed to): the rest of the papers. (We are left to do our own math, possibly because the number is so tiny that it might fail to impress us?): 243
So that’s 243 papers over 9, 10 or 11 years.
Now, here’s the tricky part:
The highest number of retracted papers (260) were written by US first authors.
Interesting. We were talking about fraudulent papers, but now we are citing numbers that include papers with errors. We are also told that “one in three” of these are fraudulent, again, we need to do the math ourselves. The answer is 86.66.
Now, at this point, I should mention that there are between about 3200 and 4000 days in the period “between 2000 and 2011″ and there are probably more than 100 scientific papers published every day. So the rate of production of US based alleged fraudulent papers is minuscule.
But that is not the real problem with this report. This is the problem: “The highest number of retracted papers were written by US first authors (260), accounting for a third of the total. One in three of these was attributed to fraud” … “The UK, India, Japan, and China each had more than 40 papers withdrawn during the decade. Asian nations, including South Korea, accounted for 30% of retractions. Of these, one in four was attributed to fraud.”
Well, that’s pretty meaningless unless we want to get the calculator out again, and even then, useful only if we want to assume that the rate of “fraud” against “withdrawn” is identical for all cases, but here we have a hint that it varies by as much as about 30%. But who cares about that … you see the problem, right? We are told (in the article’s title) that the RATE of fraud in the US is higher than anywhere else, but we are then told in the body of the piece that the NUMBER of fraudulent cases is highest in the US.
Not the same thing at all.
But, wait, there’s more. Or should I say, less.
One would think that the rate of fraud being alarmingly higher (or even moderately but statistically significantly higher) in the US would be important enough that it would be mentioned in the abstract of the original published paper. Let’s see if it is:
Background Papers retracted for fraud (data fabrication or data falsification) may represent a deliberate effort to deceive, a motivation fundamentally different from papers retracted for error. It is hypothesised that fraudulent authors target journals with a high impact factor (IF), have other fraudulent publications, diffuse responsibility across many co-authors, delay retracting fraudulent papers and publish from countries with a weak research infrastructure.
Methods All 788 English language1 research papers retracted from the PubMed database between 2000 and 2010 were evaluated. Data pertinent to each retracted paper were abstracted from the paper and the reasons for retraction were derived from the retraction notice and dichotomised as fraud or error. Data for each retracted article were entered in an Excel1 spreadsheet for analysis.
Results Journal IF3 was higher for fraudulent papers (p<0.001). Roughly 53% of fraudulent papers were written by a first author who had written other retracted papers ('repeat offender'), whereas only 18% of erroneous papers were written by a repeat offender (χ=88.40; p<0.0001). Fraudulent papers had more authors (p<0.001) and were retracted more slowly than erroneous papers (p<0.005). Surprisingly, there was significantly more fraud than error among retracted papers from the USA (χ2=8.71; p<0.05) compared with the rest of the world. Conclusions This study reports evidence consistent with the ‘deliberate fraud’ hypothesis. The results suggest that papers retracted because of data fabrication or falsification represent a calculated effort to deceive. It is inferred that such behaviour is neither naïve, feckless nor inadvertent.
A few notes:
1Note that since only English Language papers are used, there will be a slight reduction of papers from non-US sources in the sample.
2Funny how an article looking into fraud and retracted papers would use a non-OpenSource, and thus unverifiable, mathematics tool in their research. Is this a case of product placement or merely misplaced understanding of the concept of replicabiltiy and transparency?
3IF = Impact Factor
The headline from Sciencedaily is not even mentioned.
Now, please revisit this statement from the abstract: “Surprisingly, there was significantly more fraud than error among retracted papers from the USA (χ2=8.71; p<0.05) compared with the rest of the world." Surprising? Why? Because of the original hypothesis that fraud would more likely come from places with "weak research infrastructure." I'm not sure why they thought that but the evidence contradicted it, so they are surprised. Does the quoted sentence say that "US Scientists Significantly More Likely to Publish Fake Research, Study Find"? No. It says that fraud has a higher rate relative to error. Perhaps US scientists have the same or even lower rates of fraud, but screw up their data more often. Conclusion: Interesting paper mauled by crappy science reporting. I expect to see a retraction! Steen, R. (2010). Retractions in the scientific literature: do authors deliberately commit research fraud? Journal of Medical Ethics DOI: 10.1136/jme.2010.038125