Did you publish a scientific mistake? Blame the software!

When computers first entered the mainstream, it was common to hear them getting blamed for everything. Did you miss a bank statement? that darned computer! Miss a phone call? - again the computer!

The latest issue of Science had a new twist on this old story. Now, instead of a researcher failing to take responsibility for doing sloppy science, we're back to blaming the computer. Never mind that the lab was using home made software that they "inherited from someone" (and apparently didn't test it) the five retracted papers were the fault of the software! Not the scientists who forgot to include positive controls!

To quote the Science news article:

In September, Swiss researchers published a paper in Nature that cast serious doubt on a protein structure Chang's group had described in a 2001 Science paper. When he investigated, Chang was horrified to discover that a homemade data-analysis program had flipped two columns of data, inverting the electron-density map from which his team had derived the final protein structure. Unfortunately, his group had used the program to analyze data for other proteins. As a result, on page 1875, Chang and his colleagues retract three Science papers and report that two papers in other journals also contain erroneous structures.

[snip]

and

The most influential of Chang's retracted publications, other researchers say, was the 2001 Science paper, which described the structure of a protein called MsbA, isolated from the bacterium Escherichia coli. MsbA belongs to a huge and ancient family of molecules that use energy from adenosine triphosphate to transport molecules across cell membranes. These so-called ABC transporters perform many essential biological duties and are of great clinical interest because of their roles in drug resistance. Some pump antibiotics out of bacterial cells, for example; others clear chemotherapy drugs from cancer cells. Chang's MsbA structure was the first molecular portrait of an entire ABC transporter, and many researchers saw it as a major contribution toward figuring out how these crucial proteins do their jobs. That paper alone has been cited by 364 publications, according to Google Scholar.

[snip]

Ironically, another former postdoc in Rees's lab, Kaspar Locher, exposed the mistake. In the 14 September issue of Nature, Locher, now at the Swiss Federal Institute of Technology in Zurich, described the structure of an ABC transporter called Sav1866 from Staphylococcus aureus. The structure was dramatically--and unexpectedly--different from that of MsbA. After pulling up Sav1866 and Chang's MsbA from S. typhimurium on a computer screen, Locher says he realized in minutes that the MsbA structure was inverted. Interpreting the "hand" of a molecule is always a challenge for crystallographers, Locher notes, and many mistakes can lead to an incorrect mirror-image structure. Getting the wrong hand is "in the category of monumental blunders," Locher says.

On reading the Nature paper, Chang quickly traced the mix-up back to the analysis program, which he says he inherited from another lab.

I'm flabbergasted!

Chang's publication record is impressive but I'm stunned that 364 publications cited his 2001 paper and no one tested the software with data sets from positive controls!

Computer programs aren't magic. They are written by human beings and contain all the same kinds of errors in logic and mistakes that humans make, plus some other interesting problems that are unique to computers - like running out of memory, and kernal panics,
etc.

We just can't just throw all of our scientific training out of the window because there's a computer involved. If you're going to use software, you have to use controls and good experimental design, just like when you're doing a wet-bench type of experiment.

Back when I was a young, impressionable graduate student, I never ceased to be amazed by the one female post-doc in our lab. I think she left academic science once she started having children, like so many of the other women I knew, but she was an incredibly careful researcher, and famous, in our lab, for her obsession with controls. In fact, I don't think she was ever satisfied unless an experiment included more controls than experimental samples. Where are these people in computational biology and bioinformatics?

Controls are a cornerstone of biological research
To those of you who don't do wet-bench biology, controls are fundamental to this type of work. Since we can't possibly know all or predict of the variables, we do the next best thing. We use controls. Since many procedures have multiple steps, we often include both positive and negative controls.

A positive control is a sample that should exhibit predictable behavior. Often, it's a sample that we've used before. For example, in a PCR experiment, a positive control would be a sample that has worked before and produced a DNA fragment of a specific size. In a DNA sequencing experiment, it would be a sample that we've sequenced before, with success. We use positive controls to help troubleshoot experiments and identify points of failure. If a positive control fails to behave as we expect, we know that there is a problem with the entire experiment.

A negative control is a sample that is identical to our experimental sample, with the exception that it's missing the thing we want to test. In PCR, a negative control might be missing the template DNA. If we saw DNA appear in the negative control, after the PCR, we would suspect a problem with contamination. If we were testing the effect of an antibiotic, the negative control sample would be a bacterial culture, grown to the same density as the test culture, in the same media, and under identical conditions, but without the antibiotic.

In commerical software testing, and in bioinformatics and/or computational biology work, we use control samples as well. I use data sets that have a predictable behavior, or include positive controls - that is, data that I know should work a certain way - whenever I try a new program or method. I do these kinds of things partly because it's part of my job to find bugs and partly so that I can be confident that the algorithms or programs are behaving the way that they're supposed to.

Somehow I think, we have to impress on young researchers that testing software is at least as important as being able to use it.

Reference:

Greg Miller 2006 "A Scientist's Nightmare: Software Problem Leads to Five Retractions" Science 314:1856 - 1857.

technorati tags: , , ,

More like this

One further comment ... the title of the article cited is flawed:

"A Scientist's Nightmare: Software Problem Leads to Five Retractions"

"Software problem"? I strongly suspect that the software did exactly what it was asked to do. It's a "human sloppiness" problem ...

By Scott Belyea (not verified) on 01 Jan 2007 #permalink

In macromolecular crystallography, there has been a growing viewpoint that the technique has become so automated that anyone can do it; i.e. a biologist with no special expertise or experience in crystallography could crystallize a protein, then turn the crank and get a structure. Hopefully this episode will restore some sense. If you cannot tell when a piece of software has given you a wrong answer, you are not justified in claiming that it is giving you correct answers.

By Mustafa Mond, FCD (not verified) on 01 Jan 2007 #permalink

I think I need to point out that there are standard crystallographic software packages that are almost what Mustafa suggests-data in, map out, little user input, and that give statistics to tell you how 'right' or 'wrong' your structure is. I have solved a couple of structures, and I honestly don't know the nuts and bolts of how each algorithm works.

BUT, the Science article points out that he used a homebrewed processing package-which is where I have little forgiveness. The program flipped the structure to a mirror image (and since the resolution was so low, they made a incorrect model).

That said, I find it implausible that no one in his lab tried another crystal suite (most are free shareware). What 'answer' did the standard xtal software packages give? I think someone needs to look past the 'oops' and look for fraud. Were the xtal statistics worse in the standard package, reflecting the actual map quality? Were the statistics a standard package would give unpublishable? How do the electron density maps in the 2001 Science paper look so right to wind up so wrong?

The last question that needs asking is whether 4.5 angstrom structures really have value. At some point, stuffing models into blobs is just guesswork. Should the reviewers rejected the paper, in favor of waiting for a higher-resolution structure? (I think even at 3.8 that the handedness of the helices would have been apparent).

Thanks for the comments Robert. You make some very good points. I work at a company that develops and sells commercial scientific software. As you say, using the commercial stuff, is often easy. If commercial software doesn't make life easier for the people who buy it, it won't survive. Commercial software, in general, too, does end up getting tested regularly, and validated over the years because customers pay for those services.

I'm not sure if the work in the papers really constitutes fraud, BUT I thought the Science reporter let the PI off far too easily by blaming the software rather than the PI. If you use homemade software, you have a responsibility to validate that it actually works. No one would ever use an antibody without making sure that it binds to the right target. Software should be subject to the same criteria for scientific rigor as any other kind of reagent or tool.

I definitely think that this guy and the people in his lab should have tried their software out with data sets from published structures, and also, I agree, they should have tried out some other packages, to make sure that they were getting the correct results.

Letting this stuff go for five years without correction was irresponsible and most certainly an example of poor science.

I've run across a few bugs in commercial software over the years as well. Unlike open source academic packages, commercial software generally does not come with the source code so that you can track down and fix the bug.

BUT I thought the Science reporter let the PI off far too easily by blaming the software rather than the PI.

I was struck by the fact that, after publishing three of the errant structures, Science does not spend a single word questioning whether they themselves or their editorial policies contributed to the fiasco. If 4.5 Angstrom maps are not adequate, then this is a matter for the editors, not just the PI. Are adequate materials made available to reviewers to verify the correctness of the models? Does the race to publish the hottest structures lead both authors and editors to overlook questions of quality?

By Mustafa Mond, FCD (not verified) on 03 Jan 2007 #permalink

Mustafa,

Good points about Science magazine. I wonder if the reporter talked to the reviewers or not.

Presumably, the Chang lab had the source code for the homemade software program that they inherited. I don't think having open source software really mattered here.

If you don't test software, you don't find bugs. Whether the code is open source or not wouldn't make much of a difference if you don't know that there's a bug that needs fixing.

Not directly relevant, but amusing:

Way back in 1982, I was in high school, and visited another school's computer center where somebody was showing off their new poster-making program -- put in a phrase, and it would be printed in giant letters on banner paper. I asked for a poster of the classic motto: "To err is human; to really foul things up requires a computer".

The program mangled the printout....

By David Harmon (not verified) on 03 Jan 2007 #permalink

Hi,

Thanks for your helpful post. I would like to ask about your opinion.
I am a Postdoctoral Research Fellow. During my research, I found a serious error in an 1972 paper published by authors from the University of Minnesota. They have published several papers based on that erronous paper. Also other research groups since then have used that model and analyzed their data!

So the question is that what would be the best way to report this error? Is the best way is to publish a paper and in your paper explain about the error? Or it would be better to write letters to the editors of those Journals about the errors?

Thanks for your help.

Regards

1972? That mistake is over 37 years old. Hopefully, the scientific process has found it out and moved on!

It's important to remember that science is a process. We always try to come up with the best conclusions that we can, given the data that we have available. Better data and better conclusions are always likely to come along. The real question is whether or not the erroneous conclusions are leading people astray and causing them to follow the wrong direction.

But your question is about what to do with that information. If you think the error is important, maybe a paper that sets the field straight would be a good idea. I think your idea about contacting the editors of the journal is a good one. So, contact the editors. Tell them what you've found and ask them what they think about a your idea for a publication.