In today's Chronicle of Higher Education there's an article about the methods journal publishers are deploying to detect doctored images in scientific manuscripts. From the article:
As computer programs make images easier than ever to manipulate, editors at a growing number of scientific publications are turning into image detectives, examining figures to test their authenticity.
And the level of tampering they find is alarming. "The magnitude of the fraud is phenomenal," says Hany Farid, a computer-science professor at Dartmouth College who has been working with journal editors to help them detect image manipulation. Doctored images are troubling because they can mislead scientists and even derail a search for the causes and cures of disease.
Ten to 20 of the articles accepted by The Journal of Clinical Investigation each year show some evidence of tampering, and about five to 10 of those papers warrant a thorough investigation, says [executive editor] Ms. [Ushma S] Neill. (The journal publishes about 300 to 350 articles per year.)
Maybe that frequency isn't alarming -- at worst, this is less than 7% of the articles they publish in a year where the images have been doctored. However, if one of those doctored images is in a paper upon whose accuracy you are relying (say, as a staring point for your own promising line of research), it will likely seem like too high an incidence. The images, after all, are supposed to be conveying something useful about the results actually obtained -- not about the results the researchers were expecting to get, or wished that they had gotten.
Back to the article:
Experts say that many young researchers may not even realize that tampering with their images is inappropriate. After all, people now commonly alter digital snapshots to take red out of eyes, so why not clean up a protein image in Photoshop to make it clearer?
"This is one of the dirty little secrets--that everybody massages the data like this," says Mr. Farid. Yet changing some pixels for the sake of "clarity" can actually change an image's scientific meaning.
Why does this sound so much like, "Everyone fibs on their taxes" or "Everyone lies about sex" to me? I'll grant that there may be an awful lot of people doing it, but it's surely not everyone. Moreover, to the extent that researchers are not advertising that they have massaged their data or doctored their images, it seems to be an indication that they at least suspect that they ought not to be doing it. Otherwise, why hide it?
And really, in a discourse where you are supposed to make your case based on the data you actually collected, doctoring an image to "improve" the case you can make only works by misleading the scientists you're trying to convince. If your data actually supports your claims, you can present that very data to make your case.
To the article again:
The Office of Research Integrity says that 44 percent of its cases in 2005-6 involved accusations of image fraud, compared with about 6 percent a decade earlier.
New tools, such as software developed by Mr. Farid, are helping journal editors detect manipulated images. But some researchers are concerned about this level of scrutiny, arguing that it could lead to false accusations and unnecessarily delay research.
You know what else can unnecessarily delay research? Relying on the veracity on an article in the peer reviewed literature whose authors have tampered with a crucial visual representation on their results.
"Only a few journals are doing full image screening," says Mike Rossner, executive director of Rockefeller University Press. Mr. Rossner became a leading crusader for such checks after he accidentally stumbled upon manipulated images in an article submitted to The Journal of Cell Biology six years ago, when he was the publication's managing editor.
He worked with researchers to develop guidelines for the journal outlining proper treatment of images, and several other journals have since adopted them. Some enhancements are actually allowed--such as adjusting the contrast of an entire figure to make it clearer. But adjusting one part of an image is not permitted, because that changes the meaning of the data.
He says all papers accepted by The Journal of Cell Biology now go through an image check by production editors that adds about 30 minutes to the process. If anything seems amiss, the authors are asked to send an original copy of the data--without any enhancements.
So far the journal's editors have identified 250 papers with questionable figures. Out of those, 25 were rejected because the editors determined the alterations affected the data's interpretation.
Having your original data available is always a good idea, and it seems like the best defense against a "false positive". I don't doubt that it might be inconvenient or upsetting for an honest scientist to be asked by a journal to prove that her images are not doctored -- that it could feel like a journal editor is doubting her integrity. However, journal editors have an interest in protecting the quality of the scientific record (at least the piece of it published in their journal), and the whole scientific community depends on the goodness of that shared body of knowledge.
The people to get mad at are not the journal editors who are scrutinizing the manuscripts, but the cheaters who have made such scrutiny necessary.
At Nature Publishing Group, which produces some of the world's leading science journals, image guidelines were developed in 2006, and last year the company's research journals began checking two randomly selected papers in each issue for image tampering, says Linda J. Miller, U.S. executive editor of Nature and the Nature Publishing Group's research journals.
So far no article has been rejected as a result of the checking, she says.
Ms. Miller and other editors say that in most cases of image tampering, scientists intend to beautify their figures rather than lie about their findings. In one case, an author notified the journal that a scientist working in his lab had gone too far in trying to make figures look clean. The journal determined that the conclusions were sound, but "they wound up having to print a huge correction, and this was quite embarrassing for the authors," she says.
Ms. Miller wrote an editorial for Nature stressing that scientists should present their images without alterations, rather than thinking polished images will help them get published. Many images are of gels, which are ways to detect proteins or other molecules in a sample, and often they are blurry.
No matter, says Ms. Miller. "We like dirt--not all gels run perfectly," she says. "Beautification is not necessary. If your data is solid, it shines through."
In the real world, data are seldom perfect. And, to support your scientific claim, they don't need to be perfect -- just good enough to meet the burden of proof. If the data you've collected aren't clean enough to persuade other scientists in your field of the conclusion to which they lead you, this usually means you need to be a little more skeptical yourself, at least until you can find better evidence to support the conclusion that appears so attractive.
Cadmus Professional Communications, which provides publishing services to several scientific journals, has also developed software to automatically check the integrity of scientific images.
The Journal of Biological Chemistry, which uses Cadmus for its printing, sends 20 to 30 papers a year through this system, at a charge of $30 per paper, says Nancy J. Rodnan, director of publications at the American Society for Biochemistry and Molecular Biology, which publishes the journal. She says the journal cannot afford to send every paper through (without passing the cost on to authors), so its editors send only those that they suspect, usually because some figures look like they have gel patterns that have been reused. Last year about six of the checked papers led to more serious investigations, and a couple of those were eventually found to have been altered inappropriately.
This almost makes me wonder whether scientists would be inclined to exert more peer pressure on each other not to cheat if the journals decided to screen every manuscript and to pass that cost on to every author.
In any case, the Chronicle article notes that not every journal has undertaken such screening measures -- and, that there have been manuscripts rejected on account of suspect images by journals that do screen that were then submitted to (and published by) journals that do not do the screening. Given the lack of uniformity in the scrutiny applied by different journals, the misleading data can find its way into print. Moreover, not all cheaters are so risk averse that they won't take a chance on slipping through the spot checks.
This is not a new problem, nor are journals' efforts to deal with it new. (I discussed it back in January of 2006.) But clearly, it's not a problem that has gone away in response to journal editors indicating that doctoring your images amounts to lying about your results.
I commend the journal editors who have stepped up to try to combat this problem. I assume that PIs are also making a point of communicating to their trainees that lying with a pretty image is not OK in scientific discourse.
First, complete and utter agreement about the despicableness of manipulating your data, in whatever form.
But we have to be aware that there is a gray area of completely necessary manipulation going on, one that has always been a factor but which is, with digital photography, more relevant than ever.
When you take an image, you need to develop it. And yes, with high-grade digital cameras (that give you the raw, unadultered sensor data) you need to do it too. But when you do, exactly how you choose to do it is a completely open process. There is little in the image data that tells you how contrasty to make it, what light level to use, how to set the white balance, how much to denoise the image (noting that denoising is increasingly something done by your camera whether you want to or not), keep it in color or turn it into black and white (Black and white is a lot easier on the per-page charges)?
Remember that Time cover picture of OJ Simpson where he looked dark, menacing and brooding? That was arguably (and Time did argue for it) that choosing the light level is well within the editorial purview of the photographer - you in fact must choose _some_ light level, and which one is correct becomes a matter of interpretation, not an outside ground truth.
My point is, we do need to find some standard or "best practice" for this kind of thing. And preferably before a case blows up in public ending with hurt researchers and journals alike. In the end, I suspect the standard will be some variation on "make the picture look like what it did when you saw it in the lab", but we probably need some explicit consensus that this kind of subjective evaluation in the end is necessary and allowable.
Adding to the point raised above by "Janne", I'm wondering where one draws the line between acceptable (and even necessary) image manipulation and fraudulent image manipulation.
For example, my own field is astronomy (and yes, I'm still a graduate student in that field). Any image that I publish has been manipulated quite a bit (especially since I work with spectra). I've removed the bias voltage, divided out the flat field, extracted the spectrum, changed the scale to wavelength, corrected the flux levels, normalized the continuum, etc. In the case of radio spectra, often the spectrum has been smoothed a bit too so that the spectral lines are more visible. All this is necessary of course, but when does it become too much?
Now, obviously, changing the flux values for some pixels because I don't like what's there would be the wrong thing to do. But removing night sky interference isn't. I guess I'm just saying that the dividing line can be a bit fuzzy.
Perhaps the most important thing is that you describe the process. Pretty much every astronomy paper based on original observations has a section where you do nothing but talk about how you reduced the data. And, well, telescopes make their data public after a proprietary period (one to a few years), so anyone can get the raw data and then follow your procedure to reduce it. So maybe the line is drawn when you cahnge something but don't describe the change (or, even if you do describe it, if the community as a whole thinks that the change distorted your conclusions).
I realize that figuring these things out is part of being a graduate student, but it can be exceptionally hard to actually get the faculty to actually answer questions, especially general questions. It seems like not only is there a fuzzy line, but that the position of the line changes depending on what field you're in.
Brian York's comment points out an important cultural difference. For an astronomer, an image is an array of numbers. You make measurements on it. For a biologist, an image is something to be perceived by human visual organs.
The guidelines needn't be any different than for anything else: you must provide raw data on request, and you must say how you did your experiment and analyzed your data. The problem is really that more and more biology journals are moving materials and methods into supplementary information.
For digital-camera photographs, there is an engineering solution for this problem: Canon makes a Data Verification Kit (DVK-E2) that creates a tamper-proof code for each photo taken when the device is attached to a camera. It is used in law enforcement and insurance to prove that an image has not been modified in any way - not even a single pixel.
There is some initial expense - The device costs over $700, and it only works with Canon's quite expensive top-of-the-line digital SLRs. But once you own it, there's zero cost to verify every photo you take. Could it make sense for a biology department to purchase the gear for individual researchers to share when taking key photographs for publication?
Also, could journals promote or reward the use of data-verification technology? To start the discussion, could journals knock a few bucks off of the page costs when photos are backed with the verification files? After all, it would save the journal the expense of having to analyze piles of photos looking for unethical modifications.
This technology is also a very good way to prevent an unfortunate false-positive accusation of a photo manipulation that didn't really occur. Considering that an incorrect accusation of manipulation could destroy an entire career, the price of the device plus a fancy DSLR starts to look like a reasonable insurance premium to pay against complete job-and-reputation destruction.
As far as I know, the technology is only available for Canon DSLRs right now, but there's no reason why you couldn't design, say, a scanning electron microscope with basically the same verification electronics built in. That would cost the manufacturer some serious money to implement it the first time, but not much at all once it became a standard feature. If the damage due to funky photos really is high enough, spreading the technology to all sorts of scientific imaging devices that create digital files might make sense both financially and scientifically.
(And, no, I'm not a salesman for Canon :) )
I think the simplest and most absolute rule of thumb is that if you are performing a manipulation only on some pixels in an image, and those pixels are being directly manually selected by you, and you are not reporting in your methods section exactly how and why you are manually selecting pixels for manipulation, then you are likely doing something wrong. And if you are not treating different images meant to be compared to one another identically, then you are likely doing something wrong.
As a paleontologist, I also immediately found myself thinking that we routinely and rightly manipulate our images. No-one would submit a palaeo paper to a respectable journal in which the specimen photographs hadn't been cropped, had their backgrounds trimmed out, contrast-balanced, and maybe Gaussian-blurred or unsharp-masked. This stuff is necessary to bring out the details of complex, coloured three-dimensional objects in 2d B&W photos.
I can think of two things that would help. The first is to only apply filters such as blur, unsharp and contrast-change uniformly across the entire image, or at least to disclose when this rule has been violated and explain how and why. The second is to include the original downloaded-from-the-camera images in the supplementary information. These days, when anyone can have a web-site easily and cheaply, EVERY descriptive paper should have supplementary information (especially images), whether rubber-stamped by the journal or not.
I assume that PIs are also making a point of communicating to their trainees that lying with a pretty image is not OK in scientific discourse.
First, complete and utter agreement about the despicableness of manipulating your data, in whatever form.
As one of those despicable liars, I'd like to give my perspective: I was an early adopter of Photoshop and digital images in biology, and happily used it to eliminate bubbles and crud from photos of my slides. The interpretation of data was completely unaffected, there was not the slightest intention to mislead and it never occurred to me that it might be considered objectionable.
Acceptable practice in the field has evolved such that that's no longer considered appropriate, and I no longer do it. But can we at least drop the shocked, shocked pose that anyone might think that "scientific discourse" values aesthetics?
I have been thinking about this issue a lot lately because I've been preparing my figures for publication and most of my figures are images of cells taken with a digital camera. I find myself going to my advisor a lot with questions about what's okay to do and what's not okay to do. I'm scared to death of doing something that's inappropriate. Can I change the brightness and contrast? By how much? Can I make the background darker? If I do that, does it affect the interpretation of my data? And the answers depend on understanding what exactly the program is doing when you ask it to do any of the above. Generally speaking, I try to make sure the image looks as close to what I see when I look through the scope as possible and do as little electronic manipulation of the image as I can and still have it represent my data. Of course, the human eye is not as sensitive as a camera, so that is not a perfect solution, but it's as close as I can get.
My husband is an astrophysicist and we talk about this sort of thing from time to time. As Fredrick said above, there is a fundamental difference in how we approach our data. His is quantitative and mine is mostly qualitative. It is unfortunately all too easy to manipulate a qualitative result until it says what you want it to say. You really need to be careful to not let your personal bias dictate your actions.