In a recent post, Candid Engineer raised some interesting questions about data and ethics:
When I was a graduate student, I studied the effects of many different chemicals on a particular cell type. I usually had anywhere from n=4 to n=9. I would look at the data set as a whole, and throw out the outlying points. For example, if I had 4 data points with the values 4.3, 4.2, 4.4, and 5.5, I would throw out the 5.5.
Now that I am older, wiser, and more inclined to believe that I am fully capable of acquiring reproducible data, I am more reluctant to throw away the outlying data points. Unless I know there is a very specific reason why a value was off, I’ll keep all of the data. This practice, naturally, results in larger error bars.
And it’s occurred to me that it might not even be ethical to throw out the oddball data points. I’m really not sure. No one has ever taught me this. Any opinions out there?
My two different ways of handling data actually reflect an evolution in the way I think about biological phenomena. When I was a graduate student, I very much believed that there was a “right answer” in my experiments, and that the whole point of me collecting all of that data was to find the single right answer to my question.
But anymore, I’m not so sure that cells contain one right answer. For a lot of phenomena that I study, it’s totally possible that my cells might display a range of behaviors, and who am I to demand that they decide on doing only one thing? As we all know, cells are dynamic beings, and I no longer feel bad about my 20% error bars. I’ve become more accepting that maybe that’s just the way it is.
The shift in attitude that Candid Engineer describes seems pretty accurate to me. When we learn science, it isn’t just problem sets that feed the expectation that there is a right answer. Lab courses do it, too, and sometimes they subject your work to grading schemes that dock points depending on how far your measured results are from the expected answer regardless of your care and skill in setting up and executing the experiment. Under these circumstances, it can seem like the smart thing to do it to find that right answer by whatever means are available. If you know roughly what the right answer is, throwing out data that don’t fit your expectations seems reasonable — because those are the measurements that have to be wrong.
While this may be a reasonable strategy for getting through the canned “experiments” in a lab course, it’s not such a great strategy when you are involved in actual research — that is, trying to build new knowledge by answering questions to which we don’t know the answers. In those cases, the data are what you have to figure out the phenomenon, and ultimately you’re accountable to the phenomenon rather than to the expectations you bring to your experimental exploration of it. Your expectations, after all, could be wrong. That’s why you have to do the research rather than just going with your expectations.
This attitudinal shift — and the difficulty one might experience making it — points to another issue Candid Engineer raises: There isn’t as much explicit teaching as there ought to be, especially of graduate students who are learning to do real research, with respect to data handling and analysis. This is why it’s easy for habits picked up in laboratory courses to set down deeper roots in graduate school. Such habits are also encouraged when mentors identify “good data” with “data that shows what we want to show” rather than with robust, reproducible data that gives us insight to features of the system being studied.
Depending on what those phenomena are really like — something we’re trying to work out from the data — it’s quite possible that messy data are an accurate reflection of a complicated reality.
Now, in the comments on Candid Engineer’s post, there is much discussion about whether Candid Engineer’s prior strategies of data handling were ethical or scientifically reasonable (as well as about whether other researchers standardly deal with outliers this way, and what methods for dealing with outliers might be better, and so on). So I figured I would weigh in on the ethics side of things.
Different fields have different habits with respect to their treatment of outliers, error bars, statistical analyses. Doing it differently does not automatically mean doing it wrong.
Whatever methods one is using for dealing with outliers, error bars, and statistical analyses, the watchword should be transparency. Communications about results at all levels (whether in published articles or talking within a lab group) should be explicit about reasons for dropping particular data points, and clear about what kinds of statistical analyses were applied to the data and why.
Honesty is central to scientific credibility, but I think we need to recognize that what counts as honest communications of scientific results is not necessarily obvious to the scientific trainee. The grown-up scientists training new scientists need to take responsibility for teaching trainees the right way to do things — and for modeling the right ways themselves in their own communications with other scientists. After all, if everyone in a field agrees that a particular way of treating outlying data points is reasonable, there can be no harm in stating explicitly, “These were the outliers we left out, and here’s why we left them out.”
On the other hand, if your treatment of the data is a secret you’re trying to protect — even as you’re discussing those results in a paper that will be read by other scientists — that’s a warning sign that you’re doing something wrong.
Hat-tip: Comrade PhysioProf