The Significance of Negative Data

Spurred on by some comments left by Coturnix on the Three Types of Experiments entry, and by the Microparadigm paper (see my entry, and another discussion of this paper at In the Pipeline), I now present to you ... the significance of negative data.

Now most of the older (and well read) philosophers of data such as Kuhn, Popper and Feyerabend were obsessed with the physical sciences, and as Ernst Mayr has pointed out in several books, they're ideas are less applicable to the life sciences. Even the principle of Occam's Razor often fails biology. The main mechanisms that operate within a living organism arose from evolution, a producer of diversification. Every system (think signal cascades, morophagen programs, circadian mechanisms, molecular motors) is a baroque assembly of proteins each with its own intricate parts that may resemble other related proteins, but has diverged to meet the needs of the organism in this point in time. At first glance, our assumptions as to what the underlying mechanisms "should be" are often much more simplistic than the mechanisms that are eventually found. Often this is not due to the fact that nature is weird but because we often make naive assumptions about natural systems.

So how does one study biology?

Well in simplistic terms, lots of biology is exploratory ... much more than the physical sciences. We are explorers of the weird and wonderful mechanism produced by natural selection.

Hold on ... is this really true for all research done in biology?

At this point we come back to the three types of experiments. From my previous post:

Type A Experiment: every possible result is informative.
Type B Experiment: some possible results are informative, other results are uninformative.
Type C Experiment: every possible result is uninformative.

There is even a little saying that accompanies this ...

The goal is to maximize type A and minimize type C.

In that post I then stated that type B experiments were actually the most important. Well lets rephrase our three types of experiments into terms that more acurately describes their function in terms of how the life sciences opperate:

Type B Experiment: exploratory science
Type A Experiment: explanatory science
(Type C Experiment: forget about it or ask yourself what am I trying to demonstrate)

Just like an expedition team, as a life scientist you first attempt to find something new (type B experimentation). Then once the strange and wonderful product of natural selection is discovered, you then work out the details of the mechanism (type A experimentation).

As you can see my argument is that:

Type B Experiment = exploratory science = some possible results are informative, other results are uninformative

What do I mean? When exploring the vast possible space of what "might be out there" you will often get negative data. The question becomes: is that negative data informative? (this is the point that coturnix and other have raised). To a certain degree, all data is informative. However I would argue that negative data from exploratory experiments is not that informative for two reasons:

1 - In exploring the vast space of possibilities, to say that a particular biological mechanism does not occur is not saying very much. If I told you that i couldn't find any evidence for my crazy idea, you would probably reply big deal. But if I had convincing evidence that supported my crazy idea, you would say wow (I hope).
2 - A failure to find such a mechanism can occur through an improper experiment. Furthermore the reason that a certain experiment is "improper" may not be apparent to any reasonable researcher. If I told you that I couldn't find any evidence for my crazy idea, you could always say "maybe you didn't try hard enough". Or you could say "maybe you used the wrong technique". This is why grad students and postdocs on occasion hate their PIs.

I'll explain with three examples, one fictional, two real.

Example 1: The lunar cell cycle check point (fictional) from a comment I left:

Say that your hypothesis is the lunar check point ... cells divide normally unless there is a full moon. Well that would be easy to test - measure doubling rates of your cells every day and see if there is a drop in multiplication the day that there is a full moon. Now if the drop happens, you've gain some potential insight "Ha the moon may activate a cell division check point - although my evidence is correlative". And that would be a major discovery. But if you never saw a decrease in cell division, you would have negative data. This (IMHO) makes this a type B experiment.

Now, negative data, is data. But it's weak data. It doesn't formaly disprove the hypothesis. For example one may say - "Well these are tissue culture cells and so they are transformed and maybe missing the check point, but perhaps other cells have the check point." or "Maybe the cells have to be grown in direct contact with moon light for the checkpoint to be activated" or "Maybe a subset of cells in the hypothalamus detect the signal and send a secondary signal to other cells" ... etc.

Example 2: Neurogenesis.

It was long believed that you are born with all the neurons that you'll ever had. Researchers recently have discovered that this is FALSE. From a previous entry:

Dan Gilbert explained that stress came in two forms: a positive stress that challenges us and that may enhance neurogenessis, and a negative stress (think despair) that actually inhibits neurogenesis. Animals in cages are under heavy negative stress, and as Gilbert explained "that's why no one ever observed neurogenesis in the lab."

Now to get back to the three types of experiments ... Many inexperienced researchers fail to perform control experiments and this often transforms explanatory science from type A into type B experiments (or into type C experiments). It is important to recognize that if you aren't performing exploratory experiments you should try to perform enough controls so that any particular result that you get from your experiment is informative. You should also perform controls when you undertake exploratory science, but it's hard to control for everything.

Last example:

How could Ferdinand and Isabella control for Columbus' exploratory trip to China? And he made a big discovery - not the one he was looking for. But even if he was just looking for new land it would have been hard to perform a control experiment. And if Columbus turned back a day before he reached the Caribbean, could he have said, "there is no land out there"?

More like this

When computers first entered the mainstream, it was common to hear them getting blamed for everything. Did you miss a bank statement? that darned computer! Miss a phone call? - again the computer! The latest issue of Science had a new twist on this old story. Now, instead of a researcher failing…
OK, this is really ancient. It started as my written prelims (various answers to various questions by different committeee members) back in November 1999, and even included some graphs I drew. Then I put some of that stuff together (mix and match, copy and paste) and posted (sans graphs) as a…
Neil deGrasse Tyson if famous for telling us that children are natural scientists, and cautioning us to be careful not to ruin that thing about them. He makes a good case. No one ever thought, I think, that he meant that children were born resistant to the sorts of biases that scientists actively…
I'm going to link to a post on Uncommon Descent. I try to avoid that, because I think it is a vile harbor of malign idiocy, but Dembski has just put up something that I think is merely sincerely ignorant. That's worth correcting. It also highlights the deficiencies of Dembski's understanding of…

I like the new terminology much better.

I am still struggling, though, so set me straight if this, for example is A or B:

There are 2 competing hypotheses about the way something works. An experiment is performed. Data are negative. This eliminates one hypothesis and leaves the other one the winner by default.

How about this one:

There are 10 competing hypotheses about the way something works. An experiment is performed. Data are negative. This elminates 1 hypothesis. We are left with 9, which is an improvement. We do the 2nd experiment, which, with negative data (i.e., "no effect of X on Y"), eliminates another one, leaving 8 hypotheses alive. And so on, until, by elimination, over a course of several experiments, all yielding negative data, we are left with one winning hypothesis by default, sorta like Survivor show.

How about this variation:

There are 10 competing hypotheses about the way something works. We assume that the biological mechanism is evolutionarily quite conserved and should work the same way, with minor variations, in, lets say, all Vertebrates. We run into walls in our work on mice. The problem appears intractable at the time being. Idea? Try a different species (e.g., a bird or lizard or fish). You get negative results that automatically eliminate 5 out of 10 hypotheses from the "vertebrate basic mechanism". This is a huge improvement. Subsequent work in mice, now that we know what we are looking for, jsut confirms that the mouse and the other animals do actually have basically the same mechanism, and the follow-up in either mice or other animals can then be done to eliminate another 4 hypotheses.

"Negative Data is a Window in the Temple of Knowledge". Ha, my prof. Bill Lovis put it that way and I've never forgotten it.

I like the new terminology much better.

Thanks, although the old terminology is not mine.

I am still struggling, though, so set me straight if this, for example is A or B:
There are 2 competing hypotheses about the way something works. An experiment is performed. Data are negative. This eliminates one hypothesis and leaves the other one the winner by default.

A real life example of this would be whether the Golgi gets reabsorbed into the ER during mitosis. Is this exploration? You won't like my answer but it's somewhere in between exploration and explanation. It's not really anything beyond our current knowledge, it's like saying process X either requires ATP or it doesn't. In the end it's explaining how process X works. As for one experiment knocking out one hypothesis, in the best of scenarios, yes. But like I point out (and I guess Feyerabend would agree although I dissed him) a negative result could always be explained away with "you didn't used the right method and your negative result is the product of your potentially flawed procedure." And it's sometimes hard (or even impossible) to tell if an experiment is flawed (look at the neurogenesis example). And this is more so for exploratory experiments. If you read up on the Golgi war - each side criticizes about the other side's negative results in this precise way.

In many cases exploratory work does not deal with 1 of 10 possibilities, but with shots in the dark. It's usually "does this weird thing happen, or not" rather than 10 theories. And even if there are 10 main theories, the ultimate winner is usually something that wasn't thought of at first (theory 11) or a hybrid theory (which you can argue is also another theory in itself).

I guess if you can define the parameters (it has to be 1,2,3 ... 10) then you are defining the system and your work is explanatory. If you have no clue what the parameters might be (i.e. Columbus) then you are exploring.