The Significance of Negative Data

By apalazzo on September 28, 2007.

Yes you've guessed it, I'm in Italy. Here is another entry dealing with scientific thinking.

Spurred on by some comments left by Coturnix on the Three Types of Experiments entry, and by the Microparadigm paper (see my entry, and another discussion of this paper at In the Pipeline), I now present to you ... the significance of negative data.

Now most of the older (and well read) philosophers of data such as Kuhn, Popper and Feyerabend were obsessed with the physical sciences, and as Ernst Mayr has pointed out in several books, they're ideas are less applicable to the life sciences. Even the principle of Occam's Razor often fails biology. The main mechanisms that operate within a living organism arose from evolution, a producer of diversification. Every system (think signal cascades, morophagen programs, circadian mechanisms, molecular motors) is a baroque assembly of proteins each with its own intricate parts that may resemble other related proteins, but has diverged to meet the needs of the organism in this point in time. At first glance, our assumptions as to what the underlying mechanisms "should be" are often much more simplistic than the mechanisms that are eventually found. Often this is not due to the fact that nature is weird but because we often make naive assumptions about natural systems.

So how does one study biology?

Well in simplistic terms, lots of biology is exploratory ... much more than the physical sciences. We are explorers of the weird and wonderful mechanism produced by natural selection.

Hold on ... is this really true for all research done in biology?

At this point we come back to the three types of experiments. From my previous post:

Type A Experiment: every possible result is informative.
Type B Experiment: some possible results are informative, other results are uninformative.
Type C Experiment: every possible result is uninformative.

There is even a little saying that accompanies this ...

The goal is to maximize type A and minimize type C.

In that post I then stated that type B experiments were actually the most important. Well lets rephrase our three types of experiments into terms that more acurately describes their function in terms of how the life sciences opperate:

Type B Experiment: exploratory science
Type A Experiment: explanatory science
(Type C Experiment: forget about it or ask yourself what am I trying to demonstrate)

Just like an expedition team, as a life scientist you first attempt to find something new (type B experimentation). Then once the strange and wonderful product of natural selection is discovered, you then work out the details of the mechanism (type A experimentation).

As you can see my argument is that:

Type B Experiment = exploratory science = some possible results are informative, other results are uninformative

What do I mean? When exploring the vast possible space of what "might be out there" you will often get negative data. The question becomes: is that negative data informative? (this is the point that coturnix and other have raised). To a certain degree, all data is informative. However I would argue that negative data from exploratory experiments is not that informative for two reasons:

1 - In exploring the vast space of possibilities, to say that a particular biological mechanism does not occur is not saying very much. If I told you that i couldn't find any evidence for my crazy idea, you would probably reply big deal. But if I had convincing evidence that supported my crazy idea, you would say wow (I hope).
2 - A failure to find such a mechanism can occur through an improper experiment. Furthermore the reason that a certain experiment is "improper" may not be apparent to any reasonable researcher. If I told you that I couldn't find any evidence for my crazy idea, you could always say "maybe you didn't try hard enough". Or you could say "maybe you used the wrong technique". This is why grad students and postdocs on occasion hate their PIs.

I'll explain with three examples, one fictional, two real.

Example 1: The lunar cell cycle check point (fictional) from a comment I left:

Say that your hypothesis is the lunar check point ... cells divide normally unless there is a full moon. Well that would be easy to test - measure doubling rates of your cells every day and see if there is a drop in multiplication the day that there is a full moon. Now if the drop happens, you've gain some potential insight "Ha the moon may activate a cell division check point - although my evidence is correlative". And that would be a major discovery. But if you never saw a decrease in cell division, you would have negative data. This (IMHO) makes this a type B experiment.

Now, negative data, is data. But it's weak data. It doesn't formaly disprove the hypothesis. For example one may say - "Well these are tissue culture cells and so they are transformed and maybe missing the check point, but perhaps other cells have the check point." or "Maybe the cells have to be grown in direct contact with moon light for the checkpoint to be activated" or "Maybe a subset of cells in the hypothalamus detect the signal and send a secondary signal to other cells" ... etc.

Example 2: Neurogenesis.

It was long believed that you are born with all the neurons that you'll ever had. Researchers recently have discovered that this is FALSE. From a previous entry:

Dan Gilbert explained that stress came in two forms: a positive stress that challenges us and that may enhance neurogenessis, and a negative stress (think despair) that actually inhibits neurogenesis. Animals in cages are under heavy negative stress, and as Gilbert explained "that's why no one ever observed neurogenesis in the lab."

Now to get back to the three types of experiments ... Many inexperienced researchers fail to perform control experiments and this often transforms explanatory science from type A into type B experiments (or into type C experiments). It is important to recognize that if you aren't performing exploratory experiments you should try to perform enough controls so that any particular result that you get from your experiment is informative. You should also perform controls when you undertake exploratory science, but it's hard to control for everything.

Last example:

How could Ferdinand and Isabella control for Columbus' exploratory trip to China? And he made a big discovery - not the one he was looking for. But even if he was just looking for new land it would have been hard to perform a control experiment. And if Columbus turned back a day before he reached the Caribbean, could he have said, "there is no land out there"?

More like this

Programs are Proofs: Models and Types in Lambda Calculus

Lambda calculus started off with the simple, untyped lambda calculus that we've been talking about so far. But one of the great open questions about lambda calculus was: was it sound? Did it have a valid model?

Types in Haskell: Types are Propositions, Programs are Proofs

(This is a revised repost of an earlier part of my Haskell tutorial.)

Functions, Types, Function Types, and Type Inference

Haskell is a strongly typed language. In fact, the type system in Haskell is both stricter and more expressive than any type system I've seen for any non-functional language. The moment we get beyond

Tiptoeing into Type Theory

When Cantor's set theory - what we now call naive set theory - was shown to have problems in the form of Russell's paradox, there were many different attempts to salvage the theory. In addition to the axiomatic approaches that we've looked at (ZFC and NBG), there were attempts

I have thought for a while now that a Journal of Negative Results could be very useful, btw. It would serve as a measn of seeing how previous researchers have tried to address a problem, and let future workers avoid making the same mistakes.

I love good posts on methodology.

Even the principle of Occam's Razor often fails biology.

I would say that the principle is fine, but it is less likely to give correct theories as default in biology.

Also AFAIU the razor is used with great effect in specific areas. Likelihood methods are used in cladistics, and interpreted as bayesian likelihoods they express parsimony.

The principle of parsimony is used for several reasons:

- It suggests using simpler formal theories first, which are simpler to test and less likely to contain errors.
- It gives fewer reversals in knowledge, ie the theory may be wrong but we save on number of failures.
- Nature is basically simple for reasons of symmetries.

But the basic simplicity of fundamental theories guarantee that applications are complicated instead. For example in physics which I know more of, EM theory may be simple but the near- to farfield coupling and its description around an antenna is nothing but. And evolutionary biology may have a simple formal principle in inheritance, but the applications (first mechanisms and then twice removed specific outcomes) are nothing but.

On top of that inherent complexity evolution is the ultimate tinkerer. Life is just not fair. :-)

Occam's Razor doesn't fail in all areas of biology; genetics, especially genetics from the mid-20th century, used this a ton. Jacob and Wollman, for example, brilliantly applied it to deducing that bacteria had a circular genome, though the particular mechanism they cited was a little inaccurate.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

My Year in a Picture

December 28, 2009

For those interested in the organization of trust in the scientific establishment

December 15, 2009

Two great interviews with Steven Shapin and Simon Schaffer, two philosophers of science. CBC Ideas - Interview with Simon Schaffer on Leviathan and the Air Pump CBC Ideas - Interview with Steven Shapin on how science and truth are derived from social interactions within the scientific community…

Trust & Influence - The Real Human Currency

December 13, 2009

There's a battle going on out there. A battle for trust. Do you get the H1N1 vaccine? Is global warming true? Will you go to hell? Is the free market the best way to run an economy? How to answer these questions? The conventional wisdom is that all members of our society should get informed. Many…

Slicing a famous brain, streamed live on the web

December 4, 2009

I'm siting at my breakfast table when I read this in the NY Times science section: Dissection Begins on Famous Brain The man who could not remember has left scientists a gift that will provide insights for generations to come: his brain, now being dissected and digitally mapped in exquisite detail…

NIH Grants by Age

December 3, 2009

The graph is from Are there too many PhDs? at Mendeley Blog In the U.S., we are constantly hearing about how the country is falling behind in science. We need more scientists to fill all of those jobs we want to create. And the cure to that is to fund more PhD programs! Yet, when you ask graduate…