The Three Types of Experiments

By apalazzo on April 3, 2006.

I'm not sure about the history of "the three types of experiments" (3tes), but they are referred to quite often in the labs I've been in. So what exactly are they?

Here goes ...

Type A Experiment: every possible result is informative.
Type B Experiment: some possible results are informative, other results are uninformative.
Type C Experiment: every possible result is uninformative.

There is even a little saying that accompanies this ...

The goal is to maximize type A and minimize type C.

There are some that even name the 3tes 1 through 3 instead of A, B and C. I have a few comments about the whole "three types of experiments" (hey that's what blogs are for!)

First, the 3tes are very helpful for younger inexperienced scientists. When ever you perform an experiment, ask yourself "what will I learn from this" or better "what will I learn from each potential result". This not only helps to avoid performing nonsensical (i.e. type C) experiments, but helps to think about setting the proper control experiments. These controls often turn experiments of the type C variety into the type A/B variety.

Second, about that saying I quoted above ... I have a problem with it. The saying implies that type A experiments are the most important ones, when realy it's type B experiments that give you the most insight.

What do I mean by that?

One main activity of scientists, is to figure out how things work. At the beginning of a project we mull over observations that we have made, we read the literature to see what others have said, and then we come up with a hypothesis. "This is how may it work" we exclaim.

Our hypothesis may be ingenious or pedestrian, but there are probably hundreds of valid hypothesis that one could conceive. Well then what? ... start an institute and hire a PR firm?

No. The best approach is to figure out the implications of your hypothesis, and then to see if this corresponds to reality. (That's science!) These implications are called "testable predictions". These are type B experiments. If the prediction holds, the hypothesis is good (for now), if the prediction doesn't hold our poor scientist must go back to the drawing board. In the end, science advances by this line of research. Type A experiments always suppose that your hypothesis is essentially correct and help address details such as "does it work this way or that way", "fast or slow", "does it require energy in the form of ATP" ... important questions, but as Kuhn would say, in the end they are filling in the holes.

So that's all I have to say about the 3tes, but if anyone knows "the history", please let me know.

{update 4/12/06} I just posted an entry on the significance of negative data that is relevant to this topic.

More like this

Programs are Proofs: Models and Types in Lambda Calculus

Lambda calculus started off with the simple, untyped lambda calculus that we've been talking about so far. But one of the great open questions about lambda calculus was: was it sound? Did it have a valid model?

Types in Haskell: Types are Propositions, Programs are Proofs

(This is a revised repost of an earlier part of my Haskell tutorial.)

Functions, Types, Function Types, and Type Inference

Haskell is a strongly typed language. In fact, the type system in Haskell is both stricter and more expressive than any type system I've seen for any non-functional language. The moment we get beyond

Tiptoeing into Type Theory

When Cantor's set theory - what we now call naive set theory - was shown to have problems in the form of Russell's paradox, there were many different attempts to salvage the theory. In addition to the axiomatic approaches that we've looked at (ZFC and NBG), there were attempts

Dunno, I understood Type A to be the kind of experiment in which you test a hypothesis and if you get the positive result your hypothesis is correct, if you get the negative result, the alternative hypothesis is right.

In other words, it is not just figuring out the details, but matching competing hypotheses (2 or more), thus each result is informative.

Type B would be one in which positive result is confirming a hypothesis but a negative result is not informative at all, i.e., it can be explained by a whole slew of alternative hypothese - which is not bad - for instance, this type of experiment can eliminate one out of five alternative hypotheses, which leads you to design the next experiment to eliminate one out of four, then one out of three, then one out of two (and that last experiment is Type A).

Did I mischaracterize that?

I understood Type A to be the kind of experiment in which you test a hypothesis and if you get the positive result your hypothesis is correct, if you get the negative result, the alternative hypothesis is right.

I guess, in the sense of a negative hypothesis. But I'd argue that often a negative hypothesis is uninformative. For example ... say that your hypothesis is the lunar check point ... cells divide normaly unless there is a full moon. Well that would be easy to test - measure doubling rates of your cells every day and see if there is a drop in multiplication the day that there is a full moon. Now if the drop happens, you've gain some potential insight "Ha the moon may activate a cell division check point - although my evidence is correlative". And that would be a major discovery. But if you never saw a decrease in cell division, you would have negative data. This (IMHO) makes this a type B experiment.

Now as you point out, negative data, is data. But it's weak data. It doesn't formaly disprove the hypothesis. For example one may say - "well these are tissue culture cells and so they are transformed and maybe missing the check point, but perhaps other cells have the check point." or "Maybe the cells have to be grown in direct contact with moon light for the checkpoint to be activated" ... etc.

I guess another deffinition of a type B experiment is that one of the possible outcomes, is a negative result that in most cases is uninterpretable.

(Aside: I was thinking of writing an entry on negative data ... I'll have to think about it.)

So basically, type A involve statements that are both verifiable and falsifiable, type B involve statements that are verifiable or falsifiable, and type C involve statements that are neither. Is that right or am I horribly mischaracterising?

Here's another way of thinking about type B experiments. They are the equivalent of sticking your neck out and asking whether your hypothesis might be true. If the experiment doesn't work, you haven't gained much. You can look at your failed attempt and ask - did it fail because my hypothesis is wrong? or because I did not perform the experiment right? It's this ambiguity of the negative result that make type B experiments so challenging. But they are critical for science to proceed.

So basically, type A involve statements that are both verifiable and falsifiable, type B involve statements that are verifiable or falsifiable, and type C involve statements that are neither. Is that right or am I horribly mischaracterising?

I guess that's a Popperian way of looking at it. Although Popper's philosophy of science is good to think about, it mischaracterizes a lot of what science really is like. A better way of thinking about this is Likelihood Statistics.

You can find an excellent discussion of it here: http://mikethemadbiologist.blogspot.com/2005/11/krauthammer-we-are-all-…

In terms of likelihood statistics, type A's results are all informative in that each increases the statistical likelihood that your model resembles reality. (Process X likely requires ATP, or process X likely does not require ATP). In type B experiments, one result helps the likelihood that your hypothesis is valid the other result either diminishes or does not help your ability to ...

APalazzo: thanks for the link. I haven't really come across likelihood statistics before.

Would I be right in saying that Popperian statistics is an absolutist approach (always pick the hypothesis that's most predictive and most parsimonious of those that have been explicitly tested) whereas likelihood statistics is a discriminative approach (pick the hypothesis that has the most "credibility" out of a number of viable hypotheses)?

Would I be right in saying that Popperian statistics is an absolutist approach ...

Yes - almost, but Popper was even more absolutist. Acording to Poper there are an unlimited number of possible theories, thus he would say "pick a model that is not yet falsified by any experiment" - (i.e. if a model is falsified then pick another model because there are always more models). But reality is messy. Experiments don't always work, there is a lot of bad data. And some experiments are more insightful than others. In this light, Popper is black and white and likelihood statistics is different shades of gray. Liklihood statistics would say (as you put it):

pick the hypothesis that's most predictive and most parsimonious of those that have been explicitly tested

Now I'm not sure what is the difference between the two satetments (the one above, and the second one you stated: "pick the hypothesis that has the most 'credibility' out of a number of viable hypotheses"). They're different, but they both ask you to make a statistical judgement in the sense that they ask you to evaluate the likelihood of each theory and pick the best one - (I guess the picking method is slightly different).

I've been in grad school for 3 years and never heard of this 3T approach. It makes intuitive sense, although apparently the devil is in the details from the comments. Good food for thought. Thanks for posting it.

It is difficult to think of these in the abstract, so having an example, as the cells and the moon, is good for clearing one's thinking. I just think that negative results can be hugely informative, even if difficult to write-up and publish. If you can say with certainty that something definitely DOES NOT work a certain way, especially if that is what people thought was the way it worked, you have made an important discovery.

The "pick the hypothesis that's most predictive and most parsimonious of those that have been explicitly tested" allows us to choose one hypothesis as being right at that particular point in time. The "credibility" version allows us to choose one hypothesis as being optimal at any given point in time. The difference being that the former implies that believing anything else is in some sense unacceptable, whereas the latter merely implies that believing anything else is in some sense not as good.

I personally think the former sounds better because it implies some sort of ideal - an ideal that I tend to refer to as the "reality-based community" - whereas the latter is more shades-of-grey. I'm aware that this is probably a feature of being young and idealistic.

Of course, a more immediately relevant partition is Aa: Experiments where every outcome results in a paper; Ba: Experiments where some outcomes result in a paper; and Ca: Experiments where no outcome will result in a paper. This partition, by the way, is not coincident with the original one.

You _definitely_ want to spring for Aa over the other two.

You have left out another popular statistical approach, called the "flinging snot at a blanket" method.

Janne knows what I'm talking about. Who cares if it's right? Just crank out everything you can squeak past the reviewers and let the "market of ideas" sort everything out.

The most efficient way to do this, of course, is to pick the pet theories of the people who sit on the NIH study sections, and design your experiments to falsify or support those theories. Then do a lot of immunohistochemistry, throw in some overloaded gels, maybe a 40-cycle PCR if you have to. Et voila, instant career!

So what percent of papers are a result of slung snot? That's what I'd like to know.

Wow I hadn't been back here in a while and it seems as if we have descended into a discussion about snot.

As for fiction vs reality in the science lit one PNAS article that modeled the scientific literature (see this entry: http://scienceblogs.com/transcript/2006/04/microparadigms_what_universe… ) estimates that 90% of the lit in certain fields is crap. Although in another model compatible with the data, the estimate of crap ain't so bad.

That would follow Sturgeon's Revelation -- ie. that 90% of everything is crap.

See: http://en.wikipedia.org/wiki/Sturgeon's_Law

I think the key is: how well matched is your technique to what you are trying to learn?

For example, say you want to know "Do glial calcium elevations cause a modulation of synaptic transmission?".

The perfect experiment would be to trigger large calcium elevations in all the glia near your synapse, with a stimulus that has no effects on anything but glial calcium. Then either a positive or a negative result is interesting. Since this is difficult to do, you compromise and use a technique not so well suited to your question.

On one extreme, you could use a stimulus that activates *everything*, including glial calcium. Then if you see modulation, you don't learn anything, but if you see no modulation, you learn that the answer to your question is "no".

On the other extreme, you could use a selective stimulus that triggers a calcium elevation in only one glial cell. Then, if you see modulation, your answer is yes, but if you see no modulation, you learn nothing.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

My Year in a Picture

December 28, 2009

For those interested in the organization of trust in the scientific establishment

December 15, 2009

Two great interviews with Steven Shapin and Simon Schaffer, two philosophers of science. CBC Ideas - Interview with Simon Schaffer on Leviathan and the Air Pump CBC Ideas - Interview with Steven Shapin on how science and truth are derived from social interactions within the scientific community…

Trust & Influence - The Real Human Currency

December 13, 2009

There's a battle going on out there. A battle for trust. Do you get the H1N1 vaccine? Is global warming true? Will you go to hell? Is the free market the best way to run an economy? How to answer these questions? The conventional wisdom is that all members of our society should get informed. Many…

Slicing a famous brain, streamed live on the web

December 4, 2009

I'm siting at my breakfast table when I read this in the NY Times science section: Dissection Begins on Famous Brain The man who could not remember has left scientists a gift that will provide insights for generations to come: his brain, now being dissected and digitally mapped in exquisite detail…

NIH Grants by Age

December 3, 2009

The graph is from Are there too many PhDs? at Mendeley Blog In the U.S., we are constantly hearing about how the country is falling behind in science. We need more scientists to fill all of those jobs we want to create. And the cure to that is to fund more PhD programs! Yet, when you ask graduate…