Gene Expression

Every ratio 3:1!!!

Science isn’t perfect, it often misses obvious truths. Consider the 2005 Nobel in medicine, awarded for the work of Barry Marshall and J. Robin Warren in establishing the connection between Helicobacter pylori and ulcers. After the fact you hear many stories of doctors who had stumbled onto the solution, antibiotics, long before the scientific consensus. Many others now understood why they always saw these pathogens in samples taken from patients with ulcers. Now it all makes sense, but these sort of screw ups make you wonder how far we’ve gone past Galen! Falsification is a decent formalization of the scientific process if you distill it down to its bare essentials, but it ignores the reality that science is executed by people, not computers. Thomas Kuhn’s work in The Structure of Scientific Revolutions speaks to that sociological reality, instead of a gleaming geometrical crystal city, natural science is filled with booming unplanned towns, citadels being swarmed by unexepected squatters, and castles in the hinterlands striving in vain to maintain their relevance. Even mathematics, that most rational of disciplines, is driven by an engine of intuitive insight and gestalt understanding, no matter the clean final product carved from axioms. Alas, science has a low signal to noise ratio, but paraphrasing Winston Churchill, it’s the best system we’ve got.

Of course, because of the socially contextual nature of much of science there is a niche for historians and sociologists to study it as a subculture. It is on the great mound of noise in which the signal swims that Will Provine has established his career as the historian of evolutionary genetics. His biography of the American population geneticist Sewall Wright displayed not only an encyclopedic knowledge of the personalities who touched Wright’s life, but the technical details of the theoretical biology which served as his legacy. It was with an understanding of this background that I came to Provine’s The Origins of Theoretical Population Genetics.

Basically a slim elaboration on his Ph.D. thesis at the University of Chicago this text explores the social and scientific dynamics between the initial high tide of the Darwinian phase in evolutionary theory and the reemergence of its primacy during the 1920s as population genetics fused the Mendelian framework with the wealth of statistical tools that were found in the biometrical school. In the interregnum Darwin’s original ideas which emphasized the importance of natural selection on continuous variation as the primary motive force for evolutionary change were relegated to the margins. A thorough survey of this period can be found in Peter J. Bowler’s The Eclipse of Darwinism, but Provine’s work is more narrowly focused, and tends to put the spotlight upon individuals rather than grand social movements. The importance of personality in inflating semantic confusions and mediating sociological dynamics shows exactly where much of the noise in the scientific system comes from.

In short Provine’s thesis centers around the conflict between the Mendelians, led by William Bateson, and the biometricians, headed by Karl Pearson (the Pearson’s correlation coefficient), and the subsequent fusion which culminated in R.A. Fisher’s 1918 paper, The Correlation between relatives on the Supposition of Mendelian Inheritance. The conflict between these two groups was in part on genuine scientific grounds, but Provine makes it clear that personal animosity, turf wars and inability to master the methodologies of the other side perpetuated a discord which was really much ado about nothing (and resulted in far less getting done).

The dispute had its seeds in the somewhat confused ideas of Francis Galton in the field of evolution. Unlike his cousin Charles Darwin and Alfred Russel Wallace Galton did not believe that natural selection upon continuous variation within populations was sufficient to explain evolutionary change. Like many scientists, including Thomas Huxley, Galton contended that evolution was due to the emergence of unique mutant forms, “sports,” which were at sharp discontinuity with the normal variation within a population. Galton did not accept that selection upon continuous variation would induce evolutionary change because he had some peculiar ideas in regards to regression toward the population mean. He seemed to posit some sort of innate stabilizing factor within a population which kept it around a species typical mean, bounded by its range and characterized by a particular variance. So individuals at the extremes would give rise to offspring who would regress back toward the mean of the population. Mutant varieties on the other hand might offer the opportunity to break out of this tendency by generating de novo a new central tendency. Pearson, Galton’s protege, pointed out that he neglected to consider that repeated generations of assortative, or selective, mating of exceptional individuals would avoid the problem of regression back toward the ancestral mean as “mediocrity” (that is, random mating of exceptional individuals with less than exceptional ones) would not dilute the offspring and successive population means would be established.1

But Galton’s specific issue with natural selection was only a small part of the puzzle. The bigger problem was that the dominant mode of thinking in regards to inheritance of traits from parent to offspring in Darwin’s day was blending, that is, the characters of parents are synthesized in a fashion where the offspring are a byproduct which reflects a mix of both parents. But there is a problem here: this process exhausts the variation that Darwin envisages natural selection uses to drive evolution (natural selection resulting in differential fitness of individuals correlated with variation on heritable traits). Certainly there are ways to get around this problem, but the hand waving explanations proffered by Darwin seemed to lack the power to dodge the homogenizing tendency of blending inheritance. Blending inheritance is intuitive, I recall reading in science fiction many times about a distant future where all people are golden-brown, the natural range of human coloration supposedly erased by admixture. Unfortunately intuition is not always a good guide to how the world works, and in the case of inheritance it is wrong, Mendel was right, discrete transmission of traits allows variation to avoid obliteration in the process of sexual reproduction.

Of course it is famously known that Mendel’s paper of 1866, the solution to the nagging problem of the day in evolutionary biology, was roundly ignored. By 1900 his work seemed more propitious and suitable for the times, as multiple researchers rediscovered it simultaneously. Unfortunately the scientific landscape had changed in a negative fashion as well, and the Mendelian hypothesis became associated with macromutationists who generally rejected Darwinian natural selection as having any relevance for evolution. Additionally, in the interim between Mendel’s original work and 1900 a very peculiar thing happened to Francis Galton, his followers rejected his discontinuous model of evolution in favor of Darwinian gradualism, and attributed to him the foundation of a school of biology, biometrics, which would bear the selectionist banner for a generation. Much of Provine’s book details the bizarre tendrils of association between Galton, his protege Pearson and his intellectual fellow traveler in regards to evolution, William Bateson. Bateson agreed with Galton about the process that drove evolution, but Pearson was Galton’s intellectual heir and the former claimed that the latter was the precursor for the school of biology which opposed Bateson and the Mendelians and mutationists. Add to this the complexity of the fact that Galton posited a Law of ancestral heredity which was radically reworked by Pearson. The law kept its name and association with Galton and served as the linchpin of of anti-Mendelian thought (Provine devotes an appendix to this Law, but he also admits in the text it was quickly thrown into the dust-bin of history once Mendelianism arrived).

I suspect that I’ve lost many people with the minutiae of details at this point, but I wanted to highlight how Byzantine and nonsensical the personal relationships were, and how they had a secondary effect on the march of ideas and the progress of science. By the first decade of the 20th century Bateson had assembled enough data to argue for the existence of Mendelian transmission of traits, but the biometricians dismissed Mendelianism as a trivial phenomenon. Mutationists around Hugo de Vries had connected their own ideas to Mendelianism, so the rise of both models tracked each other, and both took advantage of the eclipse of Darwinism because of the a priori problems relating to blending inheritance. Add to that that Wilhelm Johnnsen’s research with pure lines, genetically uniform lineages, showed that variation was not heritable, and the muddle was complete. If there is no heritable variation for selection to act upon, then selection has no power to affect change, Q.E.D.

In the midst of this there were lone voices in the wilderness. For example, George Udny Yule presaged many of the obvious implications of both continuous traits and Mendelianism, and showed that they could be reconciled with ease. When the mathematically fluent Karl Pearson took up the attack against Mendelianism, Bateson had to rely on Yule to respond because he himself was notoriously mathematically inept.2 Nevertheless the rational arguments of Yule were far less impactful than those of the empiricists who showed that selection did have power. William E. Castle’s work with mice, Thomas Hunt Morgan’s groundbreaking research with Drosophila and H. Nilsson-Elhe’s experiments on seed color all showed that long term selection could change the character of a population.

By 1918, when R.A. Fisher published his ground breaking paper the world was ready, and science finally caught up with its lying eyes. I’ve read Fisher’s paper three times, and it is pretty hard to follow if you don’t have a background in statistics (the math isn’t technically hard, but the prose is difficult to follow and the mathematical logic makes leaps far too often for comfort). It is one of those “and it naturally follows” works where the author assumes that you can track logic across the empty chasms of unexplicated algebraic manipulations and derivations. The gist of Fisher’s paper is that the empirical data that the biometricians processed with their finely tuned statistical techniques was totally explicable assuming a larger number of Mendelian loci a priori. For example, the variance between siblings on traits, like height, which are extremely heritable (assume a reasonable nutritional background) is natural if you consider that the parents are highly heterozygous. Fisher proceeds to use his model to analyze data from the time period, and he begins down the road of using variance partitioning to ascertain the various factors which result in a statistical distribution.3

But really the convergence of Mendelianism and continuous traits do not necessitate immersing yourself in familial correlations and moments of a probability distribution. I mean, after all, for a large number of trials the binomial distribution approximates the normal. No seriously, look at these images:

I just took screen captures after slotting in different N values into this applet (N=3, 6, 12, 24, 48). As N get’s large the normal approximation gets better.

To translate this into a more genetical sense, imagine you have a locus, a position in the genome, that controls density of hair follicles on your skin. If the gene is on, it is O, and if it is off, it is o. Assmume that the organism is diploid, so you have two copies of a given gene. So,

oo = hairless
Oo = hairy
OO = hairy

So O is dominant, right? But what if the density of hair is proportional to the number of O’s you have? Then you have:

oo = hairless
Oo = hairy
OO = hairiest

This is an additive situation . You now have three discrete phenotypes. Now, imagine you had a second locus. Let’s use a different letter, but the casing convention remains the same. Also, let’s assume that how hairy you are is proportional to upper case letters, “on” genetic variants (alleles). So, imagine:

oo, pp = hairless
OO, PP = hairiest

But there are many combinations in between, for example, the median hairness can be expressed by any of these combinations:



  1. #1 Steve Sailer
    January 28, 2006

    Razib writes:

    “[Galton] seemed to posit some sort of innate stabilizing factor within a population which kept it around a species typical mean, bounded by its range and characterized by a particular variance.”

    Reading N.W. Gillham’s biography of Galton, I was struck by how reminiscent Galton’s 19th Century thinking on the stability of species was to Stephen Jay Gould’s famous theory of “punctuated equilibria.”

  2. #2 Steve Sailer
    January 28, 2006

    This history reminds me of my seven-year-long argument with U. of Chicago economist Steven D. Levitt of “Freakonomics” fame over his theory that legalizing abortion cut the crime rate dramatically in America.

    Since 1999, Levitt has been playing the Karl Pearson role of the mathematically sophisticated insider by waving away my empirical criticism of his theory by claiming that his complex econometric modeling of state level abortion and crime data proves its validity. And I’ve been playing the William Bateson role of the mathematically simple-minded outsider who keeps pointing out that if you look at the national level data, the opposite of Levitt’s theory actually happened — the first cohort born after the legalization of abortion had triple the teen homicide rate of the last cohort born before legalization.

    Well, when two econometrician finally went through Levitt’s model in detail last year, it turned out that he had two technical errors that were fatal to his theory. Of course, Levitt’s theory remains the new conventional wisdom …

  3. #3 razib
    January 28, 2006

    I was struck by how reminiscent Galton’s 19th Century thinking on the stability of species was to Stephen Jay Gould’s famous theory of “punctuated equilibria.”

    yes, you aren’t the first.

  4. #4 Mark
    January 28, 2006

    Fascinating. Readers may also enjoy:
    Evolution by Jumps: Francis Galton and William Bateson and the Mechanism of Evolutionary Change, by Nicholas W. Gillham (Duke University)

    On “evolvability” and “facilitated variation”: New book, The Plausibility of Life: Resolving Darwin’s Dilemma by Marc W. Kirschner, John C. Gerhart (Yale UP October 2005). For review and discussion of “facilitated variation”:

    “Evolvability” Gerhart + Kirschner 1998

    On genetic robustness, environmental robustness, canalization, epistasis, evolvability, etc., see:
    Mutational Robustness, Modularity and Evolvability:
    Walter Fontana, Andreas Wagner.

    Continuity in Evolution: On the Nature of Transitions: Fontana + Schuster

    On “sin” and scientific knowledge, see “Saintly Resonances,” review by science historian Lorraine Daston of Dying to Know: Scientific Epistemology and Narrative in Victorian England by George Levine

  5. #5 Amit
    January 28, 2006

    Nice write up, Razib. I attended a seminar given by Will Provine recently. He actually brought in some of Sewall Wright’s original lab notebooks!

  6. #6 Matt McIntosh
    January 28, 2006

    Great post. Regarding the whole Popper/Kuhn thing I just see them as operating at different levels. Popper sketches out the logical/epsitemic skeleton, Kuhn (and Polanyi) are more interested in describing the sociological/psychological meat. (Lakatos is mostly just Popper with a few questionable tweaks, and Feyerabend is the anti-philosopher who relishes in demolishing standards and leaving only anarchy in his wake.) I see these two angles as complements rather than substitutes when viewed properly, and find the messy sociological aspects of scientific inquiry just as interesting as the cleaner epistemic structure of it. Hence, Provine is now appended to my ever-growing reading list…

  7. #7 Steve Sailer
    January 28, 2006

    1. I believe that Bateson suggested the correct answer — that Mendelian genetics worked if you assume a lot of different genes — quite early in the controversy, but it somehow got dropped.

    2. The biometric approach is analog, the Mendelian approach is digitial. Most things in the real world look analog, and it was quite difficult for scientists in the first half of the 20th Century to realize that the seemingly smooth curves of analog reality can be based on a microscopic digitial reality. So, we should have some sympathy for those who couldn’t make the leap.

    3. By analogy, Pearson’s failure is somewhat reminiscent of Einstein’s distaste for quantum mechanics.

    4. At a macro level, the real world is often more like the statistical vision of Pearson than that of the earliest Mendelians. For example, the only thing most college graduates remember about genetics is the blue eye-brown eye model of dominant and recessive genes, which ill-equips them for thinking about more common issues in real world genetics, where a probabilistic perspective is more realistic.

  8. #8 David Boxenhorn
    January 29, 2006

    You don’t need a lot of variables to get a normal distribution, just a few variables and a little noise.

  9. #9 razib
    January 29, 2006

    You don’t need a lot of variables to get a normal distribution

    i have read in the pop genetics lit that anything with 4 or more loci is operationally normal and polygenic in terms of many experimental contexts. the power is simply too weak to detect even such large (presumed) discrete transitions.

  10. #10 Matthew Cromer
    January 30, 2006

    The great scientific sin is that we insist on certainty. Therefore we make our models more and more rigid in our minds, and throw out ideas and facts that do not fit within them. Lacunae develop in our vision of the world. And so we miss the importance of Helicobacter pylori, we miss the fact that moving plates generate mountains, we miss the fact that rocks really do fall from the sky. Because they just don’t fit in.

    The solution is a new kind of skepticism: skepticism towards our own models of reality. And the more ingrained and unquestioned the assumptions of that model are, the more skepticism we will need to muster about them.

    I like Rupert Sheldrake’s formulation of this:

    “I am skeptical of people who believe they know what is possible and what is not. This belief leads to dogmatism, and to the dismissal of ideas and evidence that do not fit in. Genuine skepticism involves an attitude of open-minded enquiry into what we do not understand, and this is the approach I try to follow.”

    Sheldrake was, of course, a full-fledged cleric of the scientific priesthood with publication credits for properly reductionistic chemo-morphological research in Nature, Science News and scores of more specialized journals. But his theory explaining how wholes (atoms, molecules, cells, organisms) were more than the sum of their constituent parts led to a sentence of anathema and a judgement that his book deserved burning. Was that condemnation an example of science working correctly, or did the collective enterprise of science just ignore the cure for ulcers yet again?

    Here is the collection of Rupert Sheldrake’s published research, both on conventional and more controversial topics:

The site is currently under maintenance and will be back shortly. New comments have been disabled during this time, please check back soon.