Science isn’t perfect, it often misses obvious truths. Consider the 2005 Nobel in medicine, awarded for the work of Barry Marshall and J. Robin Warren in establishing the connection between Helicobacter pylori and ulcers. After the fact you hear many stories of doctors who had stumbled onto the solution, antibiotics, long before the scientific consensus. Many others now understood why they always saw these pathogens in samples taken from patients with ulcers. Now it all makes sense, but these sort of screw ups make you wonder how far we’ve gone past Galen! Falsification is a decent formalization of the scientific process if you distill it down to its bare essentials, but it ignores the reality that science is executed by people, not computers. Thomas Kuhn’s work in The Structure of Scientific Revolutions speaks to that sociological reality, instead of a gleaming geometrical crystal city, natural science is filled with booming unplanned towns, citadels being swarmed by unexepected squatters, and castles in the hinterlands striving in vain to maintain their relevance. Even mathematics, that most rational of disciplines, is driven by an engine of intuitive insight and gestalt understanding, no matter the clean final product carved from axioms. Alas, science has a low signal to noise ratio, but paraphrasing Winston Churchill, it’s the best system we’ve got.
Of course, because of the socially contextual nature of much of science there is a niche for historians and sociologists to study it as a subculture. It is on the great mound of noise in which the signal swims that Will Provine has established his career as the historian of evolutionary genetics. His biography of the American population geneticist Sewall Wright displayed not only an encyclopedic knowledge of the personalities who touched Wright’s life, but the technical details of the theoretical biology which served as his legacy. It was with an understanding of this background that I came to Provine’s The Origins of Theoretical Population Genetics.
Basically a slim elaboration on his Ph.D. thesis at the University of Chicago this text explores the social and scientific dynamics between the initial high tide of the Darwinian phase in evolutionary theory and the reemergence of its primacy during the 1920s as population genetics fused the Mendelian framework with the wealth of statistical tools that were found in the biometrical school. In the interregnum Darwin’s original ideas which emphasized the importance of natural selection on continuous variation as the primary motive force for evolutionary change were relegated to the margins. A thorough survey of this period can be found in Peter J. Bowler’s The Eclipse of Darwinism, but Provine’s work is more narrowly focused, and tends to put the spotlight upon individuals rather than grand social movements. The importance of personality in inflating semantic confusions and mediating sociological dynamics shows exactly where much of the noise in the scientific system comes from.
In short Provine’s thesis centers around the conflict between the Mendelians, led by William Bateson, and the biometricians, headed by Karl Pearson (the Pearson’s correlation coefficient), and the subsequent fusion which culminated in R.A. Fisher’s 1918 paper, The Correlation between relatives on the Supposition of Mendelian Inheritance. The conflict between these two groups was in part on genuine scientific grounds, but Provine makes it clear that personal animosity, turf wars and inability to master the methodologies of the other side perpetuated a discord which was really much ado about nothing (and resulted in far less getting done).
The dispute had its seeds in the somewhat confused ideas of Francis Galton in the field of evolution. Unlike his cousin Charles Darwin and Alfred Russel Wallace Galton did not believe that natural selection upon continuous variation within populations was sufficient to explain evolutionary change. Like many scientists, including Thomas Huxley, Galton contended that evolution was due to the emergence of unique mutant forms, “sports,” which were at sharp discontinuity with the normal variation within a population. Galton did not accept that selection upon continuous variation would induce evolutionary change because he had some peculiar ideas in regards to regression toward the population mean. He seemed to posit some sort of innate stabilizing factor within a population which kept it around a species typical mean, bounded by its range and characterized by a particular variance. So individuals at the extremes would give rise to offspring who would regress back toward the mean of the population. Mutant varieties on the other hand might offer the opportunity to break out of this tendency by generating de novo a new central tendency. Pearson, Galton’s protege, pointed out that he neglected to consider that repeated generations of assortative, or selective, mating of exceptional individuals would avoid the problem of regression back toward the ancestral mean as “mediocrity” (that is, random mating of exceptional individuals with less than exceptional ones) would not dilute the offspring and successive population means would be established.1
But Galton’s specific issue with natural selection was only a small part of the puzzle. The bigger problem was that the dominant mode of thinking in regards to inheritance of traits from parent to offspring in Darwin’s day was blending, that is, the characters of parents are synthesized in a fashion where the offspring are a byproduct which reflects a mix of both parents. But there is a problem here: this process exhausts the variation that Darwin envisages natural selection uses to drive evolution (natural selection resulting in differential fitness of individuals correlated with variation on heritable traits). Certainly there are ways to get around this problem, but the hand waving explanations proffered by Darwin seemed to lack the power to dodge the homogenizing tendency of blending inheritance. Blending inheritance is intuitive, I recall reading in science fiction many times about a distant future where all people are golden-brown, the natural range of human coloration supposedly erased by admixture. Unfortunately intuition is not always a good guide to how the world works, and in the case of inheritance it is wrong, Mendel was right, discrete transmission of traits allows variation to avoid obliteration in the process of sexual reproduction.
Of course it is famously known that Mendel’s paper of 1866, the solution to the nagging problem of the day in evolutionary biology, was roundly ignored. By 1900 his work seemed more propitious and suitable for the times, as multiple researchers rediscovered it simultaneously. Unfortunately the scientific landscape had changed in a negative fashion as well, and the Mendelian hypothesis became associated with macromutationists who generally rejected Darwinian natural selection as having any relevance for evolution. Additionally, in the interim between Mendel’s original work and 1900 a very peculiar thing happened to Francis Galton, his followers rejected his discontinuous model of evolution in favor of Darwinian gradualism, and attributed to him the foundation of a school of biology, biometrics, which would bear the selectionist banner for a generation. Much of Provine’s book details the bizarre tendrils of association between Galton, his protege Pearson and his intellectual fellow traveler in regards to evolution, William Bateson. Bateson agreed with Galton about the process that drove evolution, but Pearson was Galton’s intellectual heir and the former claimed that the latter was the precursor for the school of biology which opposed Bateson and the Mendelians and mutationists. Add to this the complexity of the fact that Galton posited a Law of ancestral heredity which was radically reworked by Pearson. The law kept its name and association with Galton and served as the linchpin of of anti-Mendelian thought (Provine devotes an appendix to this Law, but he also admits in the text it was quickly thrown into the dust-bin of history once Mendelianism arrived).
I suspect that I’ve lost many people with the minutiae of details at this point, but I wanted to highlight how Byzantine and nonsensical the personal relationships were, and how they had a secondary effect on the march of ideas and the progress of science. By the first decade of the 20th century Bateson had assembled enough data to argue for the existence of Mendelian transmission of traits, but the biometricians dismissed Mendelianism as a trivial phenomenon. Mutationists around Hugo de Vries had connected their own ideas to Mendelianism, so the rise of both models tracked each other, and both took advantage of the eclipse of Darwinism because of the a priori problems relating to blending inheritance. Add to that that Wilhelm Johnnsen’s research with pure lines, genetically uniform lineages, showed that variation was not heritable, and the muddle was complete. If there is no heritable variation for selection to act upon, then selection has no power to affect change, Q.E.D.
In the midst of this there were lone voices in the wilderness. For example, George Udny Yule presaged many of the obvious implications of both continuous traits and Mendelianism, and showed that they could be reconciled with ease. When the mathematically fluent Karl Pearson took up the attack against Mendelianism, Bateson had to rely on Yule to respond because he himself was notoriously mathematically inept.2 Nevertheless the rational arguments of Yule were far less impactful than those of the empiricists who showed that selection did have power. William E. Castle’s work with mice, Thomas Hunt Morgan’s groundbreaking research with Drosophila and H. Nilsson-Elhe’s experiments on seed color all showed that long term selection could change the character of a population.
By 1918, when R.A. Fisher published his ground breaking paper the world was ready, and science finally caught up with its lying eyes. I’ve read Fisher’s paper three times, and it is pretty hard to follow if you don’t have a background in statistics (the math isn’t technically hard, but the prose is difficult to follow and the mathematical logic makes leaps far too often for comfort). It is one of those “and it naturally follows” works where the author assumes that you can track logic across the empty chasms of unexplicated algebraic manipulations and derivations. The gist of Fisher’s paper is that the empirical data that the biometricians processed with their finely tuned statistical techniques was totally explicable assuming a larger number of Mendelian loci a priori. For example, the variance between siblings on traits, like height, which are extremely heritable (assume a reasonable nutritional background) is natural if you consider that the parents are highly heterozygous. Fisher proceeds to use his model to analyze data from the time period, and he begins down the road of using variance partitioning to ascertain the various factors which result in a statistical distribution.3
But really the convergence of Mendelianism and continuous traits do not necessitate immersing yourself in familial correlations and moments of a probability distribution. I mean, after all, for a large number of trials the binomial distribution approximates the normal. No seriously, look at these images:





I just took screen captures after slotting in different N values into this applet (N=3, 6, 12, 24, 48). As N get’s large the normal approximation gets better.
To translate this into a more genetical sense, imagine you have a locus, a position in the genome, that controls density of hair follicles on your skin. If the gene is on, it is O, and if it is off, it is o. Assmume that the organism is diploid, so you have two copies of a given gene. So,
oo = hairless
Oo = hairy
OO = hairy
So O is dominant, right? But what if the density of hair is proportional to the number of O’s you have? Then you have:
oo = hairless
Oo = hairy
OO = hairiest
This is an additive situation . You now have three discrete phenotypes. Now, imagine you had a second locus. Let’s use a different letter, but the casing convention remains the same. Also, let’s assume that how hairy you are is proportional to upper case letters, “on” genetic variants (alleles). So, imagine:
oo, pp = hairless
OO, PP = hairiest
But there are many combinations in between, for example, the median hairness can be expressed by any of these combinations:
OOpp
ooPP
OoPp
OopP
oOpP
oOPp