Over the past month or so I’ve been blogging chapter 5 of Evolutionary Genetics: Concepts & Case Studies. This chapter covers “stochastics processes,” basically the random elements in the flux of gene frequencies in biological populations. Now, I’m a selection man for real, but to understand selection you need to put it into the context of evolutionary dynamics as a whole, and chance is essential to properly comprehending necessity.
First I covered the immediate danger of extinction for a new mutant allele, even those favored positive selection’s kiss. Then I traced out the possibilities which influence the transition out of the boundary of emergence. Next I cruised into the familiar paths of genetic drift. Now I will hit genetic draft. Like Iran vs. Iraq there is only a one letter difference here between draft & drift, and in some ways the two processes do mirror each other, but the ultimate take home lesson is that it is in their differences that the solutions to biological problems may lay. Before you go on, I highly suggest you read Robert Skipper’s comments on this topic, his two blog posts on the topic, as well as this paper. If you read them first you don’t really need to read the rest of this if you understand them well enough. Also, since Rob has outlined John Gillespie’s mathematics in nearly exact detail I won’t explore that as much as I would have, and simply refer you to his paper if you can’t get a copy of Evolutionary Genetics: Concepts & Case Studies and don’t have access to the original papers via academic journal access. Much of the formalism mimics aspects of the genetic drift post in any case.
To understand genetic draft you really have to understand selective sweeps, hitchhiking effects and recombination. These are all predicated fundamentally on a physical understanding of the genome. To the left the image illustrates the general concepts visually. You have two strands of physical DNA, each a mirror of the other. I am obviously assuming the organism here is diploid, that there are two copies of each gene, two alleles for each locus. The physical relationship of alleles across chromosome results in synteny. If alleles A and C, and B and D, for loci 1 & 2 respectively, are always found along the same chromosomal strand in a small isolated population, then the alleles are in linkage disequilibrium. That means that basically allele X on locus 1 can predict allele Yon locus 2 along the length of the DNA within this population. But this is not a permanent state, recombination breaks apart such associations through crossing over. But, the rate of crossing over is far less than the immediate shuffling power of independent assortment.
OK, I threw a lot of terminology at you there (with links!). But the basic idea is simple. Imagine a population where a new mutant arises on one DNA strand. Let’s call this mutant “super selected,” SS. SS increases fitness a great deal (e.g., the selection coefficient > 0.10), and by chance (remember that even selectively favored alleles tend to go extinct) it sweeps to fixation. In other words it substitutes at its locus and replaces the ancestral variant. Now, consider another locus adjacent to the locus where SS emerges. Consider that there are two allelic variants at that locus, “Lucky Posse” (LP) and “Shit Out of Luck” (SOL). In generation 1, when SS emerged de novo and the ancestral allele at its locus was fixed the adjacent locus was polymorphic, with both LP & SOL extant at frequency 0.5. This means that you have a 1 out of 2 chance that the allele paired with SS is LP or SOL. Which one will it be? Who knows! Their very definition is contingent upon being the lucky allele which is positioned next to the emerging SS variant. LP will be that one allele which sits next to SS, and is fortuitously carried to fixation along with SS‘s sweep. This one allele may replace all of its sibling alleles as well as the SOL variant.
Jason Rosenhouse of Evolutionblog has asked why we need another term to describe what seems to be straight up hitchhiking. I don’t really know, though I think the point is that if hitchhiking is the fizzing of the process, genetic draft is the general shape that the foam takes. Genetic draft is to hitchhiking what genetic drift is to sampling variance. But as I said, an important point is that there is a difference between genetic draft and genetic drift, the former is not very dependent upon population size, while the latter is. Recall that the power of drift is inversely proportional to population size in affecting a generation-to-generation deviation in the frequency of an allele. This makes sense, the sampling variance declines as you add more and more draws from the population. In contrast hitchhiking is not effected by this because it is tied to selection.1 As a population increases in size there is less variation in the power of selective events which might draw upon a novel allele, and concurrently there will be a continuous dragging along of hitchhiking alleles as sweeps run themselves through populations all across the range of population sizes. The main break in this model is recombination: remember, this breaks apart syntenic physical associations across the genome, so while a selected mutant sweeps along it will, over time, lose its hitchhikers. If the sweep is powerful enough and/or the recombination muted enough draft will be significant, expunging variation around the locus of selection. In contrast, weaker selection or robust recombination will inevitably diminish the staying power of drafting processes.
Gillespie’s model uses a few sleights of hand. He assumes that selective sweeps exhibit a poisson distribution over time (rare processes with equal mean and variance). Additionally these sweeps are independent events which exhibit no temporal overlap. If one assumes that the rate of sweeps remains constant as one increases population size, then the model of genetic draft suggests that effective population levels off to an asymptote. The reasoning is simple, the increased absolute number of sweeps clean the overgrowing gene pool. The sweeps and their hitchhikers act as a break upon the tendency toward increased polymorphism as neutral allele frequencies spend more and more of their time in the transition between states of fixation (recall that in a neutral scenario the time until substitution is proportional to effective population size). In contrast, another model which posits that sweeps increase in frequency with larger population size generates a model where effective population eventually begins to decrease with increased census size! As the power of the sweeps to homogenize regions of the genome increases the overall effective “genetic ancestry” evolution is truncated.
Of course these are all tweaks on various assumptions, and Gillespie himself admits that the second scenario, where the rate of sweeping is proportional to population size, is controversial. Though such simplistic models (e.g., one locus, two allele, etc.) don’t come close to allowing us to understand evolution, all great things must begin from humble seeds. A reductionistic scientific enterprise is built upon the backs of such simple models. If you read Robert Skipper’s posts you’ll see that draft is important precisely because its relative invariance (possibly) vis-a-vis population size might explain the lack of variation across taxa.
Addendum: For those of you interested in the mathematical formalisms, I recommend Rob’s paper, he duplicates Gillespie’s model pretty faithfully.
1 – The relationship between selection is population size isn’t totally as simple as some might make it out to be, but, compared to drift it is pretty stark and straightforward, so I’ll use the approximate truth instead of engaging in precise exposition which is off topic.