More on Misconceptions

I wrote previously about a couple of misconceptions in evolutionary genetics (random mutation and natural selection and decoding genomes). Razib and John Hawks have been rapping on genetic drift and neutrality. Razib thinks it's important to distinguish between molecular evolution and phenotypic evolution -- I agree, by the way, but drawing the line can be difficult. As John pointed out in another post on misconceptions, the one gene, one protein model is greatly flawed. However, there is a relationship between the genotype and the phenotype, and if much of molecular evolution can be explained by neutrality, then some aspects of the phenotype must have evolved by chance.

But I digress. I really wanted to take some time to discuss neutrality and selection. As John mentions, neutrality is an excellent null hypothesis, and many of the tests for detecting selection are built on rejecting that null. The challenge in interpreting these tests lies in how we can distinguish selection from demography. This can be accomplished by examining multiple unlinked loci to see if the non-neutral patterns are found throughout the genome or if they are isolated at our putative locus under selection. But I've already written about detecting selection (and I promise to write more), so I'll stop here.

I really want this post to be about how drift and selection are not separate entities. Razib shows us the simple derivation revealing that the probability of fixation of a neutral mutation depends only on the mutation rate:

rate of substitution = probability of fixation X number of mutations

number of mutations = rate of mutations X population size

probability of fixation is the 1/(population size)

ergo, when you multiply the equation out only rate of mutation remains as population size cancels out

So simple. So elegant. So only applicable to neutral mutations. Yes, the probability of fixation of neutral mutations is independent of population size -- intuitively, you may think that small populations would fix more neutral variants, but large populations produce more mutations, so population size cancels out. But what about non-neutral mutations? Mutations that are under selection have to express themselves in the phenotype (at some level), so how do they evolve in populations?

The neutral theory is great as a null hypothesis. The nearly neutral theory is a much better model of molecular evolution. The theory posits that there are three classes of mutations: deleterious (will be removed by selection), beneficial (will be fixed by selection), and nearly neutral. Natural selection operates on all three classes of mutations -- purifying (or stabilizing) selection purges deleterious mutations and positive selection fixes beneficial mutations. The two extreme classes seem to express themselves in the phenotype at some level, but what about the nearly neutral mutations? They are called nearly neutral, but a better name would be 'dependently neutral'. As in, they depend on the context in which they arise.

There are, in fact, few truly neutral mutations. Even if two amino acids have the same polarity and nearly identical sizes (resulting in no difference in the translated protein if one were to be substituted for another), one amino acid could still be selectively favored if it easier to produce. Changes to synonymous sites appear to be under selection for translational efficiency. Insertions of non-coding DNA into noncoding sequence can be selectively deleterious. Of course, a single base pair mutation in a completely non-functional region has about as a close to a neutral effect as one could imagine, but many mutations that get labeled as 'neutral' are not quite.

The nearly neutral model tells us that the probability of fixation for a new mutation depends on two parameters: effective population size (N) and the fitness effect of a new mutation (s). The product of the two factors (Ns) tells us how likely a new mutation is to fix in the population. Extremely deleterious mutations (negative s) will be purged by selection regardless of the population size. Beneficial mutations (very positive s) will fix no matter in what size population they arise. The probability of fixation of nearly neutral mutations (s close to zero) depends on the population size. As the population size increases, selection will more efficiently remove the slightly deleterious mutations and more efficiently fix the slightly beneficial mutations. This model has been used to explain the evolution of eukaryotic gene structure.

The nearly neutral model is a probabilistic model, in a similar way that the neutral model is. Both assume that extremely deleterious mutations will be lost and extremely beneficial ones will fix. The difference between the two is that the nearly neutral model incorporates the effects of selection on mutations that the neutral model labels as neutral. Both models assign probabilities of fixation to the (nearly) neutral mutations, but the probability of fixation in the neutral model is independent of population size. In the nearly neutral model, slightly deleterious and mildly beneficial mutations are still fixed by stochastic processes, but the extent of stochasticity depends on population size. We can be more certain a slightly deleterious mutation will be lost in a large population than if that same mutation arose in a smaller population. In the case of the nearly neutral model, small populations are subject to more fixations due to random chance than large populations because stochasticity plays a larger role in the evolutionary fate of a nearly neutral mutation in small populations than in large ones.

What used to be a dichotomy -- selection versus neutrality -- has now been integrated into a unifying theorem of natural selection and random fixations. Despite the fact that the nearly neutral model is more realistic, the neutral model is still used as a null hypothesis and for good reasons. For one, the extra parameter in the nearly neutral model (s) makes it harder to apply. Also, the neutral model is damn close enough to what happens to nearly neutral mutations in natural populations. As long as we recognize that certain mutations traditionally classified as neutral -- synonymous substitutions, indels -- may be subject to some selection and we include that in our model, we should be alright. Better yet, we should recognize the mutations that are closest to neutral (single nucleotide mutations in non-functional regions) and use those as our control when applying statistical tests for deviations from neutrality.

More like this

Beneficial mutations (very positive s) will fix no matter in what size population they arise.

You have to be careful saying this - in fact, many (if not most) beneficial mutations that arise in a population are lost by chance within the first few generations. It is only once a beneficial mutation has reached a sufficiently high frequency that natural selection virtually guarantees its fixation in the population.

i had the same issue with the post RPM (that sentence seemed too imprecise). though you did quallify very positive s, which is different from s = 0.01....

Yeah, guys, you're right. I kinda painted a picture of three classes (deleterious, beneficial, and nearly neutral). In reality, there is a gradiant from exremely deleterious to beneficial mutations, and where we draw the lines between beneficial and nearly neutral and nearly neutral and deleterious could be somewhat sketchy. And, yes, random sampling will lead to the loss of beneficial mutation in large populations -- I was creating too much of a caricature.