Chung-I Wu's Other Paper on Adaptive Evolution

In addition to the paper on adaptive evolution in the Drosophila melanogaster genome (reviewed here yesterday), Chung-I Wu is also senior author on a sort-of companion paper studying adaptive evolution in the human genome. Yeah, I know, who really cares about the human genome, human evolution, or humans? The real interest is in Drosophila. But, believe it or not, there are some people who find human population genetics totally engrossing.

The paper using sequences from the human genome, which Wu co-authored with Jun Gojobori (who is not this Gojobori), Hua Tang and Josh Akey, looks at polymorphisms in protein coding sequences compared to divergence from chimpanzee. They employ a modification of the McDonald-Kreitman test to determine if there is an excess of differences between humans and chimps relative to the amount of variation found within humans. But rather than doing this test on individual genes, they organized the data into amino acids. The standard genetic code allows for twenty amino acids and three codons that signify termination of translation. There are 190 possible changes from one amino acid to another ([20*19]/2), of which 75 can be achieved by a single nucleotide change; Gojobori et al refer to these as elementary changes.

The researchers employed two metrics to represent the amount of polymorphic elementary changes and divergent elementary changes. First, they divided polymorphisms into two classes: rare polymorphism found at a frequency of less than 20% and common polymorphisms found at a frequency greater than 20%. The polymorphisms within humans and substitutions between humans and chimps are further divided into those that change the amino acid encoded by the codon (non-synonymous or A) and those that do not (synonymous or S). They then estimated the rate of non-synonymous and synonymous mutations using the amount of polymorphism at those sites.

The aforementioned parameters are used to estimate the polymorphism index (PI) and fixation index (FI). They are calculated using the following equations:

PI = [Arare/Srare]/[Amutation/Smutation]

FI = [Adivergence/Sdivergence]/[Acommon/Scommon]

Synonymous changes are assumed to be selectively neutral -- they act as a control. PI captures the probability with which new mutations reach appreciable frequencies (but still <20%), while FI represents the likelihood of a common allele (>20%) reaching fixation. These two metrics were calculated for each of the 75 elementary changes (those that differ by a single amino acid) for all codons in the genome. A higher PI means that a particular amino acid change is less likely to be removed by purifying natural selection, allowing it to become polymorphic. A higher FI means that a particular non-synonymous polymorphism is more likely to fix between species than is expected based on within species polymorphism, which suggests the action of positive selection.

i-e594de344cb4520ac3acd9871121e230-gojobori_etal_2007_adap_evol_human.gif

The authors then plotted FI versus PI for two separate data sets. The top graph represents data from Perlegen, and on the bottom graph are data from HapMap. The Perlegen data are less biased, so we'll focus on those, although the same patterns can be seen in both graphs. Each point on the graph represents an elementary change, summarizing all the polymorphism and divergence for that amino acid change across the entire human genome.

The big trend to glean from these plots is that elementary changes with large PIs (ie, those that are highly likely to become polymorphic) have low FIs. This leads Gojobori et al to conclude that mutations that are not removed by natural selection -- those that are able to stick around long enough to be measure in a survey of polymorphism -- are unlikely to be fixed by natural selection between species. Conversely, the elementary changes with the largest FIs (those that are likely to be fixed by natural selection) have low PIs (they tend to be removed by natural selection when they arise as new mutations). Elementary changes with FI values significantly larger that the rest lie within the dashed box.

A simple conclusion from these trends is that some of the mutations that are the most likely to be removed by purifying selection are the same types of changes that are likely to be fixed by positive selection between species. The finding that FI and PI are negatively correlated throws a stick in the spokes of some methods to detect adaptive evolution using polymorphism and divergence. But it looks like selection (both purifying and positive) is only operating on subset of all amino acid changes, and it's the same changes that are being operated on by both types of natural selection. And some amino acid changes appear to be quite neutral, with a high probability of reaching an appreciable frequency, but with no bias toward fixation.


Gojobori J, Tang H, Akey JM, and Wu C-I. 2007. Adaptive evolution in humans revealed by the negative correlation between the polymorphism and fixation phases of evolution. PNAS 104: 3907-3912. doi: 10.1073/pnas.0605565104

More like this

Mike Lynch has been getting a fair bit of hype recently for his nearly neutral model of genome evolution (see here and here). The nearly neutral theory riffs off the idea that the ability of natural selection to purge deleterious mutations and fix advantageous mutations depends on the effective…
Bad tests for natural selection are bad at detecting selection. Austin Hughes has published a fairly critical review of some methods used to detect natural selection in protein coding sequences. His attack on current methods for detecting natural selection is threefold. First, he claims that…
Adam Eyre-Walker has published a review of adaptive evolution in a few well studied systems: Drosophila, humans, viruses, Arabidopsis, etc. These organisms have been the subject of many studies that used DNA polymorphism, DNA divergence, or a combination of the two to detect natural selection in…
Polymorphism and Divergence This is the eighth of multiple postings I plan to write about detecting natural selection using molecular data (ie, DNA sequences). The introduction can be found here. The first post described the organization of the genome, and the second described the organization of…

A question: Hasn't there been some research recently which showed that "synonymous changes" have an effect on protein structure (by changing the rate at which ribosomes transcribe RNA into a string of amino acids -- a function of tRNA concentration for a given codon)? If that is a relatively common effect, how can synonymous changes be assumed to be selectively neutral?

By Craig Helfgott (not verified) on 12 Apr 2007 #permalink

If that is a relatively common effect...

It's not common, so it's not much of an issue. A bigger problem is with Drosophila synonymous sites, which are under more selection for major codon usage. But, even in drosophila, synonymous sites evolve faster than non-coding sequences near genes.

No, he is not that Gojobori, but he is that Gojobori's son.

No shit, Josh! Not eager to comment on the post about your paper? I must've fucked something up. You guys didn't really say Brian Charlesworth is wrong, did you?