In Natural selection of a human gene: FUT2 I referred to a paper, Signals of recent positive selection in a worldwide sample of human populations (see my earlier review). Now the same group has a follow up paper which takes a slightly different tack, The Role of Geography in Human Adaptation:
Since the beginning of the study of evolution, people have been fascinated by recent human evolution and adaptation. Despite great progress in our understanding of human history, we still know relatively little about the selection pressures and historical factors that have been important over the past 100,000 years. In that time human populations have spread around the world and adapted in a wide variety of ways to the new environments they have encountered. Here, we investigate the genomic signal of these adaptations using a large set of geographically diverse human populations typed at thousands of genetic markers across the genome. We find that patterns at selected loci are predictable from the patterns found at all markers genome-wide. On the basis of this, we argue that selection has been strongly constrained by the historical relationships and gene flow between populations.
The authors noted while that their earlier paper exhibited a great deal of specificity, this one takes a broader panoramic view. Looking at the shape of human variation they attempt to clarify the influence of various parameters which affect the way in which populations vary. By parameters, imagine mutation (which introduces variation), selection (which removes variation if it is deleterious, or, removes it as a byproduct of driving a favorable mutation to fixation), drift (which is the random fluctuation in frequencies from generation to generation inversely proportional to population size) and migration (whose effect is in proportion to the numbers and the difference in genes between the two groups across which it occurs). The parameters themselves can be complicated. Consider selection, which comes in various flavors. Forms of balancing selection are driven by disparate processes such as frequency dependence and overdominance. Selection on "standing variation" suggest that no novel mutations are necessary, that the population has all the variation already extant and selection is simply shifting the balance between the genotypes (e.g., imagine that a population has a normal distribution in height, selection can change the mean value simply by altering the proportions of the underlying alleles in the population).
Throughout the text the authors seem to take one particular model on and reject its ability to explain the variation that they see. This model seems to imply strong positive selection upon novel mutations (selection coefficients of 1% of greater) driven to fixation rapidly. Concretely, imagine that there is a new mutant at a gene which confers an adaptive benefit and increases fitness against the population mean. Within 10,000 years it has gone from 0 to 100% in frequency within the population because of positive selection. Some tests of natural selection do seem to yield very high numbers of positively selected variants of just this form, but it seems that this group believes that some of these signals are due to a complex of evolutionary processes, and not just fixation of strongly selected alleles. The complexity of their "answer" is important, while rejecting the elegant model of ubiquitous selective sweeps occurring due to positive selection, they do not offer an alternative elegant model. Rather, through door comes many questions, because the exact nature of the parameters which have shaped the nature of variation is yet to be determined.
But as the title of the paper implies, geography and descent have more to do with their presumed explanation than not. Figure 2 B, D and F illustrate their concern:
This is illustrating the allele frequencies for 50 SNPs which exhibited the greatest between-group differences for three representative populations, the Han, French and Yoruba. The first chart shows where the Yoruba are the outgroup. The second where the French are. And the third where the Han are. The geographic pattern is clear. There are three clusters of populations, West Eurasian, East Eurasian + the Americas and African. Though one can drilldown to a more granular level (see the Supplement), these are the large geographic units which are important for this paper.
It seems that the main issue is that you can predict the between population differences in putatively selected alleles using total genome content differences between populations. The latter famously corresponds to geography, and in particular "chunks" out into the three marco-regions above, with the non-Africans being one clade and the Africans another. If selection was driving alleles to fixation due to local adaption, why not more variation within the macro-regions? Why not more gradual clines? No, they suggest is that selection is simply not a powerful enough parameter to swamp out the homogenizing process which occurs due to migration between adjacent populations. If selection was a more powerful process they presume that there would be more local variation which could not be inferred based purely on ancestry alone. As it is they see a recurrent pattern of clustering by the three macro-regions, with Africans exhibiting greater differences from the other two groups (the last is in line with a long line of research papers on genetic variation and differences).
But they do not wipe selection away from the story and replace it with migration and geography. There are other data which suggest that adaptation has played a role in between-population differentiation. This is evident from the first figure:
As you probably know a large portion of the genome consists of regions which are not functional sequences which are eventually translated into amino acids. What this figure is showing is that the tails of the distribution of Fst values are enriched for genic regions. In short those regions of the genome which are most clearly functional are overrepresented among the variants which exhibit a lot of between-population difference. A representative example is SLC24A5, which is nearly disjoint between European populations vs. African & East Asian ones. This allele is likely responsible for a great deal of the between population pigmentation difference. It obviously has a functional relevance.
So selection is part of the story. But how much?
The above figure comes from HapMap data. YRI = Yoruba, CEU = Utah Whites and ASN = Chinese & Japanese. The lines represent genes where there is more than a 90% difference between the two populations above the chart. Since the maximum is 100%, and average differences are far lower (on the order of 10%), these are extremely divergent alleles. The red lines are derived alleles at high frequency in the first population in the comparison, and the blue lines are the allele frequencies in the other population. Derived implies that the alleles are relatively new in relation to the total family of alleles, that is, they are descended from an ancestral state (often the other allele, or in the other allele group).
There are several patterns to note:
1) Big difference between East Asians and the Yoruba
2) When there is a big difference between these groups the Europeans seem to be in the middle
3) Derived alleles tend to be at lower frequency in the Yoruba than the other two groups
4) The derived alleles at high frequency in the East Asians are at high frequency in the Americas.
The last serves as a peg for the contention that the East Asian derived alleles are old, at least 15,000 years ago. Additionally it is noted that 80% of the 90%+ differences are between Yoruba and East Asians. Finally, they don't seem to find strong signatures of selection using haplotype based tests which would yield recent events.
It is notable to me that this is a paper with a relatively long and thorough discussion with accompanying figures. Discussions are supposed to be about future possibilities in research and inferences, and it seems that this paper poses many questions. In rejecting a relatively simple model of recent human evolution, a model which emerges in part from preliminary work that comes out of some of the individuals within this research group, they don't seem to have settled on a similarly simple replacement. Perhaps there isn't one. The authors bring up drift, weak selection, selection on polygenic traits and fluctuating selection, and all their various combinations, as possibilities. They conclude with a reaffirmation of their rejection of the model of recent human evolution driven by powerful positive selection on new mutations:
Finally, since high- FST SNPs are rare in the human genome, our study raises the question of whether human populations can effectively adapt to new environments or new selective pressures over time-scales of, say, ten thousand years or so. Our results seem to suggest that rapid adaptation generally does not occur by (nearly) complete sweeps at single loci. If human populations can adapt quickly to new environments, then we propose that this might instead occur by partial sweeps simultaneously at many loci.
There is much more in the whole paper. The supplementary figures are also very interesting. It seems that the apple-cart has been turned upside down, so let's see what will come out of the chaos.
Citation: Coop G, Pickrell JK, Novembre J, Kudaravalli S, Li J, et al. (2009) The Role of Geography in Human Adaptation. PLoS Genet 5(6): e1000500. doi:10.1371/journal.pgen.1000500
They clearly have a new model or perspective on these data but I can't for the life of me figure out what it is. Perhaps one of the authors could figure it out and tell us in a concise paragraph or two.
Speaking of Fisherian waves...
What on earth is a "spherical cow" and where does it fit into population genetics?
physics joke about ludicrously simple first approximations.