Adaptive acceleration

The acceleration story has finally cooled down a bit judging by my google news feed. That being said, I suggest you check out the comments threads on p-ter's two posts, here & here. John Hawks and some of the other authors of the paper have been participating in the back and forth. The paper is finally on PNAS's site (Open Access), with the supplementary information. It is also important that you read this paper in concert with Global landscape of recent inferred Darwinian selection for Homo sapiens, which has a detailed explication of the methods and more specific data.

Tags

More like this

I've been hoping that you, and some genetic expert posters and commenters here and on the classic GNXP beyond p-ter (and the article's authors), would weigh in on the issues p-ter's raised with the study's experimental design.

in all seriousness, i understand p-ter's points, but i can't see how to make it much simpler because i don't understand these issues as well as p-ter. i think the dispute here is going to remain technical and somewhat arcane for a while.

An informative exchange between two experts, P'ter and Hawks, can be disrupted if too many others jump in. For that reason I've been reluctant to give my opinions or ask questions on P'ter's threads. My contributions wouldn't improve that discussion.

Here is my overview of the exchange...

P'ter agrees that the theoretical arguments based on population genetics, demographic history, and cultural history are very strong. P'ter is also very familiar with genetic research showing evidence of fairly recent selection in the human genome.

P'ter's criticisms focus on the statistical analysis of the HaploMap data. He is not claiming that accelerated adaptation didn't occur, only that there are problems with the evidence that the author's presented.

Here are the issues that P'ter raises that I find most interesting:

1) The method used to detect selection events depends critically on a few parameters. I believe Hawks when he says that their results are fairly insensitive to algorithm settings but including a parameter sensitivity analysis in the appendix would make the paper stronger. Run a few simulations and show how the results change with different settings.

2) The HaploMap data used 30 "trios", i.e., mother-father-child. Thus the data already included familial correlations. This may have distorted the LDD test results. This doesn't mean the authors are wrong but it may distort the results. I think the authors need to address this issue.

3) P'ter isn't satisfied with the way the authors treated varying recombination rates across the genome. The author's excluded genome regions with the slowest recombination rates but treated other regions as having the same, constant rate of recombination. This is important because the LDD test depends strongly on local recombination rate. P'ter would like to compare their results to a "Null hypothesis" simulation that models varying recombination rates across the genome. I think this would be a useful exercise, perhaps someone will do it.

4) P'ter feels the demographic model was too simplistic. E.g., a more realistic model would include varying population sizes, bottlenecks, founder effects, and population substructure that match historical data. Hawks justified their simple model by noting that such factors affect the entire genome whereas "selection events" affect specific loci. P'ter, based on his experience in statistically detecting "selection" signals in the Hapmap data, believes that stochastic processes can generate "false positive" selection signatures that may distort the LDD test results.

I believe I understand P'ter's concerns with regard to issues 3 and 4. However, I have a different take. The paper strongly contradicts conventional wisdom about recent human adaptation. Keeping the computer models simple and the simulations to a minimum makes the paper more accessible to a wide audience. Following papers will better model reality and give more precise estimates of how adaptation has changed over time. (This is an emotional issue. Suppose P'ter believes the claims but doesn't believe that their statistical analysis proves their claims. If he spends considerable effort on simulation work to "fix" the flaws he gets little credit. Why should he do the work and then get no credit? It may not be fair but I believe that is the way science works.)

In summary, the theoretical arguments are strong and very likely to correctly predict accelerated human adaptation. Issue 1 could be dealt with by adding a parameter sensitivity study to the appendix. Issue 2 needs a substantive response from the authors. Issues 3 and 4 are worthwhile criticisms but I think it is a judgment call as to whether or not the authors used sufficiently realistic simulations. (Both P'ter and Hawks have far more experience and are far better qualified to make this judgment call than I.) Other researchers can now use more complex models to extend or refute their claims.

Here are some questions I have about the acceleration paper.

1) Does the LDD test miss copy number selection events?

Based on the Venter diploid genome, Europeans differ by about 1%. So there is more copy number variation than SNP variation. Potentially adaptation could be significantly higher than they claim. (They do claim that their estimate is conservative.)

2) How does this paper fit with the failure of large association and linkage studies to find loci of even modest effect for intelligence? Perhaps there are many good intelligence alleles of small affect that have partially swept different sub populations.

3) Could accelerated adaptation contribute to the Flynn Effect?

Assume that prior to fifty years ago, intelligence increased fitness. Based on the new paper there should be many new mutations that increase intelligence in various stages of sweeping different subpopulations. Assume that about fifty years ago the widespread use of birth control by upper and middle class women and a feminist movement that devalued motherhood together with a welfare state that encouraged poor women to have more children reversed the fitness benefit of intelligence.

Note that over many generations good alleles tend to accumulate on the same chromosome.

E.g., suppose on the same homologous chromosome allele A1 has a fitness of 1.1 compared to wild allele a1 and allele A2 has a fitness of 1.05 compared to wild allele a2 then these alleles will increase with rates 1.1 and 1.05 until they reach a significant allele frequency. At some point a single person will have homologous chromosomes with both the A1 and A2 alleles which recombine to produce a new chromosome with both the A1 and A2 alleles. This chromosome will have a fitness advantage of approximately 1.15 compared to a1a2 chromosomes. So the new A1A2 chromosome will sweep even faster. In this manner, good alleles tend to accumulate on sweeping chromosomes.

Now suppose A1 increases fertility and A2 increases intelligence. Up until fifty years ago the chromosome A1A2 would sweep at rate 1.15 and might be at an allele frequency of 50% in some sub population. However in the last fifty years intelligence became a disadvantage (Assume A2 now has fitness 0.95.) and the A1A2 chromosome now has a fitness of 1.05. It would continue to sweep and average intelligence would rise until recombination finally breaks the link between A1 and A2. So it is theoretically possible that many intelligence raising alleles became linked to other sweeping beneficial alleles with the net affect being an average increase in IQ even as the brightest people have fewer kids.

(My guess is that the Flynn Effect isn't due to increasing frequencies of good intelligence alleles. The fitness penalty for intelligence in developed nations is just too great to be overcome by the genetic draft of other beneficial alleles. To really know what is happening we will have to identify the genome variants that contribute to g variation and see how their frequencies have changed over time in the population.)

P'ter isn't satisfied with the way the authors treated varying recombination rates across the genome

to be clear, this isn't a major issue in my mind. it's more of an example of how the power of this test varies with a parameter. a more important question is: how does the power of the test vary with the major parameter they're discussing--the age of a selected allele.

P'ter feels the demographic model was too simplistic.

well, all demographic models are simplistic. Their simulations were not based on any population genetic model at all (artificially throwing together chromosomes to "simulate" a bottleneck is not really a model--a model has parameters like the strength of the bottleneck, its duration, effective population size--all these parameters they use in their theory are not in the simulations at all!), and the simulated data looks rather unlike the actual data. if you're going to use simulations to decide on thresholds, the simulated data have to be at least something of an approximation of the real data.