Detecting Natural Selection (Part 7)

By evolgen on March 5, 2006.

Polymorphism and Divergence

This is the eighth of multiple postings I plan to write about detecting natural selection using molecular data (ie, DNA sequences). The introduction can be found here. The first post described the organization of the genome, and the second described the organization of genes. The third post described codon based models for detecting selection, and the fourth detailed how relative rates can be used to detect changes in selective pressure. The fifth post dealt with classical population genetics methods for detecting selection using allele and genotype frequencies. The sixth post described how to calculate nucleotide sequence polymorphism, and the seventh explained how we can use measures of polymorphism to detect signatures of selection. In this entry we'll review how polymorphism and divergence can be used to detect natural selection (more below the fold).

We have previously discussed how nucleotide divergence between species and polymorphism within populations can be used to detect natural selection on DNA sequences. This entry will detail how divergence can be used to estimate the expected polymorphism at a locus (and vice-versa) if it is evolving according to a neutral model. The approaches I will describe here are based on contingency tables, in this case 2x2 tables. I will describe two approaches, one that is designed for examining two separate loci, and another that is designed for two types of sites within a single region.

Assuming the selective constraint on a particular sequence has been constant since the divergence of two species, the amount of divergence between the two species at that locus should be a good predictor of the amount of polymorphism at the locus. As we discussed previously, positive selection is expected to decrease the amount of polymorphism, while balancing selection will elevate the amount of polymorphism. We will use a locus that we think is not under positive or balancing selection to determine the expected relationship between polymorphism and divergence. The amount of polymorphism and divergence at that locus will be compared to that at another locus to determine whether the polymorphism at the second locus is on par with that expected under neutrality. An example of such data is shown below.

	Sequence 1	Sequence 2
Polymorphism	9/414	8/79
Divergence	210/4052	18/324

Number of polymorphic sites (S) and divergent sites along with the number of nucleotide sites compared for two D. melanogaster loci. Data taken from Hudson et al (1987).

We can then compare the polymorphism and divergence at the two loci to determine if the locus we are interested in (Sequence 2 in the table above) is evolving in a similar manner as the control locus (Sequence 1). If there is a deficiency of polymorphism in Sequence 2, we have evidence for a recent selective sweep within or near this region. The data shown above indicate excess polymorphism in Sequence 2, consistent with balancing selection maintaining polymorphism at this locus. This polymorphism data comes from the Drosophila melanogaster alcohol dehydrogenase (Adh) gene, and the divergence estimates come from comparisons with the D. sechellia Adh gene. Sequence 1 is the region flanking the Adh gene, and Sequence 2 are the synonymous sites and introns from the Adh gene. The D. melanogaster Adh has two different allozyme alleles, and the pattern of nucleotide polymorphism at the locus is consistent with balancing selection maintaining the two alleles.

The other test I will describe involves examining the polymorphism and divergence within a single coding region. Recall that we can divide a protein coding sequence into synonymous and non-synonymous sites, and we can use the divergence between two sequences at those two classes of sites to infer selection. These types of tests lack power to detect positive selection because only a small fraction of sites will be under selection, and the signal of selection will be drowned out by all of the other sites. By incorporating polymorphism we can get an idea of the amount of divergence at synonymous and non-synonymous sites expected under neutrality (or, the amount of divergence can give us an idea of the expected amount of polymorphism).

	Synonymous	Non-synonymous
Polymorphism	21	0
Divergence	17	6

Number of polymorphic and divergent synonymous and non-synonymous sites along for the Drosophila Adh locus. Data taken from McDonald and Kreitman (1991).

The data shown above are also from the Drosophila Adh locus (a sort of model locus for molecular evolution and population genetics). In this example, we see an excess of divergent non-synonymous sites (relative to non-synonymous polymorphism) suggesting that natural selection has fixed beneficial non-synonymous mutations since these two species diverged. If we had attempted to detect selection using only divergence we would see that there are fewer non-synonymous differences than synonymous differences, providing no evidence for directional selection. Only when we incorporate polymorphism data can we determine that natural selection has fixed beneficial non-synonymous mutations.

This marks the end of the detecting natural selection series. I would say something deep and meaningful at this point, but there isn't really anything else to say. So, goodbye, I guess.

Hudson, RR, M Kreitman, and M Aguade. 1987. A test of neutral molecular evolution based on nucleotide data. Genetics 116: 153-159.

McDonald, JH, M Kreitman. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351: 652-654.

More like this

Why the Empty Population Size?

Dr. Rob weighs in on the lack of a relationship between mitochondrial DNA (mtDNA) polymorphism and population size.

Recombination Rate, DNA Polymorphism, and Mutation

In one of the most important papers in population genetics, Begun and Aquadro showed that levels of DNA sequence polymorphism are positively correlated with recombination rate. There are three ways of interpreting this result:

Hitching a Ride on a Drosophila Genome

Not all regions of the genome are equal in the eyes of evolution. For example, natural selection is more effective on genes in regions of higher recombination.

Chung-I Wu's Other Paper on Adaptive Evolution

In addition to the paper on adaptive evolution in the Drosophila melanogaster genome (reviewed here yesterday), Chung-I Wu is also senior author

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

This is a Good-bye Post

January 16, 2009

This is the final post ever at evolgen. It was a fun 4+ years, the last three spent at ScienceBlogs, but it has come time for me to close up shop. When I first got into blogging, I did it as a way to share what was on my mind to the few people who would read what I had to say (usually in topics…

Mendel's Garden #27 - Call for Submissions

January 2, 2009

Mendel's Garden is the original genetics blog carnival. The next edition will be hosted by Jeremy at Another Blasted Weblog. If you would like to submit a blog post to be included in the carnival, send an email to Jeremy (jcherfas at mac dot com). The carnival should be posted within the next few…

Eric Lander Teaches?

December 20, 2008

John Hawks points out that Eric Lander has been appointed to co-chair Obama's Council of Advisers on Science and Technology along with science adviser John Holdren and Nobel Laureate Harold Varmus. Here's how the AP article describes Lander: Lander, who teaches at both MIT and Harvard, founded the…

The Implementation of Molecular Evolution for the Masses

December 18, 2008

A couple of years ago, there was talk in the bioblogosphere about getting the general public interested in bioinformatics and molecular evolution: Amateur bioinformatics? Lowering the Ivory Tower with Molecular Evolution Molecular Evolution for the Masses The idea was inspired by the findings of…

Do people still use microarrays?

December 17, 2008

Larry Moran points to a couple of posts critical of microarrays (The Problem with Microarrays): Why microarray study conclusions are so often wrong Three reasons to distrust microarray results Microarrays are small chips that are covered with short stretches of single stranded DNA. People…