Selection in structured populations

Blogging on Peer-Reviewed ResearchEvolutionary genetics is subject to parameters; forces which pull and push and shape the nature of dynamic processes over time and space. Population size, mtuation rate, migration, selection. etc., these are all parameters we have to keep in mind when attempting to analyze the nature of evolutionary dynamics. The acceleration paper was predicated on consequences of changing on parameter, population size, upon other parameters such as selection, drift and the number of mutations. Of course I have wondered about the nature of population substructure and what it means for our species. My thinking has been influenced by these maps, which show sharp geographic discontinuities in the frequencies of recently selected alleles. Quotes like this make me wonder: "As long as s < m, the new allele will migrate out of its original deme before it reaches fixation there," where s = selection coefficient, and m is the migration rate (specifically, the probability that an individual in generation t in deme x was a member of deme !x during generation t - 1). Selection coefficients on the order of 0.1 are enormous, those on the other of 0.01 are substantial. A migration probability for an individual on the order of 0.1 within a deme also seems very high. My own intuition though is that the distribution of m exhibits both a greater mean and variance than s.

Fixation Probability and Time in Subdivided Populations is the first of several papers I've been reading to obtain a greater insight on population substructure and how it might have played out in the evolutionary history of our species. From the abstract:

...Population structure changes the effective size of the species, often strongly downward; smaller effective size increases the probability of fixing deleterious alleles and decreases the probability of fixing beneficial alleles. On the other hand, population structure causes an increase in the homozygosity of alleles, which increases the probability of fixing beneficial alleles but somewhat decreases the probability of fixing deleterious alleles. The probability of fixing new beneficial alleles can be simply described by 2hs(1 - FST)Ne/Ntot, where hs is the change in fitness of heterozygotes relative to the ancestral homozygote, FST is a weighted version of Wright's measure of population subdivision, and Ne and Ntot are the effective and census sizes, respectively. These results are verified by simulation for a broad range of population structures, including the island model, the stepping-stone model, and a model with extinction and recolonization.

This is a technical paper, with diffusion equations and integration and simulations of stepping-stone and island models. I'll elide over the details, but there are a few general issues that need to be noted. The work in this paper extends and adds granularity to the famous 2s which I had spoken of before, the probability of fixation of a selectively advantageous allele within a new population. In short, if an allele confers a selection coefficience of 0.1, 10% increased fitness above the population mean, then it has a 0.2 probability of fixation. This is within a very large, operationally infinite, population. Why only 0.2 when the allele is favored? Stochastic factors are strong when an allele is at low frequencies, basically present in only a few copies. There is reproductive variance for any given individual (usually assumed to be Poisson distributed), and that variance is not perfectly correlated with an idealized genetic fitness. If a random "Act of God" exterminates a clutch with a very beneficial allele, then so be it.

In terms of the formalism, Ne & Ntot are the effective and total population. Effective basically refers to the fact that not all individuals contribute to the next generation, or contribute to the same extent. Random reproductive variance will always result in a lower Ne than Ntot. FST is basically a measure of the between and within population genetic variation. If most of the variation is partitioned between populations then FST is high, and approaches 1, but if most of it is extant within populations, it approaches 0 (FST across "races" is famous on the order of 0.15, so that 85% of the variance on a single locus is present within a race). h is measuring the extent of dominance, if the heterozygote is in between the two homozygotes it is 1/2, while if there is perfect dominant it is 1.

There are a few general results from the analyses and simulations within this paper. A great deal of substructure can result in a reduced power of selection to drive beneficial alleles to fixation within and across demes. It can also result in the fixation of deleterious alleles due to drift. Finally, it can also favor the fixation of recessive alleles which are beneficial as well, because they are more likely to be expressed as homozygotes within a small relatively inbred population (drift kicks their frequency up just high enough so that many more copies are expressed to selection).

i-25038f265aeb94d7edb43f8620fb47ed-whitlock05fig2.jpgHere's a figure from the paper. By "hard selection," the authors mean that the nature of the genotypes within the deme are relevant toward their replication in the next generation ("soft selection" seems to refer more to events which result in variant success across demes uncorrelated with their genotypes, so differences in allele frequencies are driven simply by initial differences across demes, captures by FST). Note that the selection coefficient, 0.001, is rather modest. You see that as the rate of migration increases the probability of fixation quickly converges upon 0.002, what one would predict from 2s. This is simply because migration increases the effective population size as the demes are being connected together into a larger breeding metapopulation.

i-bfa487dcc67945ec8cddee06d65c0acd-whitlock05fig5.jpgThis is from figure 5, and it shows the trend for a recessive allele which is only expressed as a homozygote, with an s of 0.002. Note that as the migration rate increases it becomes less and less able to manifest its advantange. This is because the effective population size is increasing and within a Hard-Weinberg Equilibrium its value as q2 is getting smaller and smaller.

The whole paper is open access, so I encourage you to read it. At this point, the question I'm asking myself is this: effective population seems to have been increasing over time, but how has population substructure changed, if at all, over human history? Would small isolated groups such as the Andaman Islanders exhibit a non-trivial number of beneficial recessively expressed alleles because of their insulation from migration? As history progressed and migration increased have alleles increased in their chance of fixation in part because of the breakdown of population structure? Was the 2s limit reached relatively early on in the course of our history so that as R.A. Fisher might contend we can ignore substructure? And how might Shifting Balance dynamics play into this?

Reference: Fixation Probability and Time in Subdivided Populations, Michael C. Whitlock, Genetics 164: 767-779 ( June 2003)

Tags

More like this

A few days ago I posted about selection and population structure. The basic idea is to imagine demes, breeding populations, and consider how variation in the standard parameters such as selection coefficient and migration might affect the overall frequencies of the alleles. The paper, Fixation…
One issue that has cropped up in the comments a few times here is a conflation between quantitative & population genetics. Though people seem to think they're interchangeable terms, they're distinct fields. That's why population genetics text books have chapters devoted specifically to…
A few days ago I introduced how higher levels of selection could occur via a "toy" example. Obviously it wasn't realistic, and as RPM pointed out a real population is not open ended in its growth potential. I simply wanted to allude to the seeds of how Simpson's Paradox might occur, where…
A few days ago I discussed a new paper which explores the patterns of natural selection in the genome of the X chromosome. As you know the X is "carried" disproportionately by females, as males have only one copy, so it offers up an interesting window into evolutionary dynamics (see The Red Queen…

A high rate of beneficial mutations may generate many simultaneous, beneficial mutations on the same type chromosome in a population. Recombination together with selection should produce "super fit" chromosomes that have accumulated multiple beneficial alleles. Such "super fit" chromosomes should spread beneficial alleles faster than predicted by "single mutation" models.

From your description this paper only looks at a single mutation model. (Or it assumes mutations are inherited independently.) How would the simulation results change if multiple mutations with recombination were included? Does anyone do that kind of simulation?

"Was the 2s limit reached relatively early on in the course of our history so that as R.A. Fisher might contend we can ignore substructure?"

My guess is that accelerated adaptation together with recombination means that substructure is important. With high levels of selection activity, population substructure and migration should be more important.

Also, it is not clear to me how important fixation time is when looking at a highly dynamic genome. More interesting is the probability of a beneficial allele attaining a moderate frequency in a sub population. At that point recombination and migration become important.

I wonder if our ability to track "single gene" or "few gene" traits distorts our perception of how most human adaptation occurs. A "single gene" trait that provides a significant fitness advantage would appear as a "super wave" spreading across a lake of chaotic ripples. In such cases a "single mutation" simulation might be pretty good. However, many traits will depend on hundreds or thousands of small beneficial mutations. In such cases the "single mutation" model will be inadequate.

Would a population that was expanding and subdividing have faster rates of fixation (within those subdivisions) for a given selection coefficient? Furthermore, wouldn't estimates of selection coefficients overshoot the real numbers unless that process of subdividing was taken into account?

Recombination together with selection should produce "super fit" chromosomes that have accumulated multiple beneficial alleles.

do you think the rate of recombination is high enough? you're basically talking about a supergene, right?

How would the simulation results change if multiple mutations with recombination were included? Does anyone do that kind of simulation?

i don't know yet. i'm digging through the lit right now.

My guess is that accelerated adaptation together with recombination means that substructure is important. With high levels of selection activity, population substructure and migration should be more important.

this is plausible. i'll be posting stuff on how selection changes based on structure soon...a lot of it is pretty theoretical.

However, many traits will depend on hundreds or thousands of small beneficial mutations. In such cases the "single mutation" model will be inadequate.

i think we're getting into shifting balance territory; in which case substructure is of the essence.

Would a population that was expanding and subdividing have faster rates of fixation (within those subdivisions) for a given selection coefficient?

i assume that the average will be the same. in fact, subdivision will increase the power of drift and so reduce the rate of fixation (though i the divisions are large enough this might not matter). to get at what i'm thinking here, assuming you have 10 large demes which are sampled from an even larger deme. assuming that the 10 demes are pretty good representations of the larger deme (if their sample size is big, then variance shouldn't be an issue) then they'll all have the same fixation probability. in some demes the allele will go extinct, in others it will fix, just like with drift i assume that the expectation for the number which will fix or not fix is proportional to the probability of fixation.

"do you think the rate of recombination is high enough? you're basically talking about a supergene, right?"

Yes, supergene. "Super chromosome" is misleading since each chromosome will have multiple cross-over events during gamete formation. Loci that are widely separated on a chromosome are essentially inherited independently. The loci should be far enough apart that they will likely be united by an occasional rare recombination event but not so far apart that there is a high probability that common recombination events will separate the loci.

I'm guessing that the recombination rate is high enough to generate high fitness supergenes (when many simultaneous mutations with modest benefit are present in the population) and low enough that the supergenes will be fairly long. (Long supergenes would permit the combination of many modest benefit mutations into one high benefit supergene.) I'd like to see simulations of this process with different recombination rates and different levels of mutations of modest benefit.

Note that a mutation of moderate effect might act as the seed of a supergene. The supergene would slowly spread in a sub population and pick up nearby beneficial mutations of small effect. Overtime the fitness of the supergene would increase and it would spread more rapidly. (This same process would filter out slightly harmful mutations.)

My mind kept nagging me about the term "supergene". "Supergene" usually refers to a group of nearby genes that share an epistatic relationship. When considering long term stable genetic patterns this makes sense as the epistatic relationship keeps the linkage from breaking. My usage is slightly different.

The linkage is only maintained while the component genes have a fitness advantage over wild type. Once the supergene sweeps the populace there is no longer a fitness advantage in maintaining the linkage. New beneficial mutations can replace component genes without damaging an epistatic relationship. This type of "supergene" is a temporary "selection" structure, not an epistatically re-enforced, stable structure.