Gene Expression

Longer term effective population

In a few posts below I mentioned long term effective population. The effective population is basically the breeding population as opposed to the census size. Depending on the species this can vary quite a bit. One important point to consider (and this is obviously relevant to inbreeding and genetic diversity) is that the breeding generation alive must be placed in its historical context, how many ancestors does this population have?

For non-overlapping generations long term effective population size is the harmonic mean of the effective population sizes of each generation. It is defined by the following recurrence equation:

i-ba070c7d30de3fd99431ac5f18940750-longtermeffective.png

Ne is the effective population (obviously inversed), t is the number of generations, while Ni is the effective population in a given generation, i, up to t.

Consider a population sampled over 10 discrete generations. Now, fix its effective population as 10 in every generation. What’s the long term effective population? As I said, it’s the harmonic mean, and in this case it is equal to the arithmetic mean (the conventional average, 10).

But, what happens if you drop the effective population to 5 in one of the generations? If you took the arithmetic mean you would get 9.5. Since all other generations are 10, that makes “sense.” But what about the harmonic mean? That’s 9.1. How about two generations of 5? The arithmetic mean drops to 9, but the harmonic mean is now 8.33. OK, how about two generations of 3, and seven of 10? Now the arithmetic mean is 8.6, but the harmonic mean is 6.82. So you get the picture, the harmonic mean is generally lower than the arithmetic mean. Now let’s consider an extreme case, the effective population in is 1000 individuals in nine out of ten generations, but in one generation it drops to 25! Arithmetic mean: 902.5, harmonic mean: 204!

i-dbcf6067d11c61cf00676f4b4b976348-Bottleneck.jpgWhat does this have to do with biology? Two words: population bottleneck. The 25 effective population is the bottleneck generation which drives the long term effective population down. What is the biological significance of this? A drastic reduction in genetic diversity. If you consider the genomes of all breeding individuals in the population as the sample space whose complexity encodes information, as you decrease from 1000 individuals to 25 you decrease the sample space to 2.5% of its previous size. When in the subsequent generation the population expands you increase the sample space again, but, information has been lost in the bottleneck generation. The genome is discrete, not analog, so nucleotide state space of 25 individuals is not likely to a) represent the propotionality of the previous generation due to sampling error c) the very limited nature of the sample space means it is likely that rare variants will not make it past the bottleneck generation. If you take a 25 X 25 pixel bitmap image and resize it to 100 X 100 you don’t “rescale” appropriately as you would with a vector which calculates the shape and contour on the fly, rather, you get a pixelated and distorted image. Similarly, the bounce back generation is a shadow of the information content of the pre-crash population. Over time mutation replenishes variation, to the tune of 10-8 per gene, and as much has 10-4 on polygenes. Populations can obviously recover from crashes given enough time, and large numbers are crucial for selection to operate as it is conducive to fostering the realized range of variation because of the potential for a greater number of positively selected mutations. And, because it dampens stochastic effects which might further expunge genetic variation and lead to fixation of deleterious alleles.

Addendum: There are various types of effective population and skew. Here is the formula which governs effective population in the case of imbalanced breeding sex ratios:

i-fe44d04c8d515701d602c7e4db90ca56-malefemale.png

Nf is the effective population of females and Nm of males. This ratio is relevant because many organisms exhibit far greater reproductive skew of males than of females, and our own species exhibits this to a mild extent (as implied by our sexual dimorphism). Imagine a population where the breeding population of males is 25, and females is 100. The formula above implies that the real effective population (in the genetic sense, since a fewer males are ancestors for the next generation) is 80, not 125. Again, consider the biological reality here: males contribute far more than “their share” to the next generation if they do contribute at all, so the male genomic proportion across the offspring is more genetically homogenous because it is derived from the variation on a smaller initial set. Though well all (as humans) have an equal number male and female ancestors, it seems we have more unique female ancestors because particular males seem to reoccur throughout lineages more often because of their outsized demographic footprint.