Graph theory & selection

Blogging on Peer-Reviewed Research

A few days ago I posted about selection and population structure. The basic idea is to imagine demes, breeding populations, and consider how variation in the standard parameters such as selection coefficient and migration might affect the overall frequencies of the alleles. The paper, Fixation Probability and Time in Subdivided Populations, was rather "old school" despite the recourse to simulation. It emerged out of the theoretical population genetic tradition of R.A. Fisher & Haldane, and their successors starting with Kimura. It utilized diffusion equations and rested upon standard evolutionary genetic models of population structure such as the "Stepping-Stone." Today I'm going to take a different tack, standing upon the shoulders of our friend Martin Nowak, and his foray into
Graph Theory. This post is derived in large part from his paper Evolutionary Dynamics in Graphs, as well as the equivalent chapter in his book Evolutionary Dynamics.

Nowak isn't focusing demes per se here, rather, the nodes or vertices within the network can be thought of as individuals or points from which the mutation might emerge. A background assumption here is that you're reasonably familiar with the Moran process and linear algebra, but if you aren't you can hum through pretty easily I think. There aren't any major algebraic manipulations here anyway.

i-326c12acf89521a1a030a02070db27e5-nowakfig1b.jpgThis is figure 1 b. You see the vertices in blue, while the directional arrows represent "edges." The matrix to the right is a stochastic, so the rows are probabilities which sum up to 1. Each element is ij, where i is the row and j is the column. The probabilities represent the likelihood that the offspring of indivdiual i will replace those of individual j. Many of the elements, as you can see, are 0 because i's offspring can not "replace" itself (so the top leftmost element would be w00), and some vertices do not have edges leaving or entering. If you want to know the relevance of these probability matrices to evolutionary processes over time I suggest you consult the notes (also see Markov process). I just want you to keep in mind that Nowak's paper is focusing on the networks and their concomitant probability matrices.

i-0eddb1ff22366126e45afb674fbeae5b-nowakfig2.jpgThe above is from figure 2 of the paper, and it shows a number of graphs which I will give a quick overview of. But first, an equation:

ρ1 = (1 - 1/r)/ (1 - 1/rN)

This represents the probability of fixation of a new mutant in a population governed by the Moran process, where N remains fixed across generations and during each generation one individual is selected with a probability proportional to its fitness, r, to produce an offspring which will replace a randomly chosen individual (you can see how this relates to the matrix above). The population is also homogeneous. Note that there is a chance of fixation, governed by the nature of r and N; but as in 2s there is a role for both deterministic selection and various stochastic factors (e.g., drift).

But the equation above applies to more than homogeneous (that is, panmictic) populations. In figure 2 a, b and c have the same fixation probability as a homogeneous population, ρ1. This is because W, the stochastic matrix, is symmetric. Additionally, if T, 'temperature,' for the vertices is the same then the probability of fixation is ρ1 as well. Temperature basically measures the weight of the edges going in and out of a vertex, or, Ti = Σ,jWij. 'Hot' vertices, in orange above, are often replaced, while 'cold' ones, blue, are not. Graphs where all vertices have equal temperature are termed 'isothermal.' Graph d in the figure above shows a non-symmetric, but isothermal, network where the probability of fixation is ρ1.

Obviously this is kind of a boring result, not all roads lead to ρ1. Look at graphs f & g; their probability of fixation is 1/N. Why? This is pretty clear verbally, because of the nature of the edges unless the mutant starts in the cold position it can't sweep through the population. The chance of a mutant occurring on the cold positions is...you guessed it, 1/N. This is the old rate for the probability of fixation of a neutral allele. All you learn here is that some population structures, graphs, can theoretically prevent selection from fixing an allele. Additionally, if there are multiple cold positions upstream of a large number of hot positions then the probability of fixation is 0, since obviously a mutant in one cold position can never penetrate another.

i-b5c1d9336ea926bb9dfeef09701fb3a0-nowakfig3.jpgBut enough with throwing cold water on selection. How about structures with amplify the probability of fixation? You know you want some of that! Above is figure 3 from Nowak's paper. Pretty huh?

The fixation probability for graph a, the "star structure," is:

ρ2 = (1 - 1/r2)/ (1 - 1/r2N)

Since r spans 0 to ∞, with 1 being population mean fitness, any beneficial mutation is amplified to r2. For example, 1.1 is converted to 1.21 as r2 (note that everything else remains as in ρ1). If you look at the network the power of selection to take over this network is pretty obvious, the central node acts as a mediator across the population.

But things really get going when we hit graph b, c and d, the "super star," "funnel" and "meta-funnel." Here's their fixation probability:

ρK = (1 - 1/rK)/ (1 - 1/rKN)

K is the number of leaves. The star structure has 2, the latter three structures 3. The important thing about these amplifiers is that as N → ∞ the probability of fixation converges upon 1! That means that a beneficial allele will fix, and a disadvantageous allele be eliminated! OK, back to earth. It's just a model...reality isn't a Moran process or perfectly defined by Graph Theory. But in any case, Nowak observes that these amplifying structures tend to have a few primary nodes which serve as shuttles for beneficial alleles. Good to know.

So what does this tell us? The easiest way to imagine this is that there are individuals. But what if it is interdemic competition? In other words, a competition, extinction & replacement meta-population model. And could this apply to gene flow even without replacement (let's break out of the Moran process derived box for a moment)? Perhaps there are particular dynamics at work when there is asymmetrical gene flow, when the structure of demes is irregular, and so forth. Most readers know enough data that they could produce many conjectures trying to fit the data and theory together....

Reference: Evolutionary dynamics on graphs, Erez Lieberman, Christoph Hauert, & Martin A. Nowak, Nature, 433:20 January 2005

Tags

More like this

A few days ago I introduced how higher levels of selection could occur via a "toy" example. Obviously it wasn't realistic, and as RPM pointed out a real population is not open ended in its growth potential. I simply wanted to allude to the seeds of how Simpson's Paradox might occur, where…
Evolutionary genetics is subject to parameters; forces which pull and push and shape the nature of dynamic processes over time and space. Population size, mtuation rate, migration, selection. etc., these are all parameters we have to keep in mind when attempting to analyze the nature of…
Assume that you have a new mutation, totally novel. What's its probability of going extinct in one generation? That is, it doesn't get passed on.... Consider, you have a population of N individuals. Fix the population size across nonoverlapping generations. So, in generation t you have N…
Update: Update at the bottom.... In reference to the sequencing of the Neandertal genome, Kambiz at Anthropology.net states: I have one little gripe with the New York Times article. Wade quotes a geneticist, Dr. Bruce Lahn saying there is, "evidence from the human genome suggests some…

Razib, I'm not sure I understand what you are getting from these graphs...

It appears that these graphs represent substructure population advantages not related to genetics. E.g., over time the children of the town's wealthiest family may displace children of the poorer families. (See model F in figure 2.) The link weights represent the advantage of being born into the wealthy family. Then any mutation (not just a beneficial mutation) that originates in the wealthy family is more likely to fixate than one that originates in a poor family.

Other examples of such non-symmetric flow would be noble vs serfs, town vs. countryside, trade-route-nexus village vs. nearby villages, inhabitants of fertile land vs. non-fertile land.

I don't really see how the graph weights could directly represent positive selection. The link weights would change as soon as a highly beneficial mutation occurred, a hot node under the old weights would become a cold node. Once the cold node acquired the mutation then it would in turn become hot.

What if we consider link weights that have a fixed substructure component and a variable "fitness" component. Suppose a beneficial mutation occurs at one of the nodes. The selective advantage of the allele would temporarily alter the link weight, i.e., nodes that have the allele would be slightly more likely to displace the offspring of those that don't have the allele. Even if a node is hot and so is often replaced, the beneficial allele should still pass into a cold node and then sweep that cold node. I do see that when a beneficial mutation occurs in a wealthy family it has a much greater chance of surviving stochastic elimination than if it arose in a poor family.

The more complex "amplifying" graphs don't seem realistic. I doubt substructures with many one-way flows have been common in human history. Even a little reverse flow would allow beneficial alleles to introgress.

These graph models seems more appropriate for studying drift (where the link weights don't change) than for studying selection.

Guess I need to read the paper.

Whoops, "Once the cold node acquired the mutation then it would in turn become hot." should read, "Once a downstream hot node acquires the mutation its link-weight-sum changes and it becomes cold."

The more complex "amplifying" graphs don't seem realistic. I doubt substructures with many one-way flows have been common in human history. Even a little reverse flow would allow beneficial alleles to introgress.

empirically i'm thinking africa here...i think it might be a cold node and that outside alleles have a hard time penetrating. i'm also thinking that some island biogeography models might work as analogs.

i am going to reread the paper and chapter and answer you more directly later so i make sure i know what nowak is getting to.

Stop using divicult equations and formulas!!! But on the bright site I adore the information. Take Care...