Jim Manzi on epistasis

By razib on October 30, 2008.

Jim Manzi has a long post up on epistasis, that is, gene-gene interactions:

We could call this process of competing algorithms struggling to find the best solution as fast as possible "meta-evolution". That is, each potential search method must compete for survival. The fact that the algorithm that has won this (idealized) competition in the real world has the form of a GA seems to indicate that there is some structure to the relationship between gene vectors and physical outcomes, but that it is much more complex that simple linear combinations without interaction terms, otherwise nature never would have evolved the evolutionary algorithm with all of its computational overhead. If epistatic interactions were not central, meta-evoltuion should have killed off evolution as we know it a long, long time ago.

That's a mouthful. Read the whole thing for context.

Remember that there are different ways to conceive of epistasis. Molecular biologists naturally think of epistasis as biochemical interactions between genes; e.g., locus A produces a gene product which modulates the activity of gene B. Obviously if you conceptualize the mechanistic processes of genes they are interlaced with a nearly infinite number of epistatic cross-linkages. But from an evolutionary perspective you are thinking about something different, specifically you're interested in interaction effects within traits which exhibit heritable variation. For example, you can conceive of a trait's genetic architecture as being additive and independent; each gene's effect has no relation to any other gene, and the effects are cumulative. This is obviously not true for all traits; changing the state on one locus might modulate a set of other loci.

R. A. Fisher and Sewall Wright are generally understood to have disagreed somewhat on the role of epistasis in evolution, with Fisher being more skeptical of its ubiquity than Wright. In particular Wright relied on epistatic effects in his Shifting Balance Theory which imagined populations exploring adaptive landscapes. Here is what James F. Crow, who knew both Fisher and Wright, told me several years ago:

1) In 2002 in "Perspective: Here's to Fisher, additive genetic variance, and the fundamental theorem of natural selection," you conclude, "is there any other quantity that captures so much evolutionary meaning in such a simple way?" in reference to additive genetic variance. And yet, what about other factors like statistical epistasis? Do gene-gene interactions pack enough of an evolutionary punch to be anything more than a footnote in God's Book? Have you seen Loren Rieseberg's work at Indiana which points to the importance of loci of large effect? [my question]

The remarkable thing about additive genetic variance is that it predicts the effect of selection, even in the presence of dominance and epistasis. Nature seems to follow least-squares principles. The result is that the additive component of variance pulls out of dominance and epistatic variance those components associated with allele frequency change under selection. Of course the theory is not exact, but it is a very good first approximation. Fisher did not ignore epistasis, as some have said; rather he showed how selection can utilize epistatic (and dominance) components of variance.

On a more technical level, Kimura showed that under selection with loose linkage the population rather soon attains a state in which the linkage-disequilibrium variance approximately cancels the epistatic variance. Thus, under this circumstance the effects of selection are better predicted by ignoring additive by additive epistatic variance than by including it. See my book with Kimura (1970, p. 217 ff).

I am aware of Rieseberg's work on sunflowers. QTL mapping and various other molecular methods are indeed finding alleles with large effect in many species. It is inevitable that the first genes discovered will be those with largest effect, so I expect alleles with smaller effects to follow. How large a part genes with large effect have played in evolution is still up in the air, as far as I know. But they are getting more emphasis now than in the recent past.

I am wary of saying much more, this is a complex topic. After all, consider that epistatic components of variation may be converted to additive genetic variance. If I had to guess, I would offer that perhaps epistatic dynamics play an essential role in what we would term speciation, while most microevolutionary action is along the dimension of additive genetic variance. But whatever the truth of it, I do not think that the importance of interaction effects negates the fact that as a first approximation a linear model can be highly fruitful. For a full understanding of the shape of reality we must map epistasis, but without it we may still have sight of the major landmarks necessary for our journey. Though I suppose that is conditional upon where we wish to go....

More like this

"The fact that the algorithm that has won this (idealized) competition in the real world has the form of a GA seems to indicate that there is some structure to the relationship between gene vectors and physical outcomes, but that it is much more complex that simple linear combinations without interaction terms, otherwise nature never would have evolved the evolutionary algorithm with all of its computational overhead."

Wrong, nature didn't "evolve the evolutionary algorithm" since it's not just the biological organisms that make up the EA. Some parts of nature's EA are not optimisable by biological evolution while some are (sexual reproduction, mutation rate)

His 'factory' analogy is interesting but misleading. The goal is not just to "maximize output" but to keep the factory from blowing up. Also, the whole question of "what evolutionary process would we expect to prevail, given different types of epistasis" seems a bit backwards to say the least. Isn't it the other way around?

"Here's why I think it is plausible (thought this is hardly a proof) that the nature of genetic evolution indicates that the actual functioning of the human genome is closer to the complex end of this spectrum than the simple end."

Or: The evidence showing otherwise doesn't matter because I can speculate capriciously.

Manzi's embarrassing "The Fallacy of Genetic Determinism" article shows that he is not trustworthy.

In Manzi's mind acceptable standards of evidence can be shifted around to whatever degree through convenience of preference alone:

"Thus we could sequence the DNA of all 6.7 billion human beings and still not know... [anything more than] a social-science study showing a correlation between watching lots of violent TV and aggressiveness?"

"... there are severe limits to what we know and can know ... First, there is the complexity of the genetic process. As Jim J. Manzi pointed out in a recent essay in National Review, if a trait like aggressiveness is influenced by just 100 genes, and each of those genes can be turned on or off, then there are a trillion trillion possible combinations of these gene states."

Wow, the David Brooks. Narrow sense heritability must not exist. With such omni-vacillating unstable genetic chaos isn't it extraordinary that evolution can happen at all. A genetic variant makes me stronger, but my son weaker, and my other son die of brain cancer. How do it spread?

In this new modern environment of ours it's astounding the genes originally selected for lactose digestion don't give us green hair instead. Hopefully global warming won't interact with our genes in some surprising new, yet utterly plausible, way and morph us into some tiny lizard-like creature. It's all so complex, every scenario is equally likely, except, of course, the one where we can make predictions and keep new discoveries coming on the expectations of tractable naturalistic regularity.

windy:

Thanks for the thoughtful comments.

How do you know that nature didn't "evolve" this? (note that "evolve" this doesn;t mean using the specific algortihimc operators of crossover, mutation and so forth, as we are using the term here at a higher level of abstraction). It seems to me that different methods of evolution (i.e., inheritable change in organisms over time in response to environments) should be in competition in nature.

I agree with the limitations of the factory analogy. In the original article that is linked I go into some of these simplifications, and in the context of the relevant argument in that article (which is different than the argument here) argue that they do not materially impact the argument.

Jason:

I'm sorry that you find my work embarrassing.

I'm pretty confident that I've never said that h^2 doesn't exist. (In fact I've asserted that heritability is a common-sense idea, and that observation of it at some crude level probably predates writing). The problem that I've raised is basically a version of the knowledge problem. I have argued in some detail that we (currently) have a limited capacity to quantify reliably the causal relationship between named loci on the human genome and normal mental states. For example, replicated experiments that demonstrate if I physcially manipulate the following list of genes in the follwoing manner, I will produce the following change in non-pathological mental states in the experimental subjects.

The point of the specific post to which Razib links is somewhat different than this, however. It is an argument that it seems plausible that the algorithmic complexity of crossover, mutation and so forth seems like an incredibly inefficient method for searching the space of possible genomes if the structural relationship between a given vector of genes and phenotype outcomes can be represented by linear equations without interaction terms. I don't see anything in your response that addresses this, but I may not be understanding your points.

Razib:

Fascinating. I've been trying to work my way through some of your references. I'm not (close to) done.

Here's a (non-rhetorical) question: if additive variance without interaction effects is a close engineering approximation to the causal map between genome and normal phenotypic mental states, then why haven't we discovered them through GWAS? It seems to me that under this assumption they should pop right out.

then why haven't we discovered them through GWAS?

tiny effect size. REALLY tiny. way tinier than this. lots and lots of tiny QTLs. linkage based studies were well designed to pick up rare alleles of larger effect. never found anything. GWA can hit more common alleles of smaller effect. not finding anything. the genetic component of variation in the DNA sequence for IQ probably is dispersed through on the order of 1/3 of the genic regions.

i think it is different for personality, look at the drd loci. i think that's perhaps because there is much more selection for a stability of distinct morphs within a population for personality, so you have a small number of large effect alleles segregating in the population. modulating temperament by changing dosage might be a lot easier and have fewer pleiotropic effects than shifting general intelligence by substituting at a loci of large effect.

"I'm pretty confident that I've never said that h^2 doesn't exist."

h2 contradicts your argument that GWAS have been unsuccessful in identifying genes associated with 'mental states' because of massive epistasis.* Intelligence is almost entirely additive. Heritability of intelligence calculated from identical twins raised apart (who share all their gene-gene interactions) is not higher than heritability calculated from ordinary siblings raised apart, who do not share those interactions.

And here's the thing, both data and theory indicate this is the case more generally. Here is an important paper on this from PLoS Genetics earlier this year:

Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits

We address a long-standing controversy and paradox about the contribution of non-additive genetic variation, namely that knowledge about biological pathways and gene networks imply that epistasis is important. Yet empirical data across a range of traits and species imply that most genetic variance is additive. We evaluate the evidence from empirical studies of genetic variance components and find that additive variance typically accounts for over half, and often close to 100%, of the total genetic variance. ... We conclude that interactions at the level of genes are not likely to generate much interaction at the level of variance.

* The real reason GWAS have been unsuccessful as of yet in finding variants for complex traits (not just for intelligence) is that the sample sizes need to be much larger, to detect common loci of very small effect.

Heritability of intelligence calculated from identical twins raised apart (who share all their gene-gene interactions) is not higher than heritability calculated from ordinary siblings raised apart, who do not share those interactions.

That's actually quite impressive. Do you have a source for this result so I can read more about it?

Razib:

If I understand you correctly, your point could be crudely summarized as follows:

Mental states are believed to impacted by (to take your IQ example) on the order of 7,000 genes. If each of these contribute ~1/7,000th of the difference in the phenotype mental outcome, but have no material interaction terms, then a GWAS with a sample size of a few thousand in the effect and control groups will never identify these. If we do the power calculations, however, there is a theoretically feasibly-sized GWAS that could find these genes, again goven the assumption of no material interactions. Once we scale up the studies, we will find them.

Is that about right?

If so, until we have accomplished this how do we know that this is true? (Again, a non-rehtorical question, as I assume you will point to some prior work that indicates that this is the reasonable assumption to make about what these future GWAS studies will find, as opposed to one contrary possibility that the inetraction effects will be material.)

How do you know that nature didn't "evolve" this? (note that "evolve" this doesn't mean using the specific algortihimc operators of crossover, mutation and so forth, as we are using the term here at a higher level of abstraction). It seems to me that different methods of evolution (i.e., inheritable change in organisms over time in response to environments) should be in competition in nature.

They are, partially. But organisms can't change all the parameters of evolution because part of it is implemented by physical factors. For example, you can never exclude the "flipping switches randomly" guy (mutation) completely, you mostly try to limit the damage that he does.

When we do see different methods of evolution competing, like asexual and sexual reproduction, it is often the less 'efficient' method that wins over time! Also, epistasis evolves too, and I'd guess that it is generally easier to change epistatic interactions than to change the whole mode of evolution. So you can't say "the mode of evolution must have evolved to match the underlying epistatic interactions which have stayed constant."

In fact, evolution implies a conclusion towards the other end of the spectrum - organisms simply can't afford completely unpredictable interactions since most random interactions are likely to be detrimental to survival.

John Hawks quotes quotes an explanation of the two views of epistasis that razib was talking about:
"The role of epistasis in adaptive evolution has been a controversial issue ever since Sewall Wright and R.A. Fisher first formalized their views in the early 1930s. According to Wright (113, 114), natural selection retains favorably interacting gene combinations. Therefore, as a result of the highly integrated nature of the genome, selection may lead to the production of what Dobzhansky (43) has termed "coadapted" gene complexes. In contrast, Fisher (48) argued that natural selection acts primarily on single genes, rather than on gene complexes. In Fisher's view, therefore, selection favors alleles that elevate fitness, on average, across all possible genetic backgrounds within a lineage. Such alleles have been termed "good mixers"..."

But even the Wrightian view isn't that epistatic interactions are complex and ineffable, it's more that some interactions are much better than others and these come to take over. You could also look up canalization.

I second toto's request for Jason. I've somehow managed to never discover this despite having a passing familiarity with the literature. I seem to remember Devlin et al estimating the narrow sense heritability at 0.35 from siblings and Plomin pegging the broad sense at about 0.7 with twin studies. Am I misremembering or is one or both of them wrong?

Maybe everyone is already onto this, but it is extremely interesting to read cosma shalizis very austere take on the epistemology of heritability and separated monozygotic twin studies etc. Im not saying i necessarily swallow every word of it, or that i dont, but its a damn find read. Google [cosma 'yet more on the heritability and malleability].

Do you have a source for this result so I can read more about it?

I've put together convergent numbers from studies using various kinds of kinship for later, but until then:

DZ twins raised apart. Heritability = .76.
MZ twins raised apart. Heritability = .75.

http://www.ncbi.nlm.nih.gov/pubmed/9549239

Thanks, Jason.

From the reference, just so everyone else can see:

"The weighted mean (0.38) is close but somewhat less than the typical value reported for same-age first-degree relatives reared together - if twins are excluded (Bouchard and McGue 1981). Doubling the correlation for DZ twins reared apart yields a heritability of 0.76, a value close to the heritability estimated by the correlations for MZ twins reared apart."

(For anyone who's unclear, the idea here is that if you assume the heritability is entirely additive, the correlation between those with an average of 50% identity-by-descent should be half that of those with 100% identity once shared environment is eliminated.)

sure, the only quibble is that there might a distribution of effects. so that on average the contribution is 1/7,000th, but here are some relatively large effect QTLs.
If so, until we have accomplished this how do we know that this is true?

jason's point about narrow sense heritability is on point. epistatic variation is outside of this (sometimes collapsed into environment). if narrow sense heritability is high then it is mostly additive and independent (linear) effects.

Razib:

"sure, the only quibble is that there might a distribution of effects. so that on average the contribution is 1/7,000th, but here are some relatively large effect QTLs."

I was purposely taking the extreme case most amenable to the argument that the effect of any one gene is "very tiny" which is why it has escaped identification in GWAS.

"jason's point about narrow sense heritability is on point. epistatic variation is outside of this (sometimes collapsed into environment). if narrow sense heritability is high then it is mostly additive and independent (linear) effects."

Again, my question sounds rhetorical, but I don't mean it that way - isn't this assuming the conclusion?

"Again, my question sounds rhetorical, but I don't mean it that way - isn't this assuming the conclusion?"

Narrow sense heritability is a measure of the degree to which you can predict offspring trait values using a linear model when you know the trait values of the parents and the population. E.g. if the population average value for a trait is 100, and when two parents have an average value of 140 you can expect offspring to have a trait distribution centered around 130, narrow heritability is 0.75.

Broad-sense heritability is what we get by looking at monozygotic twins, who are matched for genotype, and includes the effects of dominance and epistasis. If broad heritability were much greater than narrow, then we would know that non-additive effects like epistasis and gene-environment interactions played large roles.

Empirically, however, narrow heritability for IQ is over 75% of broad heritability in the Western populations used for twin studies: you can take parents' IQ scores and predict offspring IQ using a linear model. Thus we should expect high-powered GWAS using whole-genome sequencing and large sample sizes to be capable of picking up most of the genetic influence. Bioinformatics, the structural organization of the genome, and other sources can reduce the brute force sample sizes required. Also, once many alleles have been identified, their effects can be controlled for in estimating the effects of still rarer alleles.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Remember to switch RSS feeds

April 3, 2010

If you link to this weblog from your weblog, please update links: http://blogs.discovermagazine.com/gnxp/ If you have not updated your feeds, please do so now: http://feeds.feedburner.com/GeneExpressionBlog The old feed address will point for another week or so to the new feed, but eventually it…

I'm moving to Discover

March 26, 2010

Update your bookmarks: http://blogs.discovermagazine.com/gnxpAnd RSS: http://feeds.feedburner.com/GeneExpressionBlog If you have a weblog that links to ScienceBlogs GNXP, I would appreciate you update the link for the sake of PageRank. There isn't much to say about the move. There wasn't one big…

Canada is not a "free society"

March 24, 2010

That's all I have to say to Eric Michael Johnson's post, Ann Coulter, Hate Speech, and Free Societies. OK, seriously, from what I recall Eric is an American, though resident in the forgotten north. American absolutist stances on free speech are not shared by most Western societies, so demanding…

Others in Siberia

March 24, 2010

The complete mitochondrial DNA genome of an unknown hominin from southern Siberia: With the exception of Neanderthals, from which DNA sequences of numerous individuals have now been determined...the number and genetic relationships of other hominin lineages are largely unknown. Here we report a…

The biophysical limits of cognitive computation

March 23, 2010

In this diavlog with Glenn Loury the behavioral economist Sendhil Mullainathan recounts the results of an experiment. - If given the option of paying $100 for an item vs. $80 for an item, but in the second case having to go across town for the item, respondents choose $80 and going across town - If…

More like this

Remember to switch RSS feeds

I'm moving to Discover

Canada is not a "free society"

Others in Siberia

The biophysical limits of cognitive computation

What is the Sun made out of?

A Television Series Set At The Renfest!!!

Messier Monday: A Cluster Beyond Our Galaxy, M79