How big does big genetics need to be?

In the comments to a previous post defending big genetics, Andro Hsu relates an anecdote that warrants repeating:

IIRC, at the December NIH/CDC meeting Francis Collins suggested that the way to get to the bottom of the missing heritability, the common disease common variant hypothesis, gene-gene and gene-environment interactions, etc. etc. is to run a population-wide, 20-year longitudinal study in which genome-wide data and detailed environmental and behavioral minutiae were tracked for 100,000 participants.

The follow-up commenters starting with John Ioannidis each upped the sample size by an order of magnitude, until someone suggested that the entire U.S. population be sequenced, which it was then realized would require universal health care.

At that point, the meeting ended.

The scary part is that even a study that large might not be large enough: under certain models for the genetic architecture of complex traits (e.g. thousands of common risk variants with tiny effect sizes and substantial interaction between them) the whole US population (or even the entire world's population) might not be large enough to capture the missing heritability, especially given the challenges introduced by sample heterogeneity and measurement error in a study of that magnitude.

More like this

Nature Genetics has just released six advance online manuscripts on the genetic architecture of complex metabolic traits. The amount of data in the manuscripts is overwhelming, so this post is really just a first impression; I suspect I'll have more to say once I've had time to dig into the juicy…
Willer et al. (2008). Six new loci associated with body mass index highlight a neuronal influence on body weight regulation Nature Genetics DOI: 10.1038/ng.287 Thorleifsson et al. (2008). Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity…
Well, it's a little late, but I finally have a list of what I see as some of the major trends that will play out in the human genomics field in 2009 - both in terms of research outcomes, and shifts in the rapidly-evolving consumer genomics industry. For genetics-savvy readers a lot of these…
Purcell et al. (2009). Common polygenic variation contributes to risk of schizophrenia and bipolar disorder Nature DOI: 10.1038/nature08185 Neil Walker has been doing a spectacular job of serving up useful information in the comments recently, so I asked him to write the first ever guest post on…

the whole US population (or even the entire world's population) might not be large enough to capture the missing heritability

the string theory of genetics!

The Genetics & Public Policy Center has received funding to study public attitudes toward a large, U.S. population-based study of genes and environment.

http://dnapolicy.org/news.release.php?action=detail&pressrelease_id=134

"A large, population-based study likely would involve the participation of hundreds of thousands of U.S. volunteers, who would be followed for a period of many years to ascertain and quantify the major environmental and genetic contributors to common illnesses."

By Andro Hsu (not verified) on 26 Aug 2009 #permalink

What's interesting to me is that the numbers that Collins started out with sound very much like what we would expect to see if the Genomics and Personalized Medicine Act (GPMA), which had Obama as its sponsor in the Senate in 2006 and 2007, and was introduced in the House in 2008, is ever passed. With Collins at NIH and Obama in the White House, there's a real possibility that the GPMA, or something like it, will finally get off of the crowd.

If it requires 3000 million years of person observation to detect it aint very important.

By antipodean (not verified) on 15 Sep 2009 #permalink