To the barricades in defence of Big Genetics

By dgmacarthur on August 19, 2009.

Over at Gene Expression, p-ter has a post up defending the "big genetics" approach, noting that large-scale hypothesis-free genetics studies have consistently yielded important results for follow-up detailed fine-scale studies.

It's a sound argument. I've argued in the past that many of the fears expressed about Big Genetics are overblown:

Will Big Genetics eventually swallow the entire field, as some critics of the Human Genome Project argued towards the end of the last millennium? I'd argue that this is unlikely, and that in fact the Big Genetics approach carries within it the seeds of its own constraint.
My reasoning is this: firstly, the sheer size of these projects encourages the emergence of a public data-sharing mentality that now (thankfully) permeates most of the field, becausewith no one group feeling complete ownership of the resulting data there are fewer barriers to the idea of dumping it all online for the benefit of the community as a whole. The free release of data into the research community, like an influx of nutrients into an ecosystem, ultimately results in the increased availability of niches for researchers to exist in. Basically, Big Genetics generates far more data than its participants can ever hope to analyse themselves, and the hefty remainder is fodder for a plethora of small labs exploring small but important facets of the bigger picture.

The vast number of small-scale studies that have relied on the human genome reference sequence or the HapMap is an obvious testament to this process. We are also beginning to see small groups seize on the wealth of data from genome-wide association studies to drive both targeted genetic studies and functional and mechanistic analyses. The increasing hunger of high-impact journals for multi-disciplinary research will ensure that the drive for collaboration is always there, but groups won't need to be absorbed within these massive consortia in order to take advantage of their data output.

My guess is that - contrary to the Big critics - the human genetics ecosystem will continue to fluctuate around an equilibrium point marking a fairly comfortable balance between Big and Small Genetics. However, the crucial symbol in the equation is the free release of data, meaning anything that interferes with open data access is a threat to the research community as a whole - so we need to be wary both of an excessive focus on commercialisation within academia, and of well-meaning but excessive attempts to control the flow of data, like this.

Subscribe to Genetic Future.

Follow Daniel on Twitter.

More like this

There are a couple of fallacies thrown around as if they were facts when critics argue about this point. The major one is the notion that it was a general assumption amongst the proponents of large scale science that this was the answer to diseases like cancer, heart disease, diabetes or genetic disorders. I can't remember anyone seriously suggesting at the time that there would be rapid cures coming out of the genome project.
The other fallacy is the ridiculous overinflated role now assigned to Craig Venter in pushing the genome project to completion. I think he's a smart guy with some good ideas but come on, Celeras initial genome was a mess and cost a fortune to get full access and their gene prediction was a joke.

Seems to me that the main objection about FC's appointment isn't big vs small genetics: it's the fact that he's a *geneticist*. By analogy, Zerhouni's appointment was the triumph of imaging at the expense of other diagnostic forms, and his resignation was the MRI's fall from grace.

Really, is this what political/policy discourse has devolved into? Pah.

We are also beginning to see small groups seize on the wealth of data from genome-wide association studies to drive both targeted genetic studies and functional and mechanistic analyses.

Is there any decent evidence that the vast sums of time, money, and effort spent on Genome Wide Association studies are actually any better at driving targeted genetic studies and functional and mechanistic analyses--which are the things that are actually required to get at real biology--than the fast, free, and easy process of Wild Ass Guessing?

20 years of candidate gene association studies for complex diseases = less than a dozen reliable associations, and a whole lot of wild goose chases that wasted time and resources (how many knockout mice were made to study non-existent genetic associations, I wonder?).

2 years of genome-wide association studies = well over 400 replicated genetic variants associated with more than 70 complex traits and diseases, most of which are now the target of fairly intensive mechanistic follow-up.

QED.

It's also worth emphasising that wild-ass guessing was neither fast, nor free, nor necessarily easy. :-)

yes. this is not up for debate in the field--Wild Ass Guessing was an abject failure, and GWAS work infinitely better (admittedly, it's not hard to do much better than abject failure). i echo daniel's comment; also, for a couple concrete examples, see my post linked above.

IIRC, at the December NIH/CDC meeting Francis Collins suggested that the way to get to the bottom of the missing heritability, the common disease common variant hypothesis, gene-gene and gene-environment interactions, etc. etc. is to run a population-wide, 20-year longitudinal study in which genome-wide data and detailed environmental and behavioral minutiae were tracked for 100,000 participants.

The follow-up commenters starting with John Ioannidis each upped the sample size by an order of magnitude, until someone suggested that the entire U.S. population be sequenced, which it was then realized would require universal health care.

At that point, the meeting ended.

Perhaps Celera's initial assembly was a mess (and I'm not trying to defend it; just don't have basis for comment), Celera's formation caused an (IMHO) undeniable acceleration of the public effort.

The crisis is that when we look at things that are straightforwardly hereditary, for example height, we don't get much. We should. Why do we not?

We should be able to predict someone's height from their genome at least as accurately as by looking at their parents - and similarly for their IQ, the shape of their nose, and so on and so forth. What is the problem? Anything that is sufficiently straightforward for breeders to select for in rats, we should be able to predict from genes.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Genetic Future is moving

January 18, 2011

After a semi-hiatus due to various distractions, I'm about to restart blogging in earnest again over at the new home of Genetic Future on Wired Science. Please update your RSS feed: my new one is here. And a reminder: you can always keep track of new posts here as well as other nuggets of…

One more step towards the end of recessive diseases

January 13, 2011

In the last century infant mortality has declined precipitously in the Western world, thanks in large part to the development of antibiotics and vaccination. Yet as the suffering and death from infectious disease has reduced, the burden from genetic disease has become proportionately greater:…

New FireFox plugin for 23andMe customers

January 11, 2011

Software company 5AM Solutions has just launched a neat little FireFox plug-in for customers of consumer genomics company 23andMe. The idea is very simple: Download your raw data from 23andMe (or use one of the files from me or my colleagues at Genomes Unzipped); Install the…

Why you CAN have your $1000 genome - so long as you learn what to do with it

January 7, 2011

As part of his Gene Week celebration over at Forbes, Matthew Herper has a provocative post titled "Why you can't have your $1000 genome". In this post I'll explain why, while Herper's pessimism is absolutely justified for genomes produced in a medical setting, I'm confident that I'll be obtaining…

Bioscience Resource Project critique of modern genomics: a missed opportunity

December 15, 2010

Late last week I stumbled across a press release with an attention-grabbing headline ("The Causes of Common Diseases are Not Genetic Concludes a New Analysis") linking to a lengthy blog post at the Bioscience Resource Project, a website devoted to food and agriculture. The post, written by two…