Small delays due to Big Genetics

i-55e2d4cdff8a41e959a516cce385b26b-impatience.jpg[Added in edit in response to concerned emails: The original title was deliberately provocative, and contrary to the message in the text; I apologise for any misunderstanding. I've largely rewritten the post to make my point more clearly.]

One of the curious and paradoxical effects of Big Genetics projects like the 1000 Genomes Project - which plans to generate low-coverage whole-genome sequences for ~1,500 people by the end of this year, providing a map of human genetic variation of unprecedented resolution - is that while they considerably accelerate research in the long term, they can actually slightly delay some research projects in the short term.

Over the last six months I've heard four separate researchers note in presentations or conversations that they have abandoned large-scale sequencing projects performed as part of a larger association study, since any data generated by their sequencing would be made largely or entirely obsolete by 1000 Genomes. Essentially, the scientists are (understandably, given limited scientific resources) temporarily setting a project aside and waiting for the 1000 Genomes project to discover the variants they need, before  resuming their gene discovery efforts.

In some cases 1000 Genomes will actually generate the data faster than the researchers could themselves, thus accelerating their research - but this is not always the case. For some projects, avoiding the duplication of work that is already being done by a Big Genetics consortium will actually delay their project, albeit usually only for a few months. 

I gather from informal conversations I had at AGBT last week that previous Big Genetics projects had a similar effect: at least a few research groups held off disease gene mapping studies while they waited for the completion of the Human Genome Project, and some large-scale SNP discovery and genotyping studies were similarly put on hold while labs awaited data from the HapMap project.

Of course, I'm not arguing that this is in any way a reason not to do big genetics projects - there's no question that the overall scientific outcome of all of these projects far, far outweighs any costs due to short-term research delays. In addition, it's hard to see how this short-term effect could possibly be avoided - it's simply an inevitable consequence of any large-scale collaborative project generating a resource for the scientific community that would otherwise be constructed piecemeal by many separate research groups. (Nor am I saying, by the way, that 1000 Genomes could have been completed any faster than it has - I actually think the pace of progress has been astonishing, and its free release of data during the process has substantially reduced the likelihood of research delays.)

Finally, I think I need to emphasise that Big Genetics creates many opportunities for researchers to expand on the details of massive data-sets; I've said previously that "Big Genetics generates far more data than its participants can
ever hope to analyse themselves, and the hefty remainder is fodder for
a plethora of small labs exploring small but important facets of the
bigger picture."

Still, it makes me wonder how many researchers (and especially graduate students) have had their research suddenly change direction, while their main project is put on hold to await results from some collaborative behemoth. How many readers have experienced this themselves?

Subscribe to Genetic Future.

More like this

Why is this a bad thing?
It's not like those researchers would now sit around and do nothing.
The researchers will instead put their resources somewhere else to work where they can hope for a better long term payoff.

By ChristianK (not verified) on 15 Feb 2009 #permalink

While perhaps not the sort of massive project you describe, the PGP10 is certainly high profile. Despite reassurances of 'any day now' its been a long time since that data was produced, and yet most of it is still offline.

Interesting point which highlights the front line of the war between 'big science' and 'small science' advocates. I'm with you, in believing that these large collaborations are worthwhile and even necessary for the advancement of science.

WIth sequencing, in particular, there's something to be said for economies of scale, I have to imagine that the 1000 genomes project will get this data out more cheaply and with uniform standards, as opposed to 20 different labs doing it, each releasing data in a different format, with different quality metrics, etc. As a bioinformatician, I know which data set I'd rather work with.

I honestly think that we're better off having the grunt work (massive sequencing) done by specialized centers.

(Disclaimer: I'm at BCM, a major player in the 1000 genomes, but am not involved with the project)

Hi Christian,

I don't really think it's a bad thing overall (despite my deliberately provocative title).

You are correct that researchers will simply shift their focus onto other topics, although the end result may be substantial delays for individual projects.

Mainly, though, I just think it's an interesting effect, and I'm wondering how common it is.