The next generation of genome-wide association studies

BioArray News (subscription required) reports that genomic analysis technology provider Illumina has launched a new family of genotyping chips designed to simultaneously assay 4 million sites of variation in the human genome.

The chips are a major step up from the 1-million-feature chips that
currently represent the state of the art, and take advantage of several
public projects generating catalogues of human genetic variation (such
as the 1000 Genomes Project).
Illumina has also increased the density of markers in and around genes,
and fleshed out regions that have previously been associated with
complex traits and diseases.

The new chips are specifically designed to increase the coverage of two
types of variants that tended to be poorly captured by previous
generations of chips: rare variants, and structural variation.

Chip-based genotyping is very much a place-holder technology while we
wait for whole-genome sequencing to become cheap and accurate enough to
use for large-scale studies. Illumina clearly expects the market to
persist for at least another couple of years before sequencing takes
over completely:

There might be some customers who will hold off for the next generation
of arrays. "We think it will be a year to a year and a half until all
the content is out there and we arrive at a penultimate array that has
the content that everyone will want," [Illumina CEO Jay Flatley] said.

Of course, Illumina is well-placed to ride the wave regardless of when
the sequencing transition occurs; in addition to its genotyping
products it provides the most successful current second-generation
sequencing technology, the Genome Analyzer, and has secured an exclusive marketing contract for one of the most promising third-generation platforms, Oxford Nanopore.

Bigger is not always better
The BioArray News article also notes that the most recent generation of genotyping chips (the 1M series, with one million features) have "not seen adoption ... to the extent of other chips". There's a good reason for that, which is spelled out in an article in PLoS Genetics this week: despite the increased number of variants on the 1M chip, its value for money (in terms of power for a fixed study cost) is actually lower than earlier-generation chips.

Here's a table from that paper illustrating this point:

i-bcb09402e7159ed96123524c0c742f6f-gwas_cost_comparison.jpg

The table assumes a fixed budget of $2 million for genotyping. Despite having nearly twice the number of markers, the Illumina 1M chip actually has substantially lower power than the earlier-generation 610K chip, for a simple reason: because the 1M chip is almost twice as expensive, researchers have to settle for genotyping many fewer individuals; and the increased power from adding more markers doesn't make up for this drop in sample size.

The same economics may well apply to the new chips (depending on their pricing, of course). The addition of rare variants to the chip adds an extra element to the equation; but it should be noted that the low power of genotyping studies to detect rare disease-causing variants means that such studies will require very large sample sizes; if the new chips are too expensive such studies may well be impractical for most research groups, encouraging them to lean towards targeted resequencing of candidate genes instead.

More like this

Update 30/11/10: 23andMe has extended their 80% discount until Christmas, without a need for a discount code. Personal genomics company 23andMe has made some fairly major announcements this week: a brand new chip, a new product strategy (including a monthly subscription fee), and yet another…
Getting an accurate genome sequence requires that you collect the data at least twice argue Robasky, Lewis, and Church in their recent opinion piece in Nat. Rev. Genetics [1]. The DNA sequencing world kicked off 2014 with an audacious start. Andrew Pollack ran an article in the New York Times…
There are 42 new articles in PLoS ONE today. As always, you should rate the articles, post notes and comments and send trackbacks when you blog about the papers. You can now also easily place articles on various social services (CiteULike, Mendeley, Connotea, Stumbleupon, Facebook and Digg) with…
So, let's see what's new in PLoS Genetics, PLoS Computational Biology, PLoS Pathogens and PLoS Neglected Tropical Diseases this week. As always, you should rate the articles, post notes and comments and send trackbacks when you blog about the papers. Here are my own picks for the week - you go…

Interesting post; while the 1M chip may suffer (or already be suffering) from the fact that its high price isn't justified for common variation (since the smaller chips provide nearly equivalent power, and far greater power per dollar spent), the 4M or 10M or ... chips won't necessarily have that problem if new studies are designed to aggressively target rare variants, a challenge which requires a much larger number of SNPs in European populations.

Hi Jeff,

I completely agree, although I guess that the next generation of chips after this will be where the value of rare variants really starts to kick in (these new chips only contain ~100K variants from the 1KG project, so there are already two orders of magnitude more to draw on).

It will be interesting to see how the pricing works out, though - a 10M chip with loads of rare variants would be pretty powerful, but if it's too expensive I'm guessing people will go for other options instead: pulldown-reseq of a bunch of candidate genes, or maybe even low-coverage WGS with 1KG-style imputation. The chip manufacturers don't have much of a window left before large-scale WGS becomes feasible, so they'd better get their pricing structure right.

One final thought - as the chips dig lower and lower into the frequency spectrum it will be interesting to see if the companies start going for population-specific designs. Most of these super-rare variants are population-specific anyway, so it makes sense to create targeted chips rather than clutter up an East Asian GWAS with a whole bunch of European-specific variants.

right, though we won't know what is worth typing in a population until it is typed in a very large panel. Things at low frequency in Europe and absent in East asia in the 1000 genomes, could just be due to sampling noise. I suspect it'll be a while before we know how population specific low freq. alleles are (though clearly they will be more restricted in range).

You might be confusing the fact that the new Omni1-Quad chip will look at 4 million variants with the fact that 4 samples can be looked at on a chip. So it is actually just a reconfigured 1M chip, possibly cheaper, with new content swapped in.

That info is available here from Illumina:
http://www.illumina.com/downloads/HumanOmni1-Quad_DataSheet.pdf