Genetic Future

Personal genomics is a rapidly evolving game, with a clear end goal in sight: offering consumers an accurate, affordable and complete genome sequence, and providing them with tools to dig out the useful nuggets of information contained therein. That goal remains out of reach, and while DNA sequencing technology continues to mature companies in the personal genomics space are offering products at various points on the trade-off curve between information content and cost.

At the low-information/low-cost end, companies such as 23andMe and deCODEme offer cheap (sub-$1000) genome scans looking at between 500,000 and a million sites of common variation throughout the genome. These provide insight into a small fraction of your genome, but include the variants we know the most about (due to the recent explosion of genome-wide association studies, which look for common genetic variants associated with complex disease risk).

Meanwhile, at the other end of the spectrum we have the boutique service offered by Knomesequencing of the entire human genome, or at least the 85-90% of it that can be reached with current short-read technologies, for the princely sum of close to $100,000. It’s difficult to justify this cost given the interpretable information currently obtainable from a genome sequence, but a full genome sequence does offer the possibility of getting insight into rare, severe disease-causing variants lurking in your genome that are largely invisible to genome scans.

Now Knome has launched a new product that provides a substantial chunk of the information value of a whole genome sequence at a quarter of the cost, by focusing exclusively on the 2-3% of the genome that codes for proteins:

Unlike low-priced SNP-based genotyping, which captures genetic changes
known as common variants by taking a sample of less than 0.05% of the
genome, comprehensive gene sequencing captures the entire coding region
of an individual’s genes, collectively known as the exome, enabling the
detection of rare variants – mutations that many scientists believe
account for the majority of the genetic burden of disease.

That last claim is pretty optimistic – it’s now clear from genome-wide association studies that the majority of the common variants associated with common diseases are actually found outside protein-coding regions. However, it’s also true that more rare, severe disease-causing mutations do tend to cluster within and around protein-coding regions.

Perhaps more importantly, there is a sound pragmatic reason for focusing on this fraction of the genome: it’s simply much easier to interpret a mutation in a protein-coding region than outside it. Right now, our dismal ability to predict the functional impact of variants in non-coding regions means that sequencing the majority of the genome falling outside protein-coding genes actually adds very little in terms of health prediction. For the moment, combining a cheap genome scan (to pick up genome-wide patterns of common variation) with exome sequencing (to detect any rare, clearly pathogenic mutations) would give you pretty much everything you’d be likely to get from a whole genome sequence.

Knome plans to offer the service for $24,500 for individuals, or $19,500 per person for couples and families. That’s still well and truly in the boutique price range – but you should see this as yet another waypoint on the road towards affordable, complete genome sequencing.


  1. #1 JAShapiro
    May 18, 2009

    You are missing a word in one sentence. It is clear that the majority of COMMON variants associated with common diseases are found outside protein-coding regions. We really don’t know anything yet about what is causing the “majority of the genetic burden” since we are getting pretty small portions of the heritability for most common diseases. One of the strongest effects from GWAS, in macular degeneration, was a coding change, so who knows.

  2. #2 Daniel MacArthur
    May 18, 2009

    Oops – fixed. Thanks for picking that up.

  3. #3 Paul Jones
    May 18, 2009

    I’m cheering on all business models, though I myself won’t get my genome sequenced “fully” until it is 5k or so.

    I was thinking about one of the “heredity” services, but it seems that they genotype only 12 microsatellite markers plus a smattering of SNPs, likely mtDNA and Y chr DNA.

    Does 23 and Me give compelling anthropological/hereditary info?

    I’d pay 400$ for that plus the medical stuff.

  4. #4 Johar
    May 19, 2009

    Currently, the whole genome shot gun sequencing cost stands at around 10K for 17X coverage so it is a bit expensive to pay 24K for exome. I will say for 24K, one can get the whole genome with SNPs and structural variation analysis right now.

  5. #5 Daniel MacArthur
    May 19, 2009

    Hey Johar,

    Currently, the whole genome shot gun sequencing cost stands at around 10K for 17X

    That’s different from the estimates I’ve seen. Including all costs I understand a full Solexa run now costs somewhere on the order of $15K, and with paired-end 100 bp reads would give you maybe 14X coverage if you’re very lucky.

    Two things: firstly, that coverage is too low to regard a genome sequence as “complete” (I think you need around 30X to have a 99% chance of catching any given heterozygous SNP), even allowing for the fact that current technology can’t even “see” the 10-15% of the genome lying in highly repetitive regions.

    Secondly, that cost doesn’t allow for either interpretation costs or a profit margin, both of which are essential for Knome to be a viable business. After all, the costs of building the high-quality databases and analytical tools necessary for interpretation are non-trivial.

  6. Really, very interesting topic. If possible, please more information. This is one of the better blogs that I read.

  7. #7 Senthil Sundaram
    March 5, 2010

    Currently we get exome sequencing at 25X coverage for about 8k in academic centers. So, it seems quite expensive. Secondly, while the common variants are identified (with very low odds ratios!) in GWAS studies, they are only markers for disease and not the actual causal variant. So, the jury is still out whether most of diseases-causing variants are common or rare.