Not all next-gen sequencing technologies are created equal

The Next Generation Sequencing blog has a post on low coverage of A/T regions with Solexa sequencing. The post is in reference to a paper in Nature Methods on genome resequencing in C. elegans (doi:10.1038/nmeth.1179). Here's how the NextGen Sequencing blog summarizes it:

However, it points to a general lack of coverage in A/T rich regions (see figure 2 of the supplementary material) which leaves a number of zero size gaps in the assembly - places where reads sit shoulder to shoulder but simply do not overlap. Having found these problematic A/T rich regions, the authors went back and took a look across the genome, where they found a general correlation between A/T content and read coverage. This correlation was stronger when examining a 200 bp window than when examining a 32 bp window. 200 bp corresponds to the size of the amplicons that are amplifying during the cluster generation step prior to sequencing and 32 bp corresponds to the number of cycles in the actual sequencing by synthesis procedure. This finding made Hillier et al. conclude that failure to amplify A/T rich regions during cluster generation is the cause of the low coverage (other reasons for the bias such as hairpin formation were also explored but discarded).

This is an issue if Solexa is to become a dominant way to resequence genomes. However, there are other applications of the Solexa technology that will probably not be affected. These include using Solexa to quantify gene expression and to genotype known variants segregating in a population (both of these jobs are currently dominated by microarrays). The low read coverage in A/T rich regions shouldn't affect the genotyping of known variants. Problems will arise when the goal of a resequencing project is to identify novel variants. However, 454 sequencing should work well for to achieve that goal.

As a related aside, if you don't know much about next generation sequencing, but would like to learn, check out these two reviews:

Mardisa ER. 2008. The impact of next-generation sequencing technology on genetics. Trends Genet 24: 133-141 doi:10.1016/j.tig.2007.12.007

von Bubnoffa A. 2008. Next-Generation Sequencing: The Race Is On. Cell 132: 721-723 doi:10.1016/j.cell.2008.02.028

Tags

More like this

Thanks for this. As someone far from the cutting edge of sequencing technology (our main mode of data collection is still PCR followed by direct sequencing on an ABI 377), this is really helpful.

funny that i happen to c this new paper on the significance of A-/AT-tracts.

Genome-wide Analysis of Fis Binding in Escherichia coli Indicates a Causative Role for A-/AT-tracts. PMID: 18340041 [PubMed - as supplied by publisher]

I quote
"Analysis indicates that A-tracts and AT-tracts are an important signal for preferred Fis binding sites, and that A(6)-tracts in particular constitute a high-affinity signal which dictates Fis phasing in stretches of DNA containing multiple and variably-spaced A-tracts and AT-tracts"