It seems like only yesterday (okay, less than two years ago) that I learned about 454 sequencing. It's the new technology that many folks think will replace dye termination Sanger sequencing using capillary arrays (the method used to sequence the human genome and many other genomes). A new technology is coming on the scene which may make 454 obsolete before it ever gets a foot-hold in the market (making it the laser disk of DNA sequencing).
454 sequencing works by copying small stretches of DNA sequence that have been attached to tiny beads. As each nucleotide is added to the growing sequence, light is emitted via a chemical reaction (this approach is known as pyrosequencing). The light is read by a sensor inside the machine and translated into a DNA sequence. Many sequencing reactions can be run in parallel in a single machine, generating millions of nucleotides of data within a couple of hours. The 454 technology is limited in that the individual reads are only 100-200 nucleotides long, which is short compared to Sanger reads that are approaching 1000 bases. Furthermore, pyrosequencing techniques struggle with mononucleotide repeats (stretches of DNA sequence of the same nucleotide, ie, AAAAA or CCCCC).
The newest, hippest, and sexiest sequencing technology comes from Solexa (see here). This approach allows one to generate 1 billion nucleotides of data in a single run (a few orders of magnitude more than 454). As with moving from Sanger to 454, the tradeoffs of moving from 454 to Solexa are mostly in read length. Solexa takes the sequencing in parallel strategy of 454 to a new level, but generates reads of only 25 nucleotides. As read lengths get shorter and shorter, assembly of the sequences into a complete genome becomes more difficult. But the sheer volume of sequence generated by the Solexa technology will force some people to take a closer look -- it can sequence a complete human genome in a single run, albeit at 1x coverage.
The choice of which technology to use depends on the sequencing project one is carrying out. 454 has been shown to be useful for sequencing bacterial genomes de novo, but it's not clear whether it will be practical for the large, more repeat rich genomes found in eukaryotes (although it seems to be a good strategy for sequencing ancient DNA). But short read lengths aren't as big of a problem for resequencing projects -- those that generate sequences from multiple individuals from a species with an already sequenced genome -- because assembly of the reads can be done on the backbone of the previously published genome. The days of de novo genome sequencing with Solexa are far off, but I know of one group who plans to perform a large scale resequencing project using Solexa machines. This new technology may prove to be useful for genotyping individuals at disease associated loci or performing large scale population genetics surveys.
I agree with you that 454 won't work well for sequencing repeat-rich areas. Most genome centers, I think, have more than one sequencing instrument and use multiple technologies to address the problem.
In most cases though, once you have a reference genome in hand, it's probably not necessary to struggle through the repeat regions once again. I think 454 will be very, very useful for re-sequencing or for sequencing things like, uh, certain viruses, where people become infected with heterogeneous populations.
The biggest downside, I think, to 454 technology is just all the baggage that goes along with having to set up a clean room and do all your work in one of those funny white suits.
The choice of which technology to use depends on the sequencing project one is carrying out.
You hit the nail on the head there. Instead of seeing one method dominate, I predict that we'll see all of these methods fall into more or less discrete niches once the smoke clears.
doesn't solexa need a lot more coverage than the 454 system for large resequencing? meaning its throughput is more like 250mb in 3 days like for like comparison 600mb on the 454? I also heard storing the solexa computer data is very difficult?
The single run is nice and pretty amazing. For statistical accuracy I have heard the 16 runs are required, this of course various on the process. How is the signal to noise ratio with shorter nucleotides?
You guys know just as much about solexa as I do (and maybe more). Sorry I can't be of much help clarify the quality of reads or the quantity required for meaningful analyses.
Sounds to me as if the information for solexa is confusing I guess most of us wont find out until they are able to publish something. :(
While I agree on most of the comments here, I found the 454 technology the most mature technology compared to Solexa's and other promises like S.O.L.I.D. This let me think that if one technology is supposed to be embrassed by a vast majority of users now, then no doubt 1st choice is 454. Also, we have seen this with the recent upgrade of the GS-20 into the GS-FLX, the technology is ready for frequent upgrades, which will satisfy most of the very high end users.
But those high-end users with gigaBase needs, how many are they on earth compared to us with less than 1 megaBase requirements? How many genome sequencing centers compared to MDx labs with small-genome-resequencing needs? Let's come back to reality!
I which the later could drive this market and that we may have access to this technology in molecular medicine asap, at a reasonable cost.