About Gazelles, Elephants, Blue Whales and Dodos: A Microbial Perspective

At least if we're talking about microbes.

Nick Loman has a good piece summarizing the state of genome sequencing technologies. It's pretty accurate, although I'm always skeptical about capabilities until I've seen it function. But from my point of view, which is focused on microbial genomics, the actual sequencing--determining the nucleotides on a piece of DNA--is already incredibly cheap. In that context, there are two models that seem relevant:

1) Really rapid sequencing. Here, you could have a bacterial genome in a matter of hours, even if it's not that efficient in terms of cost per nucleotide.

2) High throughput. Here, it might take days to generate your sequence, but you generate hundreds of bacterial genomes at once (e.g., Illumina).

With both models, the cost of sequencing is already very small. DNA extraction (getting the DNA from a bacterial cell) and creating a 'DNA library' (preparing the DNA for sequencing) are already as, if not more, expensive (we'll ignore for now the cost of acquiring strains with appropriate metadata which is also expensive). If you add in other factors, such as having the infrastructure to track thousands of samples, uploading data to various databases, assembling and annotating genomes, maintaining the IT support for the computational side of sequencing (on an annual basis, we're talking files that, in total, are many, many petabytes).

What seems interesting to me is that we've reached the point where advances in sequencing capacity are yielding decreasing marginal returns.

Now we just have to figure out all of the other stuff...

Caveat: For charismatic (and even non-charismatic) megafauna, we're not quite there yet, but we probably will be in a year, even with second-generation technologies.

More like this

No more delays! BLAST away! Time to blast. Let's see what it means for sequences to be similar.  First, we'll plan our experiment.  When I think about digital biology experiments, I organize the steps in the following way: 
Shotgun sequencing refers to the process whereby a genome is sequenced and assembled with no prior information regarding the genomic location of any of the DNA we sequence. There are quite a few steps that you have to go through before you have an assembled genome sequence.
A few weeks back, we published a review about the development and role of the human reference genome. A key point of the reference genome is that it is not a single sequence.
What tells us that this new form of H1N1 is swine flu and not regular old human flu or avian flu? If we had a lab, we might use antibodies, but when you're a digital biologist, you use a computer.