Stephen Turner from Pacific Biosciences gave a dramatic presentation this afternoon launching PacBio’s new third-generation sequencing instrument. The room was packed for the seminar, with a palpable buzz, and Turner’s presentation was preceded by a theatrical introduction from PacBio CEO Hugh Martin.
The presentation was overall impressive, but light on numbers – and in question time Turner was almost comically evasive on the issue of error rates (more on this below). No doubt Luke Jostins
will have some more technical details online soon, but here are my initial thoughts:
Very long read lengths: Turner scrolled along a single read 10,351 bases long, and claimed that the system can already achieve reads as long as 20,000 bases. By circularising the template before sequencing it’s possible to use these extremely long reads to sequence the same molecule numerous times in a single run, potentially creating very accurate consensus sequences (albeit at a cost in terms of throughput). Alternatively, using a linear template would allow users to read accurately through the types of complex, repetitive sequence that confound current short-read sequencing technologies.
However, it’s worth noting that only a small fraction of reads will actually reach these extreme lengths. Users will be able to decide how long to run the instrument for; short runs will give large amounts of short reads, while longer runs will yield a small number of very long reads, but at the cost of much-decreased total sequence yield per unit time.
Strobe sequencing: this allows you to generate pulses of sequence data from a molecule separated by arbitrary stretches of unsequenced DNA during which sequencing is “paused”. That’s an extremely handy feature for a number of applications; I’m particularly interested in the possibilities for calling large-scale structural variants in mammalian genomes.
Error rates: Turner said nothing concrete about error rates during his presentation, but this issue dominated the questions from the audience. Turner skilfully equivocated, steering clear of providing any hard numbers on the raw error rates and focusing on the system’s ability to generate accurate consensus sequences through circular reads. Still, it’s clear that deletion errors due to missing bases will pose a non-trivial problem for the system: Turner referred to algorithms for assembling sequence dominated by insertion/deletion errors currently in development.
Ease of use:
the platform seems to have a very straightforward user experience: samples go in one drawer, sequencing cartridges (arrays of zero-mode wave guides
) in another drawer, and robotic fluid-handling systems inside the machine take care of the rest. The downstream informatics sound as though they will be reasonably straightforward (especially compared to the horrible complexities of the SOLiD system’s colour space), with the minor complication of having multiple non-independent reads from the same circular molecule.
My overall impression: an impressive show, and a consummate demonstration of the salesmanship that has netted PacBio an impressive $266M in funding, but it remains hard to judge the overall value of PacBio’s system without some much harder numbers on error rates. Certainly this is not yet a system that will transform the field of genomics – we’re still a long way away from the 15-minute, $100 genome promised by PacBio two years ago.
Subscribe to Genetic Future.