Sequencing Technology Adoption and the Power of Informatic 'Lock In'

Gabe Rudy, blogging at our 2 snps, has a really good introduction to sequencing technology and its history. It's worth the read, but I don't entirely agree with the reason given for why ABI SOLiD lost out to Illumina:

Coming to market at the same time, but seeming to have just missed the wave, was the Applied Biosystems (ABI) SOLiD system of parallel sequencing by stepwise ligation. Similar to the Solexa technology of creating extremely high throughput short reads cheaply, SOLiD has the added advantage of reading two bases at a time with a florescent label. Because a single base pair change is reflected in two consecutive di-base measurements, this two-base encoding has inherent accuracy in detecting real single nucleotide variations versus potential sequencing errors. In a seemingly otherwise head-to-head competitive spec sheet with Illumina's Solexa technology, the momentum of the market went to the company that shipped working machines out to the eager sequencing centers first. That prize was won by Illumina by a margin of nearly a year.

I don't disagree in the sense that people won't move to a technology they don't have access to, but that doesn't explain why people haven't moved to ABI SOLiD since its release. Another reason why SOLiD hasn't been widely adopted is because of "color space"--the two-base encoding.

Whether the technology is ol' timey Sanger, 454, or Illumina, these technologies are alike in that every individual base in a sequencing read is assigned a probability of being correct (often called a quality score, Q-score, or Q-value). Even though these three technologies work in very different ways, after we process the raw data, we convert it to a file that is a string of bases (e.g., ATG....) along with a Q-score linked to each base, known as FASTQ format. Most downstream informatics processes, whether they be whole genome assembly or looking for nucleotide variation, start with a FASTQ file (which then is often converted into other formats).

SOLiD's di-nucleotide approach, on the other hand, doesn't readily lend itself to Q-scores for individual bases, which means there's one more thing you have to figure out in order to adopt this technology. I don't think this issue should be underestimated, as it really does inhibit people from adopting SOLiD. Granted, if SOLiD were much cheaper and faster than Illumina, then it would have been adopted. But all things being equal--and they were pretty much equal--people are going to stick with data formats they know and are set up to handle. Nobody wants to spend time, effort and money puzzling out one more step when they don't have to.

Don't sell short the power of 'informatic lock in'....

More like this

When you use the PCR procedure it amplifies - (increases the amount) of the DNA so that there is enough to work with.. Without this there would not be enough DNA to work with in terms of running the DNA on gels or using it to make recombinant DNA in a plasmid.

By wirthnancy (not verified) on 19 Nov 2010 #permalink

Hey Mike,

I actually agree with you. I think a big reason SOLiD has a hard time gaining market share is the momentum of the bioinformatics in Illumina's favor. I'll explore this a bit more in my Part 2 of the series. But as to 'informatics lock-in', at the end of the day, ABI is now providing good tools to handle color space alignment and variant calling.

Would things have been different if SOLiD lead Illumina by a year instead of the the other way around? I don't know, but I do think they would have had a chance to get people used to color-space bioinformatics and possibly reached the tipping point of market mind-share.

At this point they have a hard time differentiating from Illumina in cost and throughput, and the extra difficulty of requiring alternative bioinformatics widens the gap even further to switching platforms.