Barcoding in Biology: A Tale of Two Meanings

Over at evolgen, ScienceBlogling RPM discusses a paper that describes a new barcoding technique for plants. It struck me while reading his post that barcoding has two very different meanings, even though both techniques are used in genomics--and often, at the same time.

One meaning of barcoding, and the one discussed by RPM, is the use of a gene to assign different groups of organisms a taxonomic DNA label (or barcode...). In other words, we're replacing Latin bionomials, like Escherichia coli or Homo sapiens, with a DNA sequence from a single gene (or a set of closely related sequences).

This is useful for several reasons. First, if you pick the appropriate molecule, you have a reproducible tag that doesn't require an expert taxonomist to make an identification*. Second, the barcode can contain phylogenetic information: using the DNA sequence in the tag, we can construct an evolutionary history of the organisms of interest.

Third, using high-throughput DNA sequencing methods, hundreds of thousands of these barcodes can simultaneously sequenced (in a twenty-four hour period, conservatively, I estimate one could generate one million reads, depending on what sequence length is required). One could take a sample, such as a gram of poop, extract its DNA, and characterize roughly 200,000 barcodes, representing the same number of organisms.

Which brings me to the second meaning of barcoding. In the procedure I described above, obtaining 200,000 taxonomic tags (i.e., barcodes) from a sample is massive overkill--and very expensive overkill. Instead we pool samples, so if you wanted only 10,000 tags per DNA sample, you could pool twenty samples. But how do we tell the samples apart?

Wait for it...barcoding. To each DNA taxonomic tag (a barcode sensu RPM), we attach a barcode that identifies which sample the taxonomic tag came from. So I suppose that this is barcoding barcodes.

Or something.

*Taxonomic barcoding has met considerable opposition from many biologists. One reason, and I think a legitimate one, is that it will reduce the need for and training of biologists in taxonomy and natural history. A legitimate concern is that, as we gain molecular knowledge, we are losing natural history and ecological knowledge.

More like this

I'll be at the Advances in Genome Biology and Technology meeting next week - this will be my my first experience of this annual conference on Florida's picturesque Marco Island, but I already have high expectations based on reports from previous years. The programme is packed with cutting-edge…
That is not a riddle, or rather it's not meant to be, but it's a question worth asking about the barcoding project. Wired has a nicely written piece about the rationale and program of giving species DNA barcodes and using the gene chosen as the barcode to identify the number of species out…
NIH, in about six months, will release a huge sum of money to fund the study of the human 'microbiome': those microorganisms that live on or in us. One of the things that will be done with this money is meta-genomics which is "the study of genomes recovered from environmental samples as opposed…
Every time there's an article about species barcoding--using a short DNA sequence to identify species--there always seems to be people who get all het up: Barcoding, which is something I have criticised and discussed before here, and here, treats species as things that have some invariant property…