tags: researchblogging.org, new species, insects, American cockroach, Periplaneta americana, DNA barcoding, Brenda Tan, Matt Cost, Mark Stoeckle, Rockefeller University, American Museum of Natural History, AMNH
Moving overseas has been a challenge, but worst of all for me has been the fact that my writing has suffered. I still read scientific papers and science news stories, but have been unable to find the time necessary to write these stories for you. Hopefully, my life is returning to some semblance of predictability, which means I can now start working again. I have several half-finished stories that I am working on and will be publishing over the next few days. The first story I want to share with you is about a simple high school DNA barcoding project that yielded an astonishing discovery; a new species that has been living in one of the largest urban areas in the world, New York City.
As a New Yorker, I am both surprised and not surprised at the same time by the discovery of a new species of cockroach hiding in our cabinets and showers and running around under our feet. I mean, where else would a new species of pest insect most likely be found?
Like an episode from the popular television series, CSI: NY, two high school seniors sought to identify hundreds of specimens that they had collected throughout Manhattan. Their goal? To identify the species by analyzing at a small portion of their DNA using a technique known as “DNA barcoding.” As a method for quickly identifying species, DNA barcoding has become increasingly more accepted within the previous six years.
The two “DNAHouse investigators” made a number surprising discoveries using DNA barcoding, including mislabeled food items, and — most astonishing of all — the discovery of a species of cockroach that is new to science. The insect, which looks like the American cockroach, Periplaneta americana, a widespread pest in NYC and other large cities, turned out to have a different “DNA barcode” from that species.
A DNA “barcode” is a short nucleotide sequence shared between organisms. Although the identity of the “barcode” gene is not standardized as yet, a 648-basepair long region of the mitochrondrial cytochrome c oxidase subunit I (CO1) gene is typically used as a DNA “barcode” for most eukaryotes. This genetic region is ideal because it is nearly universal, it is small and easily sequenced using current technology, and it contains large nucleotide variation between species (but relatively small variation within a species). Additionally, as of 2009, there were more than 620,000 known CO1 sequences from over 58,000 species of animals — larger than databases available for any other gene. These features allow for direct sequence comparisons and analyses between different species.
“It’s genetically distinct from all the other cockroaches in the database,” said DNAHouse investigator Brenda Tan. Ms. Tan, a senior at Manhattan’s Trinity School, worked on the project, along with fellow classmate Matt Cost.
“[Closely-related] species don’t differ [by] more than one percent, [while] this cockroach is four percent different,” agreed Professor Mark Stoeckle. “This suggests it is a new species of cockroach.”
Professor Stoeckle, a medical doctor who conducts genomic and DNA barcoding research at Rockefeller University, supervised Ms. Tan and Mr. Cost.
Further investigation is essential before it can be determined whether the students’ discovery is a new species or a subspecies. If their finding is confirmed, it is traditional that the discoverers — Ms. Tan and Mr. Cost in this case — will be granted naming rights for the new species.
This analysis is a continuation of the original 2008 study, also conducted by two Trinity students, that found that one-quarter of food fish that they purchased at restaurants and markets had been mislabeled — often by replacing expensive fish (like tuna) with cheaper species (like tilapia), triggering a public furor in NYC that came to be known as “sushi-gate.”
To conduct this study, Ms. Tan and Mr. Cost collected a total of 217 specimens between November 2008 and March 2009. They rummaged around in supermarkets, streets and in New York apartments, including that of Professor Stoeckle, where they found the new cockroach species.
“The superintendent of the apartment building was surprised when we wanted to save rather than squash the cockroach,” remarked Ms. Tan.
After the specimens were collected, they were photographed and labeled before their DNA was isolated and sequenced by scientists at the American Museum of Natural History.
After the DNA was sequenced, Ms. Tan and Mr. Cost analyzed the DNA by matching them to known sequences. Their collected specimens yielded 170 usable genetic codes that were matched to known DNA sequences for 95 different animal species. These DNA sequences are stored in GenBank and in the Barcode of Life Database (BOLD).
GenBank is an open access, annotated database of all publicly available DNA sequences and their protein translations. This database, established in 1979, is maintained by National Center for Biotechnology Information (NCBI) as part of the International Nucleotide Sequence Database Collaboration, or INSDC. Containing more than 65 billion nucleotide bases in more than 61 million sequences, this remarkable database doubles in size roughly every 18 months.
BOLD is a newer database that is maintained by the Biodiversity Institute of Ontario at the University of Guelph, Canada, where DNA barcoding was pioneered. So far, scientists the world over have DNA barcoded over 750,000 individual specimens from more than 65,000 species. Their ultimate goal is a reference library of barcodes for all animals and plants on Earth.
These DNA databases are publicly accessible, so all researchers have to do is enter a DNA sequence to compare it to those stored there.
The students were astonished to find that DNA was ubiquitous.
“We may think we live in a sterile, urban environment seemingly untouched by nature,” Mr. Cost marvels. “We imagine objects are purified and cleansed in order to pass into our personal world with evidence of their original source all but erased. But DNA is amazingly resilient to damage through all the processing to which it is subjected. We got usable DNA from 151 of 217 of the items tested — including dried soup mix, dog biscuits, beef jerky, butter, a feather lying on the sidewalk, a dried bit of horse manure from Central Park, even a feather duster.”
Ms. Tan and Mr. Cost found that DNA is very durable.
“[A]fter we realized that DNA was, indeed, omnipresent, an important question arose: How much abuse can this genetic material take before it becomes unintelligible or even unrecognizable?” Ms. Tan asked.
“Could we find decipherable DNA in a piece of cooked meat? A piece of cheese? A highly processed dog treat?” Ms. Tan continued. “What we found was astonishing. Few specific conditions proved able to destroy the DNA consistently.”
Canned foods were the one exception. Processed at high temperatures, canned foods contain DNA that was broken into tiny pieces, often making contents identification impossible.
The students were also impressed by the precision and power of the DNA barcoding technique.
“You could have a filet of fish, just the stuff you might throw on your grill, and an expert who spent his whole life [working with it] couldn’t tell you what it was by looking at it,” Mr. Cost observed. “But with this, it’s so simple.”
After the specimen was identified from its DNA “barcode”, learning more about each species was easy.
“Learning the species name was like finding a key that opened a new book,” Ms. Tan explained. “It’s exciting to learn still more after you know a species name. For example, ‘dried shredded squid’ turned out to be jumbo flying squid (Dosidicus gigas). We looked up jumbo flying squid and found it grows to 100 lbs, swims at depths up to 2,000 feet, travels in large schools containing hundreds of individuals, and hunts in cooperative packs like wolves. This gave us new thoughts about the oceans and about calamari salad.”
Most results are expected, but the pair have already made some unexpected discoveries.
“There were a lot of surprises,” said Mr. Cost. “We tested ‘buffalo mozzarella’ cheese and found it is made from the milk of Water Buffaloes. We asked some adults who have ordered it on restaurant menus and they didn’t know that.”
They ran across a few other surprises, too: sixteen percent of food items examined were mislabeled, including venison dog treats that were made of beef, dried shark meat turned out to be Nile perch, sturgeon caviar that was Mississippi paddlefish and sheep’s milk cheese that was made of cow’s milk — a potentially dangerous labeling error for those with food allergies.
But the biggest surprise of all was the cockroach whose genetic code did not appear in any of the DNA sequence databases.
“By appearance it looks like the American cockroach but it is genetically different from other American cockroaches in the databases,” the two DNAHouse investigators said.
Ms. Tan and Mr. Cost got help on their work from the American Museum of Natural History and Rockefeller University in New York.
Both Ms. Tan and Mr. Cost graduate at the end of the 2010 school year. Ms. Tan plans to pursue biology in college next year, while Mr. Cost will study music.
[NOTE: Even though this was a fun story for me to write, and I absolutely love encouraging students to get involved in science, I have to tell you that I am suspicious of these data. Those of you who are familiar with molecular phylogenies will notice a few oddities in the phylogentic tree above. First, the position of humans relative to birds implies that humans and birds are each other's closest relatives, while "other mammals," reptiles and fish are more distantly related to humans and birds. In short, this figure claims paraphyly in the evolution of mammals! Another oddity are the claimed relationships between "other arthropods" and insects, which again, suggests that "other arthropods" are paraphyletic. Second, the branches of this tree lack any bootstrap data so it is impossible to distinguish real data from noise. DNA "barcodes" do not provide sufficient resolution to describe evolutionary relationships between organisms higher than the generic or family level -- contrary to what is implied by the students' phylogenetic tree. Basically, this phylogenetic tree is not publishable data by any peer-reviewed journal for many reasons, not the least of which are the strange evolutionary relationships that it claims. It really bothers me that none of the journalists covering this story noticed these irregularities, leaving an unemployed (and presumably unemployable) evolutionary biologist to point this out.]
Musante, S. (2010). DNA Barcoding Investigations Bring Biology to Life. BioScience, 60 (1), 14-14 DOI: 10.1525/bio.2010.60.1.4
Press Release [PDF] (quotes).