The Implementation of Molecular Evolution for the Masses

A couple of years ago, there was talk in the bioblogosphere about getting the general public interested in bioinformatics and molecular evolution:

The idea was inspired by the findings of armchair astronomers -- people who have no professional training, but make contributions to astronomy via their stargazing hobbies. With so much data available in publicly accessible databases, there's no reason we can't motivate armchair biologists to start mining for interesting results.

But how do we train these new comp-bio code-monkeys? The field of bioinformatics requires both some computational skills, as well as an understanding of biology. Finding people with both skill sets (and interests) can be tricky. Well, a framework has been laid out in a recent paper in PLoS Biology for teaching the skills (doi:10.1371/journal.pbio.0060296). The authors present a web-based interface through which students implement standard online tools for DNA sequence analysis (Annotathon).

The course described in the paper takes advantage of the vast amount of data deposited in sequence repositories from metagenomic projects (specifically, the Global Ocean Sampling sequences). Starting with these data, the students perform simple molecular evolutionary analysis, including gene prediction, alignment, and phylogenetic construction. Here's how the authors summarize their course:

The goal of the course is to teach students how to computationally annotate biological sequences (DNA and protein sequences). The starting point is a short stretch of DNA sequence (such as a single metagenomic sequencing read) that students are asked to study according to two major lines of inquiry: (1) prediction of gene product putative function and (2) prediction of taxonomic group of origin.

The question remains: how can we translate these courses offered at universities to the general public? Can we inspire armchair computational biologists to analyze data outside of the classroom?


Hingamp P, Brochier C, Talla E, Gautheret D, Thieffry D, et al. (2008) Metagenome Annotation Using a Distributed Grid of Undergraduate Students. PLoS Biol 6(11): e296 doi:10.1371/journal.pbio.0060296

More like this

I just love this title! It's nerdy and cute, all at the same time. I read about this in www.researchblogging.org and had to check out the paper and blog write up from The Beagle Project (BTW: some of you may be interested in knowing that The Beagle Project is not a blog about dogs.) The paper…
One of the hot topics in evolutionary biology concerns the relative contributions of protein coding sequence changes and non-coding changes that lead to differences in the expression of protein coding genes. A subset of this debate can be summarized as cis versus trans. Non-coding sequences that…
I often get questions about bioinformatics, bioinformatics jobs and career paths. Most of the questions reflect a general sense of confusion between creating bioinformatics resources and using them. Bioinformatics is unique in this sense. No one confuses writing a package like Photoshop with…
Last year, Craig Venter became the first single person to have his genome sequence published (doi:10.1371/journal.pbio.0050254). That genome was sequenced using the old-school Sanger technique. It marked the second time the complete human genome had been published (which led to some discussion as…

Very interesting. There is a long tradition of this sort of thing in natural history and there are groups out there who promote 'citizen science'. Christmas Bird Counts and Adopt a Pond (a Toronto Zoo + other partners program to monitor wetlands) are some contemporary examples. Perhaps starting with general citizen science groups and then promoting their discoveries would help it catch on. I think once people realize that amateurs can make contributions to a field that otherwise seams impenetrably complex and technologically inaccessible then those who are interested would recruit themselves.

Thanks;A couple of years ago, there was talk in the bioblogosphere about getting the general public interested in bioinformatics and molecular evolution