Playing in the dirt: metagenomics on the JHU campus

We have lots of DNA samples from bacteria that were isolated from dirt. Now it's time to our own metagenomics project and figure out what they are. Our class project is on a much smaller scale than the honeybee metagenomics project that I wrote about yesterday, but we're using many of the same principles.

The general process is this:

  • 1. We sort the chromatogram data to identify good data and separate it from bad data. Informatics can help you determine if data is good, and measure how good it is, but it cannot turn bad data into good data. And, there's no point in wasting time with crappy data.
  • 2. We use FinchTV to take a closer look at the data and determine if our sequences represent a pure culture or not
  • 3. We use blast to find the best matches in GenBank.
  • 4. We evaluate the results and decide on the genus for our sample.
  • 5. We edit the read in FinchTV and blast again, if need be, to see if we can improve the match.
  • 6. We enter the genus of our sample and the biome where it was found (see the overview) in the FinchTV comment field, then save our results back to the iFinch database.
  • 7. We will use SQL to query the iFinch database and determine which bacteria were found, compare the bacteria in the different biomes, and compare the bacteria from different years.
    • I'll write more about each of the steps as we go along.

      You're all welcome to do a few samples yourself and help us out. Especially since I have three years worth of data. If you're a teacher and you want to get a data set to use with your class, you're welcome to log in and download the data from iFinch.

      Write to me at digitalbio at gmail dot com and I'll send you information for logging in to iFinch and playing along. Of course, you could always get a trial account anyway, but this way you get to play with our data and be part of our project.

More like this

This the third part of case study where we see what happens when high school students clone and sequence genomic plant DNA. In this last part, we use the results from an automated comparison program to determine if the students cloned any genes at all and, if so, which genes were cloned. (You can…
Would you like to have some fun playing with chromatograms and helping our class identify bacteria in the dirt? This quarter, my bioinformatics class, at Shoreline Community College, will be working with chromatograms that were obtained by students at Johns Hopkins University, and graciously made…
A few years ago, the General Biology students at the Johns Hopkins University began to interrogate the unseen world. During this semester-long project, they study the ecosystems of the Homewood campus, and engage in novel research by exploring the microbial ecosystems in different sections of the…
Okay OpenOffice fans, show me what you can do. Earlier this week, I wrote about my challenges with a bug in Microsoft Excel that only appears on Windows computers. Since I use a Mac, I didn't know about the bug when I wrote the assignment and I only found out about it after all but one of my…