Now on ScienceBlogs: Unitary mindfulness in collective action

Seed Media Group

Discovering Biology in a Digital World

My thoughts on biology, teaching, life, and exploring the living world via the digital one. Only my opinions are represented by these postings, they do not represent the viewpoints of any funding agency or Geospiza, Inc.

Profile

Sandra Porter I am a microbiologist and molecular biologist turned tenured biotech faculty turned bioinformatics scientist turned entrepreneur. My passion is developing instructional materials for 21st century biology (Digital World Biology).

Search

Digital World Biology

Discover Biology with Bioinformatics


Subscribe to our newsletter


e-mail digitalbio at scienceblogs.com

use 'Digital World Biology' news as the subject

DigitalBio Favorites

Science Blogs School Fundraiser


link_donorschoose_small.gif


Recent Posts

Recent Comments

Categories

Blogroll

Science Education Groups

Keep up to date

Awards

Red Orbit

Digital Bio at Blogged

Wikio - Top Blogs - Sciences
Add Digital Bio to your Technorati Favorites!





Follow me on Twitter

When you need to laugh

Interesting places

The Tangled Bank
MicrobeWorld Radio

Locations of visitors to this page

Archives

« What is a gene? My definition is better | Main | Make your own stem cells! »

Basics: How do you sequence a genome?

Category: Ask Dr. ScienceBasicsBioinformaticsGenomicsScience educationbiotechnology
Posted on: January 22, 2007 8:59 AM, by Sandra Porter

About a week ago, I offered to answer questions about subjects that I've either worked with, studied or taught.

I haven't had many questions yet, but I can certainly answer the ones I've had so far. Today, I'll answer the first question:

How do you sequence a genome?

Before we get into the technical details, there are some other genomic questions that you might like answered.

How much does it cost to sequence a genome?

I remember in 2002, when we were at the O'Reilly bioinformatics conference and we heard Lee Hood challenge the DNA sequencing community to lower the costs of genomic sequencing to $1000 for a human genome. It was all pretty exciting!

We're not there yet. But, we're getting closer. I've heard secondhand, from one of our customers, that it costs about $10,000 to sequence an average-sized bacterial genome, once you've purchased your sequencers, bought your software, and built your lab. Just for a bit of perspective, an average bacterial genome is about 750 times smaller than the human genome.

I'll leave you to do the math, but I imagine it scales pretty well. Ten million for a human genome seems about right, especially considering the original version was estimated to cost about 3 billion dollars.


What kind of infrastructure do you need to have?

You will need lots of robots for pipetting and preparing DNA, DNA sequencing instruments, computers, and software for tracking samples, evaluating sequence quality, and assembling the sequences at the end.

Some of the other types of equipment will depend on the methods that you're using. If you're using an older method, you'll need autoclaves and special incubators for growing bacteria. If you're using a newer method, like pyrosequencing, you need to have a special clean room where you can work with a lower risk of contamination.


Fine, so how do you go about doing it?

This used to be an easier question to answer. But now that pyrosequencing (from 454) has come along, this answer isn't as simple.

Still, I can divide the steps into three general parts, and then, since there are some nice movies and Flash® animations on the internet, I will send you out to go watch them.

Here are the steps:

  • Break the genome into lots of small pieces at random positions.
  • Determine the sequence of each small piece of DNA.
  • Use an assembly program to figure out which pieces fit together.

The last two steps are a lot like determining what was written in the Dead Sea Scrolls.


Stay tuned, there will be more.

And there is:
Part II: Sequencing strategies
Part III: Reads and chromats
Part IV: How many reads does it take?
Part V: checking out the library

Share this: Stumbleupon Reddit Email + More

TrackBacks

TrackBack URL for this entry: http://scienceblogs.com/mt/pings/31113

Comments

1

I'm not sure if this is the right place to ask, but what is shotgun sequencing? I've always wondered how they're able to sequence AND differentiate different species...

Posted by: Brian | January 22, 2007 10:45 AM

2

This is a fine place to ask.

Shotgun sequencing is a strategy for determining a DNA sequence that involves breaking a DNA molecule into several smaller pieces, then determining the sequence of DNA in each piece, and last, using software to put the smaller pieces together into a longer piece.

It's called "shotgun sequencing" because it doesn't involve mapping.

As far as differentiating between species, this is pretty easy to do. You know where you got your DNA sample, so you only need to distinguish between the DNA pieces that you're trying to sequence and DNA from the vector or from E. coli. That's pretty easy to do using standard sequence comparison programs like BLAST or cross_match.

I'll discuss shotgun sequencing in more in detail in the future posts on this subject.

Posted by: Sandra Porter | January 22, 2007 11:15 AM

3
Ten million for a human genome seems about right, especially considering the original version was estimated to cost about 3 billion dollars.
According to a current press release from Solexa:
Solexa expects its first-generation instrument, the 1G Genome Analyzer, to generate over a billion bases of DNA sequence per run and to enable human genome resequencing below $100,000 per sample, making it the first platform to reach this important milestone.
Their 1G machine allows sequencing of 1 billion basepairs per run. It is a chip based massive parallel modified Sanger sequencing method. The principle is depicted here:
http://www.solexa.com/technology/sbs.html
and
http://www.solexa.com/technology/demo.html

Posted by: sparc | January 22, 2007 12:27 PM

5

I'm not sure if this is the right place to ask, but what is shotgun sequencing? I've always wondered how they're able to sequence AND differentiate different species...

maybe this is a reference to metagenomics?

Posted by: p-ter | January 26, 2007 12:21 AM

6

like in this paper:
http://www.sciencemag.org/cgi/content/abstract/304/5667/66

if you sequence DNA from a microbial community, there's a certain stretch of DNA that acts as sort of a tag for a bacterial species. The amounts of time you see the tag and all the variants of it act as a count on the abundance of different species, and you can make a phylogeny from them.

Sequences can still be assembled as normal, it's just that it's difficult to know when you have a complete genome from a given species. In metagenomics, however, that isn't the goal-- instead, you want to look at which genes are present, which sequences are already in databases, which are novel, etc.

If there are only a couple species, you can distinguish them by GC content or some other measure of base composition.

Posted by: p-ter | January 26, 2007 12:25 AM

7

P-ter

Good point about the possible metagenomics slant to that question. You're right, in those instances you are not sequencing genomes, you're taking a sample and looking to find out what's present in that sample. Usually, people identify bacteria by looking at the genes for ribosomal RNA, but GC content is helpful, too.

Posted by: Sandra Porter | January 26, 2007 12:56 PM

Post a Comment

(Email is required for authentication purposes only. On some blogs, comments are moderated for spam, so your comment may not appear immediately.)





ScienceBlogs

Search ScienceBlogs:

Go to:

Advertisement
Follow ScienceBlogs on Twitter
Visit the Collective Imagination blog
Advertisement
Enter to win

© 2006-2009 Seed Media Group LLC. ScienceBlogs is a registered trademark of Seed Media Group. All rights reserved.

Sites by Seed Media Group: Seed Media Group | ScienceBlogs | SEEDMAGAZINE.COM