Digital Biology Friday: What was that gene anyway?

By sporte on July 21, 2006.

Welcome back!

If you've just joined us, we're in the middle of a quest to find the identity of an unknown nucleotide sequence. To summarize our results so far, we used this sequence to do a blastn search of GenBank, using all the default settings at the NCBI. You can see the beginning of the project here.

And we had some rather curious results.

It appeared that our sequence matched sequences from very diverse organisms, like Dengue virus, E. coli, and Simian Immunodeficiency virus. Very strange!

There was another curious word, too, that appeared in the descriptions for each of the results.

That word was VECTOR. "Vector" is a word that I imagine Sherlock Holmes would have used if he wanted to interrogate a scientist or mathematician and find out what they did without having them realize that he was trying to do so.

To a mathematician or a physicist, a vector is a straight line with a magnitude and direction. To a public health official, a vector is a rat, mouse, louse, or insect; anything capable of carrying a disease.

And, to a molecular biologist, a vector can be a plasmid, phage, or eucaryotic virus that is used to move genes around from place to place. This information can help us make some good guesses about the function of our unknown bit of DNA, because vectors have been engineered to have some common features. Some of these are special DNA sequences that allow plasmids to be copied. Some of the special features are genes that encode for enzymes that make bacteria resistant to different antibiotics. If a bacterial cell contains a plasmid with one of these antibiotic resistance genes, it produces a protein that allows it to live in the presence of an antibiotic. These features are helpful for biologists because we can select bacteria that are resistant to a drug and kill off all the rest.

Okay, where were we?

Back to our results:

Here is our list of matching sequences from the blastn search. We had some good guess last week about answers, and one was right, but involved far too much work.

I think it's far easier to look at the data.

Here's how.

We click the link to the alignment score.

This shows us where our sequences match each other. Pay attention to the positions of the subject sequence that match our query! We need to remember this. Our sequence starts matching at 44, 246 and ends matching at 44, 665.

Then we click the link to the matching sequence, and scroll down the page.

Eventually, we reach numbers. These numbers represent positions in the DNA sequence.

Here's the region where our sequence matches:

And our answer is, the beta lactamase gene. This gene codes for an enzyme that breaks the beta-lactam rings, thus disabling antibiotics like pencillin.

technorati tags: digital biology, blast, bioinformatics

More like this

hmmmm....
Well it is easier, but still you do not know exactly what part of your DNA sequence is matching to the annotated protein.

To know that it is much better to do a blast search against a protein DB. Then you will have information about the conservation of your sequence, which can be also useful.

And after that you can use PFAM to be sure that the protein have a "functional" conserved domain.

As you have it, it would be like the very first step, but then you have to carry on, and verify your initial findings using more specific tools.

Hi Diego,

Actually, you can look at the GenBank record and see how the DNA sequence corresponds to the encoded protein. I show it here.

I agree, PFAM is helpful if you're trying to understand the function of a truly unknown protein, or if your match isn't as good as it was in this case (100%). I also really like the Conserved Domain Database.

Hey, thanks for the really informative posts. I've been trying to get a handle on this stuff for a while, and seeing these tasks done in context just made it all click.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

New home for Discovering Biology in a Digital World

October 30, 2017

Sometime in the next day or two, Scienceblogs will shut down. We've enjoyed the opportunity to blog here for the past 10+ years. Not to worry, @digitalbio and @finchtalk will continue blogging, but more so from their own site at Digital World Biology. The Scienceblogs posts have been…

Synbiobeta: The Future is Now

October 12, 2017

@synbiobeta concluded it’s #sbbsf17 annual meeting on synthetic biology Oct 5, 2017. The progress companies are making in harnessing biology as a platform for manufacturing and problem solving is world changing. Locations of Synbio Companies What is Synthetic Biology? Synthetic biology is a term…

Understanding the CRISPR Cas9 system

September 18, 2016

On Sept. 30th, I'm going to be co-presenting a Bio-Link webinar on Genome Engineering with CRISPR-Cas9 with Dr. Thomas Tubon from Madison College. If you're interested, Register here. Since my part will be to help our audience understand the basics of this system, I prepared a…

Zika virus, drug discovery, and student projects

March 8, 2016

It's well understood in science education that students are more engaged when they work on problems that matter. Right now, Zika virus matters. Zika is a very scary problem that matters a great deal to anyone who might want to start a family and greatly concerns my students. I…

DNA: it's in your blood

February 28, 2016

Did you know small fragments of DNA are circulating in your blood stream? These short pieces of DNA are left behind after cells self-destruct. This self-destruction, or apoptosis, is a normal process. In the case of fetal development, certain cells in our hands die, leaving behind individual…