Digital Biology Friday: Those BLASTed results!

By sporte on July 14, 2006.

Last week, we embarked on an adventure with BLAST.

BLAST, short for Basic Alignment Search Tool, is a collection of programs, written by scientists at the NCBI (1) that are used to compare sequences of proteins or nucleic acids. BLAST is used in multiple ways, but last week my challenge to you, dear readers, was to a pick a sequence, any sequence, from a set of 16 unknown sequences and use BLAST to identify that sequence.

This week, we'll examine the results.

I did the experiment, too, with a completely different unknown sequence that's pasted below. This sequence is not part of the data set that I put at the Geospiza Education site.

>unknown_seq

ATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCC
TTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATC
AGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCC
TTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCT
GCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCAAGAGCAACTCGGTCG
CCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAA
GCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCAT
GAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATC

Looking at the letters, of course, doesn't really help me at all. All I see are A's, G's, C's, and T's.

To solve the problem and identify the sequence, I have to compare my unidentified sequence to a collection of sequences of that have already been identified by other people and see if my sequence matches any sequences that are already known.

First, I copy my unknown sequence, then I follow the steps that are outlined in the BLAST for Beginners tutorial at the Geospiza Education web site. In the tutorial, I click the bright green arrows to move from page to page and see what to do.

My favorite way to use the tutorials is to open two web browser windows and resize the windows so they fit side by side on a computer screen. Then, I go through the tutorial in one window and do the steps myself in the other window.

(FYI: I started making these tutorials because I thought I would go crazy if I had to teach classes by spending fifty minutes saying "Click here" then "Click here" then "Click here".)

Eventually, I get to a page with results.

BLAST has looked into it's crystal ball and we get:

Hmm, I see......

A graph with lots of red lines.

What does this mean?

Click the graph to see a larger version with some explanations.

To put it simply, the graph shows me that at least one hundred sequences in GenBank match my entire sequence.

If I look farther down the page, I come to more curious results.

Click the image to see a larger version.

To summarize what I see, I have a list of fifty results (only some of them are shown in this image). All the results have a score of 833 and an E. value of 0.0, but the descriptions look like different things. C'mon what do Dengue virus, SIV, and E. coli have in common?

(at least if we don't read carefully, wink, wink, nudge, nudge)

Strange....

Why would my sequence match (at least) 50 different sequences in the nucleotide database?

Can you solve the mystery?

Copy the sequence at the beginning of this post and give it at try. Feel free to submit comments with your answer.

Or wait until next week, for more of the story.

References:

1. Altschul, Stephen F., Thomas L. Madden, Alejandro A. SchÃ¤ffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.

technorati tags: digital biology, blast, bioinformatics

More like this

Hmm, I'm reaching here ... could this sequence be the origin of replication for plasmids as well as some viruses? Curious.

It is a beta-lactamase, an enzyme related to antibiotic resistance. BLAST it against a protein DB, and/or run it against PFAM

Coleen,
Good guess. It is a gene that's found in many plasmids.

Diego,
You are right but you're solving the problem the hard way. I'll show you an easier way to find the answer next week.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

Universities Can Agree On All Hate Speech Except Antisemitism

More by this author

New home for Discovering Biology in a Digital World

October 30, 2017

Sometime in the next day or two, Scienceblogs will shut down. We've enjoyed the opportunity to blog here for the past 10+ years. Not to worry, @digitalbio and @finchtalk will continue blogging, but more so from their own site at Digital World Biology. The Scienceblogs posts have been…

Synbiobeta: The Future is Now

October 12, 2017

@synbiobeta concluded it’s #sbbsf17 annual meeting on synthetic biology Oct 5, 2017. The progress companies are making in harnessing biology as a platform for manufacturing and problem solving is world changing. Locations of Synbio Companies What is Synthetic Biology? Synthetic biology is a term…

Understanding the CRISPR Cas9 system

September 18, 2016

On Sept. 30th, I'm going to be co-presenting a Bio-Link webinar on Genome Engineering with CRISPR-Cas9 with Dr. Thomas Tubon from Madison College. If you're interested, Register here. Since my part will be to help our audience understand the basics of this system, I prepared a…

Zika virus, drug discovery, and student projects

March 8, 2016

It's well understood in science education that students are more engaged when they work on problems that matter. Right now, Zika virus matters. Zika is a very scary problem that matters a great deal to anyone who might want to start a family and greatly concerns my students. I…

DNA: it's in your blood

February 28, 2016

Did you know small fragments of DNA are circulating in your blood stream? These short pieces of DNA are left behind after cells self-destruct. This self-destruction, or apoptosis, is a normal process. In the case of fetal development, certain cells in our hands die, leaving behind individual…