Digital Biology Friday: It's still Friday!

"Hey Rocky, watch me pull a rabbit out of my hat!"

I realized that I should add just a bit more information to last answer on gene identification, so here it is.

After the last installment, Diego commented:

but still you do not know exactly what part of your DNA sequence is matching to the annotated protein.

Ahh, but we do.

And I was negligent in not showing you.

There are multiple ways to view the GenBank record that we arrived at while following links from our matching sequence.

A very handy way, especially if you're looking at where sequences align to a larger subject sequence, like a big eucaryotic gene, is to view the GenBank record and the associated annotations as a graph.

i-691bbe3cce07d3a5955cfaea800f9b34-step6.pngTo do this, I select Graph from the Display pull-down menu.







And voila!
I see a graph and I see where beta lactamase is located on my graph.

i-ee4b77fa5fb809b7c94a5d6997af92df-step7.gif












I click the graph right above the beta lactamase (bla) arrow and I see where the sequence of the DNA, the sequence of the RNA, and the sequence of the protein all map relative to each other.

i-897226001c3ad35fb8188e5f61e31750-last_step.gif






Remember, there was a 100% match between my unknown sequence (the match started at 1 and ended at the last nucleotide) and the nucleotides between positions 44, 246 and 44, 665? You can check this here

So, in fact, we do know what part of our query DNA sequence matches (all of it) and where it matches, too.

technorati tags: , ,

Copyright Geospiza, Inc.

More like this

No more delays! BLAST away! Time to blast. Let's see what it means for sequences to be similar.  First, we'll plan our experiment.  When I think about digital biology experiments, I organize the steps in the following way: 
Shotgun sequencing refers to the process whereby a genome is sequenced and assembled with no prior information regarding the genomic location of any of the DNA we sequence. There are quite a few steps that you have to go through before you have an assembled genome sequence.
A few weeks back, we published a review about the development and role of the human reference genome. A key point of the reference genome is that it is not a single sequence.
What tells us that this new form of H1N1 is swine flu and not regular old human flu or avian flu? If we had a lab, we might use antibodies, but when you're a digital biologist, you use a computer.

I did not know about the graph option. Thanks a lot for that, and for the time invested in this exercise.

Have a nice weekend.