Digital Biology Friday: hot plants and viruses V

By sporte on May 25, 2007.

tags: plants, bioinformatics, sequence analysis, viruses, fungi

How does grass grow in the extremely hot soils of Yellowstone National Park? The quest continues.

Read part I, part II, part III, and part IV to see how we got here.

And read onward to see where will we go.

In our last episode, I discovered a new tab in the protein database (well, new to me anyway).

Related structures

If you select this tab, you get a list of protein sequences that are similar, by blastp, to the amino sequences in protein structures.

Naturally, I clicked the tab, and then the Links link, to see what this was all about.

But I was still bothered by the whole notion of how a sequence gets identified as having a "Related Structure"? What are the cutoff values? Where do they come from? via Pfam? via blast?

I spent (alright wasted) a long time searching around the NCBI website, trying to figure out how sequences got promoted to the Related Structures category. Finally, I gave up and asked a friend of mine at the NCBI who kindly referred me to the January publication in Nucleic Acids Research (2).

To paraphrase the paper (2), a protein sequence is considered to have a related structure, if the amino sequence of the protein matches an amino acid sequence in a structure. (Okay, I guessed that already, but how well do they have to match?)

The criteria for matching are pretty conservative:

1. There must be 50 or more aligned amino acids

2. Of those aligned amino acids, 30%, or more, must be identical.

So, the sequence of our unidentified protein is related to sequence of a protein in a structure file. Fine! What does that mean? How does that help us?

On the other side the Related Structures tab and what I found there

I clicked the tab, clicked the Links link (named "Related Structures:) and saw that the Aspergillus terreus sequence, that was 28% identical to my amino acid sequence (with an E value of 4 x 10^-14), is related to a protein that has 8 identical subunits in a structure named 2CLB.

Since my 331 amino acid virus protein is significantly similar to the Aspergillus sequence, I think it's quite likely that my sequence is similar to the protein in the related structure, too. (Yes, it's my protein now, at least until I find a new pet molecule to play with.)

Before going any farther, though, I did want to make sure that the part of the Aspergillus protein that was matching the eight chains of 2CLB was the same part that matched my pet viral protein. Returning to the Blink page, I clicked the linked blastp score to see how the two sequences aligned. The results were reassuring. The part of the Aspergillus protein that matches 2CLB was the same part that matched the virus protein.

Next week, we'll see what happens when we align the sequences to the structure.

Oh yeah, and we might even get some ideas about what the protein does that helps the plant survive the heat. Who knows? Join us next week for the end of the story.

References:

1. MÃ¡rquez, L., et. al. 2007 A Virus in a Fungus in a Plant: Three-Way Symbiosis Required for Thermal Tolerance Science 26: 513-515.

2. Wang Y, Addess KJ, Chen J, Geer LY, He J, He S, Lu S, Madej T, Marchler-Bauer A, Thiessen PA, Zhang N, Bryant SH. 2007. MMDB: annotating protein sequences with Entrez's 3D-structure database. Nucleic Acids Res. 2007 Jan;35(Database issue):D298-300.

More like this

Dear Sandra,
this is a great series! Just the kind of thing I like to do whenever I can find a spare minute or two. Currently, I am trying to something similar on my blog.

Please allow one remark: My experience in sequence analysis has taught me never to trust any of the precomputed NCBI links. I have to admit that before reading your blog I had no idea about the 'related structure' links, but the way the NCBI computes them appears, hmm, suboptimal. In this particular case, I am absolutely unconvinced that your viral protein has anything to do with ferritin. I couldn't resist doing my own analysis, but unfortunately that didn't lead me very far. I think that this protein is found in many filamentous fungi, but is very fast - evolving and thus hard to spot.

Good luck with the continuation of your series!

Kay

Thanks Kay!

I think I'll have to add your blog to my list in the blog roll. It looks good to follow!

As for the protein, I'm not going to give anything away now, 'cause I post the finale tomorrow morning.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

Communism V. Journalists: Beijing’s Crackdown on Press Freedom

More by this author

New home for Discovering Biology in a Digital World

October 30, 2017

Sometime in the next day or two, Scienceblogs will shut down. We've enjoyed the opportunity to blog here for the past 10+ years. Not to worry, @digitalbio and @finchtalk will continue blogging, but more so from their own site at Digital World Biology. The Scienceblogs posts have been…

Synbiobeta: The Future is Now

October 12, 2017

@synbiobeta concluded it’s #sbbsf17 annual meeting on synthetic biology Oct 5, 2017. The progress companies are making in harnessing biology as a platform for manufacturing and problem solving is world changing. Locations of Synbio Companies What is Synthetic Biology? Synthetic biology is a term…

Understanding the CRISPR Cas9 system

September 18, 2016

On Sept. 30th, I'm going to be co-presenting a Bio-Link webinar on Genome Engineering with CRISPR-Cas9 with Dr. Thomas Tubon from Madison College. If you're interested, Register here. Since my part will be to help our audience understand the basics of this system, I prepared a…

Zika virus, drug discovery, and student projects

March 8, 2016

It's well understood in science education that students are more engaged when they work on problems that matter. Right now, Zika virus matters. Zika is a very scary problem that matters a great deal to anyone who might want to start a family and greatly concerns my students. I…

DNA: it's in your blood

February 28, 2016

Did you know small fragments of DNA are circulating in your blood stream? These short pieces of DNA are left behind after cells self-destruct. This self-destruction, or apoptosis, is a normal process. In the case of fetal development, certain cells in our hands die, leaving behind individual…