Hunting for huntingtin, part II: In which we're reminded that database searches are experiments

By sporte on December 19, 2007.

In which we're reminded that database searches are experiments, too.

One of the trickiest things with bioinformatics experiments is repeating them. This challenge isn't related to the validity of the original results, the challenge is that, unless you made your own database and kept it in the same state, the database that you'll be using at a later time, sometimes even a day later, is a different database. And, if you query a different database, you may get a different result.

The series that I'm currently posting is one that I started working on a couple of years ago. Originally, I was going to repost these stories as is, but it seemed best to add another twist and see if I could reproduce some of the results, or at least find out which results have changed. In the next few posts, you'll see the results of those experiments.

Playing catch-up with the latecomers
Hi, for those of you who've just joined us, we've gotten lost in some databases while hunting for information on huntingtin. If you'd like to catch up a bit and come back later, you might want to read Hunting for huntingtin (part I).

If not, here's a brief synopsis of the plot and what we've done so far:

learned about Woody Guthrie and Nancy Wexler
found a couple of reviews describing Huntington Disease
got the HD gene sequence and counted the number of CAGs
we learned the CAG codes for glutamine and that glutamine can form hydrogen bonds

Then, we got curious about those extra CAGs and wanted to know if they result from the disease or cause the disease. So we looked up huntingtin at the UCSC genome browser and saw that there are similar genes in mouse, pigs, and zebra fish (plus a few other members of the animal kingdom that were not discussed).

Ah hah!

Since mice have a similar gene - and we know that the Jackson Lab is the place to go for all things mouse - sure enough, the Jackson mouse breeders have made mice with extra CAGs, and .....the mice get the symptoms of HD.

So you guessed it, the extra CAGS are the problem, not the result.

As the fearless leader of this expedition, I vote now we look at those extra CAGs a little more closely.

Searching for the lost glutamines
You might remember, in part I, I mentioned looking for 3-D structures with polyglutamine. I did find one structure with a polyglutamine sequence, but it looked like the crystallographers weren't able to resolve the part in the structure where the glutamines were supposed to be. Cn3D shows the missing glutamines in grey in the sequence window. The structure window shows this:

Looking for other structures

Okay, so what can I do now? What would you do?

I decided to do a blastp search, since NCBI has this cool new feature where protein sequences, with a corresponding structure, are linked to the structure record in the MMDB.

So I used blastp to search the human protein database with a sequence of 15 glutamines.

What did I find?

In 2005, my search gave this result: No significant similarity found.

This year, I got results.

But they're strange.

I have some perfect matches to things that I've never heard of like Vanderwaltozyma polyspora, Brugia malayi, and some things that I have heard of like Anopheles gambiae (some type of mosquito) and Chlamydomonas.

Where are the human proteins?

Right. I said these experiments are hard to repeat.

See ya next time. We'll try to muddle through the mystery and get back on track with the story.

technorati tags: biology, DNA, genetics

More like this

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

Glyphosate reduces soil biodiversity and decreases the proportion of native species (French)

More by this author

New home for Discovering Biology in a Digital World

October 30, 2017

Sometime in the next day or two, Scienceblogs will shut down. We've enjoyed the opportunity to blog here for the past 10+ years. Not to worry, @digitalbio and @finchtalk will continue blogging, but more so from their own site at Digital World Biology. The Scienceblogs posts have been reposted at…

Synbiobeta: The Future is Now

October 12, 2017

@synbiobeta concluded it’s #sbbsf17 annual meeting on synthetic biology Oct 5, 2017. The progress companies are making in harnessing biology as a platform for manufacturing and problem solving is world changing. Locations of Synbio Companies What is Synthetic Biology? Synthetic biology is a term…

Understanding the CRISPR Cas9 system

September 18, 2016

On Sept. 30th, I'm going to be co-presenting a Bio-Link webinar on Genome Engineering with CRISPR-Cas9 with Dr. Thomas Tubon from Madison College. If you're interested, Register here. Since my part will be to help our audience understand the basics of this system, I prepared a short tutorial with…

Zika virus, drug discovery, and student projects

March 8, 2016

It's well understood in science education that students are more engaged when they work on problems that matter. Right now, Zika virus matters. Zika is a very scary problem that matters a great deal to anyone who might want to start a family and greatly concerns my students. I teach a…

DNA: it's in your blood

February 28, 2016

Did you know small fragments of DNA are circulating in your blood stream? These short pieces of DNA are left behind after cells self-destruct. This self-destruction, or apoptosis, is a normal process. In the case of fetal development, certain cells in our hands die, leaving behind individual…