Seed Media Group

Discovering Biology in a Digital World

My thoughts on biology, teaching, life, and exploring the living world via the digital one. Only my opinions are represented by these postings, they do not represent the viewpoints of any funding agency or Geospiza, Inc.

Profile

Sandra Porter I am a microbiologist and molecular biologist turned tenured biotech faculty turned bioinformatics scientist turned entrepreneur. My passion is developing instructional materials for 21st century biology (Geospiza Education).

Search this blog

Learn about DNA with molecular models

Exploring DNA Structure


Subscribe to Geospiza Education News


e-mail digitalbio at gmail.com


DigitalBio Favorites

Molecular Momentos


Recent Posts

Recent Comments

Archives

Categories

Rotating Blogroll

Science Education Groups

Science Blogs School Fundraiser



Keep up to date

Awards

Red Orbit

Digital Bio at Blogged


Add Digital Bio to your Technorati Favorites!

Interesting places

  • xkcd
  • The Tangled Bank
    MicrobeWorld Radio

    « Hunting for huntingtin, part I | Main | If you have free time over winter break... »

    Hunting for huntingtin, part II: In which we're reminded that database searches are experiments

    Category: BioinformaticsGenetics & Molecular BiologyGenomicssequence analysis
    Posted on: December 19, 2007 9:13 AM, by Sandra Porter

    In which we're reminded that database searches are experiments, too.

    One of the trickiest things with bioinformatics experiments is repeating them. This challenge isn't related to the validity of the original results, the challenge is that, unless you made your own database and kept it in the same state, the database that you'll be using at a later time, sometimes even a day later, is a different database. And, if you query a different database, you may get a different result.

    The series that I'm currently posting is one that I started working on a couple of years ago. Originally, I was going to repost these stories as is, but it seemed best to add another twist and see if I could reproduce some of the results, or at least find out which results have changed. In the next few posts, you'll see the results of those experiments.

    Playing catch-up with the latecomers
    Hi, for those of you who've just joined us, we've gotten lost in some databases while hunting for information on huntingtin. If you'd like to catch up a bit and come back later, you might want to read Hunting for huntingtin (part I).

    If not, here's a brief synopsis of the plot and what we've done so far:

    • learned about Woody Guthrie and Nancy Wexler
    • found a couple of reviews describing Huntington Disease
    • got the HD gene sequence and counted the number of CAGs
    • we learned the CAG codes for glutamine and that glutamine can form hydrogen bonds

    Then, we got curious about those extra CAGs and wanted to know if they result from the disease or cause the disease. So we looked up huntingtin at the UCSC genome browser and saw that there are similar genes in mouse, pigs, and zebra fish (plus a few other members of the animal kingdom that were not discussed).

    Ah hah!

    Since mice have a similar gene - and we know that the Jackson Lab is the place to go for all things mouse - sure enough, the Jackson mouse breeders have made mice with extra CAGs, and .....the mice get the symptoms of HD.

    So you guessed it, the extra CAGS are the problem, not the result.

    As the fearless leader of this expedition, I vote now we look at those extra CAGs a little more closely.

    Searching for the lost glutamines
    You might remember, in part I, I mentioned looking for 3-D structures with polyglutamine. I did find one structure with a polyglutamine sequence, but it looked like the crystallographers weren't able to resolve the part in the structure where the glutamines were supposed to be. Cn3D shows the missing glutamines in grey in the sequence window. The structure window shows this:


    Looking for other structures

    Okay, so what can I do now? What would you do?

    I decided to do a blastp search, since NCBI has this cool new feature where protein sequences, with a corresponding structure, are linked to the structure record in the MMDB.

    So I used blastp to search the human protein database with a sequence of 15 glutamines.

    What did I find?

    In 2005, my search gave this result: No significant similarity found.

    This year, I got results.

    But they're strange.

    I have some perfect matches to things that I've never heard of like Vanderwaltozyma polyspora, Brugia malayi, and some things that I have heard of like Anopheles gambiae (some type of mosquito) and Chlamydomonas.

    Where are the human proteins?

    Right. I said these experiments are hard to repeat.

    See ya next time. We'll try to muddle through the mystery and get back on track with the story.

    technorati tags: , ,

    Post a Comment

    (Email is required for authentication purposes only. Comments are moderated for spam, your comment may not appear immediately. Thanks for waiting.)





    Having problems commenting? (UPDATED)

    Search All Blogs

    Blogs in the Network

    Top Five: Most German

    Top Science Stories

    powered by SEED - seedmagazine.com