sequence analysis

The next time you bite into a crisp juicy apple and the tart juices spill out around your tongue, remember the honeybee. Our fall harvest depends heavily on honeybees carrying pollen from plant to plant. Luscious fruits and vegetables wouldn't grace our table, were it not for the honeybees and other pollinators. Lately though, the buzz about our furry little helpers hasn't been good. Honeybees have been dying, victims of a new disease called "colony collapse disorder," with the US, alone losing a large number of hives in recent years. Why? Researchers have speculated about everything…
Would you like to have some fun playing with chromatograms and helping our class identify bacteria in the dirt? This quarter, my bioinformatics class, at Shoreline Community College, will be working with chromatograms that were obtained by students at Johns Hopkins University, and graciously made available by Dr. Rebecca Pearlman. (See see "Sequencing the campus at the Johns Hopkins University" for more background.) We are going to do a bit of metagenomics by using FinchTV and blastn to identify the soil bacteria that were sampled from different biomes and then use an SQL query that I…
During the past few Fridays (or least here and here), we've been looking at a paper that was published from China with some Β-lactamase sequences that were supposedly from Streptococcus pneumoniae. The amazing thing about these particular sequences is that Β-lactamase has never been seen in S. pneumoniae before, making this a rather significant (and possibly scary) discovery. If it's correct. tags: DNA sequence analysis, antiobiotic resistance , microbiology, blastn The way this sequence was identified as Β-lactamase was through a blastn search at the NCBI. And in fact, it was correct to…
One of my readers asked: Why does genome sequencing cost so much? My short answer is because it's big. But I thought it would be fun to give a better answer to this question, especially since I'm sure many of you are wondering the same thing. Okay, so let's do some math. Don't worry, this math isn't very complicated and I'll explain where most of the numbers come from. Estimating costs from salaries First, we'll take the easy route. My experience with grant budgets has taught me that the greatest cost for any project comes from salaries. If we look at the PLoS paper with Craig Venter's…
If you've read any of the many stories lately about Craig Venter or Jim Watson's genome, you've probably seen a "SNP" appear somewhere. (If you haven't read any of the stories, CNN has one here, and my fellow bloggers have posted several here, here, here, here, here, and here.) You may be wondering, and rightly so: just what is a SNP? Never fear, hopefully this post will answer some of those questions. tags: DNA sequencing, DNA , SNPs, genetic testing SNP stands for Single Nucleotide Polymorphism. That's a mouthful. It means some people, will have one base at a certain position, in a…
"Come quickly, Watson," said Sherlock Holmes, "I've been asked to review a mysterious sequence, whose importance I'm only now beginning to comprehend." The unidentified stranger handed Holmes a piece of paper inscribed with symbols and said it was a map of unparalleled value. Holmes gazed thoughtfully at the map, then slowly lifted his eyes and coldly surveyed his subject's beaming countenance. "You have an affinity for the ocean," said Holmes, "that you indulged to excess as a reckless youth. An experience as a medic in the military changed your life and gave you a reason to do more than…
Why the ABRF of course! I spend a fair amount time every summer giving workshops for college and high-school teachers on genomics and bioinformatics. One of the things that always surprises them, is the amount of lab work that's carried out by people working in shared, or core lab facilities. For example, if I was working at a research university and I wanted to sequence some DNA, maybe several patient samples, or a bacterial genome, I would send the DNA to a core lab and they would send me the sequences. I would analyze the data and write the paper. I've simplified that process a bit in…
I began this series last week with a question about a DNA sequence that was published and reported to be one the first beta-lactamases to be found in Streptococcus pneumoniae. Mike has a great post about one of problems with this paper. I think the data themselves are awfully suspicious. So, last week I suggested that you, dear readers, go and find out why. I gave you a link to the abstract and a place to get started. Perhaps that was too hard. Sigh. Okay, here's a little more help and another clue. I highlighted the accession numbers. Post your guesses in the comments.
If you've read the previous posts on this topic, here and here, you're probably aware by now that I have this weird (okay, maybe fanatical) obsession with data. Or at least, with knowing if my data are right so I can get on with life, do the analysis and figure out the results. My results from last week suggested that re-processing chromatogram data (from the ABI 3730) with phred was probably a bad idea, but still, I only had one data point and I really wanted to know if anyone had done a more thorough study and compared larger numbers of chromatograms. Naturally, someone had. tags: DNA…
One time I was watching a football game on TV and they had a short quiz, called "You make the call" or something like that, and you had to watch a play and pretend to be a referee. A short video clip showed football players falling over each other. Then you were three possible calls that a refereee might make and asked to chose which was correct. After the commercial, the announcer would tell you which choice was right and explain why it was correct. I suppose this was a trick to make us watch the commercials, but I thought the game was kind of fun. My SciBling "Mike the Mad" had a great…
Sometimes asking a question can be a mistake. Especially when your question leads to more questions and having to question things that you didn't want to question, and pretty soon you begin to regret ever opening the file and looking at the data and asking the question in the first place. Sigh. Take a deep breath. Yesterday through a twist of fate, I ended up taking a look at the DNA sequences produced by two different base calling programs from the same chromatogram file, from an ABI 3730 DNA sequencing instrument. I thought they would be the same, or at least similar. tags: DNA…
Yes, you can! Really, I thought this was going to be more challenging, but the nice folks at the NCBI have made a special personal genomics FTP site. You can also get Craig Venter's genome, and maybe even do some comparative genomics and see if one has a few deletions. After all, don't you want you find out who's is bigger? Oh, I can tell this is going to be fun! Get the traces at ftp://ftp.ncbi.nih.gov/pub/TraceDB/Personal_Genomics
What do you do when base-callers disagree? Okay DNA sequencing community, I want your help with this one. One of these sequences was called by phred and the other by the ABI KB base calling program. Which one should I believe? tags: DNA sequencing, DNA , base-calling programs Sometimes I open up files and do short experiments just because - well, I'm curious. And sometimes I immediately wish I hadn't done that because what I opened looks like a larger can of worms than I really want to see. These graphs show the quality of each base, in a DNA sequence, on the y axis and the position of…
Many medical conditions today are treated but never cured. Imagine, a child with a genetic disease like juvenile diabetes or hemophilia. This child will be taking expensive medications for their entire lives. In the case of some diseases the cost of the medications might be more than child or their parents can ever hope to earn in their lifetimes, much less spend on a life-saving drug. This is one of the many reasons why people have placed such great hopes in gene therapy. If a disease results from a defective gene, and we could replace it or supplement it with a functional gene, perhaps we…
Last week I found a bug in the new NCBI BLAST interface. Of course, I reported it to the NCBI help desk so it will probably get fixed sometime soon. But it occurred to me, especially after seeing people joke about whether computer science is really a science or not, that it might surprise people to learn how much of the scientific method goes into testing software and doing digital biology. tags: blast, software testing, scientific method, science education What happens when the scientific method isn't used? I wrote earlier in January about applying scientific principles from the wet…
BLAST is a collection of programs that are used to compare sequences (DNA, RNA, or protein) to larger collections of sequences that are stored in databases. I've used BLAST as a teaching tool for many years, partly because it's become a standard tool for biological work and partly because it's very good at illustrating evolutionary relationships on a molecular level. A few months ago, the NCBI changed the web interface for doing BLAST searches at their site. I wrote earlier about changes that I made to our animated tutorial in response to the new BLAST. Now, I want to mention some of the…
By now, many of you have probably seen the the new BLAST web interface at the NCBI. There are many good things that I can say about it, but there are a few others that caught me by surprise during my last couple of classes. tags: blast, BLAST tutorial, science education Because of these changes, and because I'm giving a workshop for teachers on BLAST at the Fralin Biotechnology Conference in Blacksburg, VA, next week, it seemed like a good time to update our animated BLAST tutorial at Geospiza Education and save myself some trouble. I originally created the BLAST for beginners tutorial to…
It was only a couple of weeks ago but it seems like years. I had spent a month learning how to use most of the features on my shiny new phone and we were in Alaska using Google maps to find our way around Fairbanks. My thumbs were getting sore, but so what? I could a give a slide show on my phone, I could read my Gmail messages, and we could find a friend's house in the Google map satellite view and amaze our older relatives with the thrill of technology. I'm not even a materialistic, gadgety sort of person, but I was in love. And now, well, maybe you guessed it. tags: chromatograms,…
How does grass grow in the extremely hot soils of Yellowstone National Park? Could a protein from a virus help plants handle global warming? Okay, that second sentence is wild speculation, but we will try to find the answer to our mystery by aligning our protein sequence to a sequence from a related structure. tags: plants, bioinformatics, sequence analysis, viruses, fungi, global warming, Read part I, part II, part III, part IV, and part V, to see how we got here. This week, in our last installment, we will seek the answers in a related structure. Last week, I found that my…
The first research assignment for our Alaska NSF Chautauqua course has been posted. Your task is to find a wound-inducible plant gene, learn something about it, and post a description in the comment section. We've already had one excellent answer, but I know there are at least 54 wound-inducible genes, so I expect to see more. Once we get our genes in order (and possibly before), we'll talk more about designing an experiment for detecting gene expression. In the meantime, I have some pre-course reading assignments to help you prepare. tags: plants, Alaska, NSF Chautauqua courses,…