bioinformatics

"Digital biology," as I use the phrase, refers to the idea of using digital information for doing biology. This digital information comes from multiple sources such as DNA sequences, protein sequences, DNA hybridization, molecular structures, analytical chemistry, biomarkers, images, GIS, and more. We obtain this information either from experiments or from a wide variety of databases and we work with this information using several kinds of bioinformatics tools. The reason I'm calling this field "digital biology" and not "bioinformatics" (even though I typically use the terms as…
A friend of mine; serial entrepreneur, and former president of Genetic Systems; Joe Ashley, told me once that starting a business is an unnatural act. Now that I've done it, I agree. Even with my multiple back-up plans, possible grants, and part-time activities, my stomach still hurts and my mind is racing. My new company has "spun out" of another. Spinning out of control until you fall down from exhaustion. It's a great metaphor all right. Sure, there's excitement and adventure. I love my new shiny business cards and my new shiny web site! It's fun to do things that I like and would…
A recent article that examined the relationship between antibiotic use and antibiotic resistance in Finland made me realize one very sad fact: what is easy to do in Finland is nearly impossible in the U.S. because we lack a national healthcare system (note: I'm not talking about how healthcare is paid for which is an argument about reimbursement, but a uniform system of record keeping and informatics protocols). Consider this from the introduction (italics mine; citations removed for clarity): According to current Finnish care recommendations, the first-line antimicrobial agents for the…
Have you ever wondered how to find things in the NCBI databases? Maybe you tried to find something but didn't know how it was spelled. Or maybe you tried to use a common name like "pig" or "deer" to find information in a database, not knowing that all the organism names are in Latin. Or perhaps you're wondering just what kind of information is stored for different kinds of records and if you could search for this information. I wrote a book that covered this topic quite thoroughly, a couple of years ago, for the NCBI structure database. Now, I've decided to make some movies, too. This…
A new paper in Bioinformatics describes an efficient compression algorithm that allows an individual's complete genome sequence to be compressed down to a vanishingly small amount of data - just 4 megabytes (MB). The paper takes a similar approach to the process I described in a post back in June last year (sheesh, if only I'd thought to write that up as a paper instead!). I estimated using that approach that the genome could be shrunk down to just 20 MB - compared to about 1.5 GB if you stored the entire sequence as a flat text file - with even further compression if you took advantage of…
xkcd has some good advice for high-schoolers: That goes doubly for anyone even vaguely interested in a career in biology, and particularly genetics - right now, even some basic scripting experience will take you further than any amount of pipette-wrangling. I wish I'd known this when I was back in high school... Subscribe to Genetic Future.
It's nice to see Carl Bergstrom, the driving force behind the Eigenfactor metric for measuring publication use, get written up in Seed magazine. Sez Bergstrom: ...Bergstrom has bigger plans than merely analyzing the importance of various journals. He wants to see how scientists have made science organize itself and change the world like a cartographer once could: "If we can map science, we can help researchers get their bearings, move efficiently among fields in their interdisciplinary endeavors, and find what they need to be reading." Carl was always smarter than the Mad Biologist.... (but…
The above pie chart shows the relative proportions of described species in various groups of organisms.  As we can see, most species are invertebrate animals.  Things like snails, flatworms, spiders, sponges, and insects. Now compare that slice of pie to the proportion of GenBank sequences that represent invertebrates: Yes, that thin blue wedge is all we've got.  While most mammal species have had at least a gene or two sequenced, the vast majority of non-vertebrate species have yet to meet a pipettor.   Entire families of insects haven't received even a cursory genetic study. Of…
Now, I realize with this title, lots of people are thinking that I'm trying to do away with scientific articles. Far from it. But the use of published articles as 'scientific currency' can retard the adoption of new breakthroughs. A recent personal experience is in order. I recently heard an invited speaker give a talk about a new way of handling DNA sequence data*. After the talk, in a private meeting, I asked the speaker if this software was available for implementation, and said speaker looked horrified. "We haven't submitted for publication yet." It turns out that no one will have…
I suppose I should have expected this. I thought it might be fun to see what the databases had to say about turkeys. Technorati Tags: Thanksgiving,, turkey,, mash-up So, I queried the NCBI databases, found a taxonomy reference, and started clicking related links to see pictures of the different species. Why? Because it would be great to see what the different species of turkies look like and compare them. Here's a species of wild turkey that I didn't expect to find. Wild turkey, humph!
I just love this title! It's nerdy and cute, all at the same time. I read about this in www.researchblogging.org and had to check out the paper and blog write up from The Beagle Project (BTW: some of you may be interested in knowing that The Beagle Project is not a blog about dogs.) The paper describes a class where students from Marseilles University investigate the function of unidentified genes from a Global Ocean Sampling experiment. All the sequences are obtained from the environmental sequence division at the NCBI. Students follow the procedure outlined below: This is a great…
Ebola virus has impressed me as creepy ever since I read "The Hot Zone: A Terrifying True Story some years back by Richard Preston. (I guess he has a new book, too, Panic in Level 4: Cannibals, Killer Viruses, and Other Journeys to the Edge of Science but I haven't been in airport for the past couple of weeks, so I haven't read it yet.) Technorati Tags: blast, phylogenetic trees, Ebola, viruses Infectious agents that cause diseases with gruesome symptoms really excite those of us with an interest in microbiology. Tara has written about this paper, too, and summarized the details. I…
I'm sure everyone else thinks the big news today is the announcement by the Washington State Health department requiring hospitals to report MRSA cases to the state. I think the cool news is their on-line database. We'll get to that a bit later. What is MRSA? MRSA stands for methicillin-resistant Staphylococcus aureus. It's a serious pathogen that causes skin infections and greater damage if it enters the body. The Seattle Times report - a quick summary For the past three days the Seattle Times has been running a series on hospital-acquired cases of MRSA. According to the report, 6…
Want to learn more about Parkinson's disease? See why a single nucleotide mutation messes up the function of a protein? I have a short activity that uses Cn3D (a molecular viewing program from the NCBI) to look at a protein that seems to be involved in a rare form of Parkinson's disease and I could sure use beta testers. If you'd like to do this, I need you to follow the directions below and afterwards, go to a web form and answer a few questions. Don't worry about getting the wrong answers. I won't know who you are, so I won't know if you answered anything wrong. If you have any concerns…
HealthMap is a great site that could be an excellent resource when teaching a biology, microbiology, or health class. Not to mention, I can picture people using it before they travel somewhere or even just for fun. I learned about HealthMap awhile ago from Mike the Mad Biologist, but I didn't get time to play with the site until today. Here's an example to see how it works. How do I use HealthMap? I begin using HealthMap by changing the number of diseases selected to "none." Then I scrolled through the list until I found something interesting. I chose "Poisoning." The number of…
Let's play anomaly! Most of this week, I've written about the fun time I had playing around with NCBI's Blink database and finding evidence that at least one mosquito, Aedes aegypti, seems to have been infected at some point with a plant paramyxovirus and that the paramyxovirus left one of its genes behind, stuck in the mosquito genome. During this process, I realized that the method I used works with other viruses, too. I tried it with a few random viruses and sure enough, I found some interesting things. You've got a week to give it a try. Let's see what you find! The method is…
Lots of bloggers in the DNA network have been busy these past few days writing about Google's co-founder Sergey Brin, his blog, his wife's company (23andme), and his mutation in the LRRK2 gene. I was a little surprised to see that while other bloggers (here, here, here, and here) have been arguing about whether or not the mutation really increases the risk to the degree (20-80%) mentioned by Brin, no one has really looked into the structure and biochemistry of the LRRK2 protein to see if there's a biochemical explanation for Parkinson's risk. I guess that task is up to me. Let's begin at…
Do mosquitoes get the mumps? Part V. A general method for finding interesting things in GenBank This is the last in a five part series on an unexpected discovery of a paramyxovirus in mosquitoes and a general method for finding other interesting things. In this last part, I discuss a general method for finding novel things in GenBank and how this kind of project could be a good sort of discovery, inquiry-based project for biology, microbiology, or bioinformatics students. I. The back story from the genome record II. What do the mumps proteins do? And how do we find out? III.…
Part IV. Assembling the details and making the case for a novel paramyxovirus This is the fourth in a five part series on an unexpected discovery of a paramyxovirus in a mosquito. In this part, we take a look at all the evidence we can find and try to figure out how a gene from a virus came to be part of the Aedes aegypti genome. image from the Public Health Library I. The back story from the genome record II. What do the mumps proteins do? And how do we find out? III. Serendipity strikes when we Blink. IV. Assembling the details of the case for a novel mosquito paramyxovirus V. A…
Part III. Serendipity strikes when we Blink In which we find an unexpected result when we Blink while looking at the mumps polymerase. This is the third in a five part series on an unexpected discovery of a paramyxovirus in mosquitoes. And yes, this is where the discovery happens. I. The back story from the genome record II. What do the mumps proteins do? And how do we find out? III. Serendipity strikes when we Blink. IV. Assembling the details of the case for a mosquito paramyxovirus V. A general method for finding interesting things in GenBank To paraphrase Louis Pasteur,…