Metagenomics and the mystery of the dying bees

The next time you bite into a crisp juicy apple and the tart juices spill out around your tongue, remember the honeybee. Our fall harvest depends heavily on honeybees carrying pollen from plant to plant. Luscious fruits and vegetables wouldn't grace our table, were it not for the honeybees and other pollinators.

i-5b72428eaa872c6d705e8a3dfcbe8139-bee.pngLately though, the buzz about our furry little helpers hasn't been good. Honeybees have been dying, victims of a new disease called "colony collapse disorder," with the US, alone losing a large number of hives in recent years.


Researchers have speculated about everything from cell phone towers, genetically modified food, and pesticides to too much traveling around by beekeepers who cart their hives around the country to catch the flowering plants.

Metagenomics is a term that's used to describe the process of obtaining nucleic acids (RNA and/or DNA), determining the sequence of the nucleotides, and identifying the source of the RNA or DNA by comparing it to known sequences. This technique can be used to answer many different questions and discover lots of new things. We're using it in our bioinformatics class right now and will be using it in the spring as well. (if you want to try it yourself, I'm making the data and some of tools available, contact me at digitalbio at gmail dot com),

Yesterday, I listened to a Science/AAAS webinar to learn more about how metagenomics has been used to investigate the mystery of the dying honeybees (if you're interested, the tape and slides are here) and the identification of a possible suspect. The speakers were W. Ian Lipkin, M.D., from Columbia University, New York, NY, and Michael Egholm, PhD, from 454 Life Sciences.

In this project, the researchers gathered samples of bees that had died from colony collapse disorder, isolated DNA and RNA, and sequenced it. They generated hundreds of thousands of short sequences, averaging 250 bases long, according to the webinar.

The informatics part of the project was largely glossed over, in short they grouped similar sequences together (clustering), assembled sequences into longer sequences (contigs) and used programs like blastx and blastn to identify the sequences by comparing them to a database of sequences. Through this process, they found sequences that showed a strong statistical association with the bees that were killed. These sequences came from the Israeli acute paralysis virus (IAPV).

I was curious to know how they managed to store and work with all those data files - I think they said they have over 400,000 flowgrams. Where did they put them all? How did they evaluate the quality of the flowgrams?

No one mentioned that.

Dr. Egholm said that Sanger dideoxy sequencing remains the gold standard, in terms of quality and read length, but that 454 will replace it because you can gather so much more data. In fact, it seemed that in order to identify the bee-killing culprit, they did need a very sensitive method. In one slide, only 65 reads out of 97,435 were identified as viral sequences. And, of those sequences, I think they said that 14 were from IAPV.

Once they had sorted through the haystack of reads and picked out IAPV, of course, they took pools of bees and tested them directly for the presence of the virus. They found that 83% of the bees that had died from colonly collapse disorder were infected with IAPV, while IAPV could only be found in 5% of the bees obtained from healthy colonies.

Koch's postulates, unfortunately, are a bit difficult to satisfy in many cases. Statistical associations between the presence of a pathogen and the existence of a disease, are often the best we can do.

At the end, they discussed the potential application of metagenomic analysis to the problem of identifying other diseases with unknown causes. Perhaps the puzzle of diseases like autism and diabetes could be solved, if we could sequence everything that's present and find out if there are pathogens hiding in the plumbing.

What new pathogens will we find when we've sequenced the world? Whatever we find, I hope we can treat them.

More like this

This article is reposted from the old Wordpress incarnation of Not Exactly Rocket Science. There's been more work on CCD since, but I'm reposting this mainly because of some interesting follow-up research that will I will post about tomorrow. In 2006, American and European beekeepers started…
We don't know if the virus is the causal agent, but a recent Science paper used a metagenomics approach to find that bees from colonies that have collapsed are infected with a virus (and it's the same virus in different colonies). Essentially, the researchers ground up bees, sequenced the whole…
Since ~2006, honey bee colonies in the US have been dropping dead overnight. Literally. They call it 'colony collapse disorder'. While large populations of organisms dying is disturbing, no matter the species, we need honey bees-- they help pollinate so many of our crops. I grew up in the banks…
Another update on Colony Collapse Disorder (CCD), the surprisingly devastating attack on the honeybee that occurred last year that was responsible for huge losses of bee colonies and a great deal of concern about crops pollinated by this insect. Originally we mocked the idea that CCD was caused…

"What new pathogens will we find when we've sequenced the world? Whatever we find, I hope we can treat them."

I agree! I saw an excellent show on CCD. Much of the time was spent on the situation in China. They have lost almost all the bees in certain areas. In one area, the crop, I think it was pears, was so important the economy that they actually pollinated all of them manually with feathers on a stick!
By the end of the show they said it was all still a mystery, with a few inconclusive theories.
They also talked about the cost, in many billions of dollars, if the US can't get this figured out and save the bees.
Dave Briggs :~)

May 20th 2009
I was looking into this, these past three years and the only logical conclusion I could reach was that the earth has tilted a bit throwing the guidance systems of the bees off.
I found this theory bizarre but then read an article of someone far wiser than me stating the same conclusion.

Bee hives get diseases every once in a while but not to the degree that is happening now, around the world.

Therefore for me, it is a world problem and it has to be something that has affected the bees in internal guidance system.
I am no expert but this makes more sense to me.


I'm very intrested by this project.

Could you help me to find a post-doc or a phd to work on?

I presued a phd on fluidity of the mobile gene in metagenomics data.

My pleasure and my dream would be the bees bacterial pathogens.