New tool anyone can use to track disease outbreaks

While CDC and FDA struggle to figure out where the Salmonella saintpaul in a large multistate outbreak is coming from they are not being forthcoming about where it has gone. We know the case total but not much about who is getting sick, where and when. There is no good scientific or privacy reason not to release more information. It's just the usual tendency to keep control. But some of the information is "out there" anyway, in news reports and other sources of information. People interested in disease outbreaks discovered years ago that this information could be harvested and disseminated to the public health community and in 2003 this informal system provided the first evidence of the SARS outbreak in Guangdong, China, weeks before there was any official confirmation. Many of us subscribe and use the no-cost ProMed Mail service, a pioneering effort by volunteer experts to collect information on disease outbreaks in people and animals worldwide, using official and unofficial sources. The ProMed concept has now been taken several steps further by a team of disease of surveillance experts at Boston's Childrens Hospital. They use automated internet data-mining with some additional curating by human experts to provide a web-based breaking-news disease reporting system organized by disease agent, time and geographic location, all displayed on a map of the world. The system is free and without registration or subscription barriers. It started in prototype in 2006 and now gets about 20,000 unique visits a month, mainly from the public health community (for comparison, this blog gets 30,000 to 40,000 visits a month). The system, called HealthMap, is pretty impressive. It was just highlighted on the Wired Blog so its traffic is going to increase.

Here's what the HealthMap Salmonella map looks like as of 3:45 pm July 8:


By clicking on one of the outbreak icons in Kansas, a balloon came up with links to news media reports. The one at the top, from Google News, is from the website of a TV station. Clicking on the link produces this report:

The Harvey County Health Department and the Kansas Department of Health and Environment are continuing to investigate the Salmonella outbreak in south-central Kansas. As of Monday, there have been 19 cases.

During the course of the investigation, the Acapulco Restaurant in Newton was identified as the probable source of the illness. The exact cause of the outbreak has not been determined at this time.

"The management of the restaurant has been extremely helpful in providing information throughout the investigation," said Charlie Hunt, Deputy State Epidemiologist with KDHE. "Based on the information we have at this time, we do not feel that the restaurant poses any immediate health risks to the community."

Restaurant management provided customer information for those who might have been exposed to determine if they have experienced any symptoms of the illnesses. Interviews with the customers will be used to further pinpoint the source of the illness.

There have not been any onsets of new cases that have eaten at the restaurant since June 10, 2008. KDHE conducted a food service inspection on June 17. Although a few violations were found as part of that inspection, they were corrected immediately. (KDKE website, Kansas)

You can't get this information from CDC or FDA. Both agencies have been singularly unhelpful and unresponsive in requests for information.

As noted, HealthMap uses datamining techniques:

HealthMap relies on a variety of electronic media sources, including online news sources through aggregators such as Google News, expert-curated discussion such as ProMED-mail, and validated official reports from organizations such as the WHO. Currently, the system collects reports from 14 sources, which in turn represent information from over 20,000 Web sites, every hour, 24 hours a day. Internet search criteria include disease names (scientific and common), symptoms, keywords, and phrases. The system collects an average of 300 reports per day, with the majority acquired from news media sources (85.1%). Although most of the reports collected to date have been in English, HealthMap also monitors information sources in Chinese, Spanish, Russian, and French, with additional languages such as Hindi, Portuguese, and Arabic under development. As HealthMap reports are acquired solely from free news sources, operational costs are minimal. The Web site is freely accessible on the Internet without subscription fees. (Surveillance Sans Frontières: Internet-Based Emerging Infectious Disease Intelligence and the HealthMap Project
John S. Brownstein*, Clark C. Freifeld, Ben Y. Reis, Kenneth D. Mandl, PLoS Medicine [cites omitted])

As the research team observes, these sources of data are subject to certain kinds of inaccuracies:

The use of international news media for public health surveillance has a number of potential biases that merit consideration. While local news sources may report on incidents involving a few cases that would not be picked up at the national level, such sources may be less reliable, lacking resources and training, and may report stories without adequate confirmation. Furthermore, other biases may be intentionally introduced for political reasons through disinformation campaigns (false positives) or state censorship of information relating to outbreaks (false negatives). We have attempted to better understand some of these issues through ongoing analysis and evaluation research. We ran a 43-week evaluation of HealthMap data, covering the period of October 1, 2006 through July 18 2007. We found that pathogen diversity was substantial across news sources, with 141 unique infectious disease categories reported through the Google News feed alone (Table 1). We found the frequency of reports about particular pathogens to be related not to their associated morbidity or mortality impact, but rather to the direct or potential economic and social disruption caused by the outbreak.


For instance, we found substantial skew towards reporting on stories about avian influenza and food-borne illnesses. Over the evaluation time period, 174 countries had reports of infectious disease outbreaks, with the greatest reporting from the United States (n = 4351), the United Kingdom (n = 1018), Canada (n = 880), and China (n = 737) (Figure 3A). There was a clear bias towards increased reporting from countries with higher numbers of media outlets, more developed public health resources, and greater availability of electronic communication infrastructure (approximated by number of Internet hosts) (Figure 3B). These trends are highly relevant for users of the system, and thus the individual impact of these factors on surveillance will form the basis of a detailed user guide currently under development.

But as the Salmonella example shows, HealthMap can also provide a lot more information than official sources, even in highly developed countries. This is another example of a highly distributed, collaborative system. The internet excels at this and this is an outstanding example of what can be done.

If you have any interest at all in infectious disease outbreak surveillance you'll find the HealthMap site extremely easy to use. These guys have done a really terrific job, financed in part by a grant from Google. Kudos all the way around.

More like this

Enough monkeys banging on keyboards over enough time should produce, through random chance alone, sensible prose now and then. But if the monkeys are bloggers and reporters and other people, the noise they generate would become merely pseudo-sensible because of (highly unlikely) chance events,…
It's my birthday today, but instead of buying me presents, how about helping ProMED, a non-profit organization that provides important disease and health information to over 155 countries? Here's some more information about what ProMED does: ProMED-mail - the Program for Monitoring Emerging…
A Gene Wiki for Community Annotation of Gene Function: Gene portals (e.g., Entrez Gene [1] and Ensembl [2]) and model organism databases (e.g., Mouse Genome Database [3], Rat Genome Database [4], FlyBase [5]) are popular and useful tools for researching gene annotation and enforcing data standards…
As I mentioned Friday, the good folks from Google were part of the crowd at this year's ICEID. This included a talk by Larry Brilliant, described on his wikipedia page as "...medical doctor, epidemiologist, technologist, author and philanthropist, and the director of Google's philanthropic arm…

I agree, a cool tool.

I might note that, while ProMED is free to the user, it does cost to maintain, and they have periodic fund drives (one just concluded.) Although it is just concluded, I'm sure they would still take a contribution from anyone (and everyone) who finds their service valuable.

Likewise, there might be a way to help support Healthmap as well.

Having done outbreak investigations in the past, I can't say that official sources would be any better. State and local health departments and non-public health authorities aren't exactly stellar.

I use this tool, and it really is the greatest professional toy I have played with. I love to see really exotic diseases in really non-exotic places -- it makes the disease that much more exotic. Filariasis in Rome? Anthrax in Georgia? Reminds me, I need to comb through my ProMed folder and clean it.

By Rogue Epidemiologist (not verified) on 10 Jul 2008 #permalink

Dear Revere,
The "Wall Street Journal" reported in the Saturday/Sunday, July 5-6 edition that Jalapeno peppers are the new suspect in the salmonella outbreak. I was particularly interested in these paragraphs:
"Earlier laboratory research has shown that salmonella grows rapidly when inoculated onto Jalapeno peppers. According to a 2003 paper by a team of Texas A&M University researchers, salmonella grew faster on extracts of jalapeno peppers than those of tomatoes, lettuce, broccoli and bell peppers. Their research was published in the journal "Bioresource Technology." I would like to know which salmonella strain the Texas A&M researchers worked with.


"The state (Texas) has confirmed more than 350 cases, the highest number of any state in the U.S."

So...Texas A&M is involved in salmonella research on jalapeno peppers and is also the state with the most cases.

Does anyone know where this rare form of salmonella, SaintPaul, is being kept and studied? Surely the CDC started their investigation at the places which own strains of SaintPaul.

Texas was also the source for the monkeypox ghambian rat/prairie dog outbreak. At the time Texas A&M was also involved in a program of innoculating wildlife against rabies with a vaccinia/rabies recombination vaccine placed into bait and salted all over Texas land that was habitat to both coyotes (the target) and prairie dogs.

The CDC said the ghambian rats were the culprits but they never identified the importer. No one was ever prosecuted.
There is a permanent ban on prairie dogs sold as pets. Why?
Was this outbreak an unintended consequence of a rabies control program using a recombinate vaccinia? There were no follow up studies on the non-target animal populations.
The program was also de-funded by the governor that same year.

Perhaps MRK would like to expound on my suspicion that someone is paying really close attention to what the researchers at Texas A&M are doing, or they are just plain sloppy with their germs in Texas.

Not usually so suspicious,
Library Lady

By Library Lady (not verified) on 11 Jul 2008 #permalink