Bird flu: testing, testing . . .

Whenever confirmed human cases of bird flu appear in an area there usually follows heightened sensitivity to new cases of severe pneumonia. Are they bird flu too? Severe pneumonia is pretty common, so you can't automatically assume that "if it quacks like a duck and walks like a duck it must be a duck." It turns out there are a lot of birds that look like ducks that aren't ducks, at least when it comes to influenza-like illnesses. On the other hand, "testing negative" with PCR, which on its own is a pretty sensitive and specific test is also not foolproof. "On its own" means under the best conditions. But the conditions in the field aren't usually the best. Specimens can be taken too late, taken improperly, masked by antiviral treatment, stored improperly, analyzed improperly, etc. So false negative tests do happen. Let's pause to think about testing more generally.

There are two measures of the accuracy of a test: its sensitivity and its specificity. Sensitivity is the probability that a test will actually pick up that a person is infected (how "sensitive" is it to detecting a signal that's really there). Specificity is the same thing but for the absence of infection: what is the probability that an uninfected person will be correctly classified as uninfected. A false positive results from a test that is not 100% specific. A false negative from one that is not 100% sensitive. For any test it is always possible to make them either 100% sensitive or 100% specific. In fact it's trivial: either call all tests negative or all tests positive, respectively. Yes, in the first case you got no false positives and in the second no false negatives, but this strategy usually isn't very helpful. Of course if the test is perfect (100% sensitivity and at the same time 100% specificity) then you are home free. Usually tests aren't that good, either on their own or because of the other ancillary conditions (lke impfect specimens or storage or whatever. Then you adjust your test conditions so that you get the best result you can in terms of the mix of false negatives and false positives. In doing this you may take into account the relative "costs" of the two kinds of mistakes, false positives and false negatives.

For any particular test and its conditions of application you will almost certainly have less than 100% accuracy (on both measures). So there is a cost. You can either decide to ignore the consequences of your mistakes or decide that the cost of a false negative, for example, is too severe in terms of the risk of missing an infected person that you want to invest more effort into confirming it wasn't a mistake. If the cost is high in terms of money or use of resources you won't confirm every negative test, especially if there are a lot of them as there are for influenza-like illnesses. But for those where the index of suspicion is high (for example a symptomatic contact of a confirmed case) you may go ahead and do some follow-up testing. For H5N1 the most common way to do this is to examine paired acute and convalescent sera to see if there is the development of antibodies to H5N1 which would indicate recent infection. This requires getting a blood sample at the time of the initial negative test and another one several weeks later.

Here's an example of its use in the recent Pakistani cases:

Blood testing has confirmed that a U.S. resident whose brother was Pakistan's first confirmed case of H5N1 infection never contracted the disease.

The New York State health department revealed that the man's blood showed no antibodies to H5N1, indicating he had not caught the virus while attending his brother's funeral in Pakistan late last year.

"His final test came back. He showed no avian flu and no antibodies to avian flu, which means he never got it," Claudia Hutton, the department's director of public affairs, said in an interview from Albany.

[snip]

The brother from Long Island experienced mild cold-like symptoms after returning from Pakistan. And his young son, who did not make the trip with him, also had a cold; it appeared to get worse after his father's return.

The man went to his doctor, the doctor notified local public health authorities and they in turn alerted the state. The U.S. Centers for Disease Control even sent a plane to New York to collect specimens from the man and his son for testing in the CDC's Atlanta labs.

They were both negative. But, a negative test isn't proof positive there was no infection. A test taken too late in the course of an infection could come back negative.

To close the book on the incident, authorities collected blood samples from the man and the son to look for the antibodies that would be present if they had been infected with the virus. Both the father and the son were negative in antibody testing.(Helen Branswell, Canadian Press)

There is also no guarantee the confirmatory tests are accurate either. What if it is possible to be infected but not show the particular antibody that is used for follow-up testing or the confirmatory test is itself insensitive or poorly done or . . . ? While there is no guarantee it would mean that both tests (the first one and the follow-up) would have to be wrong. The probability of this is presumably less than just the probability of the first test being wrong so you are still ahead. But it is still a possibility and depends on the second test having a reasonable specificity.

The problem of test accuracy is common to many disease outbreak investigations. False negatives and false positives are particular to tests and test settings. As the tests get better and easier the accuracy improves, so we are not stuck with the same level of false positives and negatives forever and all time. Still, we need to interpret test results in the light of what is known about the test's accuracy and that requires information that is not always ready to hand.

Decision-making under conditions of uncertainty is a fact of life. Since this requires judgment and not just the application of some mechanical algorithm, questions of transparency, trust and credibility now enter the mix. Need I say more?

More like this

The Center for Infectious Disease Research and Policy (CIDRAP) is a resource for all manner of information on infectious diseases and especially avian influenza. At their website one can find a technical overview which compiles a lot of bird flu information scattered over many sources. But it is a…
The first cases of swine flu were diagnosed in the US in San Diego in mid-April. The discovery was serendipitous, the result of out-of-season US-Mexican border surveillance and use of a new diagnostic test at the Naval Health Research Center. When the new test protocol showed infection with…
WHO has just issued case definitions for H5N1 infections. Case definitions are criteria that must be satisfied to designate a person as being "a case" of H5N1 infection. Case definitions are not clinical tools but epidemiological ones. Epidemiological measures pertain to populations and require the…
The Director of Loyola University Medical Center's clinical microbiology laboratory is reported as saying that rapid flu tests are a public health risk. Here's some of what he said and then my explanation as to why it is misleading or just plain wrong: Rapid influenza diagnostic tests used in…

Thank you for your explanation.

So it would be correct to say that testing is not 100% accurate? That there are many
variables? And, you have to use your best judgment about the results? Is that what
you are saying? I just want to be certain that I understand the message here.

Thanks again for your patience in explaining difficult concepts.

What I read from various places where H5N1 has been definitely found in birds is that they are generally treating people with AI-like symptoms as though they had the disease, even though tests come back negative, and even negative several times. As someone else posted elsewhere, treat to the patient's symptoms, and not to the test. AnnieRN

indigo girl: "Tests" in general can be 100% accurate (i.e., 100% sensitivity and 100% specificity) although this is rare. Diagnostic tests for flu are not 100% accurate, i.e., there is a false positive and a false negative rate (it need not be both but usually is). Various factors will go into this. One is the inherent properties of the test under the best of conditions. Then there are the real world conditions under which the test is done (well or poorly, appropriately or not, etc.). So evaluating these things is the first place where judgment is needed. Then there is what to do with the outcome. Confirm it or not with another (better) test (you may not do this originally because the confirmatory tests takes a lot of time or effort or money). Judgment again. Then you have to evaluate the confirmatory test by the same standards. It helps if you have some kind of "gold standard" but often you don't.

Take the test to see if someone has a fever. We don't have a thermometer in the house so I just feel the forehead (I'm pretty good at this over many decades). It's usually good enough for me whether they "have a fever or not" although if it is only a slight fever I might make a mistake (but I usually don't care for slight fevers). If I'm really concerned I'll go out and buy a thermometer, but there are all sorts of thermometers of various accuracies and ease of use and costs. Usually it doesn't matter much but for some purposes (e.g., research) I want to use a standardized, accurate and reliable (i.e., repeatable) method. But when I do, is that the "real" temperature? I'll only know this if I have some kind of gold standard, like the temperature of the blood cirulating in the brain or the heart or however I want to define it.

This simple example shows that "testing" is not always a simple matter. It depends.

Regarding your remark about pneumonia being "common," I strongly disagree. Especially for community acquired pneumonia among working age people. About 20% of community acquired pneumonia is legionnaire's.

Nor is hospitalization for a respiratory diagnosis or a respiratory illness severe enough for the victim to stay home from work "common." One case has to be somewhere. But two cases among similarly exposed persons in the same week or month are not common, and three would constitute a "sentinel event." A "sentinel event" should trigger an effort to get a real diagnosis, which is not so easy to get.

By Frank Mirer (not verified) on 10 Jan 2008 #permalink

Revere says: This simple example shows that "testing" is not always a simple matter. It depends.

--------------------------------------------------------
I have associated with PCR lab for about 10 years in conducting shrimp viruses (both DNA, RNA) tests. Because ours are commercial research for commercial shrimp productions, for 10 years with more than ten thousands cases, we finally have given up the PCR tests for shrimp virus screening in production reference. For instances, the biggest Cooperation in Thailand shut down the lab in 2006, prior to that four private labs all closed. The reason is the test report of negative (we call it non-detectable) in seed stage has 30 % will end up the outbreak of viral infections (WSSV virus, DNA based) in 4 months. This is for white shrimp which is currently the main cultured species. The previous one- tiger shrimp which had prevailed well from 1970-1989, but collapsed after 90s'. The non-detectable seed would end up 90% WSSV outbreaks.

I present this example is not going to confuse the situation, though it is very extreme. In our case in shrimp industry, what we have learned is positive (detectable) is easy to make judgement. As for the confirmed negative that we can apply for commercial production is pretty confusing, we have several speculations:

1. Sensitivity of PCR machines, now by the adoption of real time PCR by government, the results are very similar.
2. Sample size, few thousand samples from production population (usually a million) is too small. But for four years with more than 100,000 cases, the discrepancy (outbreak) 30% is not acceptable.
3. Samples of whole body and sample of target organs for tests are the same conditions in production.
4. The leakage of bio-security, airborne transmission, bird droppings transmission. It is possible.

You may wonder now how the production systems are going? One is continuing in upgrading the bio-security system and screening the brood-stocks free from contamination of specific virus. The second is to use selective breeding directed at virus resistant strategy. It seems that the second one is more feasible, because the outbreak rate can come down from 30% to less than 5%.

I am not sure poultry can do what the shrimp production has developed, nevertheless the concept and technology are developed. The only consideration is that invertebrate usually has plentiful gametes, shrimp has 150,000 each spawning. It is a different topic.

Anyway, for us the judgement of positive( detectable) is much easier than non-detectable.

I keep thinking of the H5N1 outbreak in Turkey a couple years ago, where the tests kept coming back negative, over and over, but the doctors kept running the test because the duck looked so darn duck-ish that they were sure that was what it was.

This has led me to believe that there are more false negatives than false positives with H5N1; that the tests have a far higher specificity than sensitivity, and that the sensitivity is potentially below 50%, at least when real world conditions are in play. Would that be your sense as well? In a case with a high index of suspicion, how many tests would you consider a responsible number to run before trusting the negative? Would that change if a healthy young person died of their ILI?

(I'd ask if you could take a rough stab at the percentages for both specificity and sensitivity, but I suspect it's one of those things you'd say is unknown and perhaps unknowable.)

I think flubbies are too hung up on how many cases are true negatives - tunnel vision, failure to see the big picture. Assuming the tests are not dependable, then as long as error rates of tests are consistent, whether we have x cases or 2x cases out of a 100 is not critically important information used as action triggers. The relevant information is rate of change of infection rate from whatever baseline.

In general, I think we should take the "use what we have" stance i.e. use the most of what we have instead of worry about what we are missing. After all, one of the main reasons for tracking is to find early indicators of a pandemic strain to trigger action. If we only use the official numbers and nothing else, we will still be able to recognize a change in pattern, based on the assumption that all parties conduct tests and report results in good faith. However, if we start adjusting the numbers based on the personal judgment of 10 different news reporters, then we are going to get a very fuzzy picture if a new pattern is evolving.

The science of the dependability of these tests is important, but it will take forever to argue over, which means it will introduce more fuzziness, not more clarity, into any attempt to identify trends.

anon.yyz:

You make an interesting point. I see two possible issues with it, though. One is that it's not clear that samples are taken and stored in equivalent ways in different countries, or even different hospitals. Maybe certain hospitals in Indonesia are better at collecting samples by now than others; maybe they're better or worse at it than those in Egypt. So I'm not sure it's safe to assume a consistent rate of false negative.

The other problem I see is that false negatives could mask other cases. If they miss the first link in the chain when one case tests negative, then the index of suspicion falls for all their contacts, neighbors, relatives, co-workers. If those people aren't tested, and then in turn their contacts have a low index of suspicion, then the potential exists for larger clusters to form without anybody noticing... particularly if H5N1 happened to become less deadly as it became more h2h efficient (which I know is not certain, but is possible).

For that reason, and because negative tests have been used for spin, I think flubies are rightfully concerned about them.

anon yyz, caia: Unfortunately I have to agree with both of you (even though you disagree with each other to some extent). There is no answer to this question. Remember, though, that during flu season most suspect cases will be negative because they will be in a sea of non-H5 cases. I did some rough calculations of that a few posts ago. So it is, as always, a balancing act. You can't do confirmatory tests on thousands of ILIs, so you use various ancillary pieces of info to decide which ones you will go after more thoroughly. That's why they do contact tracing and seroprevalence surveys when confirmed cases are found. The existence of sick birds is often a reason to look more closely, although a lot of cases have no evidence of sick poultry (especially true in China) while the use of poultry vaccination might mask the appearance of bird disease even though there is infection.

In the last analysis it is a rather difficult task and it is one reason I am not as hard on the local and national authorities as many people on the flu boards and comment threads. The diagnostic tools have limited accuracy but it is neither feasible nor reasonable to do repeated confirmatory tests on all the negatives when you know that most of them are probably true negatives (not because false negs don't exist but because they are lost in the noise of seasonal influenza). You are working with very limited resources in terms of money and manpower and you have many other pressing public health concerns: dengue, malaria, maternal and chid health, infant diarrhea, etc., etc. Those things are killing people on a minute by minute basis and you can't drop everything to track down false negatives.

Unfortunately many local and national authorities also are guilty of bad decisions, are influenced by political considerations and all the other things they are rightly accused of. But even if they weren't, there would still be a problem.

caia,

There is going to be some variability from one country to another. The solution is to add a credibility index per country that perhaps flublogia as a whole can more or less agree on. It doesn't have to be exact. A reasonable approximation will do. This index can be adjusted over time, again based on flublogia debating and reaching more or less consensus, and if applied to cumulative cases, it should be usable metric. Even if there is no consensus, a spread sheet could be built for individual to obtain his or her assessment of anomaly and therefore trigger of pattern change. This is just a way to remove momentary emotion from reading the news and therefore hopefully arrive at something more actionable. What I just posted was a thought process more than an end solution, but I think it beats staring at each case.

On the question of masking other cases, if the locals cannot figure it out, again assuming good faith (adjusted by a credibility index), what chance do we have sitting in front of a computer thousands of miles away to figure it out? I have said many times in my posts that the meaning of a piece of information is tied to the context of the event. That is why real epidemiologists from the WHO have to travel to the host country to investigate. We may be hungry for a black and white or numerical answer, but we don't have a good chance of getting one. We are more than likely as humans to come up with something with an error margin too high to be useful. Worse still is if we rely on the first piece of suspect guesstimation as the basis for evaluating the second, and then the third, and before you know it, the errors are amplified beyond recognition.

I wrote some thing on the FW about Metric Based Public Health Risk Communication or Fear Uncertainty and Doubt are no longer enough. There's some concepts there that will explain my views of how to assess one's exposure. There are micro details that we don't need to know and yet we are no less safer as preppers than if we do try to micro-guesstimate.

http://newfluwiki2.com/showDiary.do?diaryId=1518

caia,

As a supplement to what I just posted, we can give each country a High and a Low credibility index, then we have two aggregates for each country. If an anomaly becomes visible within the bracket of the two aggregates, then we can decide for our individual comfort zones what to do e.g. whether to start SIP.

Revere-A kind of on subject question. Thompson Financial is now reporting a girl who has been hospitalized since the 4th is going critical in the hospital near Jakarta.

She became ill in her home when several birds died. Now here is the question. If BF can breach the placenta would it be reasonable to assume it could infect the unfertilized embryo of an egg if a bird laid it while it was infected? They havent confirmed BF in the birds that died but they were in the house. The suggestion so far is that it was on the outside of the eggshell from fecal matter on it. It was also a half cooked egg (soft boiled?) but its obviously possible but I dont know the mechanism for how an egg forms into a large hard shelled one. How long does an egg take to form. Lot of material in there. Cross infection?

By M. Randolph Kruger (not verified) on 10 Jan 2008 #permalink

Revere: If the beginning of an H5N1 pandemic influenza coincides with the annual seasonal influenza, is an H5N1 test with a high false-negative or an H5N1 test with a high false-positive preferred/helpful for treatment decision by health officials to determine which patient should get the limited supplies of anti-viral drugs?

By flulearner (not verified) on 10 Jan 2008 #permalink

cala says: I keep thinking of the H5N1 outbreak in Turkey a couple years ago, where the tests kept coming back negative, over and over, but the doctors kept running the test because the duck looked so darn duck-ish that they were sure that was what it was.

This has led me to believe that there are more false negatives than false positives with H5N1; that the tests have a far higher specificity than sensitivity, and that the sensitivity is potentially below 50%, at least when real world conditions are in play. Would that be your sense as well
-----------------------------------------------------------

I tend to believe that if the area has had the record of BF outbreak, the false negative will be much higher than non-breaks area.

I am not sure if there is a statistic software to assess the infection status as the backup surveillance to lab tests. I believe that the relevant interpretation eventually has to be scientific outlook in terms of quantitative and qualitative data based on statistic methodology.

Otherwise the severe treatment in surveillance area like in Korea to cull all possible vectors except human, which include pigs, dogs, etc is not considered unreasonable.

Out of 18,000 samples taken from health poultry in
Vietnam, 5% has H5N1 carrier, vis Crof's H5N1 http://news.xinhuanet.com/english/2008-
01/09/content_7391149.htm

The interpretation of test result seems not an easy job?

MRK, from germ cell to laying is approximately 23 hours. The shell is added on last. While one egg is being finished with the shell, there are several behind it, in the various previous stages of development. Occasionally, an egg can be laid with no shell. AnnieRN

MRK, Revere: I think this is the same girl mentioned by MRK. A report said the girl fell ill two weeks after she ate three half-cooked eggs. If this is true, it would be a sharp contrast from the recent family cluster in China where reports said the son fell ill a day after he and his family ate a cooked chicken at a restaurant.

By flulearner (not verified) on 11 Jan 2008 #permalink