Continuing our discussion of causation and what it might mean (this is still a controverted question in philosophy and should be in science), let me address an issue brought up by David Rind in his discussion of our challenge. He discussed three cases where a rational person wouldn’t wait for an RCT before taking action, even though there remained uncertainty. The first was a single case report of rabies survival after applying an ad hoc protocol. The next was use of parachutes while sky diving. The third was the first reports of antiretroviral therapy for AIDS. Here’s what Rind said in his post:
For instance, rabies is historically a 100% fatal illness once clinical symptoms appear in someone who has never been vaccinated. In 2005, a 15-year-old girl survived rabies after treatment with a novel experimental regimen. You could imagine that anything can happen once, and it may have been coincidence that this girl survived and received this novel regimen. However, if one more person with rabies were to receive the regimen and survive, it would seem spectacularly unlikely that the explanation would be anything but that the regimen works. A rational clinician would treat any new patient with rabies with that regimen from that moment until a superior regimen was found. (As far as I know, no other patient has survived rabies on this regimen.)
Similarly, in response to the humorous attack on EBM, asking what the RCT data are supporting parachutes when skyjumping, the GRADE group would respond that the magnitude of effect of parachutes is sufficient to constitute high quality evidence for their use (okay, clinical epidemiologists can be somewhat humor-challenged). That is, we have all sorts of historical evidence about what happens when people fall from great heights (and that we might consider only slightly indirect to falling from an airplane), as well as lots of observational data about what happens when people fall from airplanes wearing parachutes. Not everyone who falls from a great height without a parachute dies, and not everyone wearing a parachute lives, but the effect size is so large that we have high quality evidence for parachutes in the absence of a clinical trial.
To give one example that might feel more real, I was doing a lot of AIDS care in the 1990s. In 1995 an abstract was published from one of the ID meetings about the effects of ritonavir in about 50 people with late AIDS. (I’ve tried in vain in the recent past to find this abstract — if anyone can point me to it, I would be grateful.) The results were like nothing we had seen before — patients’ CD4 counts rose dramatically, opportunistic infections improved, and many patients who would have been expected to die improved instead. We did not know what would happen long-term, but it was obvious, without any RCT, that ritonavir was effective therapy for AIDS, at least in the short term. By 1996, we were treating people with triple therapy “cocktails” for HIV, again without any RCTs with clinical endpoints, and watching people who had been dying walk out of hospitals and hospice care as their OIs resolved. The magnitude of effect was such that we had high quality evidence for these cocktails based on observational data alone. (Not that this actually prevented researchers from proceeding with an RCT of triple therapy, but that’s a post for another day.) (David Rind, Evidence in Medicine)
In all three cases Rind identifies the salient feature as the size and clinical importance of the effect. Which prompts me to bring up the Hill viewpoints (dearly beloved by epidemiologists although they rarely use most of them in practice). The eponymous Hill here is A. Bradford Hill, an influential biostatistician of the mid twentieth century who has a double connection to our subject, one through his 9 view points an epidemiologist should consider when trying to judge if an association has characteristics of one that is causal (or, as he put it, after considering these factors, “is there any other way of explaining the set of facts before us, is there any other answer equally, or more, likely than cause and effect?” [emphasis in original]). The other is that Hill is credited with successfully promoting the introduction of randomized trials into medicine. (That was in 1948, so it is surprisingly recent. Note that R.A. Fisher developed the theory of randomization in statistics but scholars have shown he had little influence in medicine).
We intend to discuss randomization later and won’t discuss all of Hill’s 9 viewpoints (often mistakenly referred to as criteria and even more mistakenly used in checklist form). In fact I’m only going to discuss one of them. If you want a probing and in depth discussion of the Hill viewpoints you can’t do better than epidemiologist Kenneth Rothman’s text (with Greenland and Lash) Modern Epidemiology (now in its third edition, but all three have essentially the same version of the Hill viewpoints).
The one aspect of Hill’s list of nine I want to discuss (often shortened to a sublist of five) is biological plausibility. Interestingly Hill gave this in a highly qualified version:
(6) Plausibility: It will be helpful if the causation we suspect is biologically plausible. But this is a feature I am convinced we cannot demand. What is biologically plausible depends upon the biological knowledge of the day.
To quote again from my Alfred Watson Memorial Lecture (Hill 1962),
there was ? no biological knowledge to support (or to refute) Pott?s observation in the 18th century of the excess of cancer in chimney sweeps. It was lack of biological knowledge in the 19th that led to a prize essayist writing on the value and the fallacy of statistics to conclude, amongst other ?absurd? associations, that ?it could be no more ridiculous for the strange who passed the night in the steerage of an emigrant ship to ascribe the typhus, which he there contracted, to the vermin with which bodies of the sick might be infected.? And coming to nearer times, in the 20th century there was no biological knowledge to support the evidence against rubella.?
In short, the association we observe may be one new to science or medicine and we must not dismiss it too light-heartedly as just too odd. As Sherlock Holmes advised Dr. Watson, ?when you have eliminated the impossible, whatever remains, however improbable, must be the truth.? (Austin Bradford Hill, ?The Environment and Disease: Association or Causation?,? Proceedings of the Royal Society of Medicine, 58 (1965), 295-300 [available here]
Despite Hill’s strong caveats, biological plausibility is an extremely powerful factor. Rind’s parachute example is probably more related to biological plausibility than size of the effect. Here’s another example. For many years it was my pleasure(?) to teach epidemiology to graduate students. When we came to the subject of clinical trials I used to use this example, which I thought I got from a PBS Nova show on plants, called “The Green Machine” (truthfully, I’m not sure I got this from the show, but it doesn’t matter for purposes of illustration here). Anyway, a part of the show was an experiment involving a man (I seem to remember he was a clergyman) who said that if he prayed over plants they would grow taller. So they set up two hoods, one with plants that were prayed over by this gentleman and one which were just given the usual care. After 3 months it was found that indeed the objects of prayer were taller. I asked my students to explain this.
There was always a lively discussion and year after year the same explanations were advanced: the carbon dioxide exhalations involved in praying made a difference; more care was taken over these plants than the control plants; it was a chance event — some plants will grow taller for random reasons; etc. The one explanation that year upon year was almost never advanced was the one the experiment set out to test, i.e., that prayer makes plants grow taller.
There could be several reasons for this (including belief on the part of a student that his professor didn’t believe it), but it was clear that one very strong reason was that graduate students at a health sciences school didn’t find this a biologically plausible explanation. There is more to this than the prejudice of scientists. If you were to accept a biologically implausible explanation (think homeopathy, where the number of molecules left after 25 dilutions is essentially zero), then you would have to make quite a lot of adjustments to other scientific facts, perhaps whole disciplines, to accommodate it. Biological plausibility exists within a very complex web of evidence, theory and experience, all of which might well need to be rethought on the basis of a single, potentially flawed experiment or observation. Given that, there’s no wonder it is a powerful factor in causation judgments.
There are some implications to this. One is that a seemingly adequate clinical trial that produced biologically (“scientifically”) implausible results wouldn’t be taken as seriously as one that did, even though the design might be identical. The fault isn’t with the judgment but with the notion that a particular design trumps all other evidence. By the same token, as scientific knowledge changes, how evidence is viewed might change. Suppose someone found a reproducible and biologically plausible mechanism for certain kinds of acupuncture anesthesia. That would likely alter the way existing clinical trials of acupuncture are viewed (NB: this is not a defense of acupuncture; it’s an example that is imaginable as a scenario). Another is that taking RCTs or any kind of study in isolation, even if adding them together via a meta-analysis, doesn’t tell the whole story, nor should it. There is an interlocking body of evidence that exists as context. It might be just as reasonable to say that a clinical trial which showed no effect of a vaccine that clearly raised neutralizing antibodies to was not biologically plausible.
The reason for raising this question in the context of the challenge was to indicate that when we interpret studies, of any design, there is (or there should be) a lot going on besides looking at p-values and what label the study has. Most of you know this, but it is hard not to be seduced by the latest headlines that say, “Drug X shown not to work for disease Y” or “Clinical trial shows efficacy of latest anticancer drug” or the ones that don’t meet some set of standards.
I’ll move on to other things in the next post, so now is your chance to weigh in on this subject.