Orac sent me a link to some more HIV denialist material, I assume under the assumption that since I'm already being peppered by insults from the denialist crowd, I might as well cover this now.
What I'm looking at today is a [paper by Mark Craddock called "HIV: Science by press conference".][craddoc] The paper is purportedly about how the AIDS research community, in cahoots with the media, are deceiving the public about the nature of results of AIDS research. In his words "One of the most disturbing aspects of what passes for AIDS research these days, is the separation between what researchers actually find, what they tell the press conference and what the media tells the public."
As an example of this, he discusses work by George Shaw, Xiping Wei, et al., which is one of the first papers to show viral dynamics of HIV in human lymph tissue. Craddock's criticism is allegedly based on the bad mathematics of Shaw & Wei. Below, I reproduce the section of Craddock's paper where he explains the basis of his criticism. Since the paper is only online in bitmapped PDF form, I've transcribed it by hand; any transcription errors are entirely my fault, and I will correct them immediately if any are discovered. This comes from page 2 of Craddock's paper; that's page 13 of the PDF. I'll interject a few brief comments that belong in-line, and hold the detailed discussion for after the quotation.
>To begin with, what were these people trying to do? They wanted to measure the
>rate at which both HIV and T cells are produced in infected people. The idea is
>deceptively simple. You measure the viral load in a patient at a given time,
>and then you pump them full of 'antiviral' drugs. The drugs reduce the amount
>of virus present in the blood by some factor. (Claimed to be an implausible
>98.5% in these papers. Implausible because there is no possible way that viral
>load can be measured as accurately as the figure of 98.5% suggests.)
*(This is an interesting claim: that one cannot measure viral load to 2 significant figures. Because that's all that's mentioned here. The viral load after is 1.5% of the viral load before. One can determine a figure of 1.5% if load before and after are accurate to 2 significant digits. I find it astounding that Craddock is actually making such a ridiculous claim.)*
>wait till HIV magically mutates into 'drug resistant' strains, and wait till
>the viral load returns to pre treatment levels.
*(Another interesting tidbit; apparently, Professor Craddock doesn't believe in mutation and evolution, at least where HIV is concerned.)*
> This gives an estimate, through
>some relatively simple mathematics, for the rate at which the virus replicates.
>Both Ho and Shaw's groups found that in the absence of viral clearance, the
>total amount of virus in the body should double every two days. So suddenly,
>the low levels of viral replication found over the past decade are thrown out
>the window, and HIV is now the cause of a relentless battle, a battle that
>takes place over a decade or more. The measurement of T cell production can be
>made in much the same way.
*(This is a rather obnoxious misrepresentation. The fundamental new discovery of the work being discussed is that it describes how HIV reproduces; in particular, they discovered that HIV replication occurs in the lymphatic system. Comparing the rates of viral replication found before this work in non-lymphatic tissue and the rates discussed in this work in lymphatic tissue is comparing apples and oranges.)*
>So our question must be whether or not we should believe these results? After
>all at the last big press conference, Ashley Haase's group (Embretson *et al.*,
>Nature, March 25, 1993) found low levels of HIV RNA in the T cells of patients
>studied (4 people, one of whom had no HIV proviral DNA at all) indicating 'low
>levels' of viral replication. So what do we do when one press conference seems
>to contradict the other? Clearly we have to examine both studies carefully.
>As a mathematician, I was intrigued by the claim of John Maddox, editor of
>Nature, that the new results provide a new mathematical understanding of the
>immune system. Unfortunately, my confidence in this claim was badly shaken when
>it turned out that the very first page of the Shaw paper (Wei *et al*., p 117,
>Nature, Jan 12,, 1995) they make an appalling mathematical error. And in the
>same paragraph, make an assumption which turns out, by their own admission to
>have no basis in observation, and which they give no justification for.
*(Take a look at that paragraph again. It's going to be important later.)*
>authors of Wei *et al*. are attempting to give a mathematical formula for the
>amount *v* of free virus as time *t*. They state that the virus is produced by
>virus producing cells *y*, at rate *k*, and decays exponentially at rate *u*.
>These two statements are mutually contradictory but that is not a real problem.
>If they change the word 'decays' to 'is cleared' then all is well.
*(Because using "decays" instead of "clears" when you're describing something that matches what's called a "decay curve" clearly totally invalidates the mathematical statement.)*
> This leads
>to what is known as a differential equation for *v* which may be solved easily.
>(Craddock, letter to Nature unpublished), They state a formula for *v* based on
>their assumption, which unfortunately is completely wrong. Confidence that
>anything good will come out of this paper plummets at this point. Their result
>for *v* is not only wrong, but it does not even look right.
*(Because, you see, the solutions to differential equations are always intuitively obvious, and looking at a solution, one can easily tell if it looks right.)*
>You do not have to
>be a mathematician to realise that if the rate at which *v* is produced depends
>on *ky(t)*, where y(t) is the number of virus producing cells at time *t*, then
>*v* is going to depend upon *ky(0)*, where *y(0)* is the initial size of the
>virus producing cell population. So one wonders how they manage to produce a
>formula for *v* which does not depend on ky(0) al all?
>And they state in the same paragraph that virus producing cells can to 'a good
>approximation' be assumed to decline exponentially. They then state a few lines
>further down that they 'have data only for the decline of free virus, and not
>for virus producing cells'. If they have no data for virus producing cells,
>then how can they possibly know that these cells decline exponentially? They
>might do anything. That is the whole point of not having any data. You do not
>know what is happening.
So, now we've seen the core of his criticism. Basically, he's got two real problems with the math of the paper. First, he questions the validity of the way they fit the data to an exponential decline; and Second, he believes that their descriptive equation is invalid. (He has several reasons for claiming that it's invalid, but they're part of the same argument that the way the exponential curve was fitted to get an equation is incorrect.)
Before we look at what's wrong with his criticism, it's only fair to look at the passage from the Shaw paper that he's criticizing. What follows is taken from
the Shaw paper, pages 2/3 of the PDF, pages 117/118 in the journal. Again, it is transcribed by hand, so any errors introduced are entirely my fault.
>The overall kinetics of virus decline during the initial weeks of therapy with
>all three agents corresponded to an exponential decay process (Figs 1 and 2a).
>The antiretroviral agents used in this study, despite their differing
>mechanisms of action, have a similar overall biological effect in that they
>block *de novo* infection of cells. Thus the rate of of elimination of plasma
>virus that we measured following the initiation of therapy is actually
>determined by two factors: the clearance rate of plasma virus *per se* and the
>elimination (or suppression) rate of pre-existing, virus-producing cells. To a
>good approximation, we can assume that virus-producing cells, decline
>exponentially according to *y(t) = y(0)e-at*, where *y(t)* denotes
>the concentration of virus-producing cells at time *t* after the initiation of
>treatment and *a* is the rate constant for the exponential decline. Similarly,
>we assume that free virus *v(t)* is generated by virus-producing cells at the
>rate *ky(t)* and declines exponentially with rate constant *u*. Thus for the
>overall decline of free virus, we obtain v(t) = v(0)[ue-at -
>ae-ut]/(u-a). The kinetics are largely determined by the slower of
>the two decay processes. As we have data only for the decline of free virus,
>and not for virus producing cells, we cannot determine which of the two decay
>processes is rate-limiting. However, the half-life (*t1/2*) of
>neither process can exceed that of the two combined. With these considerations
>in mind, we estimated the elimination rate of plasma virus and virus-producing
>cells by three different methods: (1) first-order kinetic analysis of that
>segment of the viral elimination curve corresponding to the most rapid decline
>in plasma virus, generally somewhere between days 3 and 14; (2) fitting of a
>simple exponential decay curve to all viral RNA determinations between day 0 and
>the nadir or inflection point (Fig. 1); and (3) fitting of a compound decay
>curve that takes into account the two separation processes of elimination of
>free virus and virus-producing ceels, as described. Method (1) gives a
>t1/2 of 1.9 +/-0.9 days; method (2) gives a t1/2 of 3.0
>+/- 1.7 days; and method (3) gives a t1/2 of 2.0 +/- 0.9 days for
>the slower of the two decay processes and a very similar value, 1.5 +/- 0.5
>days for the faster one. These are averages (+/- 1 s.d.) for all 22 patients.
>Method (3) arguably provides the most complete assessment of the data, whereas
>method (2) provides a simpler interpretation (but slightly slower estimate) for
>virus decline because it fails to distinguish the initial delay in onset of
>antiviral activity due to the drug accumulation phase, and the time required
>for very recently infected cells to initiate virus expression, from the
>subsequent phase of exponential virus decline. There were no significant
>differences in the viral clearance rates in subjects treated with ABT-538,
>L-735,524 or NVP, and there was also no correlation between the rate of virus
>clearance from plasma and either baseline CD4+ lymphocyte count or
>baseline viral RNA level.
So, Craddock's first criticism is of the use of exponential decay. He asserts that they "make an assumption which turns out, by their own admission to
have no basis in observation". How does this stand up?
Very poorly. There are two good reasons why the exponential decay was used.
1. Past observation has shown us that most infectious diseases respond to medications following an exponential decay; so we know that it's a plausible pattern;
2. Observing the data, it fits an exponential decay quite well.
So the data appears to follow an exponential pattern, and we have experience to show us that that is a likely outcome, and that the fit of the data to that kind of curve is not likely to be an artifact. This is what, in science, is known as "developing a hypothesis, and then testing it against the data". Shaw et al. think that an exponential decay is a likely response; they take the data and analyze it, and the math shows a very clean and consistent fit. So we tentatively accept the assumption that the exponential curve is correct, either until more data contradicts it; or more data supports it to the point that we no longer consider it tentative.
What about Craddock's second criticism? He claims that the equations that they fit to the data have several problems. The differential equation describing viral growth must include y(0), the *initial* number of virus-producing cells.
Does that make sense? Well, no. We're working in differential equations; that is, equations that measure *rates of change*. In fact, it's an elementary property of differential equations that they do not work in terms of raw values, in terms of *rates of change*.
Think back to college calculus. What's the integral of x2*dx*? *x3/3 **+ c***. Notice that "+c" there. What does it mean? Because the derivative - the differential equation - only measures the rate of change. The value of the rate of change for *y = x3/3* is *not* dependent on the value of *y* when *x=0*.
*(Note: the following paragraphs have been substantially rewritten for clarity. See the discussion in the first few comments to see how things were changed.)*
Remember, the original equation for describing the number of virus producing cells *y(t) = y(0)e-at*. And then using that equation, they generate the *solution to* the *differential* equation for decay rate *v(t) = v(0)[ue-at - ae-ut]/(u-a)*. This second equation is the important one, the one that is the focus of their attention. What does this second equation mean, and what is it used for?
The paper proposes a hypothesis: that the growth rate of the virus follows a pattern that can be modeled by an exponential function. It then goes on to test this hypothesis: they collected data about the decay rate in a number of infected patients, and see if the data matches the hypothesis. This equation is a template, whose behavior is determined by three variables. If the exponential curve fits the data correctly, *two* of those variables should have essentially constant values: these are the variables that describe the growth rate of the virus, *u* and *a*. The third variable is v(0) - the initial population of the virus; this one varies by patient, because different patients have different initial viral loads; the load of an infected patient can vary by several orders of magnitude. (That's not a statement about HIV; the viral population in different individuals infected by *any* virus can vary by several orders of magnitude.)
So why does the equation include *v(0)* (the initial population of virus), and not *y(0)*, the original population of virus-producing cells? Because in the context of this equation, the way that it's used, that leading value in the equation is basically a scaling factor: it's the factor that describes the unique initial starting point of each individual patient. Since the experiment starts in an equilibrium state, *it doesn't matter* whether you use v(0) or y(0) - in the equilibrium state, the two are directly proportional. Choosing one of them will fix the specific curve that fits a particular patient in a different position - but the key properties of the curve - the rate of change - will be completely unchanged.
Craddock is *supposed* to be a professor of mathematics, doing an unbiased analysis of the math of this paper. I don't believe that anyone teaching college level math could have made a mistake like this by accident. This is deliberate: he's practicing what I call *obfuscatory mathematics*. That is, you want to slip something past people who aren't particularly comfortable with math. So you talk fast, and use lots of mathematical words to obscure the fact that you're saying something patently ridiculous; you count on the fact that the overwhelming majority of your readers will not notice the stupid lie hidden by your obfuscation.
His argument *sounds good*: how can you calculate something that depends on the initial number of virus producing cells if you don't include the initial number of virus producing cells? But when you look carefully: we're creating a differential equation - an equation that describes the viral load in terms of its change over time. The number of virus producing cells *is* important; but our differential equation is written in terms of the number of viruses. It includes the term v(0): the number of viruses at time 0. Since we're measuring the *change* in viral load, the important initial factor is *the initial number of viruses*. The initial number of viruses is, of course, related to on the initial number of virus-producing cells: it's directly proportional; but you don't need y(0) to be an explicit part of the solution to the differential equation. It's just a scaling factor *which doesn't change the result of the computation of viral replication speed*. Whether you use v(0), y(0), or some computed combination of the two *doesn't matter*. In the end, they're effectively just constant scaling factors that have *no effect* on the values of *u* and *a*, which are what will allow us to determine the rate of viral replication.
Craddock also gripes about the fact that there are actually two factors involved in the viral load, but they can only measure one. But just go back and read the text carefully: the explain how, mathematically, they can account for it. Once again, this is just *how science works*: they've got a hypothesis; they've clearly stated their assumptions; and they've shown how the data fits their prediction. After the work is published, other people go back and try to reproduce it and refine it; and long term, either it stands up or it doesn't. They're very honest and open about the fact that they can only measure one of the two factors, and what assumptions they make for their model. In this paper, the very first paper to propose a model for HIV viral load *in vivo*, their model produces quite respectable results; results which have continued to hold up well over time.
So, after a quick look, how do Craddock's critiques hold up? Quite poorly. And as Craddock himself says, after his alleged discovery of an error on the first page of the Shaw paper: "Confidence that anything good will come out of this paper plummets at this point." I couldn't say it any better: after deliberate misrepresentations like what we see in the first two pages of Craddock, there's really not much point in paying attention to the rest. Craddock is a just another true believer who's willing to tell deliberate lies if they support his position.
Uh, I don't follow you here. You write the differential equation for decay rate v(t) = v(0)[ue-at - ae-ut]/(u-a), but that is not a differential equation for the simple reason that there are no differentials there.
I really wish Craddock would have written out his equation more in full, but the way I read him, v(t) is the amount of virus at time t, not the decay rate, as you seem to indicate. My guess from the description in his text is that the equation should be v'(t)=ky(t)-uv(t), where the prime denotes the time derivative (I would have liked to write a dot above the v, but HTML being what it is, I won't try). And given y(t)=y(0)e-at, I do in fact get the solution v(t)=v(0)+ky(0)(e-at-e-ut)(u-a).
What is it I am misunderstanding here?
There is a mistake in my previous post: The solution should be v(t)=v(0)e-ut+ky(0)(e-at-e-ut)(u-a).
Now this is beginning to look like the result of Wei et al. All that is missing is the right initial condition. Excuse me while I am scribbling on the back of an envelope ...
Why, yes of course: The Wei et al solution is the one where v(0) and y(0) are connected in such a way that v'(0)=0. The idea being that, before the antiviral drug is administered, there is an equilibrium, in that the decay of virus exactly balances the production of same. Thus ky(0)=uv(0), and this can be used to eliminate ky(0) from the solution, and we get Wei et al's solution.
But I think it's bad writing on their part not to spell that out. Still, I also think your criticism misses the mark; see my first comment.
One more comment before I go to bed (it's after midnight here). What strikes me is that the paper under scrutiny is quite old; from 1996 in fact. Mark Craddock has six papers listed in math. reviews, all of them on symmetry methods for PDEs. The first one appeared in 1994, the last in 2004. So I would venture a guess that here is a rather pure mathematician by training, with a fresh PhD to boot, being a bit out of his depth when encountering some typical applied math, with its frequently incomplete statement of assumption (see my previous comments) and a style of writing that is very foreign to a mathematician. His criticism of the solution for not even containing ky(0) is indeed reasonable on a first look, and it does take a bit of thinking to come to the realization as I did, that there is an underlying assumption of equilibrium at the start of the experiment. Of course, he should have realized this before writing up his paper... It reminds me of the debacle surrounding the famous Monty Hall problem, where Marilyn vos Savant, who had presented the correct solution to the problem, was told off in rather arrogant tones by a bunch of eminent statistics professors. Lots of egg on lots of faces. You might wish to cover that story one day: There is a lesson in it.
I may not have been clear enough, so let me try again; if this helps, I'll edit the post and add this explanation.
What the paper is trying to do is to find a mathematical model that explains the rate of reproduction of HIV. They've got a set of experimental data - measurements of the viral load, which provide a metric for the amount of virus in infected individuals. They're proposing a model that should allow them to determine the replication rate. The model proposes a particular kind of exponential decay; the experiment attempts to see if there are coefficients that can be plugged into that equation to make it fit the measured data *for all of the patients*.
That *for all of the patients* is the key. The initial viral load *varies* between patients. The number of cells producing the virus *varies* between patients. The unifying factor is the differential: the rate of change. If the *rate of change* in the viral load is consistent among all patients, then *one* set of coefficients should fit *all* of the data; if the rate of change is *not* exponential, or if it varies from patient to patient, then they *won't* be able to fit the curve to all of the data with a single set of coefficients.
So the determining factor in this model *is the differential*. The initial number of infected cells doesn't really matter - what matters is the rate of change. There are going to be 3 variables used in matching the curve to the data. 2 of them - u and a - are expected to be constant for all patients; if they aren't, the hypothesis fails. The third variable is a constant that represents the initial load at equilibrium before the antivirals are administered. This third variable will be different for each individual patient. If the hypothesis in the paper is correct, then this number is *not important*; it's essentially a scaling factor that will adjust the decay curve to match the initial viral load of each patient. Since the initial state is equilibrium, it can be represented in terms of v(0), or y(0), or some combination of the two.
But the key to it all is the fact that the important factors; the factors that will determine whether the hypothesis is right or wrong are the values a and u which will allow them to compute the rate of change; which in turn will allow them to determine the rate of reproduction of the virus. If the equation works for fixed values of a and u for all patients, then the hypothesis is confirmed by the data. And it does - the rate of change fits the same exponential curve quite well for different patients - it just needs to be adjusted to match the initial load.
Er, no, you haven't answered my concerns. Probably I have been unclear. Let's see ... You write
What about Craddock's second criticism? He claims that the equations that they fit to the data have several problems. The differential equation describing viral growth must include y(0), the initial number of virus-producing cells.
But Craddock says nothing of the sort. He says that the solution to the differential equation (not the differential equation itself) must include y(0). And he's right that it must, since the differential equation after all includes y(t), which in turn depends on y(0). Unfortunately, Wei et al never write up the differential equation, but it has to be the one I put in an earlier comment. And please note: That equation includes y(t), which of course depends on y(0).
However, although I must agree with Craddock on this point, it does not follow that his criticism is valid. It is not a valid criticism of the substance of the paper, but it is a valid criticism of the writing: It is too terse, with enough left out that misunderstandings easily arise, and it could have been corrected with just an extra sentence or two, preferably including the actual differential equation they're using for the model. As I explained in my comments, although y(0) appears implicitly in the differential equation, it gets replaced by uv(0)/k thanks to the assumption of initial equilibrium. (Which the authors don't seem to mention, but you do in your comment above.)
It is where you write the following that I really disagree with you:
Remember, the original equation for describing the number of virus producing cells y(t) = y(0)e-at. And then using that equation, they generate the differential equation for decay rate v(t) = v(0)[ue-at - ae-ut]/(u-a). This second equation is the important one, the one that is the focus of their attention. And because it's a differential equation, the constant y(0) does not appear.
And I disagree because what you call the differential equation here is not the differential equation: It is the solution of the differential equation. A solution in which, furthermore, the initial equilibrium condition has been employed to get rid of that pesky y(0) term.
In summary, I don't disagree with you that Craddock is off the mark. But I disagree with you on why he is off the mark, at least as far as this part of the argument goes.
Oh, just to hammer my point home some more: The way I read your original post, it looks like you're saying that differential equations don't contain constants because constants are constant (duh), and so their derivative is zero. And since differential equations are all about derivatives, there are no constants there.
If this is not what you're saying, I think you have some serious rewording to do. Of course differential equations have constants in them! That's a big part of what makes them exciting! True, some constants you can scale away, but others are just an inherent part of the equation, and if you try to scale them away in one spot, they just pop up elsewhere in the equation.
Craddock is a Dembskian - he doesn't understand how to model phenomena.
It is as Mark says apparent when he confuses the prediction from the simple exponential fit in the graphs (or from the first-order approximative model) with measurement accuracy.
One wonders why Dembskians try to wander away from math into such territories, when they don't know how to read a map. (Litterarily!) I don't claim that it is always easy, but they do fail already on the easy parts.
"They state that the virus is produced by virus producing cells y, at rate k, and decays exponentially at rate u."
Actually, the Wei et al paper says, much more carefully than Craddock "that free virus v(t) is generated by virus-producing cells at the rate ky(t) and declines exponentially with rate constant u." [Bold added.] Craddock is confused from the start.
He does go into a more adequate and detailed analysis in his "Supplementary Notes" (a separate note). He criticises Wei et al for not modelling viruses and virusproducing cells as a more basic coupled system, and that they don't present their implicit assumption of initial steadystate viral load.
The later formally eliminates y(0). But as Mark shows, who understands modelling, both procedures are good simplifications for an empirical model that is to be fitted anyway.
An accompanying note verifies that other studies shows that the steadystate is a good assumption, as we already know since the models works so well: "longitudinal studies have shown that the viral load, of both free cells and that within infected cells, increases slowly but inexorably". If it didn't, we would look at patients that would die in a few days, comparably to the clearing times the measurement shows and the model studies.)
Both Craddock and you seems to concentrate on the formality of the model, and the assumptions that goes into converting it between forms. That is not essential when developing and verifying an empirical model. Mark gives a much better explanation for that than I would do. What is essential is to simplify the model as much as you can - here they are explicitly interested in modelling firstorder effects on free viruses only due to their inability to measure virusproducing cells.
I do think that there is a point in noting and remembering that the model assumes initial steadystate viral load when using it. But as per above Wei et al doesn't need to explain the full derivation, since it is rather easy to do, and they do verify the kinetics thoroughly. They state explicitly that they consider ongoing infection to "sustain steady-state levels of viraemia" and their graphs verifies that assumption in initial counts.
I would interpret Wei et al that the knowledge of HIV viral load is approximately steadystate without treatment was considered to be wellknown, and that virus researchers model viruses for a living.
"I do think that there is a point in noting and remembering that the model assumes initial steadystate viral load when using it."
Using it so far as to derive their later simpler empirical model, that is. And looking over the model, I can't see that Mark is wrong now that I've read thought about it. The above assumption is obviously not needed when doing the fit here, it works on part of the fit too. My mistake.
My mistake, and the reason I did it was that *I was to formal too*. Hilarious! :-)
Mark: I see you have been rewriting. It is much better now. I am still not sure the short paragraph about college calculus is relevant, but never mind that.
TorbjÃ¶rn: You write
Both Craddock and you seems to concentrate on the formality of the model, and the assumptions that goes into converting it between forms. That is not essential when developing and verifying an empirical model.
I am not so sure I agree with the first part (developing an empirical model), but certainly you're right about the verification bit. If I seemed to concentrate on the formality of the model, it is because that is where Mark went wrong. I have unfortunately not had the time to work through the empirical verification, curve fitting and all that, so I haven't commented on it.
If you read my comments in sequence you will see that I was indeed confused in the beginning, and my understanding has evolved with further comments.
Thanks for pointing out the "supplementary note" by Craddock, by the way. There, it becomes clear that he has (finally!) understood where the Wei et al solution comes from, with the initial condition and all. It took me maybe half an hour to figure that out after I posted my first comment here, but he had to publish a detailed criticism first. And it doesn't seem that his mistake has deterred him. He doesn't even acknowledge it as a mistake, but just blames it on Wei et al's "sloppy presentation" (I agree with him that it could have been better, but that is no excuse for not being able to work it out given adequate time).
One wonders why Dembskians try to wander away from math into such territories, when they don't know how to read a map.
Ironically, MarkCC is physically incapable of reading a map, or even learning how to read one.
Extra! Extra! Darwinism undercuts mathematics! This one might be too simple for you.
That link to Panda's thumb is hilarious. Too simple you say? No, too weird for words.
I've previously looked at another of Craddock's "papers" here
It is amusing that Craddock cannot understand why the geometric mean is used for PCR viral load measurements.
Duh. Something about log-normal distributions in PCR measurements?
I've also found that Duesberg's wacky mathematics have been immortalised.
Fact: But, only 1 in 1000 unprotected sexual contacts transmits HIV (32-34), and only 1 of 275 US citizens is HIV-infected (29, 30) (Fig. 1b). Therefore, an average un-infected US citizen needs 275,000 random "sexual contacts" to get infected and spread HIV - an unlikely basis for an epidemic!
This "fact" has been credulously reproduced here
and in Bialy's book on page 209.
For some reason the "rethinker" mathematicians never found a problem with this.
The drugs reduce the amount of virus present in the blood by some factor. (Claimed to be an implausible 98.5% in these papers. Implausible because there is no possible way that viral load can be measured as accurately as the figure of 98.5% suggests.)
Actually the paper gives a figure of a mean 10^1.9 fold reduction in viral load following treatment with ABT-538 and L-735.524.
The paper gives only two significant figures for this value. Craddock apparently converts this from a log scale to a normal scale, calculates a percentage and includes more significant figures than were originally given (2) and then accuses the authors of claiming implausible accuracy.
He does the same thing here.
(The notation 10^4.6 is 10 to the power of 4.6, which you can work out on any scientific calculator. 10^4.6 =39810, or 19,905 virions per ml of blood. 10^7.2 = 15848932, or 7924466 virions per ml. 10^5.5 = 316228, or 158114 virions per ml. Of course the accuracy given here is ludicrous. They can't really mean that they can measure things this accurately)
Craddock has everything upside down.
Errors in QC-PCR are typically of the order of 0.5log10 hence a difference of a factor of less than 10^0.5 ~= 3 is not regarded as significant. A reduction of 10^1.9 is significant.
Craddock gets so many things wrong (some apparently deliberately wrong) that it is hard to take him seriously.
Appendix, added 6 August 1997. I have heard from Mark Craddock that his term "appalling mathematical error" is misleading. he wrote me to give a more precise description of what his objections were, for instance: "Next, they produce a solution to the above differential equation which is independent of ky(0). What they have obviously done is set ky(0)=uv(0) to obtain their solution. But this does not follow from anything in their paper. It comes from assuming that dy/dt = 0 initially...But the paper, unlike the Ho paper, does not say this. At least I can't find anywhere it has been mentioned on the first page where the formula appears. They want to get rid of ky(0), which should appear in the solution, because they don't know what ky(0) is. So they replace it by something, then make an assumption, without telling the reader."
I have attached a full copy of Craddock's further explanation to his article for circulation. His objections are really about the mathematical modeling and certain assumptions, not made explicit, and not justified by empirical evidence, rather than a mathematical error per se.
So in reality Craddock's objections are not based on mathematics but rather on his understanding (or lack thereof) of biology.
One more comment before I go to bed (it's after midnight here). What strikes me is that the paper under scrutiny is quite old; from 1996 in fact. Mark Craddock has six papers listed in math. reviews, all of them on symmetry methods for PDEs. The first one appeared in 1994, the last in 2004.
You know, that's an interesting observation-- that this paper is ten years old.
I'd be curious, if someone could somehow get hold of Craddock, whether he's still willing to stand by his 1996 HIV paper today, or whether he possibly would see errors in retrospect, or see his previous conclusions being changed by factual discoveries since then. Looking on google, it doesn't appear Craddock has written anything publicly on the subject since...
(Random aside: And by coincidence, 1996 would have this paper being released the same year as Darwin's Black Box. Is it just me, or does it seem to be a running thing that while real science continually revisits and updates old results, pseudoscience will issue one set of pronouncements and not bother to ask the question ten years later of whether anything needs to be updated? I mean, we do have people like Dembski, who are constantly rewriting the basic definitions of their pseudoscience, but since, in that case, we are never given a clear idea what in the previous writings is being superceded or even whether the previous writings still stand, it isn't so much revising as not paying any attention to consistency with what they've written before. Anyway, Dembski aside, it should be obvious that there is something quite problematic about this AIDS denial site supporting themselves with a 1996 response to a 1995 Nature paper. Even if the 1995 Nature paper had had problems, surely AIDS research has a little bit more to say by now on the subject covered by that basic paper? Surely the deniers ought to be responding to the research of today, not the research of eleven years ago? By the time of the unpublished 2005 Lang paper which is the most recent thing on that wiki, is anything different?)
By the way, MarkCC, I notice you've opened an "HIV denial" category for your blog but have not added your previous HIV-denial-related entry to it. You might want to fix that.
"all models are wrong, but some are useful" --George Box
In modeling phenomena we use assumptions that are known to be wrong. This is unavoidable. Reproducing the data exactly as observed would require a model at least as complicated as the system that generated the data. Borges has a story about this, I can't remember the title but the premise has something to do with an obsessive mapmaker who finally completes his life's work by making a perfect 1:1 map of the city in which he lives. In modeling the correct question is not, "are our assumptions incorrect?", but, rather, "how incorrect are our assumptions?".
"I am not so sure I agree with the first part (developing an empirical model), but certainly you're right about the verification bit. If I seemed to concentrate on the formality of the model, it is because that is where Mark went wrong."
I don't think Mark went wrong, and Ethan explains why. But since I used to do a lot of different modelling for a living and never stopped to think about the bigger picture, I'm going to take the opportunity (as so often :-) to bloviate on and have fun with some parts of a probably large subject.
The purpose of a model is to post- and predict data. There will be compromises between simplicity and generality on one side and accuracy on the other. Generality is important for using prior knowledge and also to build further on a model for other predictions or higher accuracy. Simplicity and generality makes better predictive power. All three gives more trust in the model.
Here an exponential model is enough. It doesn't matter much if it is developed by a domain independent method (heuristically) or by a domain dependent one (knowing that this is the general solution if there is a dominant decay process, knowing that this is the case for viruses). I believe the later is what Lubos Motl at "The Reference Frame" IIRC calls the "morally correct" answer - knowing about the domain. The later will of course make you trust the model more.
Going from basic differential equations (A) to exponential model (C) without fully solving the problem (B) as Wei et al perhaps did is even better use of domain knowledge or trust. Sometimes we can't fully solve the problem from basics, or use only axiomatics when doing it.
I believe that is the case when physicists makes quantum field theories - AFAIK the different quantization procedures aren't fully derived mathematically. But the results fulfill basic properties and are useful theories so theorists use them anyway. Other theories such quasistatic thermodynamics comes to mind. And what is good enough for theories are certainly good enough for applied models.
But even if we can go through A-B-C axiomatically, we still aren't assured applicability or trust.
The successful finegraining of the problem can be greater than the coarsegraining of the assumptions or the modelling. For example, the free virus load is measured in the blood. It is an assumption, and probably a good one, to take this as a measure of the average free virus load in the body. But in other cases an effective compartment model or a full partial differential model may be needed to arrive at a good enough answer.
"how incorrect are our assumptions?", indeed.
""One wonders why Dembskians try to wander away from math into such territories, when they don't know how to read a map."
Ironically, MarkCC is physically incapable of reading a map, or even learning how to read one."
I knew that. But he is so good at math (and mappings), so I hoped he wasn't going to be implicated. It seemed like such an appropriate analog (or model? :-) of the situation.
"And what is good enough for theories are certainly good enough for applied models."
Nope. If anything the implication goes the other way since we get a lot more trust in old and interlaced theories.
But I believe it is still a valid analog to discuss.