Lott on the Lancet study

The latest pundit to attack the Lancet study is somebody called John Lott. He writes:

I haven’t spent a lot of time going through the methodology used in this survey by Lancet, but I don’t know how one could assume that those surveyed couldn’t have lied to create a false impression. After all, some do have a strong political motive.

Well, unlike surveys of defensive guns use, where the people questioned can make anything up that they liked, the researchers tried to verify the deaths with death certificates and were successful in 81% of the times that they asked.

There is also the question of the comparability of the before and after war fatality rates. Andrew Bolt has a very extensive and interesting critique of the Lancet paper:

As I explained earlier, Bolt’s article contains some basic statistical errors. But Lott seems to be endorsing it. What does that say about Lott’s knowledge of statistics?

Lott also links to this New York Times article, claiming

If the New York Times critiques you (even with caveats) from the right, you know that you are in trouble

Which is pretty weird, since the article defends the Lancet study:

Other critics referred to the findings of the Iraq Body Count project, which has constructed a database of war-related civilian deaths from verified news media reports or official sources like hospitals and morgues.

That database recently placed civilian deaths somewhere between 14,429 and 16,579, the range arising largely from uncertainty about whether some victims were civilians or insurgents. But because of its stringent conditions for including deaths in the database, the project has quite explicitly said, “Our own total is certain to be an underestimate.”

Comments

  1. #1 dsquared
    December 9, 2004

    I agree with your main point here, but I think that you’re arguing in a systematically biased way with respect to the possibility of an underestimate rather than an overestimate.

    We know that one accident of the kind you describe actually happened; the Sadr City cluster had zero deaths. We also know that because of the grouping process, Samarra and Najaf were not sampled at all. That’s three separate highly violent regions which contribute nothing to the estimated death rate; in my previous post I referred to highly violent “regions” not “clusters”, because I was attempting, (not very clearly), to make the point that there were lots of highly violent areas which weren’t sampled.

    Meanwhile, you’re arguing on the basis of figures which exclude the one highly violent region which was sampled. The ex-Fallujah figures can’t be taken to stand for “97% of Iraq” because they aren’t representative of Sadr City, Najaf, Sammarra or Ramalla. These four cities would be equal to [DD rough guess] 15% of Iraq (based on the factoid that Sadr City has about 2m inhabitants, which is from memory and could be out by an order of magnitude). They are also likely to be disproportionate contributors to the death rate. I’d argue that in looking at causes of death, it makes much more sense to look at the cum-Falluja data than the ex-Falluja data, which is what the authors did.

  2. #2 Mike
    December 9, 2004

    D Squared:

    I have to take issue with some of your latest post.

    “In highly violent clusters, the excess deaths are massively biased toward violent deaths.”

    We don’t have that many ” highly violent clusters,” to begin with. More than half, 18 of 33, reported no violent deaths whatsoever. Even excluding the 7 Kurdish clusters in the north that failed to tally a single violent death, we still have 11 of the 26 remaining clusters that did not report a violent death either.

    Finally, we only have 21 violent deaths spread among 14 non-Falluja clusters. There simply aren’t enough violent deaths to go around to create a ” massive bias toward violent death” in any cluster other than Falluja.

    You may have a point if we look at the governorates that encompass Falluja and Baghdad, but not the individual clusters themselves.

    I’d also like to draw your attention to an earlier portion of this thread, where I advanced the theory that the non-Falluja bombing death numbers may well be quite small, as low as 6, and this was inconsistent with the large death toll from the type of massive bombing that the Lancet study authors and its defenders were implying must have occurred. To put this another way, the Falluja cluster sample reported deaths more consistent with the type of bombing that the Lancet believes to be fairly common across Iraq other than the Falluja area.

    At the time, the rebuttal to my point was two-fold; From Scott, he challenged the supposition that there were only 6 bombing deaths in the non-Falluja clusters, voicing the possibility that the numbers could be significantly (for statistical purposes) higher.

    From you, D Squared, came the suggestion that possibly some of the non-Falluja bombing deaths (however many there were) could have been erroneously attributed to the coalition when in fact they were caused by insurgent mortars.

    But we now know that in fact there are 6 bombing deaths from the non-Falluja clusters. Further, we know that only 4 of the 32 non-Falluja clusters reported any bombing deaths. I believe this supports the point I made earlier, that bombing in Iraq was, and is, far from uniform, and in many areas was not, and is not, widely used by the coalition.

    To further illustrate the argument I’ve made in the past, that the elements of the violent death toll are extremely fragile to manipulation, let’s assume that the one Thiqar cluster that reported 3 bombing deaths was substituted for another Thiqar neighbourhood that actually had 15 bomb deaths, something more consistent with (but still only a fraction of) the Falluja results. Such a bombing in Thiqar would represent what many have come to assume the coalition bombing as depicted in the Lancet study must look like; a significant part of the neighbourhood devastated, with multiple deaths in multiple homes within the neighbourhood. By changing that single bombing, one event, we drive the non-Falluja bombing detah rate up by a factor of 3, and create an extrapolated death toll of well over 50,000 from bombing alone. Of course, this puts the Lancet 100,00 excess death toll from all causes off the charts.

    Now, let’s do the same thing with the Thiqar triple fatality, but instead, the survey picks another neighbourhood that reveals no bombing deaths. Now we’re only down to 3 bombing deaths outside of Falluja, for 97% of Iraq. The impact on the extrapolated excess death toll, and the weight of criticism of Coalition culpability, is dramatically skewed from the data we currently have.

    Neither of the above possibilities is outlandish or unlikely, given the overwhelming number of clusters that experienced no bombing deaths.

    The difference in sampling a single neighbourhood other than those actually surveyed has the potential to cause huge swings in the overall extrapolations. I hate to keep bringing it up, but that’s what small numbers are vulnerable to.

  3. #3 James Brown
    December 10, 2004

    dsquared argues that you should analyse the data with the Falluja cluster in.

    the authors specifically exclude this, because they say that the cluster is statistically unreliable. The numbers would extrapolate to 200k deaths (+/- confidence interval), and the authors could not contemplate justifying such numbers.

    so when you argue for including the falluja data, you have to accept that the authors themselves believe that data is not reliable for the purposes you wish to use it.

    j

  4. #4 Mike
    December 10, 2004

    D Squared:

    I don’t think I was neglecting the possibility of an overestimate being just as likely as an underestimate, at least not in the last post. To be sure, the thrust of my argument from the beginning is that I believe the 100,000 excess death figure is too high. But when it comes to this particular sampling methodology, I do believe that the actual study result could represent an underestimate, and I’m firmly convinced that identical studies could give us much higher extrapolations than what we have with the actual study. It goes without saying that I don’t believe this enhances the credibility of the results of the actual survey.

    If you look back at my previous post, I insert two hypothetical changes to the triple bombing fatality in Thiqar. One allows for a different neighbourhood that reports no bombing deaths, the other hypothetical supposes 15 deaths in a different neighbourhood. One creates a much lower death extrapolation, the other a much higher extrapolation. I think I was giving each (the underestimate and overestimate) equal billing.

    The problem with using the actual study data to determine the probability of an overestimate or underestimate for the actual study is the shakiness of the totals for the unique causes of death that make up the lion’s share of the 100,000 excess death figure. As I’ve pointed out here and in the next thread, the actual data appears to grossly underestimate the number of insurgent dead, and provides extrapolations on bombing deaths which do not appear to corroborate the authors’own conclusions. Because these two causes of death are so significant to an overall excess death calculation, the study’s apparent inability to provide reliable estimates for these reflects badly on the reliability of the 100,000 figure (and yes, it may be actually low balling! I just can’t see it).

    “I’d argue that in looking at causes of death, it makes much more sense to look at the cum-Falluja data than the ex-Falluja data, which is what the authors did.”

    D Squared, I just can’t see a viable case for this. The reason that jumps out at me is the fact that including the Falluja numbers blows the confidence interval away. Including Falluja’s deaths unavoidably leads to a nationwide extrapolation of 300,000 excess deaths. The authors wanted no part of this, and are repeatedly on the record that their 100,000 estimate can be defended without relying on the Falluja data. The confidence interval can’t survive the re-introduction of the Falluja data.

    As I mentioned in an earlier comment, we have an unequivocal expression of this from Dr. Roberts in the Medialens e-mail exchange:

    “It happens, that the one place with a lot of bombings, Falluja, and we excluded that from our 100,000 estimate….thus if anything, assuming that there has not been any intensive bombing in Iraq.” (the elipse isn’t mine, that’s how the e-mail reads)

    As for Najaf, Samarra, Sadr City, etc, we really can’t project how sampling these areas might affect the outcome of a different survey. My suspicion is that large scale civilian casualties may well be specific to unique and limited geographic areas. I see this as a major limitation of the survey methodology. Sampling different clusters in the high violence areas not originally sampled may still provide a very misleading overall picture. Samplings from Samarra or Ramadi may not reveal a Falluja-like death toll, but what if each provided somewhere in the neighbourhood of 15 coalition-caused deaths? It’s unlikely the authors would have excluded them as outliers. Now you’ve got a much larger extrapolated death toll than 100,000. But could these clusters, AS A WHOLE, provide the necessary numbers to fill in the tens of thousands of extra bombings? Targeted bombing may have hit certain very narrowly defined parts of the violent governorates. A survey that randomly selected 1 or more of the hard hit neighbourhoods in these governorates might reault in violent death extrapolations the governorates are simply unable to even remotely sustain.

    Hopefully this makes sense, I’m getting tired!