Lancet study data

David Kane asked Les Roberts for the data for the Lancet study. He CC’d it to me, so
here it is.

Comments

  1. #1 nik
    December 20, 2005

    The scatterplot of Pre/Post CMRs is absolutely unfathomable.

  2. #2 Brendan H
    December 20, 2005

    I have run a quick cross-sectional time-series poisson regression on the data, and get increases in mortality of between 80% and 250%.

    See http://teaching.sociology.ul.ie/bhalpin/lancet/

    This takes account of regional differences in mortality, but perhaps not systematic differences in the increase.

  3. #3 dsquared
    December 20, 2005

    Tim, would it make sense to put some sort of digital signature on this file, just to be sure?

  4. #4 Tim Lambert
    December 20, 2005

    Umm, just to be sure of what?

  5. #5 dsquared
    December 20, 2005

    I’m not sure now. I just thought it was a cool techie thing to say. I sort of had the idea that it would be good to make sure that it was authenticated in case fraudulent versions showed up on the web, but it seems a bit paranoid when you say it out loud.

  6. #6 Eli Rabett
    December 20, 2005

    Even paranoids get spammed.

  7. #7 Brendan H
    December 21, 2005

    I’ve done more modelling, and feel my results tally well with the original paper’s estimate of an increase in mortality of about 50%.

    I’ve analysed this the “obvious” way, as I can’t figure out exactly what the authors did from the description in the paper. It may be that they did something more sophisticated, but the results are consistent.

    I have updated the details at http://teaching.sociology.ul.ie/bhalpin/lancet/

    The poisson model I mentioned above does not fit well due to overdispersion (more intercluster differences than the model assumes), so I have also fitted negative binomial models, which allow for this problem. With Falluja in the data these give CIs that include “no increase”, because including Falluja increases the variance a great deal. Excluding Falluja they generate a CI of 1.028–2.244 for the fixed effects model (point estimate 1.519), and 1.081–2.203 (point estimate 1.543) for the statistically more efficient random effects model. A Hausman test confirms that the random effects model is acceptable.

    I’m not sure how to scale these mortality factors up to a number for excess deaths, but the point estimates coincide closely with those in the paper. This is a reasonably conservative approach, and treats the data as providing only 64 observations (two mortality rates for each cluster, excluding Falluja). The clusters are treated as a simple random sample.

    The one thing I haven’t been able to do in this framework is to allow a cluster-specific increase in the mortality, as there are not enough degrees of freedom. This is something the paper suggests was done.

  8. #8 Robert
    December 21, 2005

    Brendan is a Stata guy, so he posts his analysis of fixed and random effects models. I’m an R guy, so I post a picture.

  9. #9 Tim Lambert
    December 21, 2005

    Brendan, I think it’s good that you’ve analyzed it in a different way since it shows that the results aren’t very sensitive to the way you analyze the data.

  10. #10 Seixon
    December 21, 2005

    So… anyone want to create a graph of the distribution? Brendan, so according to you, the Fallujah data made the CIs include 0. Correct? Gee, I wonder why they threw it out…

    Needless to say, their data does not account for any variance between differences in rates between pairs of sampled and unsampled governates. I wonder what such included variance would have have done to the CIs…

  11. #11 Ragout
    December 22, 2005

    Robert,

    Nice graph. But where’s Adhanya?

  12. #12 Robert
    December 22, 2005

    Ragout wondered:

    Where’s Adhanya?

    Aaaaaaarrrrrrgggggh. I clipped it off the top. I’ve put up a corrected graphic. Falluja is off the top of the graph but it would have been at (0,187). I’ve added “rugs” to the sides of the graph so you can see the distribution of pre- and post-invasion rates (with means 5 per thousand and 12.3 per thousand, giving a post/pre ratio of 1.54, as Brendan showed).

    There were clusters with lower post- than pre-invasion death rates (Sulyman 2 and 3 are examples) but you might be able to see from the graphic why the concerns about pairing governates are probably a red herring. Even adding a couple of clusters where the post-invasion rates were lower wouldn’t have brought the point estimate of the post/pre ratio down to 1.0.

    BTW, in case you were wondering why some cluster labels are larger than others, they’re scaled by the square root of the cluster size.

  13. #13 Tim Lambert
    December 22, 2005

    Ragout says “But where’s Adhanya”?

    It’s in Baghdad.

  14. #14 dsquared
    December 22, 2005

    the Fallujah data made the CIs include 0. Correct? Gee, I wonder why they threw it out

    Because it’s an outlier. Gee, since you clearly knew that, I wonder why you’re pretending not to?

  15. #15 Donald Johnson
    December 22, 2005

    Someone explained to me months ago why including Fallujah would widen the CI enough to include decreases in the death rate, but to me this suggests there’s something inadequate about normal statistical procedures. Inadequate because we know why Fallujah outliers exist in the post-invasion environment–there are places that have been heavily bombed. And we know that wasn’t going on in the months preceding the invasion. So unless someone wants to argue that Saddam’s security forces were committing large-scale massacres in 2002, we can safely assume that Fallujah outliers do not make it more likely that the death rate actually decreased after the invasion. Some clever Bayesian type ought to be able to come up with a procedure that shows the most likely range of death rates including the Fallujah data that incorporates the fact (if it is a fact) that such outliers probably didn’t exist in 2002.

  16. #16 dsquared
    December 22, 2005

    Donald is right on the theory but the intuition is even simpler; you have a dataset that suggests 8000-194000, a new data point comes in that would imply 300,000 – why would this make you think the true value was likely to be 0?

  17. #17 Seixon
    December 22, 2005

    dsquared,

    Exactly. I’d say that shows that this type of survey is not well-suited for estimating mortality. A single household can knock the entire estimate off the charts if it is either highly unique or there is any amount of deceit going on in the reporting. Just another reason not to trust this study at all. It is way too susceptible to fraud, manipulation, and not to mention things such as the Fallujah cluster.

    I’ll tell you one thing, that kind of thing had no effect on the UNDP study. For obvious reasons.

  18. #18 dsquared
    December 22, 2005

    Seixon said:

    A single household can knock the entire estimate off the charts

    No, that’s not true; do the sums yourself and you’ll see. You really do squander credibility by making these completely unchecked statements.

  19. #19 Seixon
    December 22, 2005

    dsquared,

    Look, when the entire basis of the 100,000 number comes from about 40 excess deaths, you’re telling me that if one household lies, misreports, or is oddly unique and reports 10 excess dead people, that this won’t affect the results? I mean, even 5 will make a difference, obviously. Or how about if 5 households misreport one extra death each? This is why these types of surveys are just no good for this kind of estimate. The UNDP survey had 22,000 households in it which would squash any such possible malfeasance.

  20. #20 dsquared
    December 22, 2005

    no, I’m saying it won’t “knock the entire estimate off the charts” which was your previous claim.

    This is pure Kaplan’s Fallacy; equally, if one households underreports deaths it will have the opposite effect and in the absence of any evidence you have no reason to believe that one or the other is the case.

  21. #21 Seixon
    December 22, 2005

    dsquared,

    Naturally it could go the other way, which doesn’t exactly make the estimate any more “robust” or precise. I’m just saying that the methodology is quite susceptible to manipulation. You can just admit it is a weakness, and we can move on.

  22. #22 Donald Johnson
    December 22, 2005

    Seixon, if a certain percentage of Iraqi households are likely to lie it’s going to mess up any survey, unless they are equally likely to lie about death tolls in either direction and by exactly the same amount. That wasn’t exactly the clearest sentence I’ve ever written–what I meant was that if ten percent of families exaggerates the death toll by one and another ten percent decreases death tolls by one, then it would tend to cancel out. Though not exactly, and you’d get closer to the true number the larger the survey. But if 20 percent lie in one direction and ten percent lie in the other, then you’ve got a systematic error no matter how big the survey.

    Les Roberts said in one of his interviews that the UNDP survey teams found they weren’t always getting a full accounting of all the deaths in a given family, btw. I forgot the details and also don’t remember which interview this was, but I think Tim linked to it.

  23. #23 Seixon
    December 22, 2005

    Donald,

    My point is that a single household could mess up the Lancet study and exaggerate the findings. This is not possible in the UNDP study as one household would be a drop in a cup compared to a drop in a thimbel with the Lancet.

    I am very clear that the UNDP study underestimates the deaths, and that is not what we are talking about here. The Lancet methodology is simply not well-suited for the type of survey they were doing in Iraq. Hitting one or two bad neighborhoods, or missing any, will just screw the entire thing up. You will get wildly erratic results each time you do the damn thing.

    I would challenge Les Roberts to go back to Iraq, and conduct the exact same survey with the same exact methodology. He will not get the same results as before, not at all. Especially since his methodology will oversample or undersample governates within pairs that are not similar. In addition to cluster surveys not being suited for mortality due to wars such as that in Iraq.

  24. #24 Dano
    December 23, 2005

    I would challenge Les Roberts to go back to Iraq, and conduct the exact same survey with the same exact methodology

    I would challenge the annoying nitpicker who is, apparently, genetically incapable of backing down to go over there during hot wartime and see how GD good of a job his sitting on his *ss self could do.

    Crikey, lad, don’t you see you’re a joke?

    I hope Santa brings you peace.

    Best,

    D

  25. #25 Seixon
    December 23, 2005

    Dano,

    When did I become a well known mortality study author? If I went and did a study, nobody would care. Not to mention I have no resources available to me and…

    Oh, here I am acting as if your comment warranted an actual response. Silly me.

  26. #26 Tim Lambert
    December 23, 2005

    A single household could not mess up the Lancet study. The deaths were checked against deth certificates so we know that they were not invented.

  27. #27 soru
    December 23, 2005

    The deaths were checked against deth certificates so we know that they were not invented.

    That’s the same study that said:

    ‘Death certificates usually did not exist
    for infant deaths and asking for such certificates would probably inflate the fraction of respondents who could not confirm reported deaths.’

    and

    ‘In 63 of 78 (81%) households where
    confirmations were attempted, respondents were able to produce the death certificate for the decedent’

    and

    ‘Interviewers also believed that in the Iraqi culture it was unlikely for respondents to fabricate deaths.’

    ‘know’ would seem to be the wrong word, perhaps ‘are forced to assume’ would be better?

    If 20% of those surveyed were supporters of the insurgency, and one in 50 of that group was prepared to lie for the cause (perhaps reporting the death of a cousin as being in the household?), then what impact would that have on the figures?

    soru

  28. #28 Seixon
    December 23, 2005

    A single household could not mess up the Lancet study. The deaths were checked against deth certificates so we know that they were not invented.

    Who are you kidding Lambert? Only 81% of them were checked for death certificates. For those of us who aren’t Lancet-apologists, that means that any of those 19% of households not asked could have invented whatever they wanted to.

    Also, the statement from the study:

    Interviewers also believed that in the Iraqi culture it was unlikely for respondents to fabricate deaths

    Wow, another belief. What is this, theism or a scientific study? I asked Iraq The Model about this, without revealing that it had to do with the Lancet study. The response I got from there is that they had no clue WTF this was about, they’d never heard of any such thing. They are Iraqis. Funny that.

  29. #29 Tim Lambert
    December 23, 2005

    No Seixon, it does not mean that could make up whatever they wanted. They would have had to invent a believable reason why the death certificate was unavailable. And even in the implauible case where all those deaths wre invented, it still doesn’t make much difference to the estimate.

  30. #30 David Kane
    December 24, 2005

    Thanks to Tim for posting this data. Comments:

    1) I assume/hope that Tim sought and received permission for the Roberts to post this data. No such permission was sought or received by me.

    2) This data, while helpful, is not adequate for reproducing the results in the paper. For example, you can’t reproduce either figure 2 or table 2 from what Tim has posted. I hope to work with the Lancet authors after the holidays to solve this problem.

    3) I think that it is excellent that others are using this data to (try to) replicate portions of the Lancet results. I am pleased to have played a small part in this by requesting the data in the first place. The more that this occurs, the better. The more open that you are with your science, the more that other people will believe your claims.

  31. #31 soru
    December 24, 2005

    Only 81% of them were checked for death certificates. For those of us who aren’t Lancet-apologists, that means that any of those 19% of households not asked could have invented whatever they wanted to.

    That’s wrong – a tiny proportion were asked (2 households per cluster, and only for non-infant deaths). Of those asked, 81% were able to answer.

    It doesn’t say, but it looks like the interviewers had discretion as to who to ask (and presumably they wouldn’t pick the armed and angry-seeming man with local tribal, Islamist or Ba’athist connections).

    It also doesn’t say anything about whether checks were done to ensure the dead person, certificate or not, was actually a part of the household, and not a non-resident cousin, acquaintance, or colleague.

    Such checks are more or less impossible – it is probably a pretty much unavoidable weakness of this kind of study.

    soru

  32. #32 Tim Lambert
    December 24, 2005

    It is misleading to describe it as a “tiny proportion” since most households did not have a death. The interviewers would have asked the first two households that reported deaths (since otherwise reinterviews would have been required).

    Why would a household have a death certificate of an acquaintance and also falsely report their death as that of a household member?

  33. #33 Robert
    December 26, 2005

    In 16, d^2 wrote:

    you have a dataset that suggests 8000-194000, a new data point comes in that would imply 300,000 – why would this make you think the true value was likely to be 0?

    Yeah. The issue is that when Falluja is included, the estimate distribution is skewed so standard statistical inference is whack. You can see that by looking at the bootstrap estimates using the cluster data provided by Roberts. The bootstrap CI including Falluja way excludes zero.

    You can also see that the bootstrap estimates excluding Falluja are much better behaved.

  34. #34 Seixon
    December 28, 2005

    soru,

    That isn’t correct, since only a few amount of households actually had deaths to report. Lambert, do you have the number for the total amount of households that reported a death? Does this parallel the number given for seeking death certificates?

    Also, Lambert, you cannot discount the possibility of fraudulent reporting by any one household which was not asked for death certificates. Anyone and their grandmother can make up a reason not to have one. Lying is easy, and I’m not so sure that Roberts’ team of Iraqi investigators would scrutinize allegedly grieving Iraqis on the supposed deaths of their kin.

    The fact that so many “beliefs” get thrown into this study is a bit unnerving. Especially when I have had confirmed by actual Iraqis that they had never even heard of one of the “customs” that the Lancet study alleges.

  35. #35 z
    December 28, 2005

    “Especially when I have had confirmed by actual Iraqis that they had never even heard of one of the “customs” that the Lancet study alleges.”

    And which we shall apparently never even hear of either.

  36. #36 Robert
    December 30, 2005

    Seixon wondered:

    Lambert, do you have the number for the total amount of households that reported a death? Does this parallel the number given for seeking death certificates?

    Perhaps I’ve overlooked it, but I couldn’t find that info in the article. However, we can get a rough idea about it. We know that they planned to ask for death certificates in cases with adult (not infant) deaths, and that the total of all non-infant deaths, both pre- and post-invasion, including Falluja, was 38+121 = 159. So presuming a maximum of only one (adult) death in any household, the absolute maximum number of households was 159. If there were a handful of households that had had two adult deaths, the number would be lower. 78 households were asked to provide death certificates, so an absolute lower bound on the proportion is 78/159 = 49%. So let’s say half.

    Also, Lambert, you cannot discount the possibility of fraudulent reporting by any one household which was not asked for death certificates.

    Perhaps not, but it’d be kind of an odd story to tell about why the half that weren’t asked would be systematically more inclined to fraud than the half that were.

  37. #37 soru
    December 30, 2005

    Perhaps not, but it’d be kind of an odd story to tell about why the half that weren’t asked would be systematically more inclined to fraud than the half that were.

    In the paper, it explicitly says that the researchers felt that asking that question might put themselves at risk.

    Checking the story of someone you suspect might be lying sounds more risky than asking a question you think will be easily answered.

    soru

  38. #38 Robert
    December 31, 2005

    Soru wrote:

    In the paper, it explicitly says that the researchers felt that asking that question might put themselves at risk.

    Is this what you were referring to? “When deaths occurred, the date, cause, and circumstances of violent deaths were recorded. When violent deaths were attributed to a faction in the conflict or to criminal forces, no further investigation into the death was made to respect the privacy of the family and for the safety of the interviewers.”

    If that does mean that death certificates were not asked for in those cases, it means that the decision was protocol-based before going out into the field. Before going out, they were probably worried about getting a case or two where a death was of an enemy combatant, or a rape — in most cases, the overwhelming causes of post-conflict deaths are disease or malnutrition, and they probably weren’t expecting that violence-related causes would be very high — and you’d set up a protocol ahead of time so that the interviewer wouldn’t be in the position of probing into potentially embarrassing deaths. Apparently, not all violent deaths qualified: only those related to a faction in the conflict or to criminal forces. So the interviewee had to report a potentially embarrassing cause of death rather than hide it or give a more common (and less embarrasing) cause of death.

    There is a fair amount of detail one has to collect in order to reconstruct the person-months of exposure. You don’t just walk into a household and ask, “Anyone die in here?”, make a tally mark on a sheet of paper and then leave. I stress that I don’t know exactly how Roberts did it in this case, but I can tell you a common way to collect family event histories in other studies. Take a sheet of paper, turn it sideways, and draw a grid on it. Each row is labeled with a person’s name, age, and relationship to the household (son, daughter, cousin, mother-in-law). Each column is marked with a month from Jan 2002 to Sep 2004 (the study period). You draw a horizontal line through all the boxes that the individual was in the household. If a baby is born, or an adult enters the household, you start up a new line and extend it to the right. If someone moves out, you stop the line. If someone dies, you stop the line and mark it with an “X”, then record the date, cause of death, and circumstances of violent death. At any point you can look down the columns and tell how many people are in the household. Any line shorter than 2 months doesn’t count as a household member. In order to make up a fraudulent death, you’d also have to make up a birthdate, a deathdate, a relationship to the household head, and a potentially embarrassing cause of death, not knowing that the interviewer later on might ask for a death certificate.

    This is what I meant by “an odd story.” It’s certainly possible, just a bit odd.

  39. #39 Seixon
    January 1, 2006

    Perhaps not, but it’d be kind of an odd story to tell about why the half that weren’t asked would be systematically more inclined to fraud than the half that were.

    I merely suggested the possibility of it happening, something which Lambert can’t allow to keep his house of cards up.

    If something is possible, Lambert will make it impossible, for the sake of his argument.

    In other words, taking the position that a single household in the sample could/would not have manipulated the results of the study is well, tantamount to a certain thing that I can’t say because then Lambert will censor me. :)

  40. #40 Robert
    January 2, 2006

    Seixon wrote:

    I merely suggested the possibility of it happening

    You’re saying that merely suggesting the possibility of fraud in a single household, no matter how odd the story would have to be, invalidates the results of any study? That’s a pretty high standard for invalidation.

    Let’s say a single household did, in fact, lie about the number of deaths that occurred, and that the lie wasn’t to understate but to overstate. The interviewers tried to verify death certificates for two cases per cluster, so by randomness the uncovered fraud would have had to occur in a household that wasn’t in the first two that were asked for verification. That’s an odd story.

    You can look at Roberts’ cluster level data to see that (excluding Falluja) no cluster experienced more than the 9 post-invasion deaths that occurred in Adhanya-Baghdad. That means that, at the very maximum, no single family could have overstated post-invasion deaths by more than 7. Piling up all overstatements in one family? That’s an odd story.

    In order to get the person-months of exposure, that family would have had to manufacture names, ages, dates of entry into the household, dates of death, and causes of death for each household member. That’s an odd story.

    Nonethess, the most that one family could have influenced the estimate was 7 deaths’ worth, and it would have had to have occurred in the Adhanya cluster. So, subtract 7 deaths from there and calculate a “new” estimate of excess mortality: I get 76000 excess deaths as compared to Roberts’ 98000 (but you should double-check). Now bootstrap it. What you’ll see is that the 95% confidence interval around 76000 includes 98000.

    So, even piling odd story upon odd story upon odd story, fraud in no single household would have invalidated the original estimate.

    BTW, as long as you’re at it, add an additional hypothetical cluster from one of the governates that you think should have been included. Get a new estimate of excess mortality, and bootstrap it. Can you exclude Roberts’ original 98000 estimate? I suppose if you chose the right parameters for the additional hypothetical cluster you could, but that would be an odd story.

  41. #41 Seixon
    January 2, 2006

    You’re saying that merely suggesting the possibility of fraud in a single household, no matter how odd the story would have to be, invalidates the results of any study? That’s a pretty high standard for invalidation.

    Well you see, most surveys alleviate problems caused by this by not having such anemic sample sizes and sample methodologies…. ;)

    So, even piling odd story upon odd story upon odd story, fraud in no single household would have invalidated the original estimate.

    An odd story is still a possible one, is it not?

    I didn’t say it would have invalidated the original estimate, just that the result could have been skewed either up or down by quite a bit from the fraud of just one household.

    I’m speaking to the uncertainty of the study’s result, not that it is in fact invalid. I have long said that the 100,000 figure (98,000, whatever) might very well be correct. Yet I have also said that this study, its methodology, do not give any confidence in that result. The size of the confidence interval is just the beginning of the imprecise and uncertain nature of the results of this study.

  42. #42 Judge Ito
    January 2, 2006

    “a new data point comes in that would imply 300,000 – why would this make you think the true value was likely to be 0?”

    The “true value” isn’t the mean! it’s the mode you ninny.

    The mode of a skewed distribution isnt pushed upward by an outlier like fallujah… the bootstrapped pdf seems like it’s pushed downward.

  43. #43 dsquared
    January 2, 2006

    The “true value” isn’t the mean! it’s the mode you ninny

    No, the true value is the number of excess deaths in Iraq. The “most likely” true number is the number which corresponds to the maximum of the likelihood function. No reasonable estimation method would have the property that a large positive outlier moved the ML estimate downward; I suspect that this might even be directly ruled out by Bayes’ Theorem.

  44. #44 Judge Ito
    January 2, 2006

    “most likely” true number is the number which corresponds to the maximum of the likelihood function.

    “the maximum of the likelihood funtion” and “the mode” are the same value. The mode is the most likely value.

  45. #45 Robert
    January 2, 2006

    the bootstrapped pdf seems like it’s pushed downward.

    Doesn’t look that way to me. Perhaps you overlooked that the horizontal scales are different? The BCa CI for the data including Falluja excludes 0 (as we’d expect it to).

  46. #46 Robert
    January 2, 2006

    Seixon wrote:

    I’m speaking to the uncertainty of the study’s result, not that it is in fact invalid.

    ????

    You’re saying that the main effect of a potential fraud in one household in the sample is to add to the estimate’s uncertainty?

    An odd story is still a possible one, is it not?

    One of the things that expertise bestows is (sometimes) the ability to put a rough ordering on “possible” errors. The possibility that 7 deaths in one single family were fraudulent and undetected may be non-zero, but it isn’t large. If you’re walking around the Rocky Mountains and you hear hoofbeats, it’s possible it could be a zebra but it’s much more likely to be a horse. It’s better to focus on likely errors than on monkeys-may-fly-out-of-my-butt possibilities.

  47. #47 Judge Ito
    January 2, 2006

    The BCa CI for the data including Falluja excludes 0 (as we’d expect it to).

    that wasn’t what i said. and “we” shouldnt expect anything going in.

    I’d like to see the calculations that went into the CI. I would think the CI attached to an unusually shaped distribution wouldnt lend itself to intuitive judgments. The mode doesnt look different, and the mode measures the “most likely” true measure.

  48. #48 Judge Ito
    January 2, 2006

    The BCa CI for the data including Falluja excludes 0 (as we’d expect it to).

    Where is this variable given please?

  49. #49 Robert
    January 2, 2006

    “we” shouldn’t expect anything going in.

    Of course we should. If the CI for the data without Falluja excluded 0, when you add one observation way, way to the right of the rest of the data, you’d certainly expect the BCa CI to move to the right rather than to the left.

    Where is this variable given please?

    Any modern bootstrap program will give you the BCa CI.

  50. #50 Donald Johnson
    January 2, 2006

    I’m a little confused about what this argument is about anymore, though for purely selfish reasons I’m happy to see it continue. (It’s been an interesting way to learn a little about statistics.) Seixon says in post 23 that the UNDP estimate might easily be an undercount and agrees that the number of excess deaths might be 100,000 but says that the uncertainty in the Lancet study is huge and the sample size ought to be much larger. Well, yeah. Roberts and company, I thought, went in with the intention of getting a crude estimate of mortality rates and did the best they could under the circumstances and came out with a mid-range figure that wasn’t unreasonable. But there are huge uncertainties and ideally the study should be repeated on a much larger scale. Probably everyone agrees with this. The rest is not terribly important detail. Though again, it’s been educational, so keep it up.

  51. #51 Seixon
    January 2, 2006

    Robert,

    I’m simply saying that the study methodology leaves its conclusions open to fraud more than one would have wanted. The UNDP study is much less affected by such things due to the massive sample size and their methodology.

    Donald,

    If I get the raw numbers from the UNDP/FAFO on their question about deaths in the family (which for some reason were not published in their “Tabulation Report”), then we will be able to compare these with the Lancet numbers. In other words, the total amount of deaths in the post-invasion period should be equal in both the ILCS and the Lancet study, adjusted for the different time frames.

    The ILCS published a number of 24,000 for the amount of “war-related” deaths, but we were never given the numbers for the other causes of death. If the total numbers of death in the ILCS data are much lower than the Lancet ones (adjusted for time frame), then we’ll be much more able to see how much the Lancet data is overblown.

    I am also tempted to ask Roberts for the data on “accidents”, to see whether or not any of those were due to increased activity rather than things caused by the war.

    It would be seriously disingenuous for a study to talk about “excess deaths” caused by a war, if we are talking about increased traffic accidents caused by… increased traffic. The amount of traffic in Iraq, along with many other normal things (such as power usage) have gone up since the invasion. The latest report shows that the demand for electricity is twice what it was before the war. I’d imagine that the amount of traffic is also up a good amount, which would directly cause more traffic accidents and thereby deaths.

  52. #52 Judge Ito
    January 2, 2006

    you’d certainly expect the BCa CI to move to the right rather than to the left.

    confidence intervals attached to bootstraps are considered unreliable, esp. with small highly skewed samples as here. results are inconsistent and tails underrepresented.

    but since you seem to have such a bootstrapping program at your disposal (and I don’t) , it should be easy for you to identify where the sincle “most likely” 5% interval for excess deaths occurs including fallujah. from your plot it seems to be the same as the “most likely” result with fallujah included (ie ard 100k excess deaths). If you could share that it would be greatly appreciated.

  53. #53 z
    January 3, 2006

    “Of course we should. If the CI for the data without Falluja excluded 0, when you add one observation way, way to the right of the rest of the data, you’d certainly expect the BCa CI to move to the right rather than to the left.”

    You’d expect the upper confidence limit to move to the right since you’ve widened the spread a lot. And the estimate itself to move to the right, but not as much (because you’re adding what you’ve already defined as an outlier). And since the lower confidence limit and upper limit are symmetrical about the estimate…. do the math, as they say in an overused fashion although here it’s appropriate.

  54. #54 Robert
    January 3, 2006

    Seixon averred:

    I’m simply saying that the study methodology leaves its conclusions open to fraud more than one would have wanted.

    Hmmm. I would have said that we should always worry about fraud, but this study design was very reasonable about verifying a large fraction of reported deaths. Perhaps you think the CI on the estimates is large, but that’s a function of sample size, not particularly of fraud. People who criticize studies of this type for wide CIs (and by inference, for small sample size) have never done primary data collection in the field. It’s easy to say that there are imperfections. No study is perfect, so the way to evaluate research is whether its contributions outweigh its shortcomings; in the biz we say: don’t let the perfect be the enemy of the good.

    Judge Ito said:

    you seem to have such a bootstrapping program at your disposal (and I dont)

    http://www.r-project.org

    confidence intervals attached to bootstraps are considered unreliable, esp. with small highly skewed samples as here.

    That’s why I showed you the BCa CI rather than the bootstrap percentile CI. The BCa CI is much better behaved with small skewed samples.

    Perhaps I misunderstand what you’re getting at, but the mode of the bootstrap distribution isn’t the maximum likelihood estimate of the excess deaths. The bootstrap distribution of the excess deaths is itself comprised of many replicates of estimates that are each maximum likelihood, so focusing on its mode is a red herring. The big picture issue is that when Falluja is included the sampling distribution is bimodal, skewed, and over-dispersed. These are symptoms that tell us that the Falluja cluster is very different than the rest of the clusters, and should have been (and was) dropped. Seixon intimated that the reason why Roberts dropped Falluja was because when included the (standard) CI did not exclude zero. My comment about the BCa CI including Falluja was simply to point out that the standard CI calculations are symmetric about the mean and thus included zero while the bias corrected and accelerated CI matched our intuition better in excluding it. Falluja wasn’t dropped to hide statistically insignificant results; it was dropped because it was atypical.

  55. #55 Robert
    January 3, 2006

    z suggested:

    And since the lower confidence limit and upper limit are symmetrical about the estimate. do the math, as they say in an overused fashion although here it’s appropriate.

    For reasons I’ve outlined above, a symmetric CI when including the Falluja cluster is almost surely inappropriate.

  56. #56 Robert
    January 3, 2006

    Donald Johnson courageously admitted:

    I’m a little confused

    Sometimes aren’t we all.

    Seeing Roberts’ cluster-level data summaries is tremendously helpful. For the past year, people have been making hypotheticals about the effect of this or that shortcoming, but actually seeing the data helps us to put rough bounds on the size of the effects. For example, Seixon has been focusing on the paired governates. Now, rather than make abstract arguments, if you were so inclined you could substitute in a “what-if” cluster to see how much influence it could have had.

  57. #57 z
    January 3, 2006

    “For reasons I’ve outlined above, a symmetric CI when including the Falluja cluster is almost surely inappropriate.”

    Yes. It makes the assumption of a Gaussian curve for the deaths, which is clearly false if you include Falluja. Thus calculations which rely on Gaussian statistics would give incorrect answers, thus the lower CI < 0, which is a bogus result stemming from the original incorrect attribution of Gaussian statistics to the data including Falluja. Which explains the question posed, i.e. why including an outlier with a very high rate would make the lower bound drop: because the calculation of the lower bound is now inappropriate.

    Are we agreeing or disagreeing? I need to know so that I know whether to insult you or not.

  58. #58 Robert
    January 3, 2006

    z wondered:

    Are we agreeing or disagreeing? I need to know so that I know whether to insult you or not.

    I spend a fair amount of time on Usenet, where one needn’t be predicated on the other.

  59. #59 z
    January 3, 2006

    “I spend a fair amount of time on Usenet, where one needn’t be predicated on the other.”

    Well, at least consider your sexuality impugned.

  60. #60 Robert
    January 3, 2006

    Well, at least consider your sexuality impugned.

    Ah. So you’re already familiar with rec.bicycles.racing.

  61. #61 z
    January 3, 2006

    “Ah. So you’re already familiar with rec.bicycles.racing”

    Nothing but heterocycles for me, just as God intended.

  62. #62 Seixon
    January 3, 2006

    Now, rather than make abstract arguments, if you were so inclined you could substitute in a “what-if” cluster to see how much influence it could have had.

    I have pondered doing this, but I think not having the actual data that the Lancet had stopped me from doing so. I’m not experienced enough, nor do I have the software they used, to re-create their CI and finding. If someone could, then they could toy around with how much this or that effect would have had.

    Such as my focus on incorrect pairings. Still haven’t received any response from UNDP or FAFO regarding the ILCS data…

  63. #63 nik
    January 4, 2006

    Robert;

    Can you spell out exactly how you made your bootstrap estimate of excess mortality. Did you get it by:

    (1) Generating a bootstrap sample of the clusters,
    (2) Calculating the pre- and post-invasion CMR of the sample,
    (3) Subtracting the sample’s pre-invasion mortality from post-invasion mortality (and multiplying by the estimated population of Iraq and the duration of the study),
    (4) Repeating (1)-(3) X number of times.

    Or did you take the average pre- and post-invasion CMR of clusters for step (2)? I’m confused both as to what was done and how it should be done.

  64. #64 Robert
    January 4, 2006

    Seixon admitted:

    I’m not experienced enough, nor do I have the software they used, to re-create their CI and finding.

    Must. Resist.

    Sigh. Okay, first off, it’s always good to know how to do exact calculations with the proper software. That’s always the best way. However, let’s not lose sight of the forest: the CI is (mostly) a function of the sample size, and there’s nothing we can do about that. Let’s just focus on plausibility bounds on the central estimate. For that, we can come reasonably close without too much calculation at all.

    Without Falluja, the central estimate of excess mortality was 98000. This was based on 43 more deaths in the post-invasion period than in the pre-invasion period, so a rough way of thinking about magnitudes is that one extra death in a cluster translates to something like 2000 or 2500 final estimated excess deaths nationwide. We’ve known this since the Lancet article came out; this isn’t anything new. What’s new is that now we have the cluster-level death counts.

    You’ve been saying that the pairing of governates was a problem, but it would be a problem to the extent that a cluster from an unsampled governate would have a different observed number of deaths than the one that “took its place”. So, take a look at Roberts’ cluster-level data and focus on the clusters from paired governates. Even if you think that clusters from less violent governates should have a lower number of observed deaths than clusters from more violent governates (and that’s an ecological fallacy — but let’s overlook that for the moment) the fact is that there aren’t a lot of deaths in any single one of the paired clusters.

    Thus, even if you believed there was something fishy about pairing governates, over the six pairings we’re probably talking about no more than a couple of observed deaths in either direction (it’d be cherry-picking your hypotheticals to think that every pairing went in the same direction). A couple of observed deaths translates to something like 4 or 5 thousand in either direction in the final estimate of excess mortality.

    Which, if you think about it, may be kind of a coincidence. You’ve probably spent 4 or 5 thousand words talking about paired governates.

    Seixon, this has been a nice Christmas interlude, but break’s over and I have to get back to doing the things that keep me in perpetual penury. If I fail to respond, please don’t take it as a sign that I’ve suddenly started to agree.

  65. #65 Robert
    January 4, 2006

    nik wondered:

    Can you spell out exactly how you made your bootstrap estimate of excess mortality.

    Caught me just as I was figuratively walking out the door.

    I bootstrapped the clusters (Brendan showed that the random effects model wasn’t shabby), and recalculated the ratio of post-to-pre-invasion crude mortality based on the observed cluster deaths and the person-months of exposure. I could’ve calculated excess mortality for each governate and then added them up, but with this level of uncertainty it wasn’t worth it. It was simpler, and probably no less accurate, to calculate the global estimate. I knew Roberts’ central estimate of excess mortality was 98000 and that corresponded to a post-to-pre ratio of 1.5something, so I scaled everything to that.

  66. #66 Seixon
    January 4, 2006

    Sigh. Okay, first off, it’s always good to know how to do exact calculations with the proper software. That’s always the best way. However, let’s not lose sight of the forest: the CI is (mostly) a function of the sample size, and there’s nothing we can do about that.

    Of course, but as I said, I don’t have any real expertise on calculating this when it comes to cluster sampling. Lazy, I know, but it’s been Xmas break. :)

    A couple of observed deaths translates to something like 4 or 5 thousand in either direction in the final estimate of excess mortality.

    2 deaths in Sulaymaniyah, 8 in Ninawa, 8 in Missan, 11 in Dhi Qar, 5 in Karbala, and 5 in Salah ah Dinh. From post-invasion.

    You apply the 8 deaths in Ninawa as the rate for Dehuk, the 11 in Dhi Qar for Qadisiyah… You sure that this would not be having any effect? Are we talking about “a couple” observed deaths?

    If we’re talking about the resultant excess deaths, sure, but what about the total numbers of post-invasion deaths?

    In other words, Dehuk is essentially given the death rate of Ninawa, which had 11 deaths post-invasion, when those two provinces are not similar at all. In fact, that is by far the worst pairing out of all.

    To put it this way, if Dehuk had been the one chosen, and not Ninawa, that 11 post-invasion death number would have become much, much less, most likely 1-2, maybe even zero.

    That wouldn’t make a difference?

  67. #67 Seixon
    January 4, 2006

    Ooops, I mixed up Ninawa with Dhi Qar in that last bit, without altering the point of course.

  68. #68 Seixon
    January 4, 2006

    Let me try to emphasize this with the Dhi Qar – Qadisiyah pairing. Dhi Qar was the chosen province of the two.

    Pre-invasion: 3 deaths
    Post-invasion: 11 deaths
    Excess: 8 deaths

    Now, taking a look at the IBC data, Dhi Qar has a death rate of 660 per million, while Qadisiyah has a death rate of 70 per million. (If Lambert & Co want to complain about reporting bias, none of the cities visited by the Lancet survey are found on this list of Iraq’s major cities. Which means that any reporting bias between the two will be minimal at best…) Just for sake of argument, we will round Dhi Qar down to 600, and Qadisiyah up to 100, for a difference factor of 6.

    If we apply this to the numbers from the survey, we’d expect about 2 post-invasion deaths in Dhi Qar with 0-2 pre-invasion deaths.

    In other words, Dhi Qar gave them 8 excess deaths. If Qadisiyah had been chosen, it’s very likely that the number would have been less than 2.

    That’s just in this one pairing. In a survey finding based off 43 excess deaths, I dare say that having 8 instead of 0-2 in this one pairing would have its effects.

    Please show me the error of my ways.

  69. #69 Robert
    January 5, 2006

    Seixon challenged:

    Please show me the error of my ways.

    Dude, despite my better judgement, I clicked on this during my lunch break and you’ve suckered me back in.

    First: Ouch. You don’t get to replace all three of the Dhi Qar clusters. Only one got moved from Qadisiyah so you only get to move one back.

    Now I really have to get back to work.

    Second: You’ve got to do the same for all six of the pairings. You can’t cherry-pick Dhi Qar.

    Third: Remember the warning I gave you about the ecological fallacy? Notice that even in the Dhi Qar governate one of the clusters had lower post-invasion rates? That’s how variability works. If cherry-picking were allowed (and it ain’t) someone else could’ve cherry-picked the zero post-invasion rate cluster and replaced that one, which would’ve raised the post-invasion rate. That wouldn’t be fair, so what you were trying to pull ain’t fair, either.

    Fourth: The IBC numbers are a red herring. They’re numerators, while Roberts got both numerators (the deaths) and the denominators (the person-months of exposure).

    Fifth: The IBC numbers are numerators for what they figure to be deaths related to war violence, while Roberts gives the numerators and denominators for all causes of death. You wrote: “Dhi Qar has a death rate of 660 per million, while Qadisiyah has a death rate of 70 per million…for a difference factor of [6]” Let’s ignore your misuse of the word “rate”; more importantly, those numbers mean you’re estimating a death rate for Dhi Qar of .66 per thousand and .007 per thousand for Qadisiyah. Look up death rates–nationwide, pre-invasion, Iraq’s death rate was estimated to be about 5 per thousand. The IBC numbers aren’t all-cause death rates. If you add in the baseline all-cause death rate of around 5 per thousand, the difference between Dhi Qar and Qadisiyah ain’t a factor of 6 or a factor of 9; it’s closer to something like 5.7 to 5.1, which my lightning fast calculations that are accurate to a hair say is, um, carry the two, errr, 1.12 or 1.13. I don’t know where you live but in the twisted little world I inhabit a factor of 1.13 isn’t quite up to a factor of 6.

  70. #70 Seixon
    January 5, 2006

    First: Ouch. You don’t get to replace all three of the Dhi Qar clusters. Only one got moved from Qadisiyah so you only get to move one back.

    Not if we are following the Lancet methodology… Also, the 8 excess deaths for 3 clusters rate is now applied to the population of Qadisiyah.

    Second: You’ve got to do the same for all six of the pairings. You can’t cherry-pick Dhi Qar.

    True, as I said, I was only showing one example. My only point was to show that their methodology seriously screwed things up by doing it this way.

    Third: Remember the warning I gave you about the ecological fallacy? Notice that even in the Dhi Qar governate one of the clusters had lower post-invasion rates? That’s how variability works. If cherry-picking were allowed (and it ain’t) someone else could’ve cherry-picked the zero post-invasion rate cluster and replaced that one, which would’ve raised the post-invasion rate. That wouldn’t be fair, so what you were trying to pull ain’t fair, either.

    I was making an example out of that pairing, not cherry-picking. I didn’t attempt to pluck out certain clusters. The point I was trying to make, which you are trying to evade, is that the pairings were fraudulent and that this would have had an effect on their results.

    Fourth: The IBC numbers are a red herring. They’re numerators, while Roberts got both numerators (the deaths) and the denominators (the person-months of exposure).

    Red herring in terms of what? So suddenly the IBC data doesn’t show that Dhi Qar and Qadisiyah are wildly different in mortality?? The Roberts data is a sample of 33 clusters, the IBC data is a full tally of all reported deaths in Iraq. The IBC data has around 25,000 reported deaths. The conclusions in the Lancet study revolve around 43 reported excess deaths, and less than 200 reported deaths in total.

    Red herring?

    Fifth: The IBC numbers are numerators for what they figure to be deaths related to war violence, while Roberts gives the numerators and denominators for all causes of death.

    True. I wasn’t comparing the IBC data to the Roberts data. So what’s your point?

    I don’t know where you live but in the twisted little world I inhabit a factor of 1.13 isn’t quite up to a factor of 6.

    Well, sir, Roberts paired governates in the belief that violence was similar. The IBC data gives an indication of violence, and that indicator shows that most of the pairings are not similar. Forgive me for comparing the governates along the basis of which Roberts paired them up. Sigh.

The site's presently under maintenance. New comments have been temporarily disabled on the site. Please check back later!