IBC vs Les Roberts

Stephen Soldz posts an exchange of letters between the IBC’s John Sloboda and Les Roberts. Sloboda accused Roberts of spreading misinformation about a NEJM study. Roberts said:

In a very prestigious journal called the New England Journal of Medicine there was an article published on 1 July 2004. Military doctors interviewed soldiers returning from Iraq.

They interviewed them because they were interested in post-traumatic stress disorder, so they asked the soldiers about stressful things that might have happened to them.

Among other things they found that 14 percent of the ground forces in the army had killed a non-combatant and 28 percent of returning Marines had killed a non-combatant.

If you work through the numbers you come up with a figure pretty darn close to our estimate in the Lancet.

Daniel Davies worked through the numbers as well:

Quite extraordinary. He refers to this article in the New England Journal of Medicine, which finds (Table 2) that in a survey of 894 US Army soldiers, 116 of them (out of 861 who responded to the survey question) regarded themselves as having been personally responsible for the death of a noncombatant. That’s 13.47% (I don’t know why the NEJM rounds it to 14% and suspect someone has made a transcription error).

I think that the most sensible way to extrapolate from this (which is not to say that this is a legitimate calculation; call it the least bad way to create a number) is to say that, given that it was an eight month tour of duty, we got 116 noncombatant deaths in about 215000 troop-days. There were 250,000 US and 45,000 British troops (plus other coalition forces) in the initial assault on Iraq and about 130k US and 20K coalition troops by December. I’m guessing that this gives us 3 months of 300k troops and 5 months of 150k troops. That would be roughly 50m troop-days in the eight months of the tour of duty of the troops surveyed.

50,000,000 x (116/215,000) = about 27,000 civilian deaths. Note that UK troops would have seen fewer noncombatant deaths per troop-day, but units like the 815 US Marines surveyed saw twice the rate of the regular Army units — also, I am not allowing for the fact that some soldiers might have been responsible for multiple noncombatant deaths.

This is really quite consistent with the Lancet study; if you crudely scale it up from eight months to eighteen you get 60,000 deaths, which is significantly more than the Lancet team would have attributed to coalition troops.

That’s about 110 deaths per day. In a later article Roberts came up with a similar number:

A report in the New England Journal of Medicine in July 2004, based on interviews with returning U.S. soldiers, suggests an unintentional non-combatant death toll of 133 deaths per day.

Sloboda writes:

We recently contacted the first author of the NEJM paper, Dr Charles
Hoge, who replied as follows:

“In no way can our data be used to estimate civilian deaths. We ask two
questions related to killing of enemy combatants and civilians on the
our anonymous questionnaire that we ask U.S soldiers, but neither can be
used to estimate casualty rates. We ask if at any time in the deployment
the soldier perceived that he was responsible for the death of an enemy
combatant and another similar question pertaining to the death of
civilians. Since all members of a team may in some way be responsible
during a combat operation these questions can in no way be used to
estimate actual civilian casualty numbers.” (email to John Sloboda,
dated 8th May 2006)

In summary, you have published a claim, on the basis of the Hoge et al
paper, which the lead author of that paper says is unsustainable (just
as we had independently argued).

There are two matters of serious concern:

  1. You have misused the authority of the New England Journal of Medicine
    and the authors of this July 2004 paper to promote a claim which has no
    basis in that study and which is explicitly rejected by the authors of
    that study.

  2. The supposed 133 per-day-rate of civilian deaths is one of several
    “estimates” used by you and many of your readers to make unwarranted
    claims about the relative value of different studies of Iraqi mortality,
    and the likely overall death toll. Your use of this figure, and the use
    made of it by others, has thus helped to spread confusion and
    misinformation on a matter which is of the utmost gravity, and where
    therefore the highest standards of rigour and professionalism are needed
    from those claiming academic expertise and authority.

Now Sloboda does have one point here. Roberts cites the NEJM as the source for the 133 figure when it’s not contained in that paper but calculated from it. The cite should be to a document where the details of the calculation are given. But Sloboda’s other points are wrong. Clearly the NEJM paper can be used to construct an estimate, regardless of what Hoge thinks. And before the IBC accuses Roberts of spreading confusion, they might want to put their own house in order.

Roberts explains how the 133 number was calculated in his reply, but this bit is the most interesting:

Finally, we measured the sensitivity of your surveillance system during
the first 18 months, we found it was <5%. This is what we generously
referred to in the HPN paper as “cannot be more than 20% complete.” The
Falluja deaths in our data set were recorded by month and in IBC Falluja
deaths were not as distinct as elsewhere so that we could not match
them. However, among the other 21 violent deaths we encountered in our
random sample of 988 households, one was in the IBC data set. Thus,
unless you have evaluated the sensitivity of your system from some
independent data source, I hope you will temper the statements you make
about the complete nature of the IBC dataset.

5% completeness is the norm of newspaper reporting in times of war. (See
Patrick Ball’s work in Guatemala online with the AAAS) I suspect and
hope that the sensitivity has increased over time as systems have
improved and the role of major battles with airpower has diminished.
But, the speculation in the press that the real number might only be
twice the IBC tally is preposterous.

Last October 11th, I was invited at 2 hours notice before a flight left
to appear on the BBC program Newsnight with Jack Straw. I called you at
that time, hoping to hear that IBC had calibrated the system and to give
you the chance to defend the IBC sensitivity before saying on-air that I
had found it to be ~5% complete. Because we did not speak, I did not
then, and up until now, report our evaluation of your sensitivity in
public. I thought I was doing you a favor by calling.

5) As for your “Speculation is no substitute” paper, I discussed it with
some of my coauthors when it arrived. We decided that it was so devoid
of credibility, and so laden with self-interest rather than the interest
of the Iraqis, it did not merit a response.

The “Speculation is no substitute” paper contains serious errors which the IBC refuse to correct.

Comments

  1. #1 Donald Johnson
    June 20, 2006

    Well, couldn’t it also be the case that 10 soldiers open up on one civilian vehicle and kill, say, 2 civilians? If all 10 soldiers are honest with themselves (soldiers who lie to themselves will result in an undercount) will admit to being responsible for a civilian death, but the number in this case would be 1 civilian for 5 soldiers.

    To me, the only solid number you can get from the NJEM survey is the number of soldiers and marines who’ve caused the death of civilians. Apparently it’s pretty darn high and it suggests there is a lot of killing going on (much or most of which doesn’t make it into the press, I think), but if one insists on getting a death toll out of it you’d have to make varying assumptions about the number of civilians per guilt-ridden soldier and the number could be less than 1.

  2. #2 Kevin Donoghue
    June 20, 2006

    Certainly the fact that IBC picked up only one out of 21 deaths is the most interesting point here. It is very hard to reconcile that with their claims regarding coverage.

    At this stage the Lancet data is too old to be of much interest in any case. Reports are scrappy, but AFAICT the general situation is actually worse now than it was in post-war period.

  3. #3 Ken
    June 20, 2006

    This debate would not be happening if there were a reliable count of casualties in Iraq. It seems to me that such information is important both to the conduct of current military/police operations as well as assessing the success or otherwise of methods employed with respect to the stated (and unstated) aims of the occupation of Iraq. Clearly the civil authorities there are unable to keep accurate records of what ought to be the fairly basic and fundamental process of recording deaths and correctly identifying causes (or of keeping records of relevant information for future review where there is confusion or doubt about precise circumstances). It can only be viewed as a profound failure to get civil authorities up and working within Iraq. As for the record keeping within the miltary heirarchy, if they are failing to keep such records, they are doing a grave disservice to their own, by failing to have accurate knowledge at hand for both present and future miltary planners. Neither overestimating nor underestimating can have any good consequence whilst misrepresentations will lead to longer term consequences in the form of entrenched distrust. It won’t win the hearts and minds of Iraqi’s to have real casualties dismissed as non-existent.

  4. #4 joshd
    June 20, 2006

    “Certainly the fact that IBC picked up only one out of 21 deaths is the most interesting point here.”

    I’d just like to point out here that nobody at IBC sees any reason to believe this claim. Roberts has been extremely careless in examining the IBC database, has repeatedly got even basic IBC figures wrong (and continues to do so to this day). I see no reason to believe he did any valid examination of our database.

    Furthermore, I highly doubt it’s even possible for him to make the claim. We have too many cumulative reports for him to be able to know if we caught any particular death or not.

    If Roberts would like to provide any proof of this to us, he can send it along and we’ll look at it. But we’ve long learned not to trust any assertions about us (or much of anything else) made by Roberts.

    “It is very hard to reconcile that with their claims regarding coverage.”

    IOW, it’s very hard to reconcile with ILCS. Yep, extremely hard in fact. Roberts claims this derives a “5%” coverage rate for IBC. Which would mean ILCS missed the true figure by a factor of five. I have no idea if Roberts even knows his ‘argument’ is implying this or not. He may still have no idea where we got that coverage rate from.

    Even if his claim were true, the assumption to draw from it would be quite different than what he suggests. The most plausible assumption would not be that IBC was off by 10X (or that ILCS was off by 5X) but that his Lancet sample was an unrepresentative sample of Iraq, perhaps helping to explain the discrepency we find between the two studies.

    And as far as Tim Lambert’s claimed “serious errors” in our paper, I believe I more than ably refuted these claims of “errors” in the link Tim provides, while also correctly pointing out several of Tim’s own errors, only one of which he’s ever decided to own up to.

  5. #5 Lee
    June 20, 2006

    josh, are you telling us the IBC data is so poorly documentd and organized that it isnt possible to examine it comprensively?

    BTW, if you believe you got more than one of those deaths, why don’t you document that?

  6. #6 joshd
    June 20, 2006

    “josh, are you telling us the IBC data is so poorly documentd and organized that it isnt possible to examine it comprensively?”

    No. I’m telling you that I have no reason to believe Roberts actually did examine it comprehensively. And even if he did, I doubt he could tell whether we got each of these deaths or not, or how many of them, because we include lots of cumulative reports such as morgue figures, hospital tallies etc., where you can’t break down to individual deaths and exact locations or dates of each, and so forth.

    “BTW, if you believe you got more than one of those deaths, why don’t you document that?”

    How do you propose we do that Lee?

    As I said, I don’t think it could even be done.

    Regardless, Roberts has so far only asserted it, and given the track record of his assertions about our coverage (he continues to get basic facts wrong even now), his latest ones carry no weight with us. He’d have to do more than assert something before it would even merit consideration.

  7. #7 Donald Johnson
    June 21, 2006

    Some disjointed thoughts–I’m too sleepy to try to make them flow together better.

    I was thinking about the 1 out of 21 figure and I’d like to know more about the basis of the assertion myself. I could imagine how the Lancet team might be able to demonstrate that a death wasn’t reported correctly in IBC’s files– for instance, if they know that such and such a person was killed in such and such a town by a US military action in a given month, there are a great many months where such events seem so rare in IBC’s files it should be easy to look through every single one and see that it isn’t there. But they might show up under the wrong category–the massacre victims of Haditha might be partly accounted for in IBC’s original database, since some were reported as the victims of insurgents, while others wouldn’t make it under the original story since they were “insurgents” and not civilians. This kind of mis-identification and miscount was ubiquitous in the Vietnam War, but I guess media coverage in Iraq and levels of governmental honesty (both US and Iraqi) are so far superior to what happened in Vietnam there’s no reason to even suspect it could be happening again on a large scale. Sarcasm off. The simplest way to show a victim wasn’t in IBC’s files would be to ask the family if the victim was buried immediately. But you could look for the victim under one category and never realize the death was recorded as something else.

    I would think that the Lancet survey’s 7 murder victims might show up in morgue data (or whatever sort of morgue data is allowed into the press), unless they were simply buried first rather than taken to the morgue. (I couldn’t quite make out what Roberts was saying about the morgue data, btw, though the gist of it seemed to be that it showed IBC’s numbers are too small.)

    I don’t feel feel the need to give any great credence to IBC’s own statistics, but the 1 in 21 coverage rate is contradicted by the Lancet’s own estimate–if the midrange violent death toll estimate is about 60,000 in Sept 2004, and if most were civilians, then IBC’s coverage would be 25 percent. So if it’s accurate it shows IBC is undercounting, but by some statistical fluke it is overstating the problem. Roberts appears to think the Fallujah outlier shows that the true death toll is much higher, but it’s probably more sensible to treat Fallujah as its own unique statistical problem in need of a more detailed survey.

    The claim that in prewar Iraq the official death count was three times too low is interesting. If they counted deaths that poorly before the war, why would counts improve during a war?

  8. #8 Sortition
    June 21, 2006

    Statistical note: Due to the small sample size (1 pos, 20 negative), the CI for the completeness parameter is quite wide. I get a 95% CI of about (0.2%,23.8%).

    It is therefore an error to base arguments on the assumption that the 5% figure is a better estimate than, say, 10% (p-value ~70%) or 15% (p-value ~30%).

    joshd: Why don’t you work with the ILCS people to see what percentage of their sample you caught?

  9. #9 Urinated State of America
    June 21, 2006

    I think there’s a couple of problems with Daniel’s extrapolations 1) multiple soldiers may have been involved or witnesses in an incident where a noncombatent was killed and hence feel responsible, and so there’s a lot of double-counting, 2) the 13.7% applies to combat infantry troops alone, not to the support troops. So Daniel’s extrapolations, if corrected, be closer to the IBC numbers.

  10. #10 Donald Johnson
    June 21, 2006

    I made that double-counting criticism too, URA, as did Sloboda, but you’re going too far with it. We don’t know what the average number of dead civilians per confessing soldier is, so it doesn’t back up IBC’s number either.

  11. #11 ajay
    June 21, 2006

    Of course, it’s also possible that soldiers could feel responsible for more than one civilian death.
    Good catch on the troops/combat troops thing. For example, there were 45,000 UK troops involved in the invasion of Iraq. But the only combat formations were 3 Cdo Bde, 16 Air Assault Bde and 7 Armd Bde – a total of about 10,000 combat troops. The study itself picked its subjects from two airborne and one mech brigade. I don’t know if the teeth-to-tail ratio is the same in the US armed forces, but if it is, that would cut the number of deaths expected by a factor of 4-5…

  12. #12 joshd
    June 21, 2006

    “I was thinking about the 1 out of 21 figure and I’d like to know more about the basis of the assertion myself. I could imagine how the Lancet team might be able to demonstrate that a death wasn’t reported correctly in IBC’s files– for instance, if they know that such and such a person was killed in such and such a town by a US military action in a given month, there are a great many months where such events seem so rare in IBC’s files it should be easy to look through every single one and see that it isn’t there.”

    But that isn’t easy Donald. As I said before, we have many cumulative incidents from hospitals and morgues and such. You couldn’t say whether or not the victim was counted in those. And since Roberts claims that 80% of the deaths he recorded had deaths certificates, it seems clear that most of them went through official sources somehow.

    “The claim that in prewar Iraq the official death count was three times too low is interesting. If they counted deaths that poorly before the war, why would counts improve during a war?”

    That too is more smoke and mirrors from Roberts. That refer, supposedly, only to a project of the “central government”, not all official sources. Also, it’s based on assuming an estimate of how many were dying in Iraq before. If the accounting of deaths by official sources is so low, Roberts needs to explain why 80% of deaths had death certificates. Who was issuing them?

    It’s high time people stopped simply believing every assertion that tumbles out of the mouth of king roberts.

  13. #13 z
    June 21, 2006

    Here even in civilized and highly bean-counted Connecticut, you can’t get an accurate death toll by reading the newspapers and querying the hospitals. You can’t even get a **completely** accurate account by querying the official state mortality register. Which the state has to update for three years after the original reports before the rate of change is low enough to be considered “final”. I know this, not only from having been paid to know it once upon a time, but also from having been in the company of those who were not only paid to know it but are also much wiser than I. So I feel it’s fairly safe to say that if the “direct count” method of death rate calculation has a significant slop factor in Connecticut (of all places), it’s probably not super super accurate in Baghdad right now.

  14. #14 joshd
    June 21, 2006

    Oh, and Tim:

    “Daniel Davies worked through the numbers as well…That’s about 110 deaths per day. In a later article Roberts came up with a similar number (133)”

    “Clearly the NEJM paper can be used to construct an estimate, regardless of what Hoge thinks.”

    Being a computer scientist, expert statistician and CI expert Tim, would you care to construct a CI for these estimates?

    We’d _love_ to see them.

    Perhaps when you begin to try you’ll understand why Hoge says what he does.

  15. #15 John Quiggin
    June 22, 2006

    Off the top of my head, given a sample size of about 900, we get a standard deviation of 30 (that is, a standard error a bit over 3 per cent) for a normal distribution. Non-sampling errors like those mentioned above seem more likely to be a problem.

  16. #16 Tim Lambert
    June 22, 2006

    95% CI for 116/861 is 11% to 16%. Applying this to Davies’ 110 yields a CI of 90-130. Of course, I’m ignoring the uncertainties in the factors that got us from 13.5% to 110, but that’s what IBC do in their defence, so I’m sure you have no objection to this, Josh.

  17. #17 joshd
    June 22, 2006

    Entirely different context Tim. We applied a known death rate found in the study you endorse more highly than any other, to make the correction – the one found in Lancet. That is the only assumption, and if Lancet data is sound, as you believe, then it’s a very, very small assumption. As I said before, I’ve already defended that assumption more than adequately in the link you provide, as the point was not to illustrate a CI of the correction anyway, but rather, one the study being illustrated would actually produce.

    That is nothing like this NEJM-imputed estimate which Roberts refuses to correctly cite Tim, or the assumptions necessary to give it a CI. There is *no* data about death rates in the NEJM data at all. A death rate must be assumed altogether based on no data, based on blind guessing about the implications of soldiers’ “feelings” of “responsibility”.

  18. #18 Donald Johnson
    June 22, 2006

    What’s exasperating about this debate (not this thread, but in general) is that there’s a perfectly reasonable guesstimate that everyone should be able to see is plausible–take IBC’s civilian death toll and double it. (Ignore Fallujah until such time as a detailed study can ever be made). It’s a little on the low side for the Lancet team, but not so low that it makes their data seem like a bizarre statistical fluke (which I think is the case if we took the Sept 2004 IBC numbers at face value). It fits nicely with ICLS. I think, btw, that surveys will undercount insurgent deaths–if your family actively supports the insurgents are you going to open up about war deaths in the family if a stranger knocks on your door and asks about it? I wouldn’t. Maybe some would, despite having lived under a totalitarian regime for decades, followed by a foreign occupation that arrests innocent people by the thousands. But if insurgent-supporting families are that trusting, the Pentagon should send out survey teams, find out who admits to wartime deaths, and send a patrol to come visiting that night.

    On the one out of three business, Josh, I don’t think it’s smoke and mirrors. It’s one thing to hand out a death certificate and another to expect the government to tally them up and report the numbers accurately either before or after the invasion. I know I’ve read stories about how the government pressures morgue officials to lie about the death toll. There was a UN official (John Pace, I think) who said this was happening. Anyway, doesn’t the Ministry of Health put out numbers which are lower than yours? If Sloboda is willing to say that the true death toll is a factor of two higher than IBC’s count, then he is implying something close to what Roberts is saying.

  19. #19 David Kane
    June 23, 2006

    Just stopped by to beat my favorite dead horse again.

    It would be possible for IBC (and the rest of us) to independently determine how well/much the Lancet data overlaps with their own records if the Lancet team were to release their data. (They could release this data slightly sanitized to protect individual identifies.) But Roberts et al refuse to release the data.

    In my experience, scientists who refuse to release the underlying data (without a very good reason) are not to be believed. Your mileage may vary.

  20. #20 joshd
    June 24, 2006

    “On the one out of three business, Josh, I don’t think it’s smoke and mirrors.”

    The point is Donald, it’s an estimate of a subset of what was recorded by Iraqi officialdom generally, and based on statistics of vague origin. There’s just nothing there to draw any particular implication about how many deaths have been reported, or about IBC.

  21. #21 Robert
    June 24, 2006

    David Kane wrote:

    In my experience, scientists who refuse to release the underlying data (without a very good reason) are not to be believed.

    David, I know just what you mean, and how disappointing it must be to beat dead horses. Speaking of which, I’ve asked you a couple of times now to tell us how many papers you’ve reviewed for journals, whether those papers supplied you with all the data you needed to audit them, and whether you did, in fact, reproduce every calculation and figure during your review.

  22. #22 David Kane
    June 25, 2006

    Robert,

    Apologies for missing your previous questions about my reveiwing experience. I have never done peer review for an academic journal — or at least not in a decade; my memories of graduate school are fading fast. But what does my personal experience have to do with the issue at hand?

    If you don’t know that data-sharing and replication are an important and growing part of academic research, then you should read about it instead of asking ad hominem questions.

  23. #23 Robert
    June 25, 2006

    David:

    Thanks for answering my question. You seemed to insist that the review that Roberts et al. received from Lancet was unusual, and that the Lancet reviewers (or by implication, those for other journals) do replicate every table and figure in a paper. I was interested in your personal experience because it was so at odds with my own personal experience. Is it ad hominem to point out that your view of peer review is wrong? If you find an error in the paper itself that seems like a reasonable criticism; but criticizing the paper as inadequate because reviewers were not provided with every data element needed to replicate the tables and figures seems unreasonable since I know of no paper reviewer who is regularly provided that sort of information (and I appear to have quite a bit more experience in paper review than you).

    As long as you’re answering my questions, you’d dismissed one of my questions earlier as “trollish” which I found quite puzzling. If you recall, you claimed that Roberts did not answer your request to release all of their data and methods, and that this was sufficient grounds to distrust their work. So now I repeat my question (which I promise is not intended to be trollish at all): do you meet the same standard in sharing all your data and methods with your clients? If not, how can you expect them to trust you? Or, if you do expect them to trust you, why is that not inconsistent with the standard you are holding Roberts to?

    I’ve already thanked you once for sharing the Excel spreadsheet that Roberts sent you. I sincerely thank you again for that. It allowed me to replicate both Roberts’ estimates, and his bootstrap CIs, which were the main finding of his paper. (As an aside, I suspect that Richard Garfield, not Les Roberts, was responsible for the estimates and CIs).

  24. #24 Kevin Donoghue
    June 26, 2006

    David Kane,

    In the earlier thread you quoted from the e-mail you sent to Roberts:

    On page 2, you write: “We obtained January, 2003, population estimates for each of Iraq s 18 Governorates from the Ministry of Health. No attempt was made to adjust these numbers for recent displacement or immigration. We assigned 33 clusters to Governorates via systematic equal-step sampling from a randomly selected start. By this design, every cluster represents about 1/33 of the country, or 739 000 people, and is exchangeable with the others for analysis.” I would like to try and replicate this portion of the analysis. Can you provide the raw population estimates that you used as well as a description of the algorithm. Also, which software program did you use for doing this and how was the random start point selected?

    I was a bit surprised that you thought this needed examination. The population figures are given in the paper itself. I copied them into a spreadsheet and sorted the governorates by population. It is quite easy to check that SESS can give the the reported initial assignment of clusters. Do you really think there is a problem here?

  25. #25 z
    June 26, 2006

    Nobody talking about War’s Iraqi Death Toll Tops 50,000
    http://fairuse.100webcustomers.com/fairenough/latimes220.html?

  26. #26 Donald Johnson
    June 26, 2006

    The most interesting part of the LA Times article was the admission by Iraqi officials who say violent deaths have been grossly undercounted in some regions such as Anbar Province.

  27. #27 Robert
    June 26, 2006

    Think IBC will be correcting the Iraqi officials?

  28. #28 David Kane
    June 27, 2006

    Thanks for these responses.

    Robert wrote:

    You seemed to insist that the review that Roberts et al. received from Lancet was unusual, and that the Lancet reviewers (or by implication, those for other journals) do replicate every table and figure in a paper.

    1) There is little doubt that the review process used for the Roberts paper was unusual. It appeared in print only weeks (if that) of being submitted. It was rushed to appear before the US presidential election in order to influence the outcome. Alas, the Lancet editors won’t answer my questions either, but I believe that almost no paper went through the review process faster in 2004.

    2) I did not mean to imply that “replicate every table and figure in a paper.” They don’t. It is, however, standard practice to assist other researchers in examining your results by explaining the process that you used and sharing whatever data you can (assuming that there are no further publications to come). The refusal of the Roberts et al to help me replicate their results is unusual (at least in my experience).

    3) In your experience, do academic authors answer questions about their work? Do they share their data when asked? In my experience, they do. Indeed, most authors love to have other people examine closely (and then cite!), their papers. But perhaps my experience is unusual.

    In other words, it is not the failure of the review process to check every calculation that is the issue. It is the failure of Roberts et al to share data and answer questions that is the problem.

    [D]do you meet the same standard in sharing all your data and methods with your clients? If not, how can you expect them to trust you? Or, if you do expect them to trust you, why is that not inconsistent with the standard you are holding Roberts to?

    In fact, I do meet this standard in my (highly limited!) academic work. You can see an article that I recently wrote for R News here. Every table and figure can be replicated from the R package available for download.

    In my non-academic work, it is hard to generalize. The data that I use often comes with restrictions. (I can’t pass along, say, Compustat data to a client who does not pay Compustat.) But I have always been comfortable answering client questions. (If you don’t, then they generally won’t be/stay clients for long.)

    It allowed me to replicate both Roberts’ estimates, and his bootstrap CIs

    Really? Not to be rude, but I don’t believe you. Roberts reported 98,000 with CI from 8,000 to 194,000. I bet that you can’t reproduce this using the data that Tim kindly hosts. (Note how the range is not symmetric around 98,000.) If you can, please tell us (exactly) how.

    I agree with your conjecture that Richard Garfield is more likely than Les Roberts to have done the actual calculation. Judging by his other word, would you say that he has the statistical background to handle this non-trivial problem alone?

    I appreciate your comments.

    To Kevin Donoghue: I agree that the raw populations are given in the paper but it is not clear to me that these are enough to replicate the sampling scheme. Can you replicate it? If so, tell us how. For example, it seems that there must be further population figures used on the town/city level. If not, how did they know when town in a given province to go to?

    Unless someone can replicate their sampling scheme, it is hard to know that they didn’t cherry pick the provinces/towns to visit.

  29. #29 Robert
    June 28, 2006

    David:

    I meant to mention that I saw your R News article.

    I do try to respond to my critics but while I welcome reasonable numbers of questions I have students to teach and my own research to further. Regretfully, sometimes after going over the same point with the same person several times I have to make a decision to be protective of my time and move on. Perhaps this is what happened to you.

    As for the accelerated review, I believe I addressed that in an earlier thread: I’ve been asked for accelerated turn-around on reviews before. I’ve never changed either the quality or scope of my review because it was accelerated–all it meant was that I didn’t procrastinate (as much). If you had ever been involved in accelerated review, either as a paper author or a paper reviewer, I suspect you would have known this.

    It allowed me to replicate both Roberts’ estimates and his bootstrap CIs

    Really? Not to be rude, but I don’t believe you. Roberts reported 98,000 with CI from 8,000 to 194,000. I bet that you can’t reproduce this using the data that Tim kindly hosts. (Note how the range is not symmetric around 98,000.) If you can, please tell us (exactly) how.

    I posted [this graph](http://anonymous.coward.free.fr/misc/roberts-iraq-bootstrap.png) back in December 2005 when Tim posted the data you kindly shared from Roberts. I get an estimate of 100665, and BCA CIs of [14150, 202183]. That’s pretty close considering I may have rounded slightly differently, I don’t know what their random seed was, and I’m using different software. Boot CIs (and especially BCA CIs) can differ slightly between packages. Note, for example, that the algorithm in R’s standard boot package differs slightly from the one used by Hastie and Tibshirani’s R package, so differences on the order of what we see here could be attributable to different implementations.

  30. #30 Donald Johnson
    June 28, 2006

    Meanwhile, back in “statistics for dummies” land I found something called Markov’s inequality in a probability book–

    Prob(that result equals or exceeds K when the expectation value is M) is less than or equal to

    M/K

    You can use this inequality no matter what the underlying probability distribution looks like–the derivation is about 5 lines of simple math (the kind of derivation I like).

    Anyway, in Sept 2004 IBC’s civilian body count was about 15,000, so in the Lancet survey that would be an expected value of 5 ( 1 Lancet death = 3000 extrapolated deaths). As best I can tell, there’s no need to throw out Fallujah with the Markov inequality,so the total number of violent deaths in the Lancet paper is 73. But some of these might be insurgents. There are 2 men of military age killed by coalition forces outside of Fallujah and 25 men of military age killed at Fallujah. Assume about half of them were insurgents–then you’d have about 58 dead civilians.

    The probability of getting that result given the IBC expectation value of 5 is guaranteed to be less than 5/58.

    You could multiply IBC’s numbers by 5 and the probability of getting the Lancet results is now guaranteed to be less than 43 percent.

    Am I interpreting this theorem correctly? It seems very simple, but maybe there’s some technical detail I’m not aware of that invalidates my attempt to apply the theorem here.

    Assuming I did apply it correctly, I can understand a little better why Les Roberts takes that Fallujah outlier so seriously. Not that 5/58 is so terribly low that it rules out the IBC numbers, but even quintupling the IBC death rate still tells you that the odds are less than even that you’d find as many dead civilians as the Lancet team found.

  31. #31 Kevin Donoghue
    June 28, 2006

    David Kane,

    Thanks for the clarification. The excerpt from your e-mail to Roberts gave the impression that all you needed at that point was the population by province, which is in the report. Drilling down to the level of towns is another thing entirely. If some of the towns are small, revealing that they are in the sample might breach undertakings given to the respondents. Still I am all for having as much information as possible, so kudos to you for trying.

  32. #32 joshd
    June 28, 2006

    Robert said:
    “Think IBC will be correcting the Iraqi officials?”

    Why would they? They released a figure 7,000 higher than IBC’s current max count (43,000), and said there were probably many uncounted in some areas for various reasons. This is no different than what IBC says, and all falls well into line with what IBC said of its own count in its recent piece. Further, note that the Iraqi officials (at least those from the MoH) claimed that 75% were from “terrorism”. IBC does not say this, but was criticized for supposedly under-representing US-caused deaths and over-representing “terrorism” deaths.

    Will IBC’s critics be correcting the Iraqi officials? Well, of course yes. Stephen Soldz recent piece (opportunistically and disingenuously attacking IBC of course) tries to do just that, accepting whatever they say that conforms with his prejudices, while rejecting whatever they say that does not: http://psychoanalystsopposewar.org/blog/2006/06/25/los-angeles-times-estimates-iraqi-dead-at-50000/

    With that kind of method, why bother reading what anyone says? You’re just making it all up yourself anyway, and grabbing whatever conforms with what you’ve made up, while rejecting whatever doesn’t.

    Donald says:
    “Anyway, in Sept 2004 IBC’s civilian body count was about 15,000, so in the Lancet survey that would be an expected value of 5 ( 1 Lancet death = 3000 extrapolated deaths).”

    If I understand you correctly, this is wrong. IBC’s counter may have read 15,000 on the day of Lancet’s release, but IBC continues adding data days, weeks and months after. As it stands, IBC recorded 19,000 (civilian only) deaths over the Lancet period. Lancet estimated 57,000 (not civilian only). 19 is 1/3 of 57. Not 1/5.

    If you want to take the figure that was on the counter on the day Lancet was released (presumably 15,000) for some strange reason, that would be <4X, again not 5X.

  33. #33 Robert
    June 28, 2006

    joshd wrote:

    [Iraqi officials] released a figure 7,000 higher than IBC’s current max count (43,000), and said there were probably many uncounted in some areas for various reasons. This is no different than what IBC says

    No different than what IBC says? So IBC is increasing its count to 50,000 and replacing “max count” with “at least”?

  34. #34 joshd
    June 28, 2006

    Robert, say something that isn’t stupid and willfully ignorant and might reply to you.

  35. #35 Robert
    June 28, 2006

    joshd wrote:

    Robert, say something that isn’t stupid and willfully ignorant and might reply to you.

    Oh dear, there you go again. Does your mother know how you speak to others? I’m sure she can’t be very proud. In any event, I think you’ve made an error of hubris: I didn’t really care whether you responded.

  36. #36 David Kane
    June 28, 2006

    I thank Robert and Kevin for their comments (and Tim for providing a forum where people of goodwill and different opinions might hash things out).

    I agree that peer review is often expedited. My claim is that no paper in 2004 went from submission to print in the Lancet faster that this one. (The editors refuse to answer questions on this and other topics.) The paper is unusual in that way.

    Robert writes:

    I get an estimate of 100665, and BCA CIs of [14150, 202183]. That’s pretty close considering I may have rounded slightly differently, I don’t know what their random seed was, and I’m using different software. Boot CIs (and especially BCA CIs) can differ slightly between packages. Note, for example, that the algorithm in R’s standard boot package differs slightly from the one used by Hastie and Tibshirani’s R package, so differences on the order of what we see here could be attributable to different implementations.

    Although your results are close to the published, they are not close enough to count as “replicated” in my book. In particular, your 6,000+ difference with the lower bound is almost big enough that, were it in the other direction, the CI would include zero.

    I am always very suspicious when someone’s 95% confidence interval barely rejects the null hypothesis, even more so when this interval is derived from a bootstrap procedure that no one else can replicate.

  37. #37 joshd
    June 28, 2006

    “I didn’t really care whether you responded.”

    You obviously don’t care if your contributions are stupid and willfully ignorant either.

  38. #38 Donald Johnson
    June 28, 2006

    Josh, thanks for the correction on the 15,000 figure. I meant to go back and look at the IBC two year analysis and see what it was in Sept 2004, but not wanting to spend much time last night looking through it I took the Lancet paper’s figure instead (13,000-15,000 from IBC as of Sept 2004). So 19,000 would correspond to about 6 deaths in the Lancet survey. I don’t know if I’m applying the Markov inequality correctly, but if so, my 5/58 becomes 6/58 and my quintupling of IBC numbers in the previous post would only be quadrupling.

    I took your statement about using old data from IBC to be snark, but actually, there is a bit of an issue here. Sooner or later there’s a chance someone may eventually do a fairly accurate count of the civilian dead from the Iraq war, it’ll be reported in the press and if IBC is still around, you guys can update your figures accordingly. From that perspective, in the long run you can’t be inaccurate.

    What Soldz said was logical. There’s no reason to take attributions of responsibility for deaths at face value when they come from one of the two sides in a war. This type of lying is normal in a war and if reporters or independent human rights organizations can’t verify things for themselves one should just expect lies and distortions happen all the time, without being able to quantify it. I saw a suspicious example of military reporting of deaths in the NYT a few weeks ago–the US military reportedly killed 36 insurgents and two civilians died in the crossfire. The two civilians happened to be small children. The cynic in me wonders whether adult male Iraqis who died were automatically classified as dead insurgents. Nir Rosen in a piece I read today gave an example he saw himself. An Iraqi riding a type of motorcycle used by insurgents was killed by Americans and obviously they felt the type of motorcycle he was riding was reason enough to shoot him.

    Robert, the rest of us are appreciative of your posts.

  39. #39 frankis
    June 28, 2006

    ummm you could be forgiven for not knowing it joshd but he outranks you both in academic credentials and in street cred, in the field of statistics. Have you considered the possibility of fallibility lately?

  40. #40 Robert
    June 29, 2006

    David wrote:

    Although your results are close to the published, they are not close enough to count as “replicated” in my book. In particular, your 6,000+ difference with the lower bound is almost big enough that, were it in the other direction, the CI would include zero.

    Wow, that book of yours is pretty tough. My experience is that exact replication is hard to do unless you use exactly the same program: the article says they used Stata but I only have R so that’s what I used. In addition, I tend to use the BCa CI in my own work so that’s what I presented but they don’t actually mention which CI they used. For example, the normal CI (in thousands) is [7, 187] and the percentile CI is [11, 193], so you can see that the choice of CI (and the exact algorithm used) matters. Here are the [data](http://anonymous.coward.free.fr/misc/iraq.csv) and [R script](http://anonymous.coward.free.fr/misc/iraq.r) files I used. In fact, in putting these numbers together I noticed that I erred in quoting the results: the mean was 97600 (bang on Roberts’ 98000) and the BCa CI was [14,196]. You can see it all if you run my code. If you decide to do that, you can get a feeling for how robust the results are if you choose a different number of replications or look at a different type of CI.

    I am always very suspicious when someone’s 95% confidence interval barely rejects the null hypothesis, even more so when this interval is derived from a bootstrap procedure that no one else can replicate.

    But in this case your suspicion is probably misplaced. The first paragraph of the “Methods” section explains that the sample size was chosen to be able to detect an effect with 95% confidence, with a small margin for error. That the CI excludes zero should be viewed as a success of the study design.

  41. #41 joshd
    June 29, 2006

    “But in this case your suspicion is probably misplaced. The first paragraph of the “Methods” section explains that the sample size was chosen to be able to detect an effect with 95% confidence, with a small margin for error. That the CI excludes zero should be viewed as a success of the study design.”

    Based on how many people have and continue to interpret the study, you’d think the CI excluded everything below 100,000.

  42. #42 Kevin Donoghue
    June 29, 2006

    “Max 43,140 civilians reported killed” is also readily misinterpreted.

  43. #43 joshd
    June 29, 2006

    “”Max 43,140 civilians reported killed” is also readily misinterpreted.”

    It’s easy to misinterpret in two cases imo:

    1) when you don’t give a s**t about how many people have been killed

    2) when you try really hard to pretend to misunderstand in order to help you criticize IBC

  44. #44 Donald Johnson
    June 29, 2006

    The tendency to ignore the lower half of the CI is because of the Fallujah outlier. You know that, Josh. Whether it’s actually correct is another story, but based on the Lancet team data it makes sense to think the true death toll probably is in the upper half of the CI.

    Robert has a graph upthread of what the probability distribution looks like if Fallujah is included and one can see not only why the Lancet team didn’t include it, but also why they think the real number is probably greater than 100,000. That’s one weird-looking probability distribution. A trimodal distribution, it looks like. I’m going to take a wild guess (not knowing much about bootstrapping) and assume this has something to do with the possibility that there could conceivably have been more than one Fallujah-type outliers present if one took another 33 cluster sample.

  45. #45 Palo
    June 29, 2006

    I thought this was an exceptional discussion from which people like me, very interested in the underlying issue, get to learn a lot from the statistics behind it. Josh, I’m truly dissapointed at your recent outbursts of vacuous insults directed at respectful critics. It truly doesn’t help your position one bit.

  46. #46 joshd
    June 29, 2006

    Donald said:
    “The tendency to ignore the lower half of the CI is because of the Fallujah outlier. You know that, Josh.”

    Yeah, but i don’t think it’s a good reason.

    Palo said:
    “Josh, I’m truly dissapointed at your recent outbursts of vacuous insults directed at respectful critics.”

    Try being called a war-criminal sometime, and facing critics who willfully pretend ignorance in order to make sarcastic (“respectful”) criticisms (insults), as Robert did above and Tim and others have done previously.

    And my “insults” tried to make this latter point. You can hold whatever opinion you like about how well it did this, but it was not “vacuous”.

  47. #47 Palo
    June 29, 2006

    Sorry, Josh, I completely missed the bit when they called you war-criminal. If you point out who it was I’ll go and punch him/her in the face for you.

  48. #48 Robert
    June 29, 2006

    joshd wrote:

    Based on how many people have and continue to interpret the study, you’d think the CI excluded everything below 100,000.

    Hmmm. Do you have a “max count” on the number of those people? I suggest that you count only those people who are independently named in two internationally respected English-language on-line websites.

    Nonetheless, I think it’s important to see the entire bootstrap distribution, as I showed [here](http://anonymous.coward.free.fr/misc/roberts-iraq-bootstrap.png), and I sort of wish Roberts et al. had included a graphic like this in their paper.

    Donald took a wild guess and wrote:

    I’m going to take a wild guess (not knowing much about bootstrapping) and assume this has something to do with the possibility that there could conceivably have been more than one Fallujah-type outliers present if you took anohter 33 cluster sample.

    Well, that’s not bad for a wild guess. The real lesson is that Falluja is quite different from the other clusters and it was probably wise to treat it separately. The 98000 estimate that everyone is talking about is thus for 32/33rds of Iraq.

  49. #49 joshd
    June 29, 2006

    “Sorry, Josh, I completely missed the bit when they called you war-criminal. If you point out who it was I’ll go and punch him/her in the face for you.”

    This kind of thing has mostly taken place on the Media Lens message board, which Tim and others here have frequented.

    I don’t think anybody here on this blog has said this. They’ve only done the latter of the things I referred to, but it should give you some idea of the level of debate I’ve been facing on this issue for the last several months.

    On the latter thing I mentioned, Robert is still doing this, now implying he believes IBC counts only people who have been “independently named” in order to help him make more “respectful criticisms” (vacuous insults).

  50. #50 Robert
    June 29, 2006

    joshd wrote:

    he believes IBC counts only people who have been “independently named”

    It doesn’t? From [IBC's page on methodology](http://www.iraqbodycount.org/background.php#methods):
    *”By requiring that two independent agencies publish a report before we are willing to add it to the count [...]“*

  51. #51 David Kane
    July 5, 2006

    Robert,

    Thanks for providing that data and R code. I plan on distributing an R package related to this topic at some point and, unless you object, will include this (or at least my own version inspired by this) as well (with proper citation, of course). With regard to our differing uses of the term replication, I can only point to this as being consistent with my own thinking. Note the requirement for to be able “to reproduce the exact numerical results.”

    But I don’t want Tim to put us in the Lott-Levitt replicate debate category! ;-)

The site is currently under maintenance and will be back shortly. New comments have been disabled during this time, please check back soon.