Diane Farsetta on Iraqi deaths

By tlambert on February 27, 2008.

Diane Farsetta has an excellent and comprehensive write up on the Lancet and other studies on deaths in Iraq. A few extracts:

Theoretically, the public health surveys and polls that have been conducted in Iraq -- at great risk to the people involved -- should help inform and further the debate. But the data is complicated by different research approaches and their attendant caveats. The matter has been further confused by anemic reporting, with news articles usually framed as a "he said / she said" story, instead of an exploration and interpretation of research findings.

These are the conditions under which spin thrives: complex issues, political interests and weak reporting. So it's not too surprising that last month saw a spate of what international health researcher Dr. Richard Garfield calls "Swift Boat editorials."

...

Two of the authors of the Lancet study, Drs. Gilbert Burnham and Les Roberts, have responded directly to the National Journal article. Asked whether he accepted or rejected their explanations, Neil Munro told me that he didn't "want to get into a back and forth" argument.

...

Munro declined to tell me whether he found Lafta's previous research to be questionable. On CNN's Glenn Beck show, Munro wasn't so reticent, calling Lafta's earlier work "crummy scientific papers" that were "part of Saddam's effort to lift economic sanctions."

...

Overall, few of the many charges made in the National Journal article "Data Bomb" stick convincingly upon further examination. That is, unless you assume that the Lancet study authors and their colleagues have consistently lied without leaving a paper trail to the contrary. You would also have to assume that the independent health researchers and statisticians who have reviewed the study -- including the chief scientific adviser to the British Defense Ministry -- are either in on the plot, or are too naive or incompetent to notice major problems.

Read the whole thing.

More like this

given that the NJ begins with citing Bush's off the cuff estimate of 30,000 deaths without quibble, as though it were worth something (then enlarges on it by citing Bush's disbelief in the Roberts study as though he were some kind of learned authority), i find their difficulties with Roberts' methodology, veracity, etc. to be somewhere between amusing, and baffling.

This part is misleading.

The release announcement also stated that data would only "be provided to organizations or groups without publicly stated views that would cause doubt about their objectivity." Les Roberts told me that condition was added "as a result of mistakes I made with the 2004 study. ... I gave the data out to more or less anyone who asked, and two groups included ... the neighborhood in Fallujah," which the study authors had excluded from their calculations, due to the area's extremely high mortality rates. "As a result, they came up with an estimate twice as high as ours, and it repeatedly got cited in the press as 'Les Roberts reports more than 200,000 Iraqis have died,' and that just wasn't true," he said. "So, to prevent that from happening again, we thought, if a few academic groups who want to check our analysis and re-run their own models want to look, that would be OK, but we're not just going to pass it out [to anyone]."

There seems to be no question that other researchers have had access to the 2006 Lancet study data. So, Munro and Cannon's criticism is essentially that reporters like them have had limited access.

First, the issue is that academics like Spagat et al are not allowed access. They are not "reporters." Second, nothing prevents me or others who have seen the data from making claims, even claims in the press, about what the data show. That has happened, so this can't be the reason that Roberts does not share the data with Spagat and others. Third, all those statement about 200,000 deaths that Roberts mentions --- to the extent that they even exist --- were probably made by people who just read the article. You don't need the raw data that he distributed to make that claim. In fact, I bet that if Roberts were forced to make clear who he is referring to, it would be obvious that those folks did not have access to the 2004 data, which was not widely distributed for more than a year afterwards.

This part is worse.

Roberts confirmed that the Lancet study authors treat data requests from non-researchers differently. "What we wanted was to release [the data] to people that had the statistical modeling experience and a little bit of experience in this field," he explained. When I asked Neil Munro whether he accepted that security concerns kept the Lancet researchers from collecting personal data or releasing non-collated data, he said, "That's a perfectly coherent response. At the same time, others can judge whether it's a sufficient response."

Munro is too kind! The issue is not just who they released the data to but which data they released. Recall that Roberts promised, to Fritz Scheuer at JSM, to release information about which unnamed interviewers conducted which interviews. Roberts has since reneged on that promise.

Sorry for the multiple posts, but the further I read, the worse it gets.

Still, the information that the Lancet study authors gave Munro and Cannon was detailed enough to reveal what the journalists called examples of data heaping. One example involves when death certificates were reported missing. Burnham and Roberts refuted Munro and Cannon's claim that "all 22 missing certificates for violent deaths were inexplicably heaped in the single province of Nineveh." They pointed out that there were three regions in which "survey interviewers either forgot or chose not to ask for death certificates out of concern for their personal safety."

This is just not true, and I have demonstrated that fact to Roberts (and cc'd Tim!) in an e-mail exchange several weeks ago. Is Roberts still lying about this to reporters now? That would be problematic!

And, minor note: I do not think that Roberts gave any data to Munro. I am pretty sure that in answering questions for Spagat, I did the actual empirical work on Nineveh. Spagat passed it on to Munro/Cannon.

If I were a Johns Hopkins lawyer, I would be getting nervous about now. That misleading statement by Roberts/Burnham is on university letterhead, hosted on a university computer. It impugns the professionalism of Munro/Cannon/Spagat. Do they have a legal cause of action against Hopkins? Perhaps we have some lawyers among the Deltoid community who can help us out.

Thanks to Tim for pointing us to Farsetta who points us to this statement (pdf) from the WHO/IFHS team.

Q: Which figure is more reliable: the new survey or the 2006 household survey?

A: The 2006 household survey shows a very different trend than that seen in the new survey and the Iraq Body Count, with increasing numbers of deaths per day, rising from 231 during 2003-2004 to 491 during 2004-2005 and 925 during 2005-2006. The biggest difference between the 2006 household survey and the other two sources is in the figure for the third year. Most of those deaths occur in six high mortality governorates outside of Baghdad, while in the Iraq Body Count and the new survey, most deaths occur in Baghdad.

The difference between 925 and 126 violent deaths per day is very large. To reach 925, the Iraq Family Health Survey would have to have missed more than 80% of deaths detected in the smaller survey. This is highly unlikely given the much larger number of clusters and households visited in the new survey.

Damning stuff. The only way to reconcile the 925 and 126 numbers and believe that the WHO/IFHS authors no what they are doing is if L2 data is fraudulent. The above statement may seem polite to those unversed in the niceties of academics and international organizations, but the clear implication is that the L2 data is fake.

David, pick a story, and stick with it. In post #3, you say that Roberts promised to give Scheurer [sic] the data. But in the link you provide, you stated that "Roberts also expressed no interest in providing Scheuren (or anybody else) with more data."

All these conflicting claims... hmmm. If I were a lawyer for the Harvard Institute for Quantum Social Demographics (or whatever group you're with) I would be getting nervous about now.

Good point! I think that it is fair to say both that Roberts has no interest in sharing more (really any) data with anyone. Check with him if you don't believe me. But, at the same time, he did promise, at JSM, to get that specific data to Scheuren. I even recall the exact quote that Sheuren used when Roberts made that promise: "I am going to hold you to that."

But, again, this is another empirical matter! Roberts made that promise at JSM in front of several hundred people. Tim can check with Roberts to confirm, if he likes. (Roberts may have all sorts of good reasonings for changing his mind, but I have not heard them.)

Instead of pestering Roberts with more (pointless) questions, or instead of asking Tim to bother him on your behalf, why don't you forward a link to this discussion to Dr. Scheuren? If he says that Roberts made the promise and reneged, I will believe him. Perhaps he would also weigh in on the broader topic of the actual death toll. That would be of greater interest to most of the participants here. Personally, I couldn't care less if Roberts is a wild-eyed, promise-breaking pinko-commie liar. I would, however, like to have some sense of how many excess deaths have been caused by the Iraq War.

David Kane:

[T]he issue is that academics like Spagat et al are not allowed access.

This is the same Spagat who put forth a trend line based on three cherry-picked data points and pointed to the high R^2 as evidence of fraud? If any "academics" are to be excluded surely it would be ones like him.

Is the programming on the David Kane bot Open Source, one wonders. The bot's programming seems to have a fairly high interactivity component.

Best,

David Kane wrote:

In fact, I bet that if Roberts were forced to make clear who he is referring to [about making death estimates], it would be obvious that those folks did not have access to the 2004 data

Hmmm. Perhaps, but another argument you could make is that even when given access to the 2004 data there are some who were unable to estimate the numbers of deaths, and it didn't seem to slow them down a bit.

Just because an undead faux scientist blog troll is tarted up in the psychotosphere does not mean anyone has to dignify him with a reply. Just a thought.

David - Excellent points!

If I were a lawyer for the Harvard Institute for Quantum Social Demographics (or whatever group you're with) I would be getting nervous about now.

I think "embarrassed" might be a much better description for how they feel.

After all, Harvard has a reputation to maintain.

Let's see. 5 out of the 14 comments above are from David Kane and most of the rest are dismissive toward him.

The guy seems to be carrying on a conversation with himself at this point.

Does that tell us anything?

Dear David,

So I presume you have taken the preliminary step of looking at the L2 published data for signs of fraud? What about a simple Benfordian analysis?

Here's what I get using the data deliminated by mode of death and excluding the totals:

N, LII freq, expected freq

1, 0.43, 0.3

2, 0.15, 0.18

3, 0.08, 0.12

4, 0.07, 0.1

5, 0.11, 0.08

6, 0.07, 0.07

7, 0.02, 0.06

8, 0.05, 0.05

9, 0.03, 0.05

Well, as much as I would like to use a Benford type analysis to look for fraud, I don't see it as possible, although I welcome suggestions.

For example, you are correct (I assume) that 43% of the causes of death are coded as having a "1" (whether as "1" or as "11" or "12" and so on), but I find it doubtful that there is any particular percentage that we should be looking for. As I understand it, Benford is all about looking for the frequency of different digits in a a large number of examples. In this case, there are only 13 possible codes. You can use my R package to take a look as:

> x <- .prepdata("deaths")
> table(x$CAUSE_16C)

1 2 3 4 5 7 9 11 12 13 14 15 16
169 38 43 40 6 27 36 48 122 43 4 40 13
> dim(x)
[1] 629 13
> table(x$CAUSE_16C) / 629

1 2 3 4 5 7 9 11 12 13 14 15 16
0.2687 0.0604 0.0684 0.0636 0.0095 0.0429 0.0572 0.0763 0.1940 0.0684 0.0064 0.0636 0.0207
>

This, obviously, does not reproduce your result because you are doing something Benfordian about the digits. The point is that these numbers have no meaning, unlike say the numbers on an income tax return. These are just codes corresponding to different causes of death:

> y <- prep.deaths()
> table(y$cause.category)

gunshot carbomb other explosion air strike violent, unknown
169 38 43 40 6
old age accident cancer heart disease or stroke chronic illness
27 36 48 122 43
infection disease infant death non-violent, other
4 40 13
>

(Apologies for formatting.)

Anyway, I do not think that the fact that, arbitrarily, gunshot deaths are coded as a "1" tells us anything.

I do think that there are lots of suspicious things about the data. See Spagat (pdf) for an introduction.

David Kane:

I do think that there are lots of suspicious things about the data. See Spagat (pdf) for an introduction.

David, I hope that was intended as sly humor. I do not know if it is the result of idiocy or of fraud but as you know one (at least) of the arguments in that Spagat paper you're recommending is transparently and utterly bogus, as Tim documented here. That doesn't necessarily mean the rest of the paper is junk (though it is suggestive), but if you want to cite it positively you should at least include some caveat.

I agree that the R^2 stuff is the weakest portion of Spagat's paper. See! Even we denialists have our disagreements. I would not go so far as to call that argument "utterly bogus." Anyway, see the thread you link to for more comments.

But the key issue here is not "How weak is that part of Spagat's paper?" The key is that there are so many other problems with L2. Let me pick just one: Were all streets included in the sample (including back alleys) or just streets which intersected main streets?

That should be a simple question. And yet, neither Tim nor Robert Chung nor sod nor my friend can tell us the answer. It is reasonable to say that the authors did not have the space in the actual paper to accurately describe the sampling scheme. But even now, over a year later, no one knows what the sampling scheme was. The authors continue to produce inconsistent claims. (See Spagat's paper for some examples.)

Tim deserves great credit for providing this excellent forum. But, if he really wanted to get to the truth, he would start threads that focused the conversation on the strongest portions of the anti-L2 case, not the weakest.

But, again, the challenge: Were all streets included in the sample (including back alleys) or just streets which intersected main streets?

If you can't answer that, how can you have much faith in L2?

David Kane is repeatedly saying "fraud" and "fake" now, but not *quite* saying outright the paper is a fraud and a fake. In fact, when pressed, he retreats to - 'the sampling scheme isn't well documented' and such like crap.
David, "the clear implication is that" you are...

David Kane: The authors continue to produce inconsistent claims. (See Spagat's paper for some examples.)

I've seen it. Spagat cites Munro, Munro cites Kane, Kane cites Spagat. Same old shite. So just for the hell of it, I'll cite Kane (March 16, 2007):

The slides [presented by Burnham] make clear that the main-street bias (MSB) issue is less serious than one might suppose (and that the methodology write up in the paper is misleading --- unintentionally, I think).

So Burnham's explanation seemed more-or-less okay back then, but then David Kane found that the rest of his powder really wasn't of the best quality, so he went trudging back to this stuff. The carpers can't keep their own story straight so they complain that the authors are being inconsistent.

But, if he really wanted to get to the truth, [Tim Lambert] would start threads that focused the conversation on the strongest portions of the anti-L2 case, not the weakest.

So "the truth" is arrived at by singling out one study for attack and ignoring the weaknesses in the others? Methinks there is something wrong with this approach. But tell us David, just what specific issue do you think is the strongest part of your case? To judge from the fact that you seem to mention it more frequently than anything else, it may be the fact that Burnham refuses to share his data with the author of that tripe you linked to upthread - is that really the best you've got, or is it just the thing that bugs you most?

"Burnham's explanation seemed more-or-less okay back then"

Correct. It did. It wasn't until much later that I saw Spagat's summary of all the conflicting descriptions that I became convinced that this was a real issue. Also, the one clump of car bomb deaths in the data (which had not been released at that stage, I think) makes it clear that MSB is a serious concern.

But, Kevin, how about a little help? You obviously have followed this closely. Can you answer what should be a simple question: Were all streets included in the sample (including back alleys) or just streets which intersected main streets?

If even experts like the Deltoid community don't know what the sampling scheme was, how can you have faith in the reported data?

David Kane: I agree that the R^2 stuff is the weakest portion of Spagat's paper. See! Even we denialists have our disagreements. I would not go so far as to call that argument "utterly bogus."

So you think the argument was merely bogus rather than utterly bogus. Or something. Still, you should not be citing the paper as a "tour de force", certainly not without further qualification. Which was my point.

As for the question of the sampling of side streets that do not intersect with main streets: Who cares? Is it even clear what the sign of the overall effect would be if they missed such streets completely? And surely, either way, the magnitude of the effect is going to be modest. (I should probably admit right here that you haven't yet fully converted me into a Lancet obsessive and I haven't actually read the mainstreet bias stuff. And I don't mean to pretend the Lancet study or any study for that matter -- e.g. think of NJEM not asking for death certificates -- is perfect. But let's get real here.)

I would go with "quite weak" rather than "merely bogus." I concede that tour de force was excessive praise.

The reason that the sampling scheme matters is not because it is that important. It is just the easiest way to see why I (and others) have so many doubts about the underlying data. In any survey situation, if you don't know how the sample was constructed, you don't know much of anything. The fact that the American authors are so confused about the actual methodology employed gives me reason to be suspicious about everything else.

Many smart people suggest that MSB might cause the estimate to be much (50% or more) too high. That doesn't get us all the way from the 600k L2 estimate to the 151k IFHS, but it does help.

But my main goal for now is to make small positive stops to increase our knowledge, as in our demographic fight of a few months ago. Surely it would be useful if we could figure out the sampling plan for L2.

Were all streets included in the sample (including back alleys) or just streets which intersected main streets?

As previously requested:

http://scienceblogs.com/deltoid/2008/02/spagat_goes_off_the_deep_end.ph…

So what do you make of the sensitivity of MSB to Spagat's f parameter?

. And yet, neither Tim nor Robert Chung nor sod nor my friend can tell us the answer.

the main street bias part of the Spagat paper is weak. he can t show that the methodology produces a significant bias (a lot depends on what "MAIN streets" are..) and he can t produce anything but wild guesses on the effect.

look at this and laugh:

For example, Sunnis would not travel deep into Shiite territory, abduct some people and make a long drive to reach safe territory. Rather, they would make a quick foray in and out of enemy territory, perhaps just crossing over a main street that divides the two areas, and continuing only until they were just inside of a residential area.
(page 10)

the ethics part is absolutely horrible as well. do you seriously claim that those attacks make sense?

have you ever heard of a call for a "formal investigation" into a similar paper before?

Many smart people suggest that MSB might cause the estimate to be much (50% or more) too high.

oh, i d love to read some names and links on this! bring them on!

But my main goal for now is to make small positive stops to increase our knowledge, as in our demographic fight of a few months ago.

you have many doubts about the lancet data, but believe that finding out that gender but not age data was collected in L2 "increased OUR knowledge"?

PS: i asked you before, i ll ask you again: why didn t Guha-Sapir find a SINGLE fault with the IBC data, but several pages of problems in the lancet paper? makes his work completely worthless!

David Kane: It wasn't until much later that I saw Spagat's summary of all the conflicting descriptions that I became convinced that this was a real issue.

Spagat gets his "conflicting descriptions" by making mountains out of molehills. Of course it's true that if you sample in a way that gives two streets of different lengths an equal probability of being selected, then Mr A of 4, Short Street is more likely to be included than Mrs B of 95, Long Street. In that sense the statement that every household had "an equal chance of being included" isn't literally true. I reckon that almost all of those who have read L2 spotted that immediately. Presumably you saw it yourself when you first read L2 and even if you didn't you can hardly have failed to think of it when Burnham gave his presentation, complete with a slide showing streets of different lengths. There are other equally obvious problems relating to population density. So I really can't see how anything in that section of Spagat's paper can have come as news to you.

No, what you learned from Spagat wasn't about sampling, it was about debating. He taught you a trick which was already old when Demosthenes was a boy, but it still works: dig up statements that your adversary has made in different contexts and say "See, citizens, where the rascal says X here and Y here - but X and Y are incompatible!" Of course your adversary will respond that X is a simplified account for a lay audience while Y is a more detailed account for a specialist audience; or that X is an inaccurate report of his words, etc. It won't much matter what he says. Those who are well-disposed to your message will announce that the enemy has been routed, debunked, discredited and exploded. Repeat at regular intervals and you win the political battle.

In any survey situation, if you don't know how the sample was constructed, you don't know much of anything.

I don't see that you know any more or less now than you did when you attended Burnham's lengthy presentation at MIT - the most complete account of the sampling he has given to date, AFAIK. Yet you were reasonably happy then, although you had already seen several of the "conflicting descriptions" which Spagat makes so much of; for example, the truncated description in L2 itself and the exaggerated "equal chance" claim in The Human Cost of War. Hence my conclusion that what Spagat brought to your notice was not new evidence, but an old debating trick whose power you had previously underestimated.

David, I've got to say "quite weak" is grading on a curve. We're talking about drawing a trend line with just three points and quoting the R^2 (rather than the adjusted R^2). To say nothing of the fact that the points were cherry picked from a larger set. Well, I suppose it could be worse. At least Spagat didn't quote the R^2 on a line drawn from two points...

(Typo in my previous comment: NJEM should have been NEJM, i.e. IFHS)

David Kane,

even Spagat hasn't proven anything about the ratio of main to back streets sampled. All he has done is written a little attack job saying that IF some streets are sampled at half the rate of other streets, the result will be biased. He hasn't proven that they were or weren't, even though he has the ability to do so.

I wonder why not?

It's now up to 9 comments from David Kane out of a total 29 (30 including this one) -- or fully 1/3 the total comments.

One would think Kane would have something better to do with his time -- like figure out how to actually calculate a CMR.

But then again, perhaps Harvard does not have any "hours on the job" requirements as part of their employment contract.

Or maybe blogging is included in that.

Doesn't Harvard have at least one explicit junk science brothel?? Something with 'risk' in the name. Perhaps David Kane is actually on the job when posting here.....

(Responding seriously to a joke is almost always a mistake, but anyway here goes.) JB, I don't think David is employed by Harvard (although he is affiliated; his day job is as CEO of Kane Capital Management).

"Responding seriously to a joke is almost always a mistake..."

My point exactly.

Diane Farsetta on Iraqi deaths

More like this

Scienceblogs is shutting down

June 2017 Open Thread

March 2017 Open Thread

January 2107 Open thread

December 2016 Open Thread

A thought experiment for the relativity skeptics

Weekend Fun

History of the Swedish Boardgame Market