David Kane claims Lancet study is likely fraudulent

By tlambert on October 18, 2006.

If you read the comment threads on the Lancet study you will know that David Kane frequently pops up with dark hints the authors committed some sort of fraud. Well now he has argued that the Lancet study is likely to be a fraud because the response rate was so high. [Update The post has been removed because the "tone is unacceptable, the facts are shoddy, and the ideas are not endorsed by myself, the other authors on the sidebar, or the Harvard IQSS"] Kieran Healy smacks him down:

Kane says, "I can not find a single example of a survey with a 99%+ response rates in a large sample for any survey topic in any country ever." I googled around a bit looking for information on previous Iraqi polls and their response rates. It took about two minutes. Here is the methodological statement for a poll conducted by Oxford Research International for ABC News (and others, including Time and the BBC) in November of 2005. The report says, "The survey had a contact rate of 98 percent and a cooperation rate of 84 percent for a total response rate of 82 percent." Here is one from the International Republican Institute, done in July. The PowerPoint slides for that one say that "A total sample of 2,849 valid interviews were obtained from a total sample of 3,120 rendering a response rate of 91 percent." And here is a report put out in 2003 by the former Coalition Provisional Authority, summarizing surveys conducted by the Office of Research and Gallup. In the former, "The overall response rate was 89 percent, ranging from 93% in Baghdad to 100% in Suleymania and Erbil." In the latter, "Face-to-face interviews were conducted among 1,178 adults who resided in urban areas within the governorate of Baghdad ... The response rate was 97 percent." So much for Iraqi surveys with extraordinary response rates being hard to find.

Oddly, the comment that Kane picked up from our thread was quickly rebutted by other commenters, who made the same point as the one above. I guess Kane didn't wait around to find out.

Accusations or insinuations of fraud are a serious matter, especially in a case like this. I have to say I am surprised -- and dismayed -- to see this balloon being floated at the Social Science Statistics Blog. I'm a fairly regular reader of theirs. It's run under the auspices of the Institute for Quantitative Social Science, an interdisciplinary group at Harvard. Most of the posts are by Harvard grad students, but the sidebar also includes respected heavy-hitters like Jeff Gill and the Institute's director, Gary King. The blogosphere being what it is, I expect posts with titles like "Harvard statistics blog says Iraq survey results may be fraudulent" to start popping up pretty soon. I wonder whether Prof. King is aware of Kane's post, and whether he thinks it's alright that his Institute is providing a platform to Kane to make his claims of fraud.

More like this

Weekend Diversion: Advertising vs. Art (Synopsis)

“I felt my lungs inflate with the onrush of scenery — air, mountains, trees, people. I thought, ‘This is what it is to be happy.’” -Sylvia Plath

Kane and Lancet

There has been more discussion at Crooked Timber on David Kane's criticism of the Lancet study. In response to Tim Burke's comment:

Love and condolences to the Urbano and Kane families

UPDATE (Wed 29 April): As friends and family of the Urbanos and Kanes have been arriving here via web searches, I wanted to provide a compendium of individual obituaries and plans for visitation and funeral.

Fumento's nemesis: the rake

Long time readers will be familiar with the epic that is Michael Fumento's attempt to debunk the first Lancet survey.

What's wrong with this explanation, Tim?
http://www.opinionjournal.com/editorial/feature.html?id=110009108

Moore doesn't know what he is talking about. I'll have a post up after BSG is over.

Shorter David Kane: "Some guy says the response rate is high and I'm not good at Googling so Roberts and Burnham must be frauds."

jc, the number of clusters affects precision of an estimate but not its central value.

I've expressed my doubts about the validity of this survey elsewhere. What I'm looking for are responses to those doubts. Specifically, I question whether the sample methodology may have caused densely populated urban areas to have been sampled more heavily than less densely populated ones - the way the methods are reported seem to imply that only a residential side street with 40 or more homes qualified as a potential cluster. Now, I don't know for sure that a person living in a less densely populated region of Iraq is less likely to live on such a street, but that sure seems likely. Nor do I know for sure that people in densely populated areas are more likely to have been impacted by violence. But that also seems likely.

Of course, even if that were the case the sample is potentially salvageable (given that they were able to treat the sample as stratified along the density variable). But I don't see any reference to stratification on that basis in the article, so I wonder.

I've seen claims (I believe it was on this blog) that the idea that the sample was biased toward densely populated areas has been refuted. Can anyone point me to the meat of that refutation?

Second, like Kane, I worry that the results could have been biased by the selection of "alternative sites" due to a lack of security (and the mention in the article that the study was conducted under conditions of "extreme insecurity"). If the use of alternative sites was common, this could be a source of bias in the survey - intentional or unintentional. I consider this to be the major methodological problem with the survey.

If anyone has any additional pertinent information (or noticed something in the article addressing this issue that I did not), please let me know.

Finally, I wonder if anyone has seen estimates of the degree of association of violence within-cluster versus across clusters - or, more to the point, knows whether the impact of this on effective sample size is accounted for in the analysis. It isn't specifically said to have been accounted for (I feel strongly that it is such an obvious objection that it should have been), but if I were building a piece of software to handle cluster survey sampling, I'd have built it in.

Thanks.

Morgan wrote:

Specifically, I question whether the sample methodology may have caused densely populated urban areas to have been sampled more heavily than less densely populated ones

The first and second stages of cluster assignment (to governates and then to administrative units within governates) were done on a population basis. So densely populated urban areas probably were sampled more heavily, but that's because they ought to be. So perhaps you meant it's the third stage you were worried about. At the third stage, locations were selected for the starting point. Locational selection is usually equally weighted by area, not by population, so if anything selected clusters would tend to be lower density rather than higher.

Second, like Kane, I worry that the results could have been biased by the selection of "alternative sites" due to a lack of security

Very possibly. We know that at least one cluster was moved (it was the lost cluster from Wasit) but we don't know how many others. Which way do you think the bias would go? Do you think the excess mortality estimate would have gone up or down because the interviewers avoided the most dangerous locations?

Kieran doesn't "smack Kane down".

Kane says "I can not find a single example of a survey with a 99%+ response rates in a large sample for any survey topic in any country ever." Kieran's - after two minutes of googling - was unable to find a single survey with a 99%+ response rate. Obviously, that's not the result Kieran wanted, but he tries to presents this as a refutation of Kane's claim. It clearly isn't. Frankly, that strikes me as dishonest.

Except for the ones with the 100% rate in Suleymania and Erbil.

Or does it have to be precisely 99%?

nik said:
[snip]

Burkina Faso, 2003 DHS. It took me about 10 minutes.

Aside from that 100 percent response (which was hard to miss since, well, he emphasized it), the point was that the very low responses in America to surveys have led people to find a 99 percent response very implausible. It looks less implausible when you see 90 and 97 and 98.5 percent responses in other countries (including Iraq).

I'd wondered about that response rate too, being the sort that will normally hang up on surveyors, but that issue is now dead.

Donald wrote:

but that issue is now dead.

Maybe, maybe not. First, it's a nice little cautionary tale that the things we're used to don't always translate well to Iraq so we need to be very careful when we claim that something is "unbelievable."

Second, speaking of issues being dead, things are happening over at the [Social Science Statistics blog](http://www.iq.harvard.edu/blog/sss/) where David made his claim of fraud. About half an hour ago, David's post was disappeared and a post by the editor went up saying it had been removed. Now the editor's post has also been removed, leaving no trace whatsoever.

Actually it hasn't vanished entirely. I think the editor needs instructions on how to make an embarassing post disappear for good.

I think the editor needs instructions on how to make an embarassing post disappear for good.

Would that involve extreme rendition using Learjets and third-party countries?

Robert:

Thanks for the reply. You're right that I'm concerned about the third stage. The problem is that areas were not equally weighted - effectively they were weighted based on the number of qualifying crossing residential streets (QCRSs) in the area.

My concern is that, depending on how you define a QCRS, you may have selection biased toward densely populated areas.

Maybe an example would help me to explain. Let's say that you define a QCRS as one that has at least 40 homes on it - which is one condition that seems to be built into the researchers' methodology. Let's say further that suburbs in Iraq were built in 1974 according to the prevailing ideas at the time, which was that every residential street should have exactly 40 homes on it, ending in a cul de sac.

In these suburbs, you have one qualified crossing residential street for every 40 homes in the area - by our definition, the highest ratio you can possibly get. This area has a very high probability of being selected as a cluster site relative to its population.

If, instead, the prevailing idea had been that 35 homes was the ideal, you would have zero QCRSs. Streets in this area have zero probability of being selected as a cluster.

But the ratio of QRCSs to population may not be static with respect to population density. I think it probably decreases as population density falls below some limit - it probably decreases above that limit, too, if you have high-rise apartment buildings with hundreds of units, for example.

With regard to the selection of alternative sites, we don't know whether the estimate of excess mortality would go up or down based on these choices. Yes, the interviewers were supposed to be changing locations due to concerns about safety, which tends to imply violence in that area (though we don't really know what constituted "insecurity"). But they may have chosen alternatives that were "comparable" in that they had clearly been impacted by violence previously. That's a reasonable thing to do if you want to avoid underreporting violence, but it's perfectly possible to overcorrect.

The simple fact is that if alternative sites were used frequently, and absent some information about the procedure used to identify these sites or the characteristics of these alternative sites relative to randomly chosen ones, there is no way to evaluate the representativeness of the sampling procedure.

Well, looks like a dead thread. I'd like to thank Tim for locating the Gilbert Burnham interview several posts later effectively answering the third question from my original comment. Exactly the kind of thing I was hoping to find here.

It's just a gut feeling but I would think high response rates to surveys of this kind go hand in hand how strongly people feel about the subject matter combined with whether they feel the reality of their experiences has been (or in this case has not been) noticed. When people from your household are dead from war related violence and the media and relevent authorities (Local and International) are ignoring or downplaying it and then people come along who want to know the facts there could well be a lot of motivation to tell them. This is nothing like the kinds of surveys that most Australians or (I suspect) Americans have little time or inclination to answer.
I don't like to believe so many people have died in Iraq or that the real numbers are ignored or downplayed. I like even less the thought that so much of the world really doesn't care. To deny the reality of Iraqi peoples' suffering can only add credence to those that preach hatred of the West and the ongoing consequences of that will be to make our future less safe and secure.

The blog editor and IQSS director now have posts up explaining the removal of David Kane's post. Discussion and links are at CT.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

David Kane claims Lancet study is likely fraudulent

More like this

Weekend Diversion: Advertising vs. Art (Synopsis)

Kane and Lancet

Love and condolences to the Urbano and Kane families

Fumento's nemesis: the rake

Scienceblogs is shutting down

June 2017 Open Thread

March 2017 Open Thread

January 2107 Open thread

December 2016 Open Thread

Ask Ethan #31: Why are we made of matter? (Synopsis)

An Ant Diversity Sampler

Our way is not the only way!