Lancet at the ASA meeting

The AP's Paul Foy reports on the American Statistical Association meetings discussion on the Lancet studies:

Number crunchers this week quibbled with Roberts's survey methods and blasted his refusal to release all his raw data for scrutiny -- or any data to his worst critics. Some discounted him as an advocate for world peace, although none could find a major flaw in his surveys or analysis.

"Most of the criticism I heard was carping," said Stephen Fienberg, a professor of statistics at Carnegie Mellon University in Pittsburgh. "I thought the surveys were pretty good for non-statisticians." ...

"It puts you in a position where you are going to get attacked," said Fritz Scheuren, a senior fellow at the University of Chicago's National Opinion Research Center, who is trying to organize another Iraqi survey to see if he can match Roberts's results.

Scheuren, the American Statistical Association's former president, said he couldn't find anything wrong with The Lancet surveys. ...

"Verifying these numbers wouldn't be too difficult, so why isn't it being done?" asked Jana Asher, a graduate student at Carnegie Mellon University, who dissected Roberts's surveys and found little fault with them.

David Kane's report on the discussion is rather different:

When the history of this sad saga is written, the moment when a past president of the American Statistical Society termed the Lancet numbers "not credible" will mark the end of the beginning of the debunking of these flawed studies.

I think might be what Fienberg described as "carping".

Tags

More like this

There has been more discussion at Crooked Timber on David Kane's criticism of the Lancet study. In response to Tim Burke's comment: Good faith skepticism starts with, "Ok, I want to look at why you're making this claim, and your evidence for it. I don't want to take anything on faith." Not, "I'm…
Robin Meija writes about the Lancet studies: In any case, such problems are common in war zones, according to nearly a dozen leading survey statisticians and epidemiologists I spoke with. "Iraq is not an ideal condition in which to conduct a survey, so to expect them to do the same things that you…
It's only been a few days, but already the Lancet study of excess deaths in Iraq has faded from the headlines. Even NPR seems to have decided that further analysis is not worthy of interrupting this week's pledge drive pleas. Which is a pity, because this is the sort of thing that should decide…
I guess that the next time a new physics study comes out Science will ask epidemiologists what they think of it. You see, John Bohannon, the reporter for Science, decided that opinions from a couple of physicists and an economist were more important than getting comments from experts in…

Looks like Kane's perked up and is resurrecting his claim that the survey is an outright fraud:

1) It was amazing to see such an accusation [by Scheuren] at a professional gathering. Scheuren was not suggesting that Roberts himself was guilty of fraud. Instead, he seems to believe that there is no way that the Iraqi interviewers achieved a 98% response rate.

Then Kane himself:

Note that this point (a 98%+ response rate is ipso facto evidence of fraud) was what first made me (in)famous in Lancet circles last November.

Note that in this quote, Kane is stating quite clearly that he thinks Scheuren believes the survey's a fraud.

But it appears Kane is exaggerating Scheuren's commentary just a bit:

Scheuren made clear that, while he did not view the response rate as "credible," he did not think that it was absolutely impossible either.

As to this:

When the history of this sad saga is written, the moment when a past president of the American Statistical Society termed the Lancet numbers "not credible" will mark the end of the beginning of the debunking of these flawed studies.

Kane's claiming a past president of the American Statistical Society has never been, and can not in this case be, wrong?

Of course, Kane handwaves Scheuren's statement that he couldn't find anything seriously wrong with the analysis itself (which is sort of different than Kane's claims about the analysis).

Not sharing the data and code is a big concern and not as much allowed in this field as in climatology. I'm sure that turned heads. Also the peacenik bias of the authors. Not saying they did anything sneaky, but...

The response rate implying fraud is a strange item to pick on, because the original paper says they kept trying households until they got to 40 responses, although the fact that some clusters included less than 40 households shows that they did not always do this.

Has Roberts changed his description of the collection methodology? Has the inconsistency between saysing they'd keep trying households and giving a 'response rate' based on the fraction of the target number (40 per cluster) sampled been clarified ?

Thanks to Tim for bringing this article to our attention. I want to highlight two parts. First:

Roberts organized two surveys of mortality in Iraqi households that were published last October in Britain's premier medical journal, The Lancet. He acknowledged that the timing was meant to influence midterm U.S. elections.

Is that true? As far as I can tell, Roberts has always denied that the timing of L2 had anything to do with the US elections. (The timing of L1 is another matter.) Did Roberts really acknowledge this? I did not see him do so but Foy seems to have spoken with Roberts privately. Perhaps Tim or dsquared or one of the other Lancet experts can confirm this claim.

Second:

In The Lancet article, Roberts and other researchers produced eight pages of text, maps, charts and many figures, but the number was the show-stopper: 654,965 war-related deaths, from the 2003 invasion of Iraq to last summer.

The estimate covered everything from battlefield casualties to civilians dying for lack of routine medical care. President Bush last fall said the estimate wasn't credible.

"Do I believe 650,000? No," Fienberg said. "Do I believe a lot of people are being killed? Yes."

So is Professor Fienberg (a well-known and respected statistician) just another part of the denialist cabal? Help us out Tim! Me, Michael Spagat, Iraq Body Count, President Bush, Tony Blair, Fienberg and all the other Lancet critics agree that lots of people are being killed and that the 650,000 number is very wrong. If Feinberg thinks that the Lancet results are way too high then why don't you criticize him as well? Just asking!

By David Kane (not verified) on 07 Aug 2007 #permalink

Here`s a little story for TCO, David Kane, and all the other non-epidemiologists who think that refusal to release the data is "unusual" in public health circles:

a year or two ago, I was working for a large research institute in Australia with about 10 people on several different analyses of various parts of a major national survey. This survey had been conducted by a government department, and we had a contract with them to use their data. Even given that we had that contract with them, their own institutional ethics approval for the survey prevented them from releasing any information to us about the clusters which might have led to the identification of individuals. I was trying to develop what is called a "complex sample plan" in SPSS to disseminate to all my co-workers, and needed this information. Because the department couldn`t release the data, my organisation had to pay for me to go to Canberra to work in that department on one of their computers, recoding all the cluster-level data so that it was incapable of identifying clusters, and in some case aggregating or correcting clusters so that they didn`t have too few people. Because the complex sample plan needs this information, and I didn`t know what information I would need until I got the plan running, I had to do the whole thing - with documentation - in their office. I couldn`t take the data to my hotel, even - everything had to be done in work hours, while the data custodian was there. And we had a contract with these people.

So when you complain about this being "unusual" then remember this story, stick it in your pipe and smoke it. Health data is not climatology or whatever pointless data Kane works on, the ethical constraints attached to health data are much greater. In this case, we are talking about information which could potentially identify individuals in a country being terrorised by death squads. Don`t expect to be getting that information any time soon, regardless of how many times you blather about being "honest" or "well-meaning". And don`t pretend that this is unusual or suspect. The only suspect thing here is David Kane`s constant accusations of fraud.

Also David Kane, I think it is in very poor form for you to be continuing to post attacks on the lancet team, when you have clearly stopped replying to the criticisms of your own work. If you aren`t brave enough to answer the mounting criticisms of your own paper, why do you think anyone should listen to you mounting yet more criticisms of someone else`s?

And why should we think your attitude is either well-meaning, genuine, or honest?

SG,

1) I have never heard of a situation in which academic authors release data to some critics but not to others. The Lancet authors have released data to some people (like me) but not to others (Spagat et al and, I think, one other group). Have you ever heard of such a thing? If so, please tell us.

2) I agree that it is not unusual for people to refuse to share data.

3) I am trying to process all the excellent criticisms of the paper. A new (better!) draft will be forthcoming. When it is done, I hope that Tim might post it so that you and others can make comments. With luck, it will deal with the previous criticisms.

By David Kane (not verified) on 07 Aug 2007 #permalink

David, I have never heard of a situation where academic authors oblige people who accuse them of fraud by giving them access to their data. Have you?

1) Yes! Roberts et al gave me their data!

2) Spagat et al have never accused Roberts of fraud, as far as I know. Their point is Main Street Bias. Why should that critique preclude them from seeing the data?

3) I'll ask again: Have you ever heard of academics sharing data with some critics but not others? I haven't. That would make Roberts et al, uh, "unusual," wouldn't it?

By David Kane (not verified) on 07 Aug 2007 #permalink

David, 650,000 is the point estimate. L2 is consistent with the real number being 300,000, so you don't have to concoct some argument that there are fatal flaws in the survey if you think that is the likely number. I criticise Kane, Spagat, IBC, Fumento etc etc for their erroneous arguments.

Thanks to Tim for the reply.

1) Any thoughts on the election timing issue? I won't really believe that Foy has this right unless you or someone else with good information can confirm.

2) You claim that 300,000 is "consistent" with L2. Perhaps. I guess that zero is also consistent. But isn't the 95% confidence interval for L2 392,979 to 942,636. So, 300,000 is not very consistent with that. Perhaps you meant 400,000?

3) If Feinberg says that 650,000 is wrong, isn't he (implicitly) saying something is wrong with the raw data? I doubt that he disputes that the Stata commands were correct. Again, this is like Pedersen's 150,000 estimate. When serious people start claiming that the mean estimate from L2 is too high, then there is a problem. If Feinberg believes that the true number is 400,000 (and he is right), then there was almost certainly something wrong with the raw data. There is almost no way for the raw data to be accurate and for Pedersen to be correct. Or am I missing something?

By David Kane (not verified) on 07 Aug 2007 #permalink

Any thoughts on the election timing issue?

This has absolutely no relevance regarding the accuracy of the survey, nor your repeated claims that the survey is not only inaccurate, but fradulent.

When serious people start claiming that the mean estimate from L2 is too high, then there is a problem.

Classic science denialism argument. "Foo, Bar, and Fubar all say their are problems with (choose one or all of global warming, evolutionary biology, HIV causes AIDS, etc) therefore there's a real problem.

Sorry, you should know better than to argue in this way.

By now, Kane's paper is nothing but a joke.

On the theoretical side, Kane has a fundamental misunderstanding of frequentist statistics, causing his entire analysis to be completely meaningless. I have tried repeatedly to get him to address this point, but he won't (or can't). [In brief, expressions like P(CMR > _this-or-that) are completely meaningless in frequentist statistics since parameters are not assumed to be random variables - Kane rests his entire "analysis" on manipulating those expressions.]

On the substantive side, he doesn't seem to know exactly what his point is (is it the wide CI for CMRpost if you include Fallujah, or is the 98% response rate, or is it that some important people like Blair and Bush and Fienberg simply cannot _believe_ that 650,000 people have been killed in Iraq).

On the personal side, he is clearly onto some self-promotion trip, reminding everyone that he is (in)famous in "Lancet circles", and rushing to get his name into Malkin's high visibility blog.

Why would anyone take him seriously?

David, re: your 2. Assessing Main Street Bias would require access to the raw data in its most detailed cluster level (to, you know, find the main streets). This is almost certainly a guaranteed no-no, for all the obvious reasons (need I say "IRB" until I`m blue in the face).

Re: your 3. See 2. Also, consider that Spagat et al aren`t epidemiologists as far as I know, and neither are you. "serious" public health researchers have no obligation to release data against their IRB rules to a bunch of dilettantes on a mission to discredit a number.

I was trying to hint at this with 1: since they released the data to someone who has really been rather rude to them, might you perhaps be willing to extend to them an assumption of good faith?

TCO suggests that the Roberts et al. studies might be biased because of "The peacenik bias of the authors".

Yeh, that makes them really bad guys, not supporting aggression, theft, occupation and mass murder. I s'pose the only people that can be trusted to conduct impartial surveys are those who support rampant empire and its trappings.

By Jeff Harvey (not verified) on 07 Aug 2007 #permalink

Sortition wrote,

On the theoretical side, Kane has a fundamental misunderstanding of frequentist statistics, causing his entire analysis to be completely meaningless. I have tried repeatedly to get him to address this point, but he won't (or can't). [In brief, expressions like P(CMR > _this-or-that) are completely meaningless in frequentist statistics since parameters are not assumed to be random variables - Kane rests his entire "analysis" on manipulating those expressions.]

Yes, that indeed shows he has no understanding of frequentist statistics.

SG: I was making stuff up with my comment on epedemiology data sharing. I don't know squat about it. Is interesting that Tim's news story does have comments from the conference that some people had reservations about the study because of the unshared data and political leanings of the authors. ;)

Medical ethics seems like a key reason for not sharing data. I wonder (just curious) if there is some data that can be shared without risks to the patients, but that isn't because of hesitancy to have others find fault with it.

In general, I think there are differences on what is/isn't accepted in terms of data sharing and they have more to do with the natural tendancy of people not to want their work scrutinized. Even an honest man, doesn't welcome an IRS audit. In crystallography, there is a long tradition of requiring data to be shared, depostited in archives, put in SI's etc. In econometrics, there is a long tradition, to share code. Such a tradition is beneficial. Many mistakes are found. And people do more careful work, knowing someone can check the details.

16: I've decided I don't trust those guys either.

SG writes:

David, re: your 2. Assessing Main Street Bias would require access to the raw data in its most detailed cluster level (to, you know, find the main streets). This is almost certainly a guaranteed no-no, for all the obvious reasons (need I say "IRB" until I`m blue in the face).

How many times do we need to go through this? While it is true that many groups (Spagat et al; Scheuren and his co-author) want more data than what the Lancet authors have provided (to check on Main Street Bias and other issues), the controversy here is: Why not release the same data that they have already released (to me and a dozen other groups) to Spagat et al? It is true that Spagat et al want other data, but, for a start, they would be eager to just see the data that everyone else has already seen. Don't believe me? Ask them!

I have never heard of a situation in which academic authors give data to some academic critics but not to others. Got a counter-example? If so, tell us.

By David Kane (not verified) on 08 Aug 2007 #permalink

"...political leanings of the authors."

Is there any proof of these supposed "leanings"?

Currently it looks like Wingnut A says "They're lying, they must be commies" and Wingnut B going "They're commies, they must be lying."

Roberts actually said that they released the L1 survey before the 2004 election because they were hoping that the high level of casualties attributed to allied bombing might lead one or both candidates to promise to investigate what that was happening and look at ways to reduce the death toll.

By Ian Gould (not verified) on 08 Aug 2007 #permalink

Nice to see that someone else wants to organize another Iraq mortality survey. That's the only way to settle this issue--sincere skeptics should have been pushing for this all along.
It's clear the only way the death toll couldn't be in the mid to high hundreds of thousands is if the data is wrong, and carping about the analysis in either paper has just been a waste of time (though personally I picked up one or two pieces of information about statistics).

Slightly off-topic, but Bill Clinton told some people at Aspen a month or two ago that as best he could tell, the death toll in Iraq was 300-400,000. It'd be nice to know if this number is based on anything--if the US government has done any classified studies presumably HRC might have access. Or maybe he pulled it out of his rear.

I don't have a link to the talk itself, though I've looked, so you'll have to settle for this--

http://www.tinyrevolution.com/mt/archives/001631.html

By Donald Johnson (not verified) on 08 Aug 2007 #permalink

Dumb--of course the talk is right there at the link I provided. I forgot about that. And yeah, Clinton says "Near as I can tell, the death rate in Iraq is three or four hundred thousand."

By Donald Johnson (not verified) on 08 Aug 2007 #permalink

One thing that has't been commented on yet is ""Verifying these numbers wouldn't be too difficult, so why isn't it being done?" asked Jana Asher, a graduate student at Carnegie Mellon University"

I thought this, too. Why isn't Jana assembling a team, flying to Baghdad and travelling all round Iraq using their excellent vernacular Arabic to rerun the survey? Why isn't Kane? Spagat? It could be great fun; like one long pub-crawl. They'd make lots of new friends.

Kane at 20: Spagat et al are amateurs. No-one has to give them anything. Roberts et al are also not required to give the data to Laura Bush, just because she wants to show her family members aren`t evil. Do I need to say this again: dilettantes are not welcome here. How many times do I have to say that?

Your question is disingenuous anyway. How many times are epidemiologists even asked to share their data, let alone share it with physicists?

"Not sharing the data and code is a big concern and not as much allowed in this field as in climatology. I'm sure that turned heads. Also the peacenik bias of the authors. Not saying they did anything sneaky, but..."

Posted by: TCO

Any *honest* person would realize that the probability of meteorologists being kiled for participation in weather data collection is rather low.

Actually, dave heasman, Jana's question is valid and I read it differently from the way you apparently did. There was a large scale poll taken in Iraq in early 2007, and it asked very sensitive political questions. There was even a question about casualties in the household, though the study wasn't designed to provide mortality figures. So yes, it is perfectly possible to do surveys in Iraq, even in early 2007 and if the United States government had any interest in determining the casualty figures and in releasing this info they could fund a study by another group (hopefully an independent one respected by everyone). But they don't make any effort (at least, not publicly) to determine this. I assume they prefer the issue to remain foggy.

By Donald Johnson (not verified) on 08 Aug 2007 #permalink

[[In brief, expressions like P(CMR > _this-or-that) are completely meaningless in frequentist statistics since parameters are not assumed to be random variables - Kane rests his entire "analysis" on manipulating those expressions.]

An alternative way to interpret Sortition's criterion is that Kane's "Bayesian" arguments treat a frequentist confidence interval as if it were a Bayesian "credible interval". This is only a legitimate move conditional on the assumption of a diffuse prior (Wikipedia IIRC says it isn't legitimate at all but I think that's too harsh). Kane does in fact assume a diffuse prior for the death rate, but this is clearly wrong, as a diffuse prior puts nonzero probability on negative death rates.

Or, to give the guts of the intuition (I think I already posted this once on the Thread From Hell), consider the fundamental philosophical difference between Bayesian and frequentist statistics.

For a frequentist, the estimate s^ is one's estimate of the true value s which is a non-random variable and thus has no distribution. The confidence interval for s^ is the range of possible values you would expect to see in an infinitely repeated sequence of trials if the true value of s was s^.

For a Bayesian, s^ is the expectation of a random variable, which has a distribution which reflects the degree of belief that you are going to assign to the proposition s=s^ for different values of s^.

We can see from this that in estimating death rates, frequentist confidence intervals might give very different numbers from corresponding Bayesian credible intervals. The reason is that while, in an infinite series of trials, you might conceivably fuck up the experiment so badly as to measure a negative death rate (and thus have the confidence interval extending into negative territory), the *actual* probability of a negative death rate is zero, so the credible interval wouldn't. And of course, this also goes for positive-but-very-improbable death rates like 1 or 2 per 100k.

Regarding the timing of the 2006 mortality study and the mid-term elections in the U.S.: I commissioned this study and I began with an email to Les Roberts in October 2005. It was my intention to have the survey conducted in winter 2006 and released by spring. For various reasons, including the Samarra bombing (making it too dangerous for survey takers to go into the field), the survey could not be taken until May-July. It then took several weeks to enter and analyze the data, and write the report and article for The Lancet. Les Roberts was not involved in the early part of this process; Gilbert Burnham led the effort from December 2005 to completion. Les entered the picture again in the analytical and report-writing phase. He never had any decision making authority about when the results would be released. We decided to set a deadline in early October, after which we would not release until after the election. The deadline was made by one day. There was never an expectation that the results would affect the elections, since the war was already very unpopular anyway. I do not know, by the way, the political affiliations or views of anyone associated with the survey, apart from Roberts.

The quotation attributed to Roberts in the Daily Herald was some kind of mistake. I doubt very much that Roberts said what is reported. But, in my view, informing the U.S. public on this crucial issue is not merely a right but perhaps an obligation, particularly since so few news media outlets report on Iraqi casualties at all. The human cost of the war--as Oxfam reported last week---is exceptionally high (one million households headed by widows, for example; one-third of children undernourished, etc), yet receives little attention. Why quibble about the release date of this survey? Such quibbling strikes me as consistent, however: trying to discredit important information by trivializing, distracting, distorting, and generating innuendo. In these things, Kane is a master.

John Tirman, thanks for clarifying that. I'm sure we both wish there was no need for that sort of clarification at all.

dsquared:

You are really being too generous here. Kane's paper is a complete mess. He clearly is quite confused about what CIs are, and is in general mathematically out of his depth.

His claim that the existence of a positive outlier is an indication that the distribution mean could be significantly lower than the median, say, makes no sense, so he has to back it up by using the appearance of mathematical sophistication, which he clearly lacks.

BTW, you write:

> The confidence interval for s^ is the range of possible values you would expect to see in an infinitely repeated sequence of trials if the true value of s was s^,

which is inaccurate. The confidence interval is for s, not for the estimate s^. The 95% confidence interval for s is a random interval that will cover s 95% of the time in an infinitely repeated sequence of trials.

Paul Foy of the Associated Press wrote:

Roberts organized two surveys of mortality in Iraqi households that were published last October in Britain's premier medical journal, The Lancet. He acknowledged that the timing was meant to influence midterm U.S. elections.

John Tirman now claims:

The quotation attributed to Roberts in the Daily Herald was some kind of mistake. I doubt very much that Roberts said what is reported. But, in my view, informing the U.S. public on this crucial issue is not merely a right but perhaps an obligation, particularly since so few news media outlets report on Iraqi casualties at all.

What are we to make of this? Foy is either right or wrong. If he is right, we know something for sure that many of us have suspected all along. If he is wrong, then Tirman/Roberts should seek a correction. So far, no one has denied that Foy is correct. Tirman may "doubt" this, but doubt is not denial. Why doesn't Tirman ask Roberts? E-mail is a handy tool.

By the way, this part:

Les Roberts was not involved in the early part of this process; Gilbert Burnham led the effort from December 2005 to completion. Les entered the picture again in the analytical and report-writing phase. He never had any decision making authority about when the results would be released. We decided to set a deadline in early October, after which we would not release until after the election.

is largely inconsistent with various statements made by Roberts. He claims (can't find a cite right now) that he was insistent that the study appear before the election lest the surveyors be in danger from Iraqis who thought that they (the surveyors) had purposely withheld the results.

And, if elections have nothing to do with things, why is there such a big difference between survey completion and publication in L1 (finished Sep 22(?)) and L2 (finished July 10)? Just happened that one paper took two months longer to write up than the other?

By the way, is Sortition claiming that dsquared doesn't understand statistics? Say it aint't so!

By David Kane (not verified) on 12 Aug 2007 #permalink

David, care to explain why the US government hasn't funded an independent study to determine the number of casualties? Or why the subject seems to interest them so little? Or are Les Roberts' political motives more important than the dishonesty of a government which lied its way into a war and refuses to count the number of people who have died as a result?

And since you seem so interested in motives, what are yours? You seem to want to discredit the two Lancet reports by any argument you can think of. If you were interested in the number of deaths while being skeptical of the Lancet figures (a position I respect), you'd be more outraged by the US government's total lack of interest in finding out the truth and letting the American people and the rest of the world know what we've done. Iraq Body Count is intensely skeptical of the Lancet numbers, but they agree with Roberts that the death toll, whatever it is, is the responsibility of the US government. What's your opinion on that? And what is your opinion of the US government's lack of interest in finding out the truth? And what exactly is wrong with someone wanting to inform the public about the number of deaths in Iraq and possibly influence votes with this information? You act as though this is a discreditable motive. Why?

Not that I give a flying f*** about this, but there's no inconsistency in Roberts "insisting" on a particular publication date and someone else saying that he had no authority to make the decision. But it's idiotic to argue about this. Everyone knows Roberts is against the war--what you have to do is show that this fact discredits the paper.

By Donald Johnson (not verified) on 12 Aug 2007 #permalink

A republican ex-Marine with a precarious background in statistics claims the famous statisticians that produced a mortality study in Iraq are wrong. And politically biased.

In Argentina this week, a doctor who was reprimanded for some anti-semitic remarks, filed suit for being discriminated for his anti-semitic views.

Dr Gunter Nimtz and Dr Alfons Stahlhofen, of the University of Koblenz, claim to have broken the speed of light. Dr Nimtz told New Scientist magazine: "For the time being, this is the only violation of special relativity that I know of."
Einstein, a famous physicist, "made some obvious mistakes", added Stahlhofen. As a confessed socialist, Einstein was obviously biased against fast cars.