Official Comment Count: 1,026,146

Search this blog

Search older postings


Profile

Tim Lambert Tim Lambert (deltoidblog AT gmail.com) is a computer scientist at the University of New South Wales.

Deltoid Facebook Group

Recent Posts

Recent Comments

Categories

Archives

Links

Blogroll

Archives of previous Deltoid

16th

Subscribe via Email

Stay abreast of your favorite bloggers' latest and greatest via e-mail, via a daily digest.

Sign me up!

« Burnham answers questions about Lancet study | Main | Lancet Links »

Science on Lancet study

Category: LancetIraq
Posted on: October 21, 2006 3:28 PM, by Tim Lambert

I guess that the next time a new physics study comes out Science will ask epidemiologists what they think of it. You see, John Bohannon, the reporter for Science, decided that opinions from a couple of physicists and an economist were more important than getting comments from experts in epidemiology.

Bohannon report on the Lancet study (subscription required) states:

Neil Johnson and Sean Gourley, physicists at Oxford University in the U.K. who have been analyzing Iraqi casualty data for a separate study, also question whether the sample is representative. The paper indicates that the survey team avoided small back alleys for safety reasons. But this could bias the data because deaths from car bombs, street-market explosions, and shootings from vehicles should be more likely on larger streets, says Johnson. Burnham counters that such streets were included and that the methods section of the published paper is oversimplified. He also told Science that he does not know exactly how the Iraqi team conducted its survey; the details about neighborhoods surveyed were destroyed "in case they fell into the wrong hands and could increase the risks to residents." These explanations have infuriated the study's critics. Michael Spagat, an economist at Royal Holloway, University of London, who specializes in civil conflicts, says the scientific community should call for an in-depth investigation into the researchers' procedures. "It is almost a crime to let it go unchallenged," adds Johnson.

Spagat offers more on his own site:

Researchers at Royal Holloway, University of London and Oxford University have found serious flaws in the survey of Iraqi deaths published last week in the Lancet.

Professor Michael Spagat of Royal Holloway's Economics Department, and physicists Professor Neil Johnson and Sean Gourley of Oxford University contend that the study's methodology is fundamentally flawed and will result in an over-estimation of the death toll in Iraq.

The study suffers from "main street bias" by only surveying houses that are located on cross streets next to main roads or on the main road itself. However many Iraqi households do not satisfy this strict criterion and had no chance of being surveyed.

Main street bias inflates casualty estimates since conflict events such as car bombs, drive-by shootings, artillery strikes on insurgent positions, and market place explosions gravitate toward the same neighbourhood types that the researchers surveyed.

This obvious selection bias would not matter if you were conducting a simple survey on immunisation rates for which the methodology was designed. But in short, the closer you are to a main road, the more likely you are to die in violent activity. So if researchers only count people living close to a main road then it comes as no surprise that they will over count the dead.

This is an another example of what Daniel Davies called the "devastating critique", where the slightest flaw is grounds for dismissing the findings. Spagat, Johnson and Gourley (SJG) assert that it is a serious flaw and will result in an over-estimate.

Now, what the study actually states is this:

The third stage consisted of random selection of a main street within the administrative unit from a list of all main streets. A residential street was then randomly selected from a list of residential streets crossing the main street. On the residential street, houses were numbered and a start household was randomly selected.

Contrary to Bohannon's they did not avoid small back alleys for safety reasons. And contrary to the claims by SJG, they don't just survey houses close to or on main roads. According to the description above, they don't survey main roads at all, and the houses on cross streets aren't necessarily close to the main road -- a random house was chosen. To the extent that deaths tend to be concentrated on main roads, the survey will produce an under-estimate. Further, while deaths are more likely to occur on a main street, that isn't the relevant characteristic as far as the study is concerned, but where the people lived. If someone dies in because a car bomb is set off in a marketplace, that person will be counted if the Lancet surveys his home. "Main-street bias" exists only to the extent that people living on main streets would be killed in their homes. Deaths from car bombs and drive-by shootings will mainly be from people out in the open, not people in their homes. So while "main-street bias" would produce an under-estimate, it's unlikely to make much difference and it is wrong to call it serious flaw.

SJG are also incorrect to argue that "main-street bias" would not matter if you were measuring immunisation rates. Houses in back alleys are likely to be smaller and the residents poorer and less likely to get immunised. It may well be a bigger source of bias for immunisation rates than for violent death.

Having said all that, according to Bohannon, Burnham says that all streets were included and description was made too simple. In that case, there isn't any bias at all.

I would like to see a more complete description of the actual procedure followed, but the procedure as described biases the estimate downwards. Note that the first study used GPS for sampling and produced similar numbers over the same time frame so the different sampling method doesn't seem to have made much difference.

Bohannon finishes with:

For now, Spagat says he is sticking with casualty numbers published by the United Nations Development Programme (UNDP). A UNDP survey of 21,668 Iraqi households put the number of postinvasion violent deaths between 18,000 and 29,000 up to mid-2004. "When a survey suggests so much higher numbers than all other sources of information," he says, "the purveyors of this outlier must make a good-faith effort to explain why all the other information is so badly wrong."

This is inaccurate. The UNDP numbers are just for "war-related" deaths and do not include criminal homicides, of which there have been vast numbers. Carl Bialik writes about the UNDP study:

But the death question was merely one of dozens, with the average interview lasting 83 minutes. The primary purpose of the survey was to assess living conditions and infrastructure.

Lead researcher Jon Pedersen, FAFO's deputy managing director, also told me that he had reservations about the Iraqi statistical workers who carried out the research -- not their impartiality, but their techniques. Statistical workers who have worked under dictatorships "tend to develop a fairly loose relation to data because they know that things will be changed by the government, and that leads to a sloppiness in field work," Mr. Pedersen says, though he adds that the statistical chief, installed after the removal of Saddam Hussein, was committed to accuracy.

Nor is the disagreement between the surveys as large as the raw numbers suggest. Because of the much shorter time frame and the narrower category of deaths, the UNDP number is only about half of the number from the new Lancet study. Given that it easier to neglect to mention a death than to produce a death certificate for a death that never happened, the Lancet number is more likely to be correct. But even if we decide that the UNDP shows that the Lancet is somehow over-counting by a factor of two, that leaves us with a death toll of about 300,000. Will the various Lancet critics accept this number? Somehow I doubt it.

Finally, I demonstrate the power of the "devastating critique". Spagat and Johnson have a paper where they fit a power law to the distribution of deaths in conflicts in Colombia and Iraq. Guess what data source they use for Iraq? That's right, the Iraq Body Count. Their paper is seriously flawed because nowhere do they even consider the possibility that the IBC's coverage of violent deaths in Iraq is incomplete. They ascribe changes in the distribution of deaths to changes in the conflict when they could equally have been due to changes in the reporting rate of deaths. Their study is "fundamentally flawed". See what you can do with the devastating critique?

Their use of IBC data may explain why Spagat defended the IBC's work in comments here a few months ago.

Comments

#1

He also told Science that he does not know exactly how the Iraqi team conducted its survey; the details about neighborhoods surveyed were destroyed "in case they fell into the wrong hands and could increase the risks to residents."

Well then how can anyone ever check how they did it? Doesn't this seem kind of odd?

Posted by: tc | October 21, 2006 3:58 PM

#2

Is there a serious problem of under-employment in physics? Nothing left to do? I keep hearing accounts of a core of high-prestige physicists putting all their efforts into an a-empirical string theory paradigm, with the fringe physicists then applying superior intellects and an absence of substantive competence to fields like social network analysis and computer science, and now it seems epidemiology!

You might be really good applied mathematicians, guys, but you really ought to read the literature...

Posted by: BrendanH | October 21, 2006 4:37 PM

#3

Oddly enough a commenter on d-squared's latest CT post referred to this as a "devastating critique", evidently unaware of the irony. At least these guys are numerate, so maybe when they spell it out in full there will be something to chew on. They give some hints here about how they plan to demonstrate and quantify the bias but it will be a few weeks before it's ready.

Without a detailed description of the selection of households it's hard to guess whether they are on to anything.

Posted by: Kevin Donoghue | October 21, 2006 4:41 PM

#4

Spinning like a top Tim. It is a joy to watch. ;-)

Posted by: joshd | October 21, 2006 5:31 PM

#5

Josh, there's a bit more at stake here than clashing egos. The critique may be valid and if so, after a period of yelling and shouting the scientific community will probably come to a consensus about it. Of course if the point is to determine how much harm the Iraq War has caused, it seems like everyone would be calling for a new study. The US, British, and Iraqi governments are morally obligated (and maybe legally obligated for all I know) to find out the degree of harm their policies have caused--it shouldn't be up to a bunch of outsiders either using media reports or doing survey work under very dangerous conditions without government assistance to try and determine it themselves. But we have this interesting circumstance where apparently IBC and the Lancet teams have a kind of grudge match going. Maybe it's time to move past that?

Posted by: Donald Johnson | October 21, 2006 6:06 PM

#6

The "devastating critique" has been described elsewhere as a "magic bullet argument" or a "zinger". I think it's a side effect of the way that internet "debate" works, where the most effective arguments in terms of propagation are the very short ones.

What we end up with are "debates" where people take a long and extensive piece of writing and go through it looking for not only a flaw, but a flaw which they can then write a mocking response to. It the online equivalent of a soccer goal.

This was a mocking description of a Usenet group, but it accurately describes most internet "debate":

This [rec.arts.sf.written] is a group of readers, interested in discussing books which we fit into the vague category "sf," along with other topics we wander into along the way. We are not an academic debating society. We do not follow Robert's -- or anyone's -- Rules of Order. We do not argue fairly. We have rather divergent tastes, and don't always respect those who disagree with us. We quite often do not even listen to those disagreeing with us. Some of us are sane and relatively normal, while others appear to be psychopaths or monomaniacs. We often wildly misrepresent books if that will make our posts more entertaining or our arguments seem stronger. The "zinger," sir, is what we're here for.

Facts are nice, but we're not going to let them get in the way of a good story. We do this for our own amusement, not for anyone else or in pursuit of any larger aim. If we are amused ourselves, then we have succeeded. If you are amused, then you have become one of us.

Posted by: brokenlibrarian | October 21, 2006 6:47 PM

#7

I've already addressed most of Tim's points on another thread, but let me add a few things. First, looking at a map of Basrah, I don't think that "back alley" is a good description of the interior streets that Burnham et al didn't sample. My guess is that these are quiet, pleasant, and wealthier blocks.

Second, even if (contrary to what they wrote) Burnham et al did sample every block, there's still a bias, since they're sampling addresses, not people. Hence, they're undersampling multi-family housing.

Third, it is a completely standard critique of this kind of sampling scheme that it misses households "off the beaten track." That you dismiss it suggests only that you should read a book about survey sampling. I recommend Kish (1965). I don't remember specifically what he says, but I'd be astonished if he didn't make this point.

Finally, when Roberts presented his Congo work to the NSF a few years ago, he acknowledged the problems with sampling proportional to area rather than proportional to population, and was told that he ought to consult more with survey statisticians. The NSF committee seems to have offered a pretty gentle critique, but as I read it, they were saying that Roberts lacked expertise in this area.

Posted by: Ragout | October 21, 2006 6:49 PM

#8

Donald, we've talked a lot before and I think we've had some good conversations etc., but stick a sock in it.

Given Tim's behavior toward critics, how in the world you manage to single me out - again - is ridiculous. There is nothing at stake on this blog more than egos Donald. Wake up. Every posting for the last weeks has been nothing but an exercise to deflect and evade any criticism of this study, regardless of its merits, and always dripping with condescension. This has nothing to do with the harm of the iraq war or anything else. Yeah, and I've lowered the tenor of this by my comment poking fun at it.

Look at the post above. Recall his former position on ILCS. This guy is dancing up a storm here. Forgive me, it's funny. And I had to comment to that effect. And look at how he's attempting to belittle these guys because, oh my gosh, they are not epidemiologists (as soon as this appeared several IBC-ers predicted this would be the deflection line.

Come on Donald.

Posted by: joshd | October 21, 2006 6:50 PM

#9

How can Tim Lambert justify this statement:

Contrary to Bohannon's they did not avoid small back alleys for safety reasons. And contrary to the claims by SJG, they don't just survey houses close to or on main roads. According to the description above, they don't survey main roads at all, and the houses on cross streets aren't necessarily close to the main road -- a random house was chosen.

Eh? The quote that you provide doesn't say anything of the sort. Read the Lancet quote again and you'll see.

They took from main streets and streets that intercross with main streets. Basically, they thought Iraq was like the US where nearly every town has a grid system of roads rather than the networks of roads away from main roads that you find in Europe and any old city in the world. That's why the figures are crossly inflated.

Furthermore, the highly respected Oxford scientists contacted the Lancet team and discovered they didn't really know how their study was conducted on the ground, as you can find with from an email exchange, again bringing the whole project in serious doubt: http://www.medialens.org/forum/viewtopic.php?t=1949

The outrageous spinning simply won't wash; experts will continue to speak out using fact and reason.

Posted by: Ryan | October 21, 2006 7:17 PM

#10

Come on Josh, you can do better than that. What about the other critiscisms of these people? You know, the ones that centre on the faults they claim to have found in the Lancet paper? Can you not see them?

Posted by: guthrie [TypeKey Profile Page] | October 21, 2006 7:24 PM

#11

Josh, when you first came here I remember Tim offering to let you put up a guest blog here. I think you could probably have the kind of interaction here that Heiko gets (which, to be honest, isn't welcoming from everyone if I recall, but it's generally respectful).

Everyone's ego is involved in this debate, mine included. It's another reason to cut back on the sarcasm if you want to try and persuade people that they are wrong.

Socks don't taste good, btw.

Posted by: Donald Johnson | October 21, 2006 7:35 PM

#12

Kevin: "Oddly enough a commenter on d-squared's latest CT post referred to this as a "devastating critique", evidently unaware of the irony."

Pennies would get you dollars that the person didn't come up with that term on their own; they were passing propaganda around.

Posted by: Barry | October 21, 2006 8:15 PM

#13

TC: I have had some experience with interviews of a much different nature but we were always quick to tell the people we asked questions of that they would not be identifyed and we would destroy all personal records after the study was done. So this is not unheard of.

Ryan: I just wasted 10 minutes of my life going through the thread that you linked to and as far as I can see they are still using the same flaw that Tim pointed out (i.e. that the survey did not use main roads).

Posted by: John Cross | October 21, 2006 8:56 PM

#14

John, the Oxford people have spoken to the Lancet study team extensively by phone and by email many times. I think they know what they are talking about better than Tim who is basing his judgment on a Lancet quote he found on the medialens message board. The Lancet don't even know how their study was conducted. Face it, guys, there are serious questions here; bluff and spin won't do. A proper report is coming out that will address this.

Posted by: Ryan | October 21, 2006 9:06 PM

#15

Ryan, the study SJG are purportedly critiquing clearly states:

"The third stage consisted of random selection of a main street within the administrative unit from a list of all main streets. A residential street was then randomly selected from a list of residential streets crossing the main street. On the residential street, houses were numbered and a start household was randomly selected." [1]

Whereas the Royal Holloway University page states:

"The study suffers from 'main street bias' by only surveying houses that are located on cross streets next to main roads or on the main road itself." [2]

The team pursued residential areas by deduction. Les Roberts said the wording (top) is something of an oversimplification, generally intended for the lay person, and not an exact, detailed description of methodology.

It is not enough to state the obvious and say 'all roads lead to....a car bomb!' As with attacks on street markets, mosques, office blocks, banks, police/army recruitment centers, police stations etc., the victims are likely to be a random cross section of society.

[1] http://tinyurl.com/et35c [2] http://tinyurl.com/ymk7xe

Posted by: SMB [TypeKey Profile Page] | October 21, 2006 9:10 PM

#16

Physicists do have an annoying tendancy to stick their noses into things they know little or nothing about -- a tendancy that border on the silly.

Ironically, the best physicists like Einstein and Feynman are the exception rather than the rule in believing that they are not qualified to weigh in on matters outside their field. The lesser physicists are usually the least likely to abide by this rule.

Posted by: JB | October 21, 2006 9:47 PM

#17

I wonder what the denialists will say if Johnson and Goureye's study ends up producing a higher estimate than the Lancet study.

Posted by: Ian Gould | October 21, 2006 10:00 PM

#18

SMB, I refer you to my previous post. The Oxford people have spoken to the Lancet study team extensively by phone and by email many times. They won't have made that sort of mistake. The Lancet study is dishonest for even pretending they know how the methodology was done when they clearly don't. They also think Iraq has a grid system of roads when they do not. How can we trust these people?

Posted by: Ryan | October 21, 2006 11:08 PM

#19

tc writes:

He also told Science that he does not know exactly how the Iraqi team conducted its survey; the details about neighborhoods surveyed were destroyed "in case they fell into the wrong hands and could increase the risks to residents."

Can someone provide the paragraphs surrounding this quote so we see the context? I am not a Science subscriber.

I am especially interested in more details about this quote from the study.

The interview team were given the responsibility and authority to change to an alternate location if they perceived the level of insecurity or risk to be unacceptable.

Did this happen a lot or a little? Or does Burnham not know because the information was destroyed?

Posted by: David Kane | October 21, 2006 11:25 PM

#20

So the eclectic physicist Neil Johnson is going around tagging everything as "main street bias," and this "riposte" is making its way all over the internet.

What he has described is (possibly) main street SAMPLING BIAS, i.e. the population of Iraq is not being sampled uniformly.

But as much as people like to scream it on the Internet, sampling bias is not the same thing as, nor does it imply, MORTALITY BIAS.

Perhaps people who live on the main streets are more likely to be hit by collateral damage from car bombs. But they spend less time walking on the street, so they are less likely to be hit by stray bullets. etc.

Quantum complexity/financial researcher Neil Johnson has no idea whether "main street sampling bias" implies anything about mortality bias. He has no data to back up his handwaving, and any "simulations" he does will be worth considerably less than the data set he uses for them. And there are no good data sets for this problem.

Sure, it would behoove the Lancet study authors to disclose as much as they know about their sampling methods, even if it involves getting back in touch with their survey physicians (the ones who are still alive and in Iraq, anyway). But, you know, they could sing the Star Spangled Banner on Rush Limbaugh's radio show -- and their critics (from whom I've seen nothing constructive) still wouldn't stop carping.

Posted by: theo | October 22, 2006 12:39 AM

#21

BTW, The best way I've found to make silly physicists go away is to give them a ludicrously inaccurate approximation to your problem, formalized as a spin glass model.

While they're off solving that, you can get some work done.

Posted by: theo | October 22, 2006 12:41 AM

#22

The surrounding sentences are.

"But this could bias the data because deaths from car bombs, street-market explosions, and shootings from vehicles should be more likely on larger streets, says Johnson. Burnham counters that such streets were included and that the methods section of the published paper is oversimplified. He also told Science that he does not know exactly how the Iraqi team conducted its survey; the details about neighborhoods surveyed were destroyed "in case they fell into the wrong hands and could increase the risks to residents."

My impression is that Burnham was saying he does not know exactly which streets were surveyed.

Posted by: Eli Rabett | October 22, 2006 12:46 AM

#23

David Kane, the complete passage that quote is drawn from is in my post.

Ryan, if we go by why the Lancet article says, the bias is away from main streets. If we go by what Burnham now says they did there is no main street bias at all. I don't see how any communications between Spagat and co and Burnham and co lets Spagat conclude that they did something other than what they said they did.

Posted by: Tim Lambert | October 22, 2006 12:50 AM

#24

A devastating critique is one that identifies a potentially very large source of bias. The bias of a self-selecting Internet survey, for example, knows no bounds. Even if the flaws identified in these "devastating critique"s do introduce bias, I'd be surprised if the magnitude of the bias were very great. Since it's the magnitude of the deaths estimate that is causing such a stir rather than the exact number, identifying a number of small sources of bias wouldn't change things very much - would we really all think it wasn't significant if only 330,000 Iraqis had died?

Posted by: Paul Crowley | October 22, 2006 9:06 AM

#25

I had a look at the work of SJG on power-laws. I'm not too sure how much you know about maths. But let me spell it out to you, any one how knows anything about maths knows that power laws do not depend on absolute numbers. They are called scale-free for a reason i.e. they do not depend on any specific scale. [link]

Hence if the media data they use is consistently off by a factor of two (or ten) it does not affect the power-law distribution. I suggest you go back to school and relearn your basic mathematics before launching another 'devastating critique'.

Posted by: jkbaxter | October 22, 2006 9:54 AM

#26

Hence if the media data they use is consistently off by a factor of two it does not affect the power-law distribution.

How would they determine the consistency, or otherwise, of Iraq Body Count data? They seem to ignore the fact that it relies on reportage conducted under media laws resurrected from Saddam's time, in the most dangerous conflict since WW2 (for journalists), and on data from a Ministry of Health which is known to fiddle or suppress mortality data (when it takes time out from running death squads).

Posted by: Ron F | October 22, 2006 11:31 AM

#27

jkbaxter, are you saying there could be changes in the reporting rate of deaths if the media data they use is consistently off by a factor of two? I don't wish to be uncharitable, but I do think you have emitted a brainfart.

Posted by: Kevin Donoghue | October 22, 2006 11:33 AM

#28

Yes, Baxter if the data is consistently off by a factor of two or some fixed number, then it makes no difference. If, on the other hand, that factor has changed and is different at different scales, both of which seem very likely, then it does make a difference.

Posted by: Tim Lambert | October 22, 2006 11:37 AM

#29

Theo said: "The best way I've found to make silly physicists go away is to give them a ludicrously inaccurate approximation to your problem, formalized as a spin glass model. While they're off solving that, you can get some work done."

String Theory is particularly fruitful in that regard (and if reality is any judge, probably that regard only) -- good to make them go away for a few decades (or is it centuries?)

Posted by: JB | October 22, 2006 11:45 AM

#30

The point about power-laws in general is that they only look at relationships between numbers not the absolute numbers themselves.

Ron F - the consistency of the IBC data has little to do with media laws conducted under Saddam's time. Reporters record deaths to the best of their ability under the circumstances, not sure where Saddam's media laws come in.

Kevin, I'm not quite sure what point you are trying to make? You say "there could be changes in the reporting rate if the media they use is consistently off by a factor of two" huh?

let me try this - the media collects a sample of the data, they do not get it all but they get a percentage, IBC records this. SJG then seem to collect this data and count the number of times there is an attack where one person dies, two people dies or n people die. This gives them a distribution along the lines of P(x>X). For a given number of deaths (X) what is the probability that there exists an attack where greater than x people were killed. This is a standard way to represent power law distributions and makes no mention of the absolute number of deaths.

From this distribution they then fit a power-law function to the data above a certain value x_min. Again, this is standard in mathematics, finding the value above which the relationship holds.

I'm not totally sure about the next part of the paper, but it looks like they then go on to track the war through time measuring the value of the slope of the power law. But again there is need to know absolute numbers for this measurement.

It seems to be quite an interesting result actually, a little difficult if you don't have the background in maths, but worth the read.

Posted by: jkbaxter | October 22, 2006 11:57 AM

#31

"It seems to be quite an interesting result actually, a little difficult if you don't have the background in maths, but worth the read."

There are lots of "interesting" results that have nothing whatsoever to do with reality and sometimes knowing how to do all the detailed math is not as important as being able to judge whether the result means anything physically speaking.

Posted by: JB | October 22, 2006 12:11 PM

#32

Tim what you suggest is that the reporting rates at different ends of the scale might vary as the war progresses. This would mean that the trends over the war for the IBC data and the Lancet data would be different. However if you plot the two together you will find that that both the Lancet deaths and the IBC deaths follow a very similar trend. Meaning the IBC people are capturing the same percentage of the Lancet deaths at the start of the war as they are at the moment.

you can read more about the IBC trends in here [link]

Also if the result is specific to the reporting and recording of events in Iraq, why does Colombia a different war conducted in a different country over a different time-scale seem to follow the same trend towards a value of alpha=2.5?

Posted by: jkbaxter | October 22, 2006 12:20 PM

#33

Kevin, I'm not quite sure what point you are trying to make?

Tim wrote:

They ascribe changes in the distribution of deaths to changes in the conflict when they could equally have been due to changes in the reporting rate of deaths.

On my reading, the reporting rate is the ratio of the number of actual deaths to the number of reported deaths.

You wrote:

...if the media data they use is consistently off by a factor of two (or ten) it does not affect the power-law distribution.

Did you simply misread Tim's post? He raises the (surely very real) possibility that the reporting rate is variable and you reply, in effect, that if the reporting rate is constant the problem doesn't arise.

Posted by: Kevin Donoghue | October 22, 2006 12:21 PM

#34

I'm with Theo on this. Until somebody can explain why people who live on roads which intersect with a main street are systematically more likely to meet a violent end than the population as a whole, the "main street bias" is right up there with Saddam's as yet undetected WMD.

Posted by: z | October 22, 2006 12:23 PM

#35

Kevin, if the reporting rate is consistent then the criticism does not apply, i.e. the ratio remains similar throughout the conflict. Not constant as you suggest. The similarity between the trends in the Lancet paper and the IBC data would suggest that this consistency in reporting is indeed the case.

Posted by: jkbaxter | October 22, 2006 12:27 PM

#36

Theo, you are right it is difficult to determine the effects of 'main street bias'.

From the methods described in the Lancet article, main street bias sampling most definitely exists and it is significant. This bias means that you are only sampling from a small select group within the population.

you then have two options

(1) you state that the death rates across all of Iraq is represented by the small group of people who live near main roads (using near here in a network sense of being one link away). Which is the same as saying that in wars violence occurs homogeneously across the entire city.

or (2) you take the more robust position and state that yes there are probably variations in the intensity of violence across cities and the bias of the sample will mean that you cannot capture this variation.

Surely the responsibility for defending their sampling method and the supposed accuracy of the survey falls with the authors of the study? In order for the results to stand the authors must prove that there exists a homogenous distribution of violence across cities.

Either way this is where the game is at, and as you rightly point out this is a very tall order.

Posted by: jkbaxter | October 22, 2006 12:58 PM

#37

jkbaxter, I am not familiar with your notion of similarity. Similar triangles and similar matrices I have encountered, but similar ratios? What might they be?

Posted by: Kevin Donoghue | October 22, 2006 1:02 PM

#38

Ron F - the consistency of the IBC data has little to do with media laws conducted under Saddam's time. Reporters record deaths to the best of their ability under the circumstances, not sure where Saddam's media laws come in.

I'm talking about the laws in force now and that's why I referred to them as resurrected. Their implications for reportage should be obvious, even for a mathematician. If the government don't like what you report you go to jail. Or, if you're "lucky" you get thrown out of the country, like al-Jazeera, al-Sharqiyah and al-Arabiya TV stations, or shut down, like al-Zaman newspaper.

Use your imagination. What might the Iraqi government object to being reported? The Washington Post spelt it out on Friday -

"The Iraqi government has long resisted efforts by U.N. officials and human rights workers to obtain reliable government figures on mortality."

This phenomena has been widely noted - though not by Iraq Body Count - right back to the early days of the occupation:

Iraq's Health Ministry ordered to stop counting civilian dead from war

and continuing to the present

Official Says Shiite Party Suppressed Body Count

Officials from all tiers of government routinely inflate or deflate numbers to suit political purposes

Posted by: Ron F | October 22, 2006 1:44 PM

#39

I'd say the Johnson/Spagat power law paper on Iraq is a very good example of the old CS saw -- Garbage in, Garbage out -- since they base their analysis on IBC numbers which are most likely underestimates of mortality in Iraq.

Anyone familiar with the physics journals sees this type of stuff all the time.

Some people are very proficient at "mathturbation", but the output is drivel, just the same.

Posted by: JB | October 22, 2006 2:08 PM

#40

Why was this point about bias not addressed in the original Lancet paper? Why did the referees of this 'prestigious' journal not think that this bias issue might warrant further investigation? Why was it not even mentioned in the 8 pages of accompanying material?

Posted by: jkbaxter | October 22, 2006 2:45 PM

#41

JB - in what sense does underestimation invalidate a power-law?

Provided the reporters consistently underestimate the deaths the power-law still holds. And by comparing the Lancet trends over time with the IBC trends over time you get a good match. Thus you can conclude that the media used by IBC is doing as good a job of reporting the violence now as it did at the start of the war.

Thus the power laws must hold, regardless of the underestimation.

Please let me know if I have missed something here, but this is how it seems to me.

Posted by: jkbaxter | October 22, 2006 2:54 PM

#42

Provided the reporters consistently underestimate the deaths the power-law still holds.

You can't be serious. In everyday speech, to "consistently underestimate" something means to present a number which is always too low. It may be a little too low in one year and a lot too low in another. In statistics consistency has a technical meaning which is not relevant here AFAICT.

What are you trying to say? That the errors are "similar", whatever that means?

Posted by: Kevin Donoghue | October 22, 2006 3:10 PM

#43

jkbaxter, the media is more likely to report an incident involving 20 deaths than one involving on death. This rather has an effect on you power law.

Posted by: Tim Lambert | October 22, 2006 3:23 PM

#44

Kevin - try this simple exercise. Suppose if you will there is a war with 100 deaths in the first year, 200 deaths in the second year and 300 deaths in the third year.

One survey, let's call it the 'God survey', manages to know every death exactly. So they get year1=100, year2=200 and year3=300. If you plot these figures out you will get a straight line with a slope that is easy to calculate.

Now lets consider a second survey lets call this one the 'media survey'. In the media survey a death is only added to the death toll if it is reported by the media.

Now there are two scenarios

(1) the media consistently under-report the deaths by a factor of 2. This will give them the following data points - year1=50, year2=100 and year3=150. Now if you were to plot these on a graph, you would get the same slope as the 'God survey'. Do you follow?

(2) second option, the media survey does not consistently under report the deaths. Instead it has the following results, in year 1 it under reports by a factor of 2, in the second year it under reports by a factor of 5 and in the third year it under reports by a factor of 10. The data points for the second survey are then year1=50, year2=40, year3=30. Now if you plot these points on a graph you get a different slope to the 'God survey'. Is this making sense to you?

so what we are left with this: if you calculate the slope of the media survey and compare that with the slope of the 'God survey', and they are equal you must conclude that the media survey is consistently under reporting the deaths. If they are different then the media survey is not consistently under reporting the deaths and some years it does better/worse than others.

Thus because the slopes of the IBC deaths and the Lancet deaths are very similar we must conclude that the IBC database is consistently under reporting the death rates.

Posted by: jkbaxter | October 22, 2006 3:29 PM

#45

"this rather has an effect on your power law"

In most power-law distributions (and I assume it is the same in this paper) there is an xmin above which the power law holds. For a distribution such as this it would be around xmin=10. So you are not comparing the likelihood of reporting an attack with one death with the likelihood of reporting an attack with 20 deaths. You are comparing the relative likelihood of reporting different sized events above this value of xmin. i.e the power law holds above this value of xmin. It difficult to say that media will be better at reporting attacks of 30-50 deaths than they are at reporting attacks with 15 deaths.

Posted by: jkbaxter | October 22, 2006 3:49 PM

#46

jkbaxter: Now if you were to plot these on a graph, you would get the same slope as the 'God survey'. Do you follow?

I get slopes of 100, 50 and -10 for the three graphs. So what you mean by getting the same slope I can't imagine; perhaps merely that the first two graphs go through the origin and the third one doesn't. Something tells me that Tim Lambert will not be too perturbed by your disparagement of his mathematical ability.

Posted by: Kevin Donoghue | October 22, 2006 4:35 PM

#47

jkb, whether or not the use of the IBC numbers in the power law paper is valid the paper does not justify it with anything like the thoroughness the authors are demanding of the JHU paper. The point is precisely that the JHU study, performed under very difficult crcumstances and without the resources available to US or UK governments which refuse to collate data, will have some limitations but limitation is not flaw which in turn is not the same as a reason to believe that the results are not an overestimate.

Tim is actually being less unfair to Spagat than Spagat is being to JHU.

The first JHU paper would not have been subject to high street bias because the randomisation of sampling was not based on addresses but random geographical coordinates.

Ryan is making things up. According to Spagat the conversations between the two teams were mediated by Science magazine journalists which is a bit different.

It is far from obvious what effect any high street bias would have on the numbers. Is the murder rate likely to be higher for residents of a large house on a main street or for dwellers in backstreet slums where interviewers think it is too dangerous to go? The King's Road or the North Peckham Estate? The Copacabana or a favela? It would be wrong to assume that high street or street off a high street bias actually exists it means that the JHU estimate is high.

Posted by: Jack | October 22, 2006 4:37 PM

#48

Kevin - it is the slope relative to the starting value that is important. In order to see this you need to normalise with respect to the starting value. I should have made this clearer in my last post. If you do this and measure the slopes you will have the same value or +1/year for the first two surveys. For the third survey you will have a normalised slope of -(1/5) per year.

sorry for the confusion - does it make sense now?

Posted by: jkbaxter | October 22, 2006 5:12 PM

#49

Hey Ragout,

How big do you think the biases you identify are in practice?

Posted by: lemuel pitkin | October 22, 2006 5:18 PM

#50

Jack - You are right to apply the same level of scientific rigour to the two papers. But there is a big difference paper by Roberts et al was published in a peer reviewed scientific journal. The paper you point to by Johnson et al is on the physics pre-print server arxiv.org the paper is most likely in draft format.

arxiv.org is a place where authors can post working papers and get feedback and comments before they write their final version and submit it to a journal for peer review. The key difference is that Johnson's paper has not been submitted to peer review, this paper is a draft version. Robert's paper has been through peer review and is thus subject to a much higher level of analysis and criticism than the draft paper you point to by Johnson.

Posted by: jkbaxter | October 22, 2006 5:19 PM

#51

The suggestion that the Burnham et al. may have substantially overestimated the risk due to undersampling of homes that are less immediately accessible from "main" streets (which might contain an elevated fraction of targets of violence or military action) implies that a large fraction of the reported deaths are of people killed at their homes, rather than out and about transacting their business. But it is hard for me to square this hypothesis with the low number of deaths among women, who presumably spend an elevated proportion of their time in the home. If a large fraction of deaths reflect people who live on dangerous main streets being killed in their homes, then wouldn't we also expect to see large proportion of women casualties?

Posted by: trrll | October 22, 2006 5:21 PM

#52

jkbaxter, Your normalisation only gives you the slope = +1 because the function of time describing the "God survey", G(t) = 100t is a constant multiple of the function C(t) = 50t. That is to say, the reporting rate is constant. Yet you contend that this is not required. All that is required is a reporting rate which remains "similar", whatever that may mean. That is what I asked you to explain.

Posted by: Kevin Donoghue | October 22, 2006 7:05 PM

#53

kevin, not sure if I follow, but yes if the reporting rate is constant/consistent then you will get the same normalised slope in the media survey as you do in the 'God survey'. This is indeed what appears to be the case when you compare the IBC with the Lancet. You get slopes that are similar - thus IBC must be consistently recording the same fraction of deaths throughout the war. Which means that the initial criticism from tim about the media reporting varying dramatically over time is not valid. So the paper by Johnson et al seems to be unaffected by this 'devastating critique'.

Posted by: jkbaxter | October 22, 2006 7:42 PM

#54

Lemuel,

I think the biases are probably really big, and if this survey was our only sounce of information, I don't think that I'd conclude anything stronger than that excess deaths were greater than zero.

But this isn't our only source of information. We have the ILCS, the IBC, Health Ministry counts, and guesses about their bias; plus press reports suggesting that things keep getting worse and worse. I'm especially impressed by Patrick Ball's point (quoted in the Lancet article, I think) that as killings go up, the percentage that get reported goes down. So, I find it hard to believe that there have been less than 200,000 excess deaths (maybe half of them violent).

To me, the puzzle is why the Burnham et al estimates are so reasonable. My theory is that the biases are big but partially offsetting.

Here's another data point, the Washington Post reports today that "in the past few months, 10 people have been killed in Yusufan," a village of 4,000 about 30 minutes south of Basra. Assuming they mean the past 3 months, that's a rate of 10 per 1000. That's about 200 times the US murder rate, or 25 times the rate in the US's most violent cities. And it's right in line with the Burnham et al estimate of 12 per 1000 for the whole of Iraq.

Looking at it the other way, the IBC figures would imply a violent death rate of maybe 0.8 per 1000 (assuming 20,000 killed in a year). That's comparable to the US murder rate in the worst US cities 15-20 years ago, when the crime rate was higher. I've lived in such cities, and often in neighborhoods where the murder rate was higher than what the IBC claims for Iraq. And let me tell you, it was nothing like what you read in the news about Iraq. I did not feel compelled to hide in a barbed-wire covered bunker. I did not step over bodies every morning, or hear about multiple bodies found floating in the river every day. You won't be surprised to hear that I never saw a body. So obviously the IBC figures are huge underestimates.

Posted by: Ragout | October 22, 2006 7:42 PM

#55

"in what sense does underestimation invalidate a power-law?"

In the sense that "underestimation" means "inaccurate estimation" -- period.

There is no reason why an inaccurate estimate should necessarily be consistent in any sense -- other than consistently wrong, perhaps (and even that is not guaranteed).

It is no secret that much of the reports coming out of Iraq are the result of "hotel journalism" reported from the safety of the Green Zone or from a few other "safe havens" (guarded by US troops) scattered here and there.

If one is basing death rates on media reports and where the media can go in Iraq at any given point in time is a function of how safe certain regions are for travel -- something that may be constantly changing as the sitation on the ground changes -- it is stretch (to say the least) to assume that the underestimates based on media reports (or the underlying conditions) should be in any sense consistent from day to day, week to week or even month to month.

The value of any conclusion is only as good as the data that go into the analysis. If the data is rubbish (or of unknown accuracy), it matters not how elegant the math is that led one to the (false) conclusion.

Posted by: JB | October 22, 2006 8:50 PM

#56

if the media consistently underreports the violence by the same factor each month i.e. every month they get 1/3 of the total deaths, (you don't need to know the exact ratio, just that it is constant or close to constant). Then the power law holds.

This simple fact can be tested by comparing the trends of the Lancet paper and the trends of the IBC data. If the trends are the same or similar (renormalised to take into account the different total deaths) then the media must be under-reporting by the same factor each month.

If you do this you will see that indeed the trends over time for the IBC data and the Lancet data are in agreement. Henceforth the power-law holds.

Posted by: jkbaxter | October 22, 2006 9:04 PM