Too much has been made of the claims about main street bias in the new Lancet study — if you do a few calculations you’ll find that even if it exists, it doesn’t make much difference. As Jon Pedersen said:
Pedersen did NOT think that there was anything to the “Main Street Bias” issue. He agreed, I thought, that, if there was a bias, it might be away from main streets [by picking streets which intersect with main streets]. In any case, he thought such a “bias”, if it had existed, would affect results only 10% or so.
But now Johnson, Spagat and co have put together a working paper where they argue that main street bias could reasonably produce a factor of 3 difference.
How did they get such a big number? Well, they made a simple model in which the bias depends on four numbers:
q, how much more deadly the areas near main street that were sampled are than the other areas that allegedly were not sampled. They speculate that this number might be 5 (ie those areas are five times as dangerous). This is plausible — terrorist attacks are going to made where the people are in order to cause the most damage.
n, the size of the unsampled population over the size of the sampled population. The Lancet authors say that this number is 0, but Johnson et al speculate that it might be 10. This is utterly ridiculous. They expect us to believe that Riyadh Lafta, while trying to make sure that all households could be sampled, came up with a scheme that excluded 91% of households and was so incompetent that he didn’t notice how completely hopeless the scheme was. To support their n=10 speculation they show that if you pick a very small number of main streets you can get n=10, but no-one was trying to sample from all households would pick such a small set. If you use n=0.5 (saying that they missed a huge chunk of Iraq) and use their other three numbers, you get a bias of just 30%.
fi, the probability that someone who lived in the sampled area is in the sampled area and fo the probability that someone who lived outside the sampled area is outside the sampled area. They guess that both of these numbers are 15/16. This too is ridiculous. The great majority of the deaths were of males, so it’s clear that the great majority were outside the home. So the relevant probabilities for f are for the times when folks are outside the home. And when they are outside the home, people from both the unsampled area and the sampled area will be on the main streets because that is where the shops, markets, cafes and restaurants are. Hence a reasonable estimate for fo is not 15/16 but 2/16. If use this number along with their other three numbers (including their ridiculous estimate for n) you get a bias of just 5%.
In summary, the only way Johnson et al were able to make “main street bias” a significant source of bias was by making several absurd assumptions about the sampling and the behaviour of Iraqis.