# How can you tell if an elephant is hiding in your fridge?

Seixon is no great shakes with statistics, but he sure can do things with a metaphor:

I swear, commenting on this blog is like a game of hide-and-seek with the elephant somewhere in the room. Lambert and his friends giggle every time I open up a cupboard and don’t find the elephant, even though the elephant is somewhere to be found. In the process, they keep making false statements to distract me and make me look places where the elephant can’t be found

1. #1 Kevin Donoghue
November 9, 2005

Seixon asks:So is z also wrong then?

Well, z has already commented on what you wrote (no. 87):

Grk. Snort. Brzk. Flzrb. Snrg. I’m having trouble coming up with a reply is it possible that this means something different in another language?

I certainly wouldn’t describe that response as wrong. Does it strike you as a ringing endorsement of your position? If it was intended as such, z can say so.

This is really starting to get out of hand Kevin.

Now on that we can agree. In the interests of restoring some semblance of order to the discussion, here is a statement of the true state of affairs as I see it:

Any random sample, by its very nature, excludes the vast majority of the population. Nonetheless, statistical theory and practical experience have shown that it is possible to use a random sample to obtain an unbiased estimate of a population parameter. It is also possible to compute a valid measure of the likelihood that the true value of the parameter differs greatly from the estimate. Although elementary textbooks focus on the use of simple random samples, in the real world modified versions of SRS are often used and give good results. They are also called random samples because they are, well, random. The Lancet team used one such method, with a modification which you find objectionable and which I don’t. The fact that a large segment of the population got left out of the sample is not a valid criticism. But it seems to be all you’ve got and you are evidently determined to bang on about it forever, saying it’s “unrepresentative” just because people who had as good a chance of being included as anyone else were randomly excluded.

2. #2 Tim Lambert
November 9, 2005

Seixon, it’s a gem because it is off-the-planet class delusional. You are wrong on those two points because you have no clue how random sampling works.

Please note that if I don’t bother to comment on something you write it doesn’t mean I agree with you. More likely it’s because what you wrote is so obviously wrong that I don’t think any refutation is required.

3. #3 z
November 9, 2005

“There will always be 1/3 states that aren’t in the sample. Always. Every single time”

Not the same states every time, but what the heck. Similar to sampling 10,000 human beings to get an average weight. How valid can that be, when there will always be 6 billion humans who aren’t in the sample. Every single time. On purpose!

4. #4 z
November 9, 2005

“Now, aside from Lambert, z, and some others, including you, have even admitted that the sampling error exists and some are even brave enough to admit that the Roberts team are the one who introduced this error to the sample.”

Of course, they do state this in the paper. Giving credit where credit is due, I had not really given this any consideration until you brought it to my attention, and that’s probably true of most here. On the other hand, you were suggesting it was bias; I believe my initial post suggested that it was not bias, but it did increase the error.

While we’re on the subject, just to clear the air of mysteries, you are correct that the initial Lance press release regarding the study was incorrect. On the other hand, you attribute this to malice aforethought, I’m more inclined to hypothesize human error on the part of dumbass deskjockey drones without a rigorous background manning the PR desk. Of course, that’s because as a pathetic liberal wimp I am required by law to avoid assuming evil motives where stupidity would suffice, whereas as an upstanding conservative you would only be required to do so in cases of invading another country on the basis of false information regarding WMD.

5. #5 z
November 9, 2005

“Error is random, bias is systematic.

This is what I wrote:

Sampling error and sampling bias are basically the same thing, although I think bias tends to mean that the result is skewed in a specific manner, whereas in this case, the error is unknown

The exact same sentiment as z.”

Grrk. Snzppf. Sptng.

6. #6 Seixon
November 9, 2005

Various obfuscators,

A sample bias is an error in the sample, a systematic error. You guys know this, that is what I was saying, and you keep pretending that I don’t know what I’m talking about since you have to always take what I say out of context. Zzzz….

Not the same states every time, but what the heck. Similar to sampling 10,000 human beings to get an average weight. How valid can that be, when there will always be 6 billion humans who aren’t in the sample. Every single time. On purpose!

Once again, you are in full denial mode. I’m talking about excluding entire segments of the population, not x amount of people from all over. There’s a difference between sampling 20 people from Basra and 10 people from Missan as opposed to 30 people from Basra and 0 from Missan.

In other words, going along with your example of doing a sample to find the average weight in the world, the Lancet would have purposefully excluded 1/3 of the countries in the world. Now tell me, if one of those happens to be China or Japan, what do you think happens to the sample and the result? Will that effect the result of the study or not?

Zzzzzzz….

I am not a conservative, jesus friggin christ. I am probably more liberal than you are. The “erroneous” headline on the Lancet website was there for over a week. You’d think that if it was a mistake, it would have eventually been fixed. You know, the editor of the Lancet should have taken a look at some point in time. You’d think that Roberts himself actually took a look at the damn thing. But nope, the “error” stayed up there.

In other words, the publication that supposedly peer-reviewed the damn thing can’t even get a headline about the study correct. Excuse me while I ponder whether that is due to incompetence or malice…..

I may have made a mistake in talking about “bias” even though I think most people here take that to mean fraud even though it doesn’t have to mean that at all.

In the example I gave above, the sample would be biased because it would negate the weights of a large segment of Asian people who have a lower average weight compared to much of the rest of the world. Thus the sample would be biased towards non-Asians and thus towards a higher average weight.

Right?

That is an effect created by the Lancet methodology, although the bias will not be the same every time. Depending on the outcome between all 6 pairs, the biases may cancel each other out, or there may not be a difference between the pairs so then there would not exist any bias.

The problem is, we just don’t know because those provinces were not sampled, and that is poor methodology.

I find it hilarious that Kevin and Lambert (especially) would defend a poll of the United States population that would systematically exclude 1/3 of the STATES.

z, you like to pretend that the exclusion of x amount of people is the same as the exclusion of x amounts of population segments. It’s not, and you know it’s not.

Clumping of clusters is also not a valid statistical methodology. I have asked you to find ONE SINGLE EXAMPLE of it other than the Lancet study, and I have received nothing.

I find it hilarious that you can sit and defend a methodology that changes the cluster size three times during the process of sampling. From 30 households, to 90 households, and then back to 30 households.

Lambert said that clustering of clusters was multistage clustering – it is not. Multistage clustering is distributing clusters (consistent size throughout the process) in different stages.

The Lancet methodology would have constituted multistage clustering if they didn’t clump clusters in the pairing process.

The error was increased by this methodology in a way that cannot be calculated, and the confidence intervals do not reflect it either.

Usually when a methodology purposefully introduces an error, that is frowned upon.

But not here! Anything goes! Cut 1/3 of the states? SURE! We don’t need California in a sample of the USA! Or Texas. Or the entire midwest. Nope, excluding them will certainly not affect the sample. Nothing will! Yeay!

Breaking statistics principles? SURE! It’s “innovation”! Breaking windows is a new way of cleaning them!

And you say I am off-the-planet??

You’ve got to be kidding me.

I should have a chat with John Zogby, he would be laughing his ass off hearing what you guys are saying about sampling.

7. #7 z
November 9, 2005

“That is an effect created by the Lancet methodology, although the bias will not be the same every time. ”

Then it’s not bias. It’s error. That is why the confidence limits are very large.

“I may have made a mistake in talking about “bias” even though I think most people here take that to mean fraud even though it doesn’t have to mean that at all.”

I think most people here take it to mean systematic nonrandom error such that the mean of the distribution of estimates in multiple samples is significantly different than the population mean. Given an alpha, say the usual .05, then the confidence interval of a biased sample will no contain the true mean more than alpha of the time; i.e. >5%, with it falling on one side more often. With that in mind, it’s easy to determine whether a sampling procedure is biased if we compare a large number of samples to the true population mean. Aside from being self-evident, the math is pretty well defined. If the true mean falls outside the confidence interval >5% but equally on both sides, then it is not biased; it just has higher error rate than a Gaussian distribution. Repeat: Not Biased. And none of those multiple samples, which gave individual estimates of the true mean which were not exactly correct, was biased either.

It’s an old metaphor, but maybe needs trotting out again: a gun which shoots in a foot-wide circle centered around the bullseye is accurate, but not precise. A gun which shoots a tight little circle a foot away from the bullseye is precise, but not accurate. In neither case would you expect your first shot to hit the bullseye; but in either case, it would still be your best indicator of where the bullseye was, if you didn’t already know. From the results of just this one shot, however, you cannot say whether the gun is imprecise, inaccurate, or both, even if you knew where the bullseye is in relation to the bullet hole. You can inspect the mechanism of the gun, however, and determine what features may be causing a reduction in precision and/or accuracy.

You are telling us that this survey is not accurate, based on the fact that you have found a problem with its precision. Does not follow logically, they are distinct concepts.

8. #8 z
November 9, 2005

“The Lancet methodology would have constituted multistage clustering if they didn’t clump clusters in the pairing process.”

Why? The clusters represent provinces. Why do you assume that the death rate is uniform in any province, so that randomly leaving out whole chunks of a province by including only certain clusters from that province is biased or whatever you are calling it, but randomly leaving out those few clusters is not biased?

To use your analogy, tossing the dice to decide whether to leave out Texas or California is an absolute nono, but tossing the dice to decide whether Los Angeles or Eureka represents California; no problem.

9. #9 Seixon
November 9, 2005

z,

Then it’s not bias. It’s error. That is why the confidence limits are very large.

Each different sample from this methodology has the possibility of producing a bias. Think about it like this: if the provinces excluded all had very low death rates, then the sample will be biased upwards. Yet as I have said, and you repeat, this will not be the same every time, but there is still potential for a biased sample. Biased does not have to mean fraudulent, you know. Yet there is also the possibility of it not being biased at all. So to generalize, as I have already said, we’ll call it sampling error.

The methodology in itself is not biased, but it has the potential to produce biased samples.

The confidence intervals do not reflect the sampling error produced by this effect, as the text in the study hints at and tries to gloss over.

You are telling us that this survey is not accurate, based on the fact that you have found a problem with its precision. Does not follow logically, they are distinct concepts.

Not at all, you are ascribing to me the opposite of the Lambert & Co dogma.

My position is that the procedure is not precise, and we cannot know how accurate it is. I have said time and time again that the study might very well be accurate despite its woeful precision, but due to the precision woes, we just don’t know how accurate it is. We can have no confidence in its accuracy. We can’t say it is accurate, or that it isn’t. We don’t know, and can’t know.

The Lambert & Co position is that it isn’t precise, but that it is accurate.

Tell me which position is logically corrupt.

Why? The clusters represent provinces.

Ehm, no they don’t. Not sure what you mean by that, but each cluster represents 1/33 of Iraq’s population, which does not correspond with any province.

Why do you assume that the death rate is uniform in any province, so that randomly leaving out whole chunks of a province by including only certain clusters from that province is biased or whatever you are calling it, but randomly leaving out those few clusters is not biased?

Have you even read the methodology? Your comments seem to indicate that you have absolutely no clue about the methodology at all. It isn’t a matter of leaving out “chunks of a province”. It’s a matter of leaving out entire provinces. If you haven’t even realized this by now, no wonder we are arguing in circles.

To use your analogy, tossing the dice to decide whether to leave out Texas or California is an absolute nono, but tossing the dice to decide whether Los Angeles or Eureka represents California; no problem.

Except that’s not what the Lancet study does…. The Lancet study leaves out Texas or California. I think you’ll agree that Texas and California are larger areas than Los Angeles and Eureka, if I am not understanding what you are trying to get at….

10. #10 z
November 9, 2005

“The methodology in itself is not biased, but it has the potential to produce biased samples.”

Well, to beat the metaphor to death, that is the equivalent of saying that with an accurate but imprecise gun, each shot is inaccurate. OK, but it’s still your best shot, so to speak.

To beat the now dead metaphor to a pulp, your POV is that you have examined the gun thoroughly and found it’s precision not up to your standards; which the manufacturer does note in the small print in the technical manual, although the PR literature makes false statements. However, upon examining the gun, you have found no sign of anything that would cause it to be systematically inaccurate, other than the fact that each shot will have a random component that would make it inaccurate by a random amount, averaging zero. Nevertheless, you state that without having a bunch of shots you can’t tell that the imprecision is covering up an inherent inaccuracy that can’t be determined by thoroughly examining the mechanism; and therefore, you are assuming that the actual bullseye is somewhere below the location of the bullet hole. To my mind, that starts off well, gets less grounded, and then veers off totally at the end. The rest of us accept that the bullseye probably does not fall dead center on the bullet hole, but think the relationship between the two is random and unpredictable, so right now “near the bullet hole” is all we have to go on for the location of the bullseye, but it’s better than the previous estimate, which was “somewhere above the ground”.

11. #11 z
November 9, 2005

“Seixon has been banned for 24 hours for repeated violation of my comment policy. I’ve deleted his comments and all the ones criticizing him from this thread. Any discussion of Seixon should go in the Seixon thread, not here.”

Oops. In that case, I’ll refrain from posting anything else, lest I be suspected of beating him with a dead metaphor while he cannot defend himself.

12. #12 z
November 9, 2005

“it’s precision”
But I will say that I meant “its”.

13. #13 Donald Johnson
November 10, 2005

Is this the Seixon statistical criticism thread then? I assume it is, but just checking to be sure.

14. #14 Seixon
November 10, 2005

However, upon examining the gun, you have found no sign of anything that would cause it to be systematically inaccurate, other than the fact that each shot will have a random component that would make it inaccurate by a random amount, averaging zero.

You want to explain why it would average zero? Or did you just pull that out of a hat? The gun is systematically inaccurate, only it doesn’t pull to any one direction systematically. ðŸ˜‰

Nevertheless, you state that without having a bunch of shots you can’t tell that the imprecision is covering up an inherent inaccuracy that can’t be determined by thoroughly examining the mechanism; and therefore, you are assuming that the actual bullseye is somewhere below the location of the bullet hole.

Once again misrepresenting my position, again trying to juxtapose me opposite of the position that people like Lambert and you are taking: that the study is accurate. I’m not claiming it isn’t accurate, I’m claiming that the study is so imprecise and unaccountable that there is no reason to put any weight behind its conclusions. It might be accurate, might not be.

To use your methaphor against you: you know you have an imprecise gun, and you are missing enough shots to be able to make any meaningful conclusion about the accuracy of the gun, yet you claim the gun is accurate anyway.

The rest of us accept that the bullseye probably does not fall dead center on the bullet hole, but think the relationship between the two is random and unpredictable, so right now “near the bullet hole” is all we have to go on for the location of the bullseye, but it’s better than the previous estimate, which was “somewhere above the ground”.

Well, it looks like my characterization of you was accurate. How do you know that it is “near the bullet hole”? Your gun is so imprecise, how can you even claim it to be near? In fact, with the shots you have taken, all you can claim is that the bullseye is somewhere on the board.

Sure, something is better than nothing, but that doesn’t justify walking around claiming that this something is accurate, robust, and the damn truth.

I also note that you stayed far away from talking about the methodology once you found out you didn’t know what you were talking about regarding clusters and provinces… So in other words, I can see why you were thinking the way you did – you just didn’t have your facts straight on the methodology.

15. #15 Tim Lambert
November 10, 2005

Yes, this is Seixon’s thread. Posts by Seixon about the Lancet and any criticism of Seixon belong here and nowhere else.

16. #16 z
November 10, 2005

“You want to explain why it would average zero? Or did you just pull that out of a hat? The gun is systematically inaccurate, only it doesn’t pull to any one direction systematically. ;)”

Because you haven’t been able to show any reason why it would systematically go to one direction or another; i.e., bias. Or is this part of your position that it is not up to critics to come up with a better estimate, it is up to non-critics to come up with an estimate that proves themselves wrong?

17. #17 Seixon
November 10, 2005

Because you haven’t been able to show any reason why it would systematically go to one direction or another; i.e., bias. Or is this part of your position that it is not up to critics to come up with a better estimate, it is up to non-critics to come up with an estimate that proves themselves wrong?

You sound like Mary Mapes. “I don’t have to prove that my estimate is authentic. I don’t think that’s the standard.” LOL.

No, it doesn’t systematically go in any certain direction, namely because it is an unquantifiable error. That still doesn’t mean that it “averages zero” and even if it did “averages zero” that would have no relevance since you are only conducting the sample one time! The only way that would be relevant is if you did the sample 10 times and then combined those samples to publish a result.

You are hopping and skipping away from the fact that:

1) They cut the corners on the methodology
2) This resulted in a sampling error
3) This sampling error is not computable
4) Their confidence interval, result, and DE do not account for this error
5) That matters because THEY introduced this error, and not an external factor

18. #18 Seixon
November 10, 2005

I guess I will just have to post this here since Lambert can’t be bothered with defending his debunked arguments:

The ILCS does not agree well with the Lancet survey. The infant mortality rates are quite different, and the ILCS has much too vague questions about mortality for Lambert to be making the assertions he has made here and in the past. The ILCS asked respondents about “war-related” deaths, which would leave it up to the respondent to decide whether they would note it as that or disease, accident, pregnancy-related, or other. Lambert has excluded the possibility of a disease, accident, pregnancy or criminal death being “war-related” and the respondent assessing the question in this manner. As usual, Lambert eliminates all possibilities that do not mesh with his conclusions.

19. #19 z
November 11, 2005

“The ILCS asked respondents about “war-related” deaths, which would leave it up to the respondent to decide whether they would note it as that or disease, accident, pregnancy-related, or other.”

In other words, the ILCS can be considered to represent an underestimate of the total deaths.

20. #20 z
November 11, 2005

“The only way that would be relevant is if you did the sample 10 times and then combined those samples to publish a result.”

And statistical theory goes out the window.

“1) They cut the corners on the methodology ”

No study can ever be “perfect” (particularly by your definition). All studies are hampered by real world constraints. Medical trials, for example, are usually badly underpowered due to the cost of getting sufficient subjects. Nevertheless, they deliver information. One learns to scale the validity of such information on the basis of “what they actully did”, unlike the media, which immediately sells newspapers on the basis of “cure found for cancer!”. Certainly, you are entitled to your estimate of the validity of this survey; we all disagree. You’ve enlightened (not sarcastic) us to the fact that the precision of the survey is not great, even as great as their confidence interval would indicate. OK, thanks, but of course that means that the true number is as likely to be > 98000 as it is to be less than that.

As with any study, the “so what?” question comes up. Nobody here was particularly vested in the number being precisely 98,000, I believe, as would life insurers, coffin manufacturers, etc. Nor the researchers involved, nor the Lancet editors. It’s not really a linear relationship; 50,000 excess deaths would be just as shocking and horrifying as 98,000. or 20,000. If i had to put the point of the survey into one sentence it would be that this war to save the Iraqi people from the ravages of Saddam has resulted in a large increase in their death rate, which is indicative of a certain lack of achieving its goal. Normally, this would cause one to rethink current and future strategies, with an eye towards getting back on track.

And the survey has indeed proved valuable and widely accepted, in that beforehand the war supporters were pooh-poohing the Iraqi BodyCount numbers as ridiculously high and now that has become their fallback position, despite being an obvious underestimate. Thus does information weasel its way into the general wisdom, even if the source of such information did not perfectly achieve its ends.

21. #21 Seixon
November 12, 2005

In other words, the ILCS can be considered to represent an underestimate of the total deaths.

Yes. And? So my criticism of Lambert’s cherry-picking stands then?

And statistical theory goes out the window.

Surely you’re not confusing the rest of statistics with sampling, now are you? Like how someone recently started talking about coin-tosses as if that were taking a “sample”… lol.

No study can ever be “perfect” (particularly by your definition). All studies are hampered by real world constraints.

Of course, but what this study does is attempt to seem more precise than it actually is. Also, they could have just extrapolated the results for those parts of Iraq they went to, and not for all of Iraq. That would have been an honest way of going about it. In other words, they should have just extrapolated the result for the 75% of Iraqis they sampled, and not for 97% (because of Fallujah) that they ended up doing.

I’m not saying that things have to be “perfect”, as no survey ever is, but this one there is just things that went wrong beyond the normal kind of constraints. “Clumping of clusters” is something that I cannot find in any other survey, which means that they sliced and diced this methodology a bit too much. The fact that this slicing and dicing doesn’t and can’t end up in the final estimates is also very troubling.

OK, thanks, but of course that means that the true number is as likely to be > 98000 as it is to be less than that.

Again, how do you know that? It depends entirely on those unsampled provinces.

Nobody here was particularly vested in the number being precisely 98,000, I believe, as would life insurers, coffin manufacturers, etc.

Oh really? You might want to check with Lambert on this one… And you know, all the anti-war people who keep using the 100,000 as if it were a fact.

It’s not really a linear relationship; 50,000 excess deaths would be just as shocking and horrifying as 98,000.

So why was 98,000 so important? Why was it rounded up to 100,000? Why was the “civilians” connotation added?

If i had to put the point of the survey into one sentence it would be that this war to save the Iraqi people from the ravages of Saddam has resulted in a large increase in their death rate, which is indicative of a certain lack of achieving its goal.

And as a response, I would say that anyone would be out of their mind to believe that the death rate would not go up in a time of war, and you are also leaving out the fact that the insurgents have killed over 10,000 innocent Iraqis in the last 2 years. You are also looking at this quite short-sightedly. Saddam’s regime lasted for around 30 years. Yet you are ready to claim that just 2 years after his removal, things are already going to be worse for the next 28? Hmmm.

Using that kind of logic would get you pretty screwed if you took a trip back to 1945, for example…

And the survey has indeed proved valuable and widely accepted, in that beforehand the war supporters were pooh-poohing the Iraqi BodyCount numbers as ridiculously high and now that has become their fallback position, despite being an obvious underestimate.

I did not accept the IBC numbers. During the first months of the war. That is because the counts seemed too high – at that time. A long time has passed since then, and the IBC numbers now seem very realistic and are in fact most likely an underestimate. So I think you are sort of leaving out the time frame here, in my case anyways. There are of course partisan pro-war people who acted as you just said, but there are people on the other side who do the same thing.

So I guess that means that at least one person grudingly admits that there are problems with the methodology, but accepts the problems due to constraints.

As long as you don’t go around citing the Lancet study as some sort of fact, you are good on my list. It is those such as Lambert who continually keep up the farce that the Lancet study is “robust” and immune from all criticism that I find very hard to stomach. Not to mention the reliance on the results as if they were written in stone…

With such imprecise methodology and imprecise results, only the most rabid of partisans will claim that the results are proof of anything. Especially when they don’t acknowledge that additional imprecision was introduced by the pollsters that goes beyond the normal errors one typically finds in a survey.