How can you tell if an elephant is hiding in your fridge?

Seixon is no great shakes with statistics, but he sure can do things with a metaphor:

I swear, commenting on this blog is like a game of hide-and-seek with the elephant somewhere in the room. Lambert and his friends giggle every time I open up a cupboard and don't find the elephant, even though the elephant is somewhere to be found. In the process, they keep making false statements to distract me and make me look places where the elephant can't be found

Tags

More like this

Looks like classic conspiracy-theory "logic" to me. Repeated failures to prove a theory don't indicate that the theory might be wrong; they only prove the existence of diabolical conspirators determined to suppress the evidence, which of course they wouldn't do if the theory weren't correct. Oh brother.

Well you would say that wouldn't you Platypus. Anything to muddy the waters right? Some of us do know the real reason why you're dissing Sexion as just another autodidactic crank. Remember the truth is out there.

And you really think we will believe it's just "coincidence" that the word "elephant" has eight letters?

Incidentally It's real easy to tell if an elephant is hiding in your fridge. Just check the butter for really big footprints.

coop, if there was no elephant, why would Lambert be making so many false statements? Why can't he find a single piece of evidence to show that the methodology used was acceptable?

The metaphor was to say that some people are being intellectually dishonest here, they know that there is something wrong, but keep shoving the less-inclined from discovering it.

It is quite obvious that Lambert has a good hold on statistical theory, and me less so, but that doesn't excuse him from making obviously false statements, especially since he knows better.

He thought he saw an Elephant,
That practiced on a fife:
He looked again, and found it was
A letter from his wife.
"At length I realize," he said,
"The bitterness of Life!"

-- Lewis Carroll

Bertrand Russell, in Portraits from Memory:

I wanted certainty in the kind of way in which people want religious faith. I thought that certainty is more likely to be found in mathematics than elsewhere. But I discovered that many mathematical demonstrations, which my teachers expected me to accept, were full of fallacies, and that, if certainty were indeed discoverable in mathematics, it would be in a new field of mathematics, with more solid foundations than those that had hitherto been thought secure.But as the work proceeded, I was continually reminded of the fable about the elephant and the tortoise. having constructed an elephant upon which the mathematical world could rest, I found the elephant tottering, and proceeded to construct a tortoise to keep the elephant from falling. But the tortoise was no more secure than the elephant, and after some twenty years of very arduous toil, I came to the conclusion that there was nothing more that I could do in the way of making mathematical knowledge indubitable.

Henry David Thoreau, "Walking:"

The Hindoos dreamed that the earth rested on an elephant, and the elephant on a tortoise, and the tortoise on a serpent; and though it may be an unimportant coincidence, it will not be out of place here to state, that a fossil tortoise has lately been discovered in Asia large enough to support an elephant. I confess that I am partial to these wild fancies, which transcend the order of time and development. They are the sublimest recreation of the intellect. The partridge loves peas, but not those that go with her into the pot.

Seixon, you are convinced that Lambert is making false statements because he is telling you that the elephant is a figment of your imagination. Now suppose, just suppose for the hell of it, that there is no elephant. In that situation, wouldn't his statements actually be true? If not, which statement of his is false (in the absence of an elephant)?

By Kevin Donoghue (not verified) on 20 Oct 2005 #permalink

Kevin,

Where do I begin?

He claimed that I did not understand sampling, cluster sampling, etc.

He claimed that "clustering of clustering is called multistage clustering and yes it's a valid method as has been explained several times already".

He claimed that the Lancet grouping process was cluster sampling, clustering at the governate level, etc.

His plot graphs operated on the basis that the mortality found from a sample of people would vary as much as between 1-6 or 1-13 with each of the numbers being equally possible...

He said, "BruceR, they used the results they found to boot-strap a probability distribution, so the more different the pairs are, the more variation in the distribution and the larger the confidence interval you get."

Ehm, how exactly could they determine the distribution and the confidence interval then, when they didn't even know the difference between the pairs??

He said, "Why should I go into the details of all the many mistakes when folks have already calculated it for you in comments on your blog? You'll just ignore the answer because you don't like it."

Again, the comments on my blog did not calculate what I was looking for, and the mistake I had made earlier that I sent to Lambert.

And of course, "The Lancet did not lie about the conclusions of the study — you are just projecting."

Another unsubstantiated comment: "Seixon, the Lancet study used the mean."

A misleading one: "Heck, where is your condemnation of the UNDP survey? I mean, except for Baghdad, they had the same number of clusters in each governorate, so each hosehold did not have the same chance of being sampled."

I think that covers most of them. I sure hope so for Lambert, anyways. You didn't care or respond when I pointed these out earlier, I doubt you will care now either.

Hey Seixon, I notice you think Lambert makes a ridiculous statement when he says:

"BruceR, they used the results they found to boot-strap a probability distribution, so the more different the pairs are, the more variation in the distribution and the larger the confidence interval you get."

Now you may be right, because I remember there was something about bootstrapping in my statistics book (I'm studying to become an engineer, we learn some very basic statistics). However, our course didn't cover it, and I was too idle to figure it out by myself. Since you say this statement is ridiculous, I suspect you are very knowledgeable about statistics, since you can call Tim Lambert (the computer science lecturer) on his bluff.

Go ahead. Explain bootstrapping a probability distribution to me, and why Tim Lambert's statistics speak above is ridiculous.

I am not an expert in statistics, but I can read. After reading this, I comprehend bootstrapping to be a process for ensuring that the sample you have collected is a typical one.

In other words, instead of having to collect your sample multiple times to ensure you got a typical result, you just use the sample you already took, thus being more efficient and cost-effective.

This is done by resampling your original sample, but with replacement.

In other words, say my sample is:

4, 15, 10, 13, 12, 16, 20, 7

Then my mean is: 12.125

Now I will bootstrap and resample my sample, with replacement. Since there are 8 elements in the sample, I will draw a number between 1-8 8 times, and conduct the experiment 3 times.

20, 15, 12, 20, 4, 13, 16, 7
7, 7, 4, 20, 13, 15, 10, 15
10, 16, 4, 15, 12, 10, 7, 15

This gives means of 13.375, 11.375, and 11.125. The mean for the bootstraps or whatever you wan to call them, is then 11.958.

The SE for the bootstrap is then: 1.233
The SE for the original is: 1.807

Of course, you're supposed to do this for larger samples, and you're supposed to repeat the bootstrap resampling many, many times. This was only to serve as an example of what bootstrapping is.

So I cannot see how the difference between the pairs would be alleviated by using the bootstrap, since the bootstrap only uses the values from the sample already taken, where the excluded partners of the pairs are not included.

The bootstrap doesn't help us with that, as the bootstrap idea is:

The original sample represents the population from which it was
drawn. So resamples from this sample represent what we would get
if we took many samples from the population. The bootstrap distribution
of a statistic, based on many resamples, represents the sampling
distribution of the statistic, based on many samples.

This doesn't help us figure out how much our sample varies due to the exclusion of 6 provinces. Yet as always, I am open to persuasion by Mr. Lambert if he actually tries to explain something instead of just steamroll me with Authority.

That's a good question, Seixon, at least from where I sit. I doubt I'll be answering it, not having known what bootstrapping was before. Maybe the fact that any household had an equal chance of being sampled makes the 33 clusters representative enough to justify the bootstrapping procedure, but I skimmed my very first articles on the subject today (including yours),so darned if I know. The paper says that bootstrapping was used to obtain the CI under the assumption that the clusters were exchangeable and I'm not sure what that means either.

By Donald Johnson (not verified) on 21 Oct 2005 #permalink

First one has to ask how many elephants can fit in a Volkswagen. Answer: two in the front, two in the back.

THEN you can ask the main question, to which Nabakov had the correct answer: Footprints in the butter.

Okay, but how can you tell if there are TWO elephants in the fridge? Answer: TWO sets of footprints in the butter!

Sure, but how can you tell if there are THREE elephants hiding in your fridge? Answer: THREE sets of footprints in the butter!

And how can you tell if there are FOUR elephants hiding in your fridge?

...Give up?? I thought so...

Answer: There's a Volkswagen parked in your living room!

(rim shot)

Well I once shot an elephant in my pajamas.

How he got in my pajamas I'll never know.

Q. Why don't you see more elephants drinking martinis?
A. Have you ever tried to get an olive out of your nose?

Oh yes, and if you're blind, how do you tell the difference between an elephant and a grape?

Jump up and down on it for while. If you don't get any wine, it's probably an elephant.

Donald,

What they meant by the assumption that the clusters were exchangeable was that they assumed that the provinces they paired up were similar.

Also, as I have been trying to solicit an answer for, a random sample, by definition, is one where any sample of n elements has the same probability of being selected. Now how can this be with the Lancet's methodology, when there is 0% probability to select a sample that includes households from both of the provinces from any of the pairings, and more than 0% otherwise?

Bootstrapping simply emulates repeating the same sample you already got however many times, and it uses the sample you already gathered. Thus, how could it ever pick up the differences between pairs? Short answer: it couldn't.

I guess Tim would rather read elephant jokes than make comments about this.

Donald,

I'm no authority on bootstrapping either. Computers were truly elephantine when I learned statistics: big, slow-moving beasts that you certainly couldn't hide in a fridge (but you did have to keep them in a temperature controlled environment). However in a previous thread a slightly overweight thirty-something who knows a bit about these things provided this explanation of the number-crunching for the Lancet study:

For what it's worth, they fitted a generalised linear model to each cluster's pre- and post-invasion death rate. They created a likelihood function for the data they observed relative to an assumed multiplicative (or additive in logs) risk factor, and picked the risk factor which maximised the (log-) likelihood function. That's one model, by the way, taking the 32 ex-Fallujah clusters as data points, not "32 regression lines" (in fact there was no "regression" in the sense of least squares as the model was estimated by conditional ML; the word "regression" in the paper refers to the assumed underlying model, not the fit).

Then they bootstrapped the confidence intervals by taking each cluster's pre- and post-invasion death rates and shuffling them around (attaching one cluster's post-invasion death rate to another cluster's pre-invasion death rate to create a pretend cluster) and re-estimating the model. Repeat this enough times and you get a bootstrapped distribution of "risk factors" for the various pretend datasets. Take the [2.5] and [97.5] percentiles of that distribution and call them your confidence interval.

I found it possible to get the hang of what's involved by writing a short program which bootstraps 32 plausible-looking before-and-after crude mortality rates (using the figures and bar-charts in the report as a guide), to see what the distributions of the resampled means and risk ratios look like. Oddly normal: it seems the Central Limit Theorem is still in force. I haven't tried it with the Fallujah cluster thrown in, however.

As to the assumption that clusters are interchangeable, has anybody suggested a way to do a bootstrap without that assumption? It seems to me that such an assumption is implicit in any bootstrapping exercise.

As to Seixon's "n households", that requirement is violated by every cluster sample. Consider n=2, and note that if I get surveyed, chances are pretty good my next-door neighbour does to. With a purely random sample the probability that we both get sampled is microscopic.

By Kevin Donoghue (not verified) on 23 Oct 2005 #permalink

I don't think that this is an issue that requires statistics, It's possible to find the answer through simple legal principles. We're entitled to assume that any evidence one of the parties has but won't produce is to their discredit.
The American government has internal counts of dead Iraqis; they could release them if they wanted to; the only possible reason for not releasing them is that they're as bad as or worse than the Lancet figures.
You can't take seriously a debate that on one side just says "Think of a number between one and 200,000."
"100,000?"
"Wrong. try again."

I should for the sake of completeness add the footnote that if the American government has in fact not got such figures - if it has in fact instructed every relevant section never to count dead Iraqis -- then this, too, is sufficently unreasonable to be taken as evidence of a consciousness of guilt.

"I don't think that this is an issue that requires statistics, It's possible to find the answer through simple legal principles."

Exactly. Statistics doesn't get you the to a question, it gets you your best estimate. If you only have one estimate of the question, that is your best estimate. You may quibble all you like, but it's still your best estimate until somebody provides another. Then you have to judge which of two or more estimates is better. Even if you aren't satisfied with the Hopkins methodology, I don't see how you could go with an opposing methodology of "Seems more reasonable".

Kevin,

Since clusters are the sampling units in cluster sampling, then how about "n clusters"?

What is the probability of a cluster ending up in Basra and one in Missan? 0%. Repeat for 5 other pairs. In other words...

I have 6 red marbles, 3 green marbles, and 1 yellow marble. A random sample of 2 marbles from this sample should have an equal chance of sampling any 2 of those marbles

Now, let's say that the Lancet study decides that these marbles (clusters) will have a different rule. Which is, the yellow marble cannot be chosen if a green marble is chosen. Will a sample of these marbles continue to be random with this rule in place? Hardly. We have biased the selection by creating arbitrary rules.

Cluster sampling does not void this principle, as long as you use the principle for the clusters. From what I can gather, the Lancet study did not use the principle for the clusters, thus violating it and the sample of clusters was thus not random.

As for bootstrapping, from everything that has been said, and what I read, I cannot fathom how it could correct the CI for the excluded provinces. Bootstrapping uses the sample you already had, so how can they make corrections to the CI in respect to the increased variance from the excluded ones?

When they said "it is likely" as I quoted, I cannot conclude anything other than that they knew that the exclusions increased the variance of their sample, but could not figure out how much. As they would say that "it is likely" that people would lie about rations, etc. There's no way to calculate the error from people lying in a survey. However, they themselves introduced this error caused by excluding the provinces. That they can't measure the error caused by this is therefore quite a flaw.

Seixon, until the sampling is done, there are no clusters to select, only potential clusters. So your question, as I understand it, relates to the probability that two distinct potential clusters become actual clusters. If they are both in Muthanna that probability is zero, but not if they are both in Baghdad. I don't see any reason to worry about that, since it's all about households, not clusters or provinces.

As for bootstrapping, from everything that has been said, and what I read, I cannot fathom how it could correct the CI for the excluded provinces.

If tens of thousands of people were slaughtered or smitten by a plague in Basrah or Najaf during the months before or after the invasion, then the CI is of course very misleading. Do you really think anyone disputes that? If so, you are probably just misreading. The bootstrap performs much the same function as the t-statistic used to perform in the old days. Of course there could be a problem with it, but you haven't made a convincing case that there is. As Ragout said earlier:

If Seixon is right that the pairings were poor, then the reported SEs could be underestimated substantially. Seixon isn't that persuasive, so I think Lambert is probably right to guess that the pairings don't affect the SEs that much.

Ragout does not think very highly of the Lancet study, but he hasn't reported seeing any elephants. That tends to reinforce my belief that there are none hereabouts.

By Kevin Donoghue (not verified) on 24 Oct 2005 #permalink

Kevin,

In cluster sampling, clusters are the sampling unit, not households. You would treat them as you would households. From Wikipedia:

Cluster sampling is used when "natural" groupings are evident in the population. The total population is divided into groups or clusters. Elements within a cluster should be as heterogeneous as possible. But there should be homogeneity between clusters. Each cluster should be a small scale version of the total population. Each cluster must be mutually exclusive and collectively exhaustive. A random sampling technique is then used on any relevant clusters to choose which clusters to include in the study. In single-stage cluster sampling, all the elements from each of the selected clusters are used. In two-stage cluster sampling, a random sampling technique is applied to the elements from each of the selected clusters.

So tell me, were the clusters in Basra mutually exclusive? Bzzzt. Face it, the sample of clusters was not random.

I love it, there could be a problem with the pairings giving a larger error than the CI the study used, but psh, we won't pay any attention to that. That's what I have been saying all along: the error produced by that is unknown thus any "guesses" are irrelevant.

The pairings could have had no effect, small effect, or horrendous effects. We can't know which it is, so quietly pretending that there was no effect seems very disingenuous, especially since the error stems from their own methodology.

When you combine that with the fact that my circumstantial indicators show that the pairings were not similar... I think that is a better argument than "no, it's fine, trust me".

I mean, there are always accepted biases in samples, such as self-reporting bias, the differences between household sizes, and so on. These are biases you cannot correct and are just a given. The error that they produced with this methodology was not a given, they created it.

Seixon, there were no clusters in Basrah. Are you really asking, in all seriousness, whether the elements of the empty set are mutually exclusive? If there are no angels on the head of a pin, are these non-angels distinct from each other? If the barber shaves all the men who don't shave themselves, does he need a shaving-mirror?

Enough already, you should be doing theology, not statistics.

By Kevin Donoghue (not verified) on 24 Oct 2005 #permalink

The other elephant, of course, is that they deleted Falluja from the primary results, based on wishes to be on the conservative side of the estimate, i.e. biased low. Would whatever effect of the paired province selection be strong enough to overcome that?

Kevin,

I might as well say that there were no households in Basra. In any case, the clusters were not mutually exclusive as they were grouped together in bunches of 3 in 6 pairs to give a result that was atypical of a random sampling. You obviously didn't even read the simple explanation from Wikipedia. In other words, Basra DOES have clusters, because the population is divided up into clusters, which is what makes the Lancet methodology even more wrong. Missan got 3 clusters?? Each cluster is 739,000 people, and Missan has 685,000 people. Even a child sees that something is wrong with that.

And no, I should not be doing theology since I am atheist and have no interest in it.

z,

You're saying that the elimination of a clear outlier is not statistically sound? Dude, I learned that this was OK when I was back in high school. Funny stuff. Bootstrapping should have taken care of outliers anyways.

I think folks on this blog, Tim included, have been uncharitable to a dissenter who has remained fairly charitable himself. If nothing else, commenters should consider where they would be without dissenting views. I'll tell you where - reading a boring set of comments.

Additionally, Sexion's criticism has helped explore the issues and forced folks to spell out their positions in a little more detail. That's not bad either.

The discussion on this blog has led me to strengthening my arguments, and ripping off errors I had made in the process.

My next step, whenever I get the time, will be to write up a mock study design that mirrors the Lancet methodology, without including any notion that it is about Iraq. I could then pass this off to any number of stats professors, and see what their responses to it are. Perhaps I can even get the stats professor I had to take a look. It is apparent that I won't get any further on this issue without some "authority" to stand behind what I am saying.

As expected, Kevin and other Lambert-apologists didn't care about his false statements, even though Kevin solicited them from me once more. I will take that as a sign of silent consent.

PS. I love it when people have a Freudian slip when writing my alias... ;)

Slick-something, the lack of charity has gone both ways. Seixon started out with a wrong position and was very argumentative about it, and harsh words were flung around in all directions. Seixon also had one good idea--the notion of using American troop deaths as a way to see how violent various provinces have been. The results were mixed, I think--taking Seixon's word for it (I haven't looked at the numbers myself), 5 of the 6 unsampled provinces were less violent than the ones chosen, which I take to be a bit of bad luck, something you could expect from sampling. On the other hand, possibly the most violent province, Anbar, also was excluded in the main analysis because its cluster was extremely violent, and Bruce R (again citing Seixon, since I didn't pay attention whenever the argument took place) pointed out that when you exclude Anbar, the selected provinces showed an almost identical violence level to the overall Iraq average. That seems to strengthen the case for the Lancet numbers--things worked out so that they sampled a representative area of Iraq. Not the conclusion Seixon wants to draw, but as I understand it that's what one would conclude using his idea.

Whether their pairing procedure caused a problem calculating the CI is something several of us have wondered. Being one of the non-experts here, I don't know and haven't had time to think about it much (not that I think I'd necessarily understand it better if I did.)
I'm a little befuddled--if I think about it more maybe I'll be more befuddled.

Thanks for the dsquared quote, Kevin. I almost felt like I understood much of it, but that's probably an illusion.

By Donald Johnson (not verified) on 25 Oct 2005 #permalink

I see that Seixon is trying to hijack the latest Lancet thread for issues which don't belong there. I'm not going to continue responding to him, having spent more than enough time doing so already, but I won't let this pass:

[Tim Lambert has] not addressed the numerous patently false statements [he] made in the process of the previous debate. Kevin wanted to see them, I gave them, and he never commented.

This refers to my suggestion in this thread:

Seixon, you are convinced that Lambert is making false statements because he is telling you that the elephant is a figment of your imagination. Now suppose, just suppose for the hell of it, that there is no elephant. In that situation, wouldn't his statements actually be true? If not, which statement of his is false (in the absence of an elephant)?

Seixon responded with his usual gripes, all derived from the assumption that there really must be an elephant around here somewhere: "He claimed that I did not understand sampling, cluster sampling, etc."

Indeed he did, Seixon. But you supplied the evidence to support his claim. That most certainly is not a "patently false statement" by any means. To illustrate it there are three threads on this blog and, last time I looked, three on your own.

By Kevin Donoghue (not verified) on 31 Oct 2005 #permalink

Good job Kevin, you picked out the only statement on that list I made that there can be subjective opinions about. I have demonstrated knowledge of sampling on this blog and my own, although I miscalculated things to do with probability. So if you and Lambert are going to accuse me of anything, accuse me of being sloppy with probability theory, not knowledge of how sampling works and is done.

From the get-go, Lambert misrepresented what I was saying to make me sound like a complete idiot, just setting up one strawman after the other and knocking them down.

There were more than that statement on the list I made. You won't touch any of the other ones because they all show that, if we were to use Lambert's opinions about who knows how sampling works, then it would actually be Lambert that doesn't have a clue.

Clustering of clusters is multistage clustering? LOL. Just one of many priceless false statements that Lambert has squeezed out to keep from actually answering to the beef of the matter, beating around the bush, and now he has started censoring my comments. Priceless.

So, when you compiled your list of Lambert's alleged falsehoods, you chose to place at the top a claim which, on your own admission, you cannot substantiate. Think about that if you want to understand why you are not taken seriously.

By Kevin Donoghue (not verified) on 02 Nov 2005 #permalink

Kevin,

Since that was the first one I came across, I naturally listed it first. I cannot substantiate it? I have demonstrated that I know what cluster sampling is, and how sampling works. The only thing I did wrong was the probability theory. Lambert also misrepresented my blog post from the get-go to paint me as a complete moron.

Now why don't you stop fudging around and get to those other falsehoods? You know, instead of assuming that I listed them in order of importance just so you'd have a red herring to escape with. Assumptions seem to be a common theme around here...

The first complaint on your list isn't really any sillier than the others. They are all silly. You want me to play whack-a-mole with your other gripes, but you haven't actually withdrawn this one, nor have you offered any sensible argument to support even one of your assertions. Why should I do all the work? If you have a case to make, make it.

By Kevin Donoghue (not verified) on 02 Nov 2005 #permalink

Seixon, you still don't understand how sampling works. You're still claiming that the sample wasn't random even though Kevin showed that you were wrong earlier in this thread. The samples are only independent with simple random sampling. With cluster and systematic sampling they are not.

Kevin,

Just keep saying they are all silly, sure beats actually showing that they are! As for not "proving" the first one: I have asserted that I made mistakes in probability theory, not sampling theory. If you feel that is incorrect, please provide some evidence that I do not understand sampling theory. Making a mistake in probability theory doesn't mean I don't understand it, either. After figuring out what went wrong, I have acknowledged the mistake and I understand it full well.

Lambert,

Kevin showed no such thing, he played a bait and switch and then when I had him, he stopped responding.

Cluster sampling, SESS sampling, stratified sampling are all meant to ensure a representative distribution of either clusters or households. With cluster sampling, clusters are the primary sampling unit, same with SESS. Stratified sampling uses demographics to ensure a representative distribution.

YET! When the distribution of the sampling units has taken place, THEN SRS takes place. As you have all made abundantly clear, each household has an equal chance of selection in all of these methods.

However, with cluster sampling, clusters are the primary sampling unit. You either have to use a method that will give a representative distribution, or do it with SRS. SESS ensures a representative distribution of clusters, which is why it is used.

The Lancet method does not adhere to this at all, as it will ALWAYS produce and UNREPRESENTATIVE sampling of clusters. Do you want to argue with this? Go ahead and try it.

Also, what I have been saying for at least over a week now is that the clause about a sample of size n having to have the same probability of being chosen as any other sample of size n has to apply, and it does for all the above mentioned methodologies: except the Lancet method.

In none of the other methodologies is there a 0% probability of both Basra and Missan being sampled. None of them. That in itself violates this statistic principle. Then you multiply that by 6 since this happened with 6 pairs of provinces, and you've got yourself quite a problem.

In other words, let's look at this in microcosm with governate A, B, and C. B and C are paired up as with the Lancet methodology, meaning only one of them can be in the sample at a time. Sample size of n=2 gives:

S = {AB, AC}

If it were a random sample, it would be:

S = {AB, BC, AC}

That is because every sample of size n has to have the same probability, and the Lancet methodology gives P(BC) = 0.

P(BC) will never have a probability of 0 with SESS, cluster sampling, stratified sampling, or anything else.

Only with the brilliant Lancet methodology.

Thank you sirs, I await your rebuttal.

Sigh. Let's look at a really simple example. Our population is size 4 and we are choosing a sample of size 2.

Simple random sampling chooses each subset of size of 2 with equal probability, ie 12 13 14 23 and 24 are all equally likely.

If we instead use cluster sampling with a cluster size of 2, then 12 and 34 are then only possibilities so all subsets are not equally likely.

And if we use systematic sampling with a step size of 2, 13 and 24 are the only possibilities so all subsets are not equally likely.

Kevin explained this to you, but it bounced off your fact shield.

Lambert,

And if we use systematic sampling with a step size of 2, 13 and 24 are the only possibilities so all subsets are not equally likely.

You wouldn't use SESS with such a small sample. There are guidelines for the use of SESS. I guess that doesn't matter, since here the game seems to be omitting the most amount of facts to be correct. Also, for SESS and cluster sampling, clusters are the primary sampling unit.

In other words, if 12 is cluster 1, and 34 is cluster 2, then both of those are equally likely of being picked, thus staying with the principle.

The Lancet study, however, does not allow for this because of the introduced "rule" about all clusters going to one of the provinces.

In your example, you were using 1 cluster of size 2, with sample units 1, 2, 3, 4. That means that 12 and 34 were your clusters, that's what you seemed to indicate. With only one cluster being chosen, then it will either be 12, or 34. Both equally likely of being chosen, keeps with the principle because clusters are the primary sampling unit.

Now with the Lancet, they would have 3 clusters, and let's say that A had one, B had two (clusters to choose from).

Then the possibilities, if you strike the "rule" they made, it could have been AAA, AAB, ABB, BBB, all equally likely considering PPS.

With the Lancet, however, only AAA or BBB was possible. Thus violating the principle, since not all samples of n clusters are equally likely to be chosen.

Brilliant try by changing it to one cluster by the way, you almost fooled me. You are once again trying to conflate households with clusters when the primary sampling unit in cluster sampling is clusters, and not households.

In summary:

The point of creating a sample is to have it be representative of the population, and that you choose the units in the sample with SRS at some stage in the sampling process.

SESS, stratified sampling, cluster sampling, SRS sampling, all of these accomplish this.

The Lancet methodology does not, as it will always produce an unrepresentative sample, because it violates the principles of a random drawing AND being representative.

Stratified sampling distributes units by area or population before hand (representative) and then picks the units with SRS.

Cluster sampling distributes clusters by area or population (representative), and then picks units with SRS.

SESS uses clusters and distributes them representatively, and then picks units with SRS.

The Lancet methodology does distribute the clusters representatively or randomly, and uses SRS in the end.

The first stage has to be done representatively, or randomly. The last stage always has to be done with SRS.

The Lancet methodology does only the latter, while all other sampling methods cited do both.

Game. Set. Match.

If n = 4 is too small, try n = 6. Then try n = 8. Continue until the penny drops.

By Kevin Donoghue (not verified) on 04 Nov 2005 #permalink

It doesn't matter Kevin, you guys are conflating households and clusters to escape the fact that I am right. See my summary below.

I saw it. You're dodging the point.

By Kevin Donoghue (not verified) on 04 Nov 2005 #permalink

The point I made in comment no. 18, which Tim Lambert has also explained: you cannot say that a sample generated without pairing is random while a sample generated with pairing is not random, without being logically inconsistent and/or pulling some ad hoc principle out of your ass.

By Kevin Donoghue (not verified) on 04 Nov 2005 #permalink

Kevin,

Apparently I'm the only one who understands sampling. In cluster sampling, the clusters are the primary sampling unit, not households.

You either have to distribute the primary sampling units representatively (SESS, stratified, cluster sampling) or randomly (SRS).

The Lancet method does neither. That's not an ad hoc nothing, and you know it.

Yes, clusters are PSUs, so what? What's wrong with the Lancet's method of creating them? If all you've got is your Sir Humphrey argument (we must follow precedent Minister) that's not very persuasive. But show how the estimators are affected and you'll certainly get me interested.

By Kevin Donoghue (not verified) on 04 Nov 2005 #permalink

I think I already explained it, Sir Humphrey.

Either you have to distribute the clusters randomly (as with straight up cluster sampling) or you have to distribute them representatively.

The Lancet method does neither.

I can't show anything about the effects on the estimators, namely because 6 of the provinces were excluded and never sampled. Neither can the study, WHICH IS THE DAMN POINT.

In a study, there will usually be the mention of any biases that may have occurreed, such as self-selection, lying, etc, etc. The study has no control over these, they are externally created and the study has little or no control to change those.

However, the error that they cited, that the variances were likely to increase, is something that THEY INTRODUCED, not a typical external bias.

Their final confidence interval does not reflect the error they themselves introduced, thus there is no way to estimate the error associated with it, and thus undermines the finding.

You want to claim that the Lancet sample was neither random nor representative. You present no argument. You just keep saying it wasn't random or representative. Sorry, that doesn't cut it. As I explained earlier:

Seixon, until the sampling is done, there are no clusters to select, only potential clusters. So your question, as I understand it, relates to the probability that two distinct potential clusters become actual clusters. If they are both in Muthanna that probability is zero, but not if they are both in Baghdad. I don't see any reason to worry about that, since it's all about households, not clusters or provinces.

But in any case, if you want to lay down a principle which applies specifically to clusters, you need a principle which is not flouted by SESS. Otherwise you are just being inconsistent.

By Kevin Donoghue (not verified) on 04 Nov 2005 #permalink

SESS distributes the clusters representatively, thus it need not do so "randomly". Stratified sampling distributes the sampling units entirely representatively, there is nothing wrong with that.

I already presented why the Lancet sampling is not random, and you guys conflated households to clusters to try and get out of admitting it.

The principle of a random drawing is that any sample of n units has to be equally likely of being chosen as any other sample of n units.

With cluster sampling, these units are clusters, not households. You guys tried the bait and switch, by using households and not clusters to "prove" that it didn't work. As I have said, the PSUs of cluster sampling is clusters, so the principle is applied to clusters, and not households.

As I showed, this does not hold with the Lancet methodology.

A sampling needs to be done representatively AND/OR randomly.

SESS, stratified and some forms of cluster sampling do the former. SRS and some forms of cluster sampling do the latter.

The Lancet method accomplishes neither.

So you define "random" and "representative" in such a way that SESS is representative but not random. Presumably SRS is both random and representative on your definitions. But the Lancet method is neither. Very neat.

I see two problems. Firstly, you don't spell out your definitions. Maybe they look sensible, maybe not. If you want them to be accepted you had better put them on the table. Secondly, you offer no argument as to why it's quite all right to distribute PSUs in a "non-random" fashion (per your definition) but wrong to allow them to be redistributed randomly between paired governorates.

By Kevin Donoghue (not verified) on 04 Nov 2005 #permalink

SESS does not strictly follow the definition of a randow drawing, no. However, that is irrelevant since it distributes clusters representatively. Stratified sampling does not distribute sampling units randomly at all, as each stratum is given a certain amount of sampling units based on area of population.

I think you know perfectly well what I mean by representative, which is demonstrated quite ably with stratified sampling. If you don't buy that, then you can go ahead an argue against stratified sampling. Good luck on that. I already gave you the definition of a random drawing/sample.

I didn't offer an argument as to why it's alright to distribute them in a non-random fashion? Really? I said that they would have to be distributed representatively. That was the arguement, such as stratified sampling. Argue against that if you wish, you're only making this a comedy hour.

They are not redistributed "randomly" between the paired governates. The way the Lancet study does it violates the definition of a random drawing/sample.

If the Lancet had distributed the clusters in a representative manner, then this would make it valid, such as stratified sampling.

Yet they do NEITHER.

No, of course I don't know what you mean by representative. You spent weeks arguing that the Lancet sampling was biased, so it's quite clear that you use words differently from the rest of us. True to form, you attribute to me the view that there is something wrong with stratified sampling. By now you know quite well that isn't my view. I am merely drawing attention to your inconsistency. You insist on a strong criterion for randomness when you are attacking the Lancet study, but you have no objection to other methods which don't meet that standard.

In refusing to define your terms, you are just ducking the task of resolving the inconsistency in your position.

If the Lancet team had not paired governorates you wouldn't have a problem. Yet in that scenario they would have been using SESS in the first stage and a chance selection (using GPS) in the final stage - a mixed method. And your objection to what they actually did is? They used a mixed method! Devious bastards!

Still think you've got a case?

By Kevin Donoghue (not verified) on 04 Nov 2005 #permalink

Kevin,

There is no inconsistency in my argument. I have tried to make the case to you several times now, but you will not listen.

All sampling methods are "mixed" except for pure SRS. A "mixed" method is not a problem, as long as you follow one principle. This principle is that each step of your sampling process must be either:

1. Representative sampling (PPS)
2. Random sampling;

and the last step must always be a random sampling.

Stratified sampling uses 1, then 2.
SESS uses 1, then 2.
Cluster sampling uses either 1 or 2, and then 2.
The Lancet method used 1, then NEITHER, and then 2.

In other words, each step of your sampling process must be either representative, or random, and the last step must always be random.

All of those sampling methods follow this, yet the Lancet methodology does not.

Why? Because their pairing method is not representative, nor is it random.

Yes, I do think I have a case, because you keep claiming inconsistency when there is none.

Good day, sir.

Seixon, let's re-cap this discussion:

A random sample, by your preferred definition, is one where any sample of n elements has the same probability of being selected. Obviously that requirement is violated by every cluster sample. If I get surveyed, chances are my next-door neighbour does too.

To this you respond that clusters are the primary sampling units in cluster sampling. True, of course, but that doesn't mean you can sensibly apply your definition to them. PSUs are certainly not elements, only households are. Until the sampling is done, there are no clusters to select, only potential clusters. Really, your argument breaks down right there. If you really insist on the strong definition of randomness and apply it logically none of the methods being considered will measure up.

But never mind that. Let's follow your reasoning just to see where it leads. To rescue your definition of randomness, the probability that any two distinct potential clusters become actual clusters must be the same. That doesn't happen with SESS. If they are both in Muthanna the probability is zero, but not if they are both in Baghdad.

So you drop the demand for random samples. You say instead that SESS is representative. You don't define this but in your latest comment you chuck in "PPS", which (to me) means the probability of selection must be proportional to the size of the population. Is that just a slip on your part, or does PPS mean something different to you? Obviously the probability that Missan or Basrah gets the clusters is proportional to population in the Lancet method. If that's what you mean by "representative" then you are effectively conceding the point. You surely mean something else; but what? You haven't said.

By Kevin Donoghue (not verified) on 05 Nov 2005 #permalink

Whoops! That last comment was badly worded, taking us back to old territory. It would be more accurate to say that the probability that a potential cluster in Missan becomes an actual cluster is the same as for a potential cluster in Basrah. That's the heart of the matter.

By Kevin Donoghue (not verified) on 05 Nov 2005 #permalink

This is quite comical. I am sort of drunk right now, but I can still take you to town.

The Lancet method will never be representative, because it will always oversample one province and not sample the other. Thus, it will never be representative. PPS means what it means, that each section, slice, strata, whatever, will get its piece that it is due according to PPS. With the Lancet method, that gets thrown way out the window. With the Lancet method, it will be 0 or another number clusters regardless. That isn't representative.

SESS uses representativeness. The Lancet is not. Ever.

The principle for random drawing applies to the sampling unit at each stage. In cluster sampling, this is first clusters, and then households. Now you and Lambert have been trying to conflate the two the whole time to avoid having to avoid being wrong.

When you're dealing with clusters, you have to either distribute them randomly, or representatively. The Lancet method does neither. Representatively, Missan should get 1 cluster, Basra should get 2. That will NEVER happen with the Lancet method.

Nor is it random, since they inserted the "all or nothing" rule, which violates the principle of random sampling as I have already explained.

I know you guys won't ever admit to it, so I will just have to submit this to some professors or something. Lambert has invested too much of his credibility in this, as have you Kevin. You are never going to admit you are wrong. I have admitted I was wrong about many other things, but I know I am right about this. This is what I was fighting for the whole time. This is the elephant I was always looking for. I found it, and now you are determined to obfuscate.

Be an honest man. Put your drink up and join me in the land of truth!

Night!

In vino veritas. I was wondering if you would ever come clean and tell us what you really mean by representative. To my surprise, you have. Evidently this is your "argument":

1st premise: A sample which excludes more than one or two governorates is not representative.

2nd premise: The Lancet sample excludes seven governorates (eight when Fallujah is excluded).

Conclusion: The Lancet sample is not representative.

Really, that's all you've got, isn't it? Maybe when you are in your cups it seems like a great idea to submit that thesis to a statistician. I suggest you have a few drinks with the guy before you present your case.

By Kevin Donoghue (not verified) on 05 Nov 2005 #permalink

Seixon, I encourage you to check out your arguments with some statistics professors.

Others have asked experts about the methodology of the study and they have all declared it sound.

Kevin and Lambert,

Yes, it looks like I will have to ask an "expert" because if you guys can obfuscate this hard on what "representative" means, then I'm guessing I'm not going to get anywhere here. You guys have erected the stonewall, and nothing is getting past that sucker.

Kevin, I already explained representative. You know, like stratified sampling, where each strata receives a portion of the sampling units based on area or population size. In other words, proportionate sampling. Is that a better word for you? The sample should reflect each governate proportionately.

This will never happen with the Lancet methodology. Nor does it select the clusters randomly in the paired governates. Thus, it undermines the principle of sampling.

It also produces an unknown and unquantifiable sampling error, which they even admit in the text of the study. This is an error they produced themselves, not an external error.

"The Lancet method will never be representative, because it will always oversample one province and not sample the other."

OK, well then perhaps we could understand if you would tell us which province will always be oversampled and which one never sampled.

z,

6 of the paired provinces will always be oversampled, and 6 of them will be excluded. The resulting sample will never be proportionate, nor random.

Missan will always have 0 or N. Basra will always have 0 or N. Etc.

Repeat the Lancet methodology a million times, and your sample will never be representative of Iraq.

Except, Seixon, according to your own indicator, the American death toll from violence, by chance the provinces actually included in the 98,000 death toll estimate had the same violence level as the one for Iraq as a whole. Sounds kinda representative to me.

By Donald Johnson (not verified) on 06 Nov 2005 #permalink

Donald,

I've already said that the indicator I used was rough and by no means demonstrative. I think my source now has the death toll broken down by province, so there is now an up-to-date correct tally. Regardless, as I have always maintained, that indicator is not a direct one. Even though it is a rough indicator, the differences between most of the pairings are obvious.

Using that rough indicator to claim that the provinces picked were "representative" of Iraq would be stretching it a bit too far. That just tries to escape the fact that the Lancet study methodology is on the ropes.

Seixon's original principle was that each subset of the same size had to be equally likely. This is only true for simple random sampling, so, without actually admitting that his principle was wrong he added an exception for systematic sampling (he calls this "representative"). However, cluster sampling is neither representative nor random using Seixon's definitions.

So, is it your position now that cluster sampling is invalid, Seixon? Note that yuor arguments apply equally to all forms of cluster sampling.

Lambert,

Again you *[OK Seixon, this is absolutely your last warning. If you violate my commenting rules again you will get a 24 hour ban. Tim]* conflate clusters and households so that your argument seems accurate. Clusters are the PSUs in cluster sampling, not households. So each subset of clusters have to be equally likely, not each subset of households.

Exception for systematic sampling? What about stratified sampling? Instead of actually debating on substance, you twist my argument and ask disingenuous rhetorical questions.

I already went through this one time before, and you have just hopped over it and repeated yourself. Zzzz....

OK, so I can't use the d-word, the l-word, the other d-word, none of which are profane. Sweet! So are you just going to censor my comments for mundane words, or are you actually going to respond to the argument? Zzzz...

I did not conflate households with clusters. If your principle was a real principle then it would apply at whatever level of sampling you were doing. What we actually have is Seixon making up his own rules for sampling that surprise suprise the Lancet study doesn't satisify. Of course, you won't find any but Seixon who believes in these rules or any mathematical basis why they should be followed.

Lambert,

It does apply at every level, just not all simultaneously. First you choose clusters. THEN you choose households. You don't do them at the exact same time.

I'm making up my own rules? Really? What I am describing here is what every sampling study does. Oh, well, except the Lancet study. Because THEY made up their own rules. I asked you weeks and weeks ago to find a single other study that did things the way the Lancet study did, and you have come up with nada. That is because they cut corners on the methodology to make it work out for their security constraints.

So basically Lambert disavows a sample being representative, done according to PPS and the definition of a random sample. Yeah, I don't know anything about sampling, right.

The Lancet methodology will NEVER produce a representative sample, but instead of just admitting it, we will just BS Seixon until he goes away. Not only that, but the other issue I have brought up, that the sampling error they produced, that they admit to themselves, is unquantifiable. They introduced it into their sample because of their methodology.

Just like I have been saying all along, the elephant is glaring you right in the face, but you don't want to admit it.

"6 of the paired provinces will always be oversampled, and 6 of them will be excluded."

Yes, but if it's not the exact same province being always excluded, or oversampled, then it's not a bias. I believe that's the sense in which the original reference which you used meant a province always being oversampled. If you mean that there will in every case be at least one province oversampled or excluded, then pretty much every sampling strategy can guarantee that will be the case, and no sample can ever be considered valid. The question is, given an infinite number of repetitions of the sampling procedure, will one province or another be over or undersampled in total? You say

"Repeat the Lancet methodology a million times, and your sample will never be representative of Iraq."

I disagree; overall, the oversampling and undersampling will balance out, since the probabilities are weighted by the populations of the provinces, and an infinite number of these procedures will give the same probability for each province as an infinite number of "true" clustering where the weary samplers trudge out to the outer provinces an infinite number of times and there is no pairing and second sampling.

It comes as no surprise that the Seixon Rules are being drafted in such a way as to discredit the method used in Iraq. What is interesting is that Seixon can't seem to rig his rules in such a way as to legitimise other methods. Even a stitch-up should have some logic to it. Consider this, for example:

In other words, each step of your sampling process must be either representative, or random, and the last step must always be random.

The problem here is that the last step is never random under Seixon's definition of randomness. It isn't random whether you look at households or clusters. The probability that two neighbouring households will both be sampled is much greater than the probability for two households at a distance from each other. Also, the probability that two disjoint clusters will both be sampled varies from province to province. In Muthanna it is zero, but not in Baghdad.

All this is ignoring the fact that it is obviously wrong to apply to PSUs a principle which expressly refers to elements, which PSUs are not, as well as the fact that, properly speaking, no PSU actually comes into existence until a household is selected using GPS coordinates etc. (So in the previous paragraph I should really refer to potential clusters; Iraq is not divided into disjoint clusters waiting to be selected.) When I pointed this out to Seixon previously he replied:

I might as well say that there were no households in Basra.

Hardly a compelling response. That was two weeks ago. He hasn't attempted to address the point since then.

By Kevin Donoghue (not verified) on 07 Nov 2005 #permalink

z,

If you mean that there will in every case be at least one province oversampled or excluded, then pretty much every sampling strategy can guarantee that will be the case, and no sample can ever be considered valid.

Goodie, the Lambert bait-and-switch method. This is simply not correct at all. I have already done graphs on the effects the Lancet methodology has compared to a normal one and shown that what you are saying here is completely incorrect.

Here we are again appealing to the fallacy that "the Lancet sample was just unfortunate". No, it wasn't. There will always be 6 provinces excluded (not counting Muthanna) and always 6 will be oversampled. Always. Any other form of sampling, stratified, cluster, SESS, anything, this would NEVER occur. Never.

So can you please stop pretending that the Lancet methodology is just like everything else when it clearly is not? You are trying to conflate the Lancet methodology "with every other method" when they are not equal in the least.

This is what Lambert and friends have been doing the whole time instead of admitting that the Lancet methodology, by design, ALWAYS produces a farce of a sample.

I disagree; overall, the oversampling and undersampling will balance out, since the probabilities are weighted by the populations of the provinces, and an infinite number of these procedures will give the same probability for each province as an infinite number of "true" clustering where the weary samplers trudge out to the outer provinces an infinite number of times and there is no pairing and second sampling.

I already conducted an experiment where I did this about 500 times, and this is simply not true. Also, you talk about "averaging out". I'm sorry, but when you do a survey, you don't conduct the sample a thousand times and then take the average. You take ONE sample.

Each sample you get with the Lancet methodology, on its own, will never be representative of Iraq. Never.

The notion that the sample would produce OK results after taking it a thousand times doesn't help because you are just doing it ONCE. Every time you do the Lancet sample, it will turn out unrepresentative. Every single time.

This is not so with all the other methodologies.

Kevin,

The obfuscation continues.

The probability that two neighbouring households will both be sampled is much greater than the probability for two households at a distance from each other. Also, the probability that two disjoint clusters will both be sampled varies from province to province. In Muthanna it is zero, but not in Baghdad.

Now you aren't even talking about the same thing. Now you are talking about the probability of two certain households being sampled. Why? Each neighborhood in each province has the same PPS change of being sampled. Again, this is conducted randomly.

All this is ignoring the fact that it is obviously wrong to apply to PSUs a principle which expressly refers to elements, which PSUs are not, as well as the fact that, properly speaking, no PSU actually comes into existence until a household is selected using GPS coordinates etc.

Where does the principle expressly refer to elements?

True that clusters are not designated before you choose, but what is your point?

The Lancet methodology of giving all the clusters of two provinces to one is not proportional, nor is it random.

I think I see what you mean, that when you choose a neighborhood, that you choose all the elements in that neighborhood. Yet this is alleviated by the fact that each neighborhood has the same PPS change of being chosen.

It is obvious that cluster sampling steps on the toes of a "random sample" in many respects, but you guys are simply not applying the same standard that is being applied to neighborhoods that was applied to the provinces.

Within each neighborhood, each sample of n households is equally likely since you randomly choose where to start in the neighborhood.

You guys obviously can't see for the life of you that the principle you talk about in the neighborhood sampling is not applied to the provinces.

In other words, let's say that Basrah gets 2 clusters. So now 2 neighborhoods are chosen randomly in Basrah. Both of the two clusters aren't given to the same neighborhood.

Yet, with the Lancet methodology, looking at provinces, both of those clusters would be given to the same province.

In other words, you don't see that the principle you think proves me wrong is in fact not the same as the one you are scurrilously trying to protect.

The obfuscation continues.

It surely does. You're a hell of an obfuscator, I'll give you that.

Now you are talking about the probability of two certain households being sampled.

Am I, indeed? I stated quite clearly that the last step isn't random (on your definition) whether you look at households or clusters. The probability that two disjoint clusters will both be sampled varies from province to province. That violates your principle (with n=2), as applied to clusters.

Where does the principle expressly refer to elements?

Seixon, comment number 17 of this thread:

Also, as I have been trying to solicit an answer for, a random sample, by definition, is one where any sample of n elements has the same probability of being selected. Now how can this be with the Lancet's methodology....

Inspect that quotation closely. Do you see the word "elements" in there?

I think I see what you mean, that when you choose a neighborhood, that you choose all the elements in that neighborhood. Yet this is alleviated by the fact that each neighbourhood has the same PPS change of being chosen.

Exactly. That is true whether provinces are paired or not, if PPS means probability proportional to size, which is what most of us understand by the term. If you have your very own meaning for the term, that won't surprise me in the slightest.

You guys obviously can't see for the life of you that the principle you talk about in the neighborhood sampling is not applied to the provinces.

To the provinces? Why on earth should it be should it be? Are you now claiming that provinces are sampling units? My suspicion is that you have been thinking precisely that all along, but you've never actually risked saying it, fearing (quite rightly) that it would look ridiculous.

By Kevin Donoghue (not verified) on 08 Nov 2005 #permalink

Kevin,

The probability that two disjoint clusters (I thought they didn't exist until chosen, hmm) will get chosen is equal to any other according to PPS.

That I said "elements" in my definition can be replaced with "units" or any other synonym, what a ridiculous semantics argument.

Exactly. That is true whether provinces are paired or not, if PPS means probability proportional to size, which is what most of us understand by the term.

Once again, obfuscation. Yes, it will be true whether paired or not given that you follow the definition of a random drawing by PPS.

Unfortunately, that's not what the Lancet methodology does.

Let me ask you one question: is distributing 3 clusters between 2 provinces with a single drawing true to PPS?

To the provinces? Why on earth should it be should it be? Are you now claiming that provinces are sampling units?

Your responses seem almost scripted. Next you're going to ask me if I am claiming that Santa Clause is a sampling unit. I think it's quite telling that you and a certain someone keep dropping strawmen to muddle the discussion.

No, provinces are not sampling units. You should apply the same statistical principles when distributing clusters between provinces as you would when distributing clusters within provinces. Should you not?

And I guess no one is denying the fact that they introduced an unquantifiable sampling error to the study now. Good.

I thought [clusters] didn't exist until chosen, hmm.

Of course they don't exist until chosen. I have explained several times that I am only going along with your nonsense about clusters being "elements" in order to show that it leads to the rejection of methods which you say you accept. If by now you don't understand this, you are probably quite determined not to. Your comments are becoming even sillier, which didn't seem possible.

And I guess no one is denying the fact that they introduced an unquantifiable sampling error to the study now.

Utter rubbish. Are you rejecting all statistical inference now? If you have an argument, present it. And please don't give yourself airs. The fact that people don't respond to assertions which are not supported by any reasoning whatever doesn't constitute acceptance. When I abandon the statistical theory I have swotted hard to understand in favour of the new-fangled Seixonian mysticism, rest assured I will let you know. In the meantime please note that the points I have actually responded to are the best of the bad lot you are proferring. You can take it that anything I ignore is even less worthy of comment.

By Kevin Donoghue (not verified) on 08 Nov 2005 #permalink

Kevin,

I have explained several times that I am only going along with your nonsense about clusters being "elements" in order to show that it leads to the rejection of methods which you say you accept.

Good thing you didn't respond to the rest of my previous comment! You might have to try to put up an argument! Why didn't you answer the one question I posed? Because the answer was an obvious "Non"?

Figures.

So Kevin, you are pretending that the sampling error that they introduced doesn't exist? You know, the one that they admit to by saying:

This clumping of clusters was likely to increase the sum of the variance between mortality estimates of clusters and thus reduce the precision of the national mortality estimate.

In other words: we introduced a sampling error by clumping clusters, and this sampling error was likely to increase the sum of the variance, but we have no idea how much because it is impossible to calculate it because we side-stepped statistics to allegedly save our asses by not traveling as much.

Are you denying that this exists?

I guess I will also have to ask again: does distributing 3 clusters between 2 provinces with a single drawing stay true to the principle of PPS?

"In other words: we introduced a sampling error by clumping clusters, and this sampling error was likely to increase the sum of the variance, but we have no idea how much because it is impossible to calculate it because we side-stepped statistics to allegedly save our asses by not traveling as much.
Are you denying that this exists?"

No, I think we all agreed on that long ago. I did in my first post on the subject. Back when you were still tossing the word BIAS around. I suggested at one point that the additional variance from this extra random procedure could be calculated for an estimate of how much additional it adds, but nobody listens to Zathras. Anyway, I still don't see how this leads to the conclusion that no estimate is better than a wide one. Is there some threshold involved? Even the .05 alpha isn't a law of nature or God, just a commonly accepted standard. I suppose we might be better off not knowing, and if so then not making an estimate would follow, logically.

"I guess I will also have to ask again: does distributing 3 clusters between 2 provinces with a single drawing stay true to the principle of PPS?"

Yeah, just as distributing 1 cluster between 2 provinces with a single drawing does.

"And I guess no one is denying the fact that they introduced an unquantifiable sampling error to the study now. Good."

So.... what are we arguing about?

z,

No, I think we all agreed on that long ago. I did in my first post on the subject. Back when you were still tossing the word BIAS around.

Sampling error and sampling bias are basically the same thing, although I think bias tends to mean that the result is skewed in a specific manner, whereas in this case, the error is unknown.

You didn't catch Kevin throwing a tantrum just below? Some also suggested that bootstrapping fixed this problem. Which was also not correct.

Anyway, I still don't see how this leads to the conclusion that no estimate is better than a wide one.

That has never been my stance. My stance is that it was fine for them to do the study and publish the result, but for people to be using it as a fact is just absurd. The incessant whining about it being "buried" is also evidence that those who suscribe to it as some sort of fact are only doing so to further their own agenda. It isn't a fact that George Bush has an approval rating of 37%. Yet with a gigantic confidence interval in comparison, the Lancet study is given some sort of special place as a fact. That is just laughable.

Basically it was just a hapless study that produced an almost meaningless result. There is no confidence in the result they came up with. The manner and the timing of which they released the results is also highly suspect.

IMHO, they should have waited until another better time to conduct a better survey, with far more resources, instead of an anemic one rushed to publish before the presidential election. The impression given is that one wasn't seeking to get a precise and accurate finding, one was just seeking to get any finding.

Yeah, just as distributing 1 cluster between 2 provinces with a single drawing does.

So a 3:1 clusters to drawing ratio is the same as 1:1? Please, do elaborate.

Seixon, the reason why I didn't answer your question is because, however much you piss me off at times, there is one thing I will say for you: unlike many Lancet-bashers, you have actually read the study, which says:

Because the probability that clusters would be assigned to any given Governorate was proportional to the population size in both phases of the assignment, the sample remained a random national sample [my emphasis].

I have already explained that where I come from, PPS means what any reasonable person reading that sentence would guess it to mean. So you knew the answer. Why did you ask it in the first place and why do you now repeat it? Do you get a kick out of wasting time?

There is a world of difference between saying that the precision of an estimate is reduced and saying "we side-stepped statistics to allegedly save our asses by not traveling as much." You know that too, of course. But you like posting nonsense. Still, you can hardly be accused of trolling since this is a thread devoted to Seixonian Statistics, which is pretty much nonsense from start to finish.

But even by your standards this is a gem:

Sampling error and sampling bias are basically the same thing....

By Kevin Donoghue (not verified) on 08 Nov 2005 #permalink

Kevin,

Thanks for telling me what PPS is. So then you will get to explain to me how that text you just cited is carried out by giving all 3 clusters in a pair with the combined population of 2.015 million to the province that has 1.33 million. Or the one that has 0.685 million.

Is that "proportional to the population size"?

The populations of the two
Governorates were added together, and a random
number between 0 and the combined population was
drawn. If the number chosen was between 0 and the
population of the first Governorate, all clusters
previously assigned to both clusters went to the first.
Likewise, if the random number was higher than the
first Governorate population estimate, the clusters for
both were assigned to the second.

Seen that before, haven't you?

By Kevin Donoghue (not verified) on 08 Nov 2005 #permalink

Kevin,

I asked you a question. Answer it. I have 3 clusters. I have two provinces to give them to. I want to give the clusters away "proportional to the population size". Province A has 1.33 million people. Province B has 0.685 million people. Would I:

A. Give all 3 to A
B. Give all 3 to B
C. Give 2 to A and 1 to B
D. Give 2 to B and 1 to A

Please answer the question.

I want to give the clusters away "proportional to the population size".

Maybe you do, but the Lancet team wanted to allocate them with probability proportional to the population size.

That's what they set out to do, and yes, that's what they did. I think I have answered your question several times over the last few months. You know what they did, I'm telling you it was a perfectly proper thing to do in the circumstances, so what's your problem? Have you a bet on with some guy in the cyber-café to see how many times you can get me to answer the same trivial question?

By Kevin Donoghue (not verified) on 08 Nov 2005 #permalink

Kevin,

You know what they did, I'm telling you it was a perfectly proper thing to do in the circumstances, so what's your problem?

Proper to do, in the circumstances, yes. Otherwise? No.

You couldn't find any other study that did this, and it violates the principle of random sampling. It produces an unrepresentative and nonproportional sample every single time. Right?

It produces a sampling error that cannot be computed. Right?

The confidence intervals given in the study do not reflect this sampling error? Right?

Thanks for playing.

Your questions answered:

1) Wrong, unless you have in mind some trivial notion, like: there's a greater percentage of bald guys than there is in the population as a whole. Every sample is "unrepresentative" in some sense. The relevant question is whether we have any reason to believe that the mortality experience in the sample was significantly different from that in Iraq as a whole. I don't know of any good reason to suppose that it was.

2) There are various ways of estimating sampling error. AFAIK the one they used was state-of-the-art. The problem doesn't strike me as being any greater in this case than in many others. Statistical theory does not depend on a sample being random in the very strictest sense; if it did no survey using anything other than SRS could be carried out. The Lancet sample is random in the same sense that any cluster sample is. Your "argument" to the contrary doesn't hold water. We've spent enough time on that.

3) As you know, Tim Lambert has already said he is open to argument on this question. I don't think he is expecting anything convincing from you, nor am I, but surprise us if you can. Ragout thinks there could be a problem but he didn't find your argument persuasive. I have given my own views, for what they are worth, in another thread. If anyone else has volunteered an opinion I haven't seen it.

By Kevin Donoghue (not verified) on 08 Nov 2005 #permalink

"Sampling error and sampling bias are basically the same thing, although I think bias tends to mean that the result is skewed in a specific manner, whereas in this case, the error is unknown."

Grk. Snort. Brzk. Flzrb. Snrg. I'm having trouble coming up with a reply... is it possible that this means something different in another language?

Kevin,

1) What a crock of... When you give a province 3 or 0 clusters that should by any normal circumstance have 2, that is not representative nor proportional. When you give a province 3 or 0 clusters that should by any normal circumstance have 1, that is not representative nor proportional. It would be like creating a methodology that would ensure that 1/3 of the states in the USA were oversampled by a degree of 1.5-3, and another 1/3 of the states were excluded entirely. I don't see how in the world you manage to fool yourself into thinking this would ever represent a representative or proportional sample of the USA.

2) They could not compute the sampling error they produced. It is impossible. You suggested the bootstrap method, but that doesn't do it. Listen to this guy, "state-of-the-art". Sure, they used that to figure out their numbers, but unfortunately, they could not figure out the error they produced. That's why they said, "it was likely" meaning "we have no idea what it was, but we are certain it happened".
The Lancet survey is not random in the sense any other cluster sample is. No other cluster samples clump clusters as they did. I have hounded you for over a month to find a single other example of this methodology, you found none and pretended that they had invented a new way of doing samples. I guess Michael Brown of FEMA invented a new way of doing hurricane relief as well....

3) Is he? He sure doesn't say much about it. There really is nothing to talk about, because the sample error was introduced by the methodology, and it would be impossible to calculate or alleviate the error since those 6 provinces were excluded from the sample. There is nothing more to "persuade" you to think about. I'm not sure what else you are looking for. It's like you're asking me for proof that the dead cat in front of you is dead. It's dead! Look at it! There's really not much more to say than that.

z,

So sample bias is not an introduced error in the sample? Zzzzz....

z asks: is it possible that this means something different in another language?

Who knows? But I reckon it may be Seixon's best yet.

By Kevin Donoghue (not verified) on 08 Nov 2005 #permalink

We have now come full circle. Tim Lambert wrote (22 Sept): "Seixon could just as well argue than all surveys are biased because after the sample has been randomly selected each person outside the sample has a 0% chance of being selected."

Indeed. Read the whole thing; all three threads on the Seixon Critique. But only if you have a strong stomach. If not, that's a pretty good summary.

By Kevin Donoghue (not verified) on 08 Nov 2005 #permalink

Kevin,

That has absolutely nothing to do with what I just said, or anything I have ever said. Instead of actually making an argument, you just resurface old and discredited strawmen.

Anyone can read point #1 in my previos response to you and see that your quotation of Lambert's BS just now is completely delusional.

Kevin, do you think a sample of the USA would be representative by purposely, not by chance, but purposely excluding 1/3 of the states?

You have just proven that you guys are not interested in a spirited debate, but shoveling more BS on your own partisan flame.

Utterly disgraceful.

So apparently a sample bias is not an error introduced into the sample. This is like some kind of parallel universe. A flawed study is "robust". Sampling with unsupported methodology is akin to "innovation". Regular statistics software is "state-of-the-art". A completely unrepresentative sample isn't that because "I don't know of any good reason to suppose that it was". Proportional to population size means not proportional to population size. A cluster sample that doesn't resemble any other documented cluster sample "is random in the same sense that any cluster sample is".

Amazing. No wonder the Lancet study is bulletproof - it has so many holes in it, all the bullets pass right through it.

"Kevin, do you think a sample of the USA would be representative by purposely, not by chance, but purposely excluding 1/3 of the states"

No; that's why they did it by chance.

You seem to not understand that samples are not required, nor expected to be 100% absolutely proportionately representative of every subgroup of characteristics of the population. That is why you have sampling error measurements. If the sample was 100% absolutely representative of the population, there would be 0 sampling error. That would be because you had examined every member of the population, a 100% sample. Anything else, you would be excluding some part of the population.

"So sample bias is not an introduced error in the sample? "
No, it's a bias. Error is random, bias is systematic.

Meanwhile, back at the metaphor... like most folks in the US, I have a vague idea that Perth is somewhere in Australia. Very large confidence interval. Your position would seem to be
1) this is not worth having, I would be just as well or better informed if I did not know what country Perth was in
and
2) the nature of my vagueness about the precise location of Perth is similar to having a belief that Perth is in New Zealand.

I reject both these propositions.

Lambert,

Well geez, Kevin said you were open for that being true, and z has said he agrees, as have others. So that's a gem of mine, because it is true? Zzzz...

z,

Again, you are not being very honest here. There's a difference between not being 100% representative, and purposely designing your sample to exclude 1/3 of the states in the USA. No sample will be 100% representative, but a sample that is purposely designed not to be representative is a whole different story.

They did not "do it by chance". Look, in every single type of sampling, 1/3 of the states in the USA would never be excluded purposely. Not with stratified sampling, not with SESS, not with cluster sampling, not with any type of sampling.

You guys continue to pretend that "whoops, gee we got a bad sample completely by accident" and it is just incredibly ridiculous.

When doing a study and taking a sample, the entire idea is to try and be as representative as possible of the population you are sampling. Purposely making sure that a large portion of that population will have no chance to be in the sample doesn't seem to go along with that principle. There will always be 1/3 states that aren't in the sample. Always. Every single time.

Zogby, Gallup, or any other polling firm would get laughed out of business if they did things in the manner you people seem to indicate is completely OK.

The fact is that it isn't OK, you are just weak to admit it. Lambert has his credibility tied up in knots with this study as he has defended it from day one in over 60 posts, so I don't exactly expect him to admit anything. The rest of you, though, I cannot fathom why you would leech onto his side against truth. I've only done 3 posts on the Lancet, and I have admitted that some of the things I said in them were in error. Lambert hasn't admitted a single error in all of the stuff I have proven was wrong or misleading.

But that's fine. It actually makes me laugh that persons with any respect for themselves would actually defend a sampling technique that purposely seeks to be unrepresentative of the population being sampled.

Here we have a nice example of how Seixon, stout campaigner for truth and, furthers his cause.

Seixon asked:

The confidence intervals given in the study do not reflect this sampling error? Right?

My reply began:

As you know, Tim Lambert has already said he is open to argument on this question.

I was referring to this comment in an earlier thread:

I'm open to persuasion on the question of whether they accounted for the pairing when they calculated the CIs. It certainly seems unlikely to make much difference if they treated it as a one stage design when boot-strapping.

This apparently entitles Seixon to say:

I have already silently gotten consensus at Lambert's blog that the Lancet study introduced a sampling error into their study that cannot be computed.

You have the right to remain silent. But if do you choose to make a statement, anything you say may be taken down, grossly distorted, and used in evidence against you.

By Kevin Donoghue (not verified) on 08 Nov 2005 #permalink

Kevin,

I said Lambert's blog, not just Lambert himself. Obviously Lambert's comment was like many of his others, he says he is open to persuasion, when he is really not. I have already shown that the sample error could not be computed, and that bootstrapping would not help with this. So what left is there to be persuaded by? Lambert hasn't said that I have been wrong on these two points. In fact, he has been incredibly silent.

I know why he is, because he doesn't want to admit I am right, or figure out some way of showing that I am wrong.

Now, aside from Lambert, z, and some others, including you, have even admitted that the sampling error exists and some are even brave enough to admit that the Roberts team are the one who introduced this error to the sample.

You shouldn't be throwing concussion grenades in that glass house of yours, as your highly dishonest half-quotation of me on the other blog shows.

No, Seixon, you wrote "Kevin said you were open for that being true" where "that" refers to a statement which Tim Lambert never made and which I never attributed to him. As for the "half quotation", I posted the funny part; the rest didn't add anything except to make it even more clear that you don't know what "sampling error" means. But the full quotation has now been posted chez Lenin.

I see that you are again interpreting silence on various points to mean that people suspect you are right. By this time it is far more likely that they consider you a hopeless case.

By Kevin Donoghue (not verified) on 09 Nov 2005 #permalink

Kevin,

So is z also wrong then? This is what he wrote:

Error is random, bias is systematic.

This is what I wrote:

Sampling error and sampling bias are basically the same thing, although I think bias tends to mean that the result is skewed in a specific manner, whereas in this case, the error is unknown

The exact same sentiment as z. So is z also wrong? Or are you just willfully not wanting to comprehend what I am writing?

Kevin, you didn't answer anything I wrote in the previous comment and are just trolling now.

Obviously a methodology that ensures an unrepresentative sample is not "sound methodology" but instead of just admitting it, you (and z) go into "well no sample is 100% representative, bla bla" when I have never even made any such qualification.

Strawmen upon strawmen, trolling, misquotations. This is really starting to get out of hand Kevin.

Seixon asks:So is z also wrong then?

Well, z has already commented on what you wrote (no. 87):

Grk. Snort. Brzk. Flzrb. Snrg. I'm having trouble coming up with a reply is it possible that this means something different in another language?

I certainly wouldn't describe that response as wrong. Does it strike you as a ringing endorsement of your position? If it was intended as such, z can say so.

This is really starting to get out of hand Kevin.

Now on that we can agree. In the interests of restoring some semblance of order to the discussion, here is a statement of the true state of affairs as I see it:

Any random sample, by its very nature, excludes the vast majority of the population. Nonetheless, statistical theory and practical experience have shown that it is possible to use a random sample to obtain an unbiased estimate of a population parameter. It is also possible to compute a valid measure of the likelihood that the true value of the parameter differs greatly from the estimate. Although elementary textbooks focus on the use of simple random samples, in the real world modified versions of SRS are often used and give good results. They are also called random samples because they are, well, random. The Lancet team used one such method, with a modification which you find objectionable and which I don't. The fact that a large segment of the population got left out of the sample is not a valid criticism. But it seems to be all you've got and you are evidently determined to bang on about it forever, saying it's "unrepresentative" just because people who had as good a chance of being included as anyone else were randomly excluded.

By Kevin Donoghue (not verified) on 09 Nov 2005 #permalink

Seixon, it's a gem because it is off-the-planet class delusional. You are wrong on those two points because you have no clue how random sampling works.

Please note that if I don't bother to comment on something you write it doesn't mean I agree with you. More likely it's because what you wrote is so obviously wrong that I don't think any refutation is required.

"There will always be 1/3 states that aren't in the sample. Always. Every single time"

Not the same states every time, but what the heck. Similar to sampling 10,000 human beings to get an average weight. How valid can that be, when there will always be 6 billion humans who aren't in the sample. Every single time. On purpose!

"Now, aside from Lambert, z, and some others, including you, have even admitted that the sampling error exists and some are even brave enough to admit that the Roberts team are the one who introduced this error to the sample."

Of course, they do state this in the paper. Giving credit where credit is due, I had not really given this any consideration until you brought it to my attention, and that's probably true of most here. On the other hand, you were suggesting it was bias; I believe my initial post suggested that it was not bias, but it did increase the error.

While we're on the subject, just to clear the air of mysteries, you are correct that the initial Lance press release regarding the study was incorrect. On the other hand, you attribute this to malice aforethought, I'm more inclined to hypothesize human error on the part of dumbass deskjockey drones without a rigorous background manning the PR desk. Of course, that's because as a pathetic liberal wimp I am required by law to avoid assuming evil motives where stupidity would suffice, whereas as an upstanding conservative you would only be required to do so in cases of invading another country on the basis of false information regarding WMD.

"Error is random, bias is systematic.

This is what I wrote:

Sampling error and sampling bias are basically the same thing, although I think bias tends to mean that the result is skewed in a specific manner, whereas in this case, the error is unknown

The exact same sentiment as z."

Grrk. Snzppf. Sptng.

Various obfuscators,

A sample bias is an error in the sample, a systematic error. You guys know this, that is what I was saying, and you keep pretending that I don't know what I'm talking about since you have to always take what I say out of context. Zzzz....

Not the same states every time, but what the heck. Similar to sampling 10,000 human beings to get an average weight. How valid can that be, when there will always be 6 billion humans who aren't in the sample. Every single time. On purpose!

Once again, you are in full denial mode. I'm talking about excluding entire segments of the population, not x amount of people from all over. There's a difference between sampling 20 people from Basra and 10 people from Missan as opposed to 30 people from Basra and 0 from Missan.

In other words, going along with your example of doing a sample to find the average weight in the world, the Lancet would have purposefully excluded 1/3 of the countries in the world. Now tell me, if one of those happens to be China or Japan, what do you think happens to the sample and the result? Will that effect the result of the study or not?

Zzzzzzz....

I am not a conservative, jesus friggin christ. I am probably more liberal than you are. The "erroneous" headline on the Lancet website was there for over a week. You'd think that if it was a mistake, it would have eventually been fixed. You know, the editor of the Lancet should have taken a look at some point in time. You'd think that Roberts himself actually took a look at the damn thing. But nope, the "error" stayed up there.

In other words, the publication that supposedly peer-reviewed the damn thing can't even get a headline about the study correct. Excuse me while I ponder whether that is due to incompetence or malice.....

I may have made a mistake in talking about "bias" even though I think most people here take that to mean fraud even though it doesn't have to mean that at all.

In the example I gave above, the sample would be biased because it would negate the weights of a large segment of Asian people who have a lower average weight compared to much of the rest of the world. Thus the sample would be biased towards non-Asians and thus towards a higher average weight.

Right?

That is an effect created by the Lancet methodology, although the bias will not be the same every time. Depending on the outcome between all 6 pairs, the biases may cancel each other out, or there may not be a difference between the pairs so then there would not exist any bias.

The problem is, we just don't know because those provinces were not sampled, and that is poor methodology.

I find it hilarious that Kevin and Lambert (especially) would defend a poll of the United States population that would systematically exclude 1/3 of the STATES.

z, you like to pretend that the exclusion of x amount of people is the same as the exclusion of x amounts of population segments. It's not, and you know it's not.

Clumping of clusters is also not a valid statistical methodology. I have asked you to find ONE SINGLE EXAMPLE of it other than the Lancet study, and I have received nothing.

I find it hilarious that you can sit and defend a methodology that changes the cluster size three times during the process of sampling. From 30 households, to 90 households, and then back to 30 households.

Lambert said that clustering of clusters was multistage clustering - it is not. Multistage clustering is distributing clusters (consistent size throughout the process) in different stages.

The Lancet methodology would have constituted multistage clustering if they didn't clump clusters in the pairing process.

The error was increased by this methodology in a way that cannot be calculated, and the confidence intervals do not reflect it either.

Usually when a methodology purposefully introduces an error, that is frowned upon.

But not here! Anything goes! Cut 1/3 of the states? SURE! We don't need California in a sample of the USA! Or Texas. Or the entire midwest. Nope, excluding them will certainly not affect the sample. Nothing will! Yeay!

Breaking statistics principles? SURE! It's "innovation"! Breaking windows is a new way of cleaning them!

And you say I am off-the-planet??

You've got to be kidding me.

I should have a chat with John Zogby, he would be laughing his ass off hearing what you guys are saying about sampling.

"That is an effect created by the Lancet methodology, although the bias will not be the same every time. "

Then it's not bias. It's error. That is why the confidence limits are very large.

"I may have made a mistake in talking about "bias" even though I think most people here take that to mean fraud even though it doesn't have to mean that at all."

I think most people here take it to mean systematic nonrandom error such that the mean of the distribution of estimates in multiple samples is significantly different than the population mean. Given an alpha, say the usual .05, then the confidence interval of a biased sample will no contain the true mean more than alpha of the time; i.e. >5%, with it falling on one side more often. With that in mind, it's easy to determine whether a sampling procedure is biased if we compare a large number of samples to the true population mean. Aside from being self-evident, the math is pretty well defined. If the true mean falls outside the confidence interval >5% but equally on both sides, then it is not biased; it just has higher error rate than a Gaussian distribution. Repeat: Not Biased. And none of those multiple samples, which gave individual estimates of the true mean which were not exactly correct, was biased either.

It's an old metaphor, but maybe needs trotting out again: a gun which shoots in a foot-wide circle centered around the bullseye is accurate, but not precise. A gun which shoots a tight little circle a foot away from the bullseye is precise, but not accurate. In neither case would you expect your first shot to hit the bullseye; but in either case, it would still be your best indicator of where the bullseye was, if you didn't already know. From the results of just this one shot, however, you cannot say whether the gun is imprecise, inaccurate, or both, even if you knew where the bullseye is in relation to the bullet hole. You can inspect the mechanism of the gun, however, and determine what features may be causing a reduction in precision and/or accuracy.

You are telling us that this survey is not accurate, based on the fact that you have found a problem with its precision. Does not follow logically, they are distinct concepts.

"The Lancet methodology would have constituted multistage clustering if they didn't clump clusters in the pairing process."

Why? The clusters represent provinces. Why do you assume that the death rate is uniform in any province, so that randomly leaving out whole chunks of a province by including only certain clusters from that province is biased or whatever you are calling it, but randomly leaving out those few clusters is not biased?

To use your analogy, tossing the dice to decide whether to leave out Texas or California is an absolute nono, but tossing the dice to decide whether Los Angeles or Eureka represents California; no problem.

z,

Then it's not bias. It's error. That is why the confidence limits are very large.

Each different sample from this methodology has the possibility of producing a bias. Think about it like this: if the provinces excluded all had very low death rates, then the sample will be biased upwards. Yet as I have said, and you repeat, this will not be the same every time, but there is still potential for a biased sample. Biased does not have to mean fraudulent, you know. Yet there is also the possibility of it not being biased at all. So to generalize, as I have already said, we'll call it sampling error.

The methodology in itself is not biased, but it has the potential to produce biased samples.

The confidence intervals do not reflect the sampling error produced by this effect, as the text in the study hints at and tries to gloss over.

You are telling us that this survey is not accurate, based on the fact that you have found a problem with its precision. Does not follow logically, they are distinct concepts.

Not at all, you are ascribing to me the opposite of the Lambert & Co dogma.

My position is that the procedure is not precise, and we cannot know how accurate it is. I have said time and time again that the study might very well be accurate despite its woeful precision, but due to the precision woes, we just don't know how accurate it is. We can have no confidence in its accuracy. We can't say it is accurate, or that it isn't. We don't know, and can't know.

The Lambert & Co position is that it isn't precise, but that it is accurate.

Tell me which position is logically corrupt.

Why? The clusters represent provinces.

Ehm, no they don't. Not sure what you mean by that, but each cluster represents 1/33 of Iraq's population, which does not correspond with any province.

Why do you assume that the death rate is uniform in any province, so that randomly leaving out whole chunks of a province by including only certain clusters from that province is biased or whatever you are calling it, but randomly leaving out those few clusters is not biased?

Have you even read the methodology? Your comments seem to indicate that you have absolutely no clue about the methodology at all. It isn't a matter of leaving out "chunks of a province". It's a matter of leaving out entire provinces. If you haven't even realized this by now, no wonder we are arguing in circles.

To use your analogy, tossing the dice to decide whether to leave out Texas or California is an absolute nono, but tossing the dice to decide whether Los Angeles or Eureka represents California; no problem.

Except that's not what the Lancet study does.... The Lancet study leaves out Texas or California. I think you'll agree that Texas and California are larger areas than Los Angeles and Eureka, if I am not understanding what you are trying to get at....

"The methodology in itself is not biased, but it has the potential to produce biased samples."

Well, to beat the metaphor to death, that is the equivalent of saying that with an accurate but imprecise gun, each shot is inaccurate. OK, but it's still your best shot, so to speak.

To beat the now dead metaphor to a pulp, your POV is that you have examined the gun thoroughly and found it's precision not up to your standards; which the manufacturer does note in the small print in the technical manual, although the PR literature makes false statements. However, upon examining the gun, you have found no sign of anything that would cause it to be systematically inaccurate, other than the fact that each shot will have a random component that would make it inaccurate by a random amount, averaging zero. Nevertheless, you state that without having a bunch of shots you can't tell that the imprecision is covering up an inherent inaccuracy that can't be determined by thoroughly examining the mechanism; and therefore, you are assuming that the actual bullseye is somewhere below the location of the bullet hole. To my mind, that starts off well, gets less grounded, and then veers off totally at the end. The rest of us accept that the bullseye probably does not fall dead center on the bullet hole, but think the relationship between the two is random and unpredictable, so right now "near the bullet hole" is all we have to go on for the location of the bullseye, but it's better than the previous estimate, which was "somewhere above the ground".

"Seixon has been banned for 24 hours for repeated violation of my comment policy. I've deleted his comments and all the ones criticizing him from this thread. Any discussion of Seixon should go in the Seixon thread, not here."

Oops. In that case, I'll refrain from posting anything else, lest I be suspected of beating him with a dead metaphor while he cannot defend himself.

"it's precision"
But I will say that I meant "its".

Is this the Seixon statistical criticism thread then? I assume it is, but just checking to be sure.

By Donald Johnson (not verified) on 10 Nov 2005 #permalink

However, upon examining the gun, you have found no sign of anything that would cause it to be systematically inaccurate, other than the fact that each shot will have a random component that would make it inaccurate by a random amount, averaging zero.

You want to explain why it would average zero? Or did you just pull that out of a hat? The gun is systematically inaccurate, only it doesn't pull to any one direction systematically. ;)

Nevertheless, you state that without having a bunch of shots you can't tell that the imprecision is covering up an inherent inaccuracy that can't be determined by thoroughly examining the mechanism; and therefore, you are assuming that the actual bullseye is somewhere below the location of the bullet hole.

Once again misrepresenting my position, again trying to juxtapose me opposite of the position that people like Lambert and you are taking: that the study is accurate. I'm not claiming it isn't accurate, I'm claiming that the study is so imprecise and unaccountable that there is no reason to put any weight behind its conclusions. It might be accurate, might not be.

To use your methaphor against you: you know you have an imprecise gun, and you are missing enough shots to be able to make any meaningful conclusion about the accuracy of the gun, yet you claim the gun is accurate anyway.

The rest of us accept that the bullseye probably does not fall dead center on the bullet hole, but think the relationship between the two is random and unpredictable, so right now "near the bullet hole" is all we have to go on for the location of the bullseye, but it's better than the previous estimate, which was "somewhere above the ground".

Well, it looks like my characterization of you was accurate. How do you know that it is "near the bullet hole"? Your gun is so imprecise, how can you even claim it to be near? In fact, with the shots you have taken, all you can claim is that the bullseye is somewhere on the board.

Sure, something is better than nothing, but that doesn't justify walking around claiming that this something is accurate, robust, and the damn truth.

I also note that you stayed far away from talking about the methodology once you found out you didn't know what you were talking about regarding clusters and provinces... So in other words, I can see why you were thinking the way you did - you just didn't have your facts straight on the methodology.

"You want to explain why it would average zero? Or did you just pull that out of a hat? The gun is systematically inaccurate, only it doesn't pull to any one direction systematically. ;)"

Because you haven't been able to show any reason why it would systematically go to one direction or another; i.e., bias. Or is this part of your position that it is not up to critics to come up with a better estimate, it is up to non-critics to come up with an estimate that proves themselves wrong?

Because you haven't been able to show any reason why it would systematically go to one direction or another; i.e., bias. Or is this part of your position that it is not up to critics to come up with a better estimate, it is up to non-critics to come up with an estimate that proves themselves wrong?

You sound like Mary Mapes. "I don't have to prove that my estimate is authentic. I don't think that's the standard." LOL.

No, it doesn't systematically go in any certain direction, namely because it is an unquantifiable error. That still doesn't mean that it "averages zero" and even if it did "averages zero" that would have no relevance since you are only conducting the sample one time! The only way that would be relevant is if you did the sample 10 times and then combined those samples to publish a result.

You are hopping and skipping away from the fact that:

1) They cut the corners on the methodology
2) This resulted in a sampling error
3) This sampling error is not computable
4) Their confidence interval, result, and DE do not account for this error
5) That matters because THEY introduced this error, and not an external factor

I guess I will just have to post this here since Lambert can't be bothered with defending his debunked arguments:

The ILCS does not agree well with the Lancet survey. The infant mortality rates are quite different, and the ILCS has much too vague questions about mortality for Lambert to be making the assertions he has made here and in the past. The ILCS asked respondents about "war-related" deaths, which would leave it up to the respondent to decide whether they would note it as that or disease, accident, pregnancy-related, or other. Lambert has excluded the possibility of a disease, accident, pregnancy or criminal death being "war-related" and the respondent assessing the question in this manner. As usual, Lambert eliminates all possibilities that do not mesh with his conclusions.

"The ILCS asked respondents about "war-related" deaths, which would leave it up to the respondent to decide whether they would note it as that or disease, accident, pregnancy-related, or other."

In other words, the ILCS can be considered to represent an underestimate of the total deaths.

"The only way that would be relevant is if you did the sample 10 times and then combined those samples to publish a result."

And statistical theory goes out the window.

"1) They cut the corners on the methodology "

No study can ever be "perfect" (particularly by your definition). All studies are hampered by real world constraints. Medical trials, for example, are usually badly underpowered due to the cost of getting sufficient subjects. Nevertheless, they deliver information. One learns to scale the validity of such information on the basis of "what they actully did", unlike the media, which immediately sells newspapers on the basis of "cure found for cancer!". Certainly, you are entitled to your estimate of the validity of this survey; we all disagree. You've enlightened (not sarcastic) us to the fact that the precision of the survey is not great, even as great as their confidence interval would indicate. OK, thanks, but of course that means that the true number is as likely to be > 98000 as it is to be less than that.

As with any study, the "so what?" question comes up. Nobody here was particularly vested in the number being precisely 98,000, I believe, as would life insurers, coffin manufacturers, etc. Nor the researchers involved, nor the Lancet editors. It's not really a linear relationship; 50,000 excess deaths would be just as shocking and horrifying as 98,000. or 20,000. If i had to put the point of the survey into one sentence it would be that this war to save the Iraqi people from the ravages of Saddam has resulted in a large increase in their death rate, which is indicative of a certain lack of achieving its goal. Normally, this would cause one to rethink current and future strategies, with an eye towards getting back on track.

And the survey has indeed proved valuable and widely accepted, in that beforehand the war supporters were pooh-poohing the Iraqi BodyCount numbers as ridiculously high and now that has become their fallback position, despite being an obvious underestimate. Thus does information weasel its way into the general wisdom, even if the source of such information did not perfectly achieve its ends.

In other words, the ILCS can be considered to represent an underestimate of the total deaths.

Yes. And? So my criticism of Lambert's cherry-picking stands then?

And statistical theory goes out the window.

Surely you're not confusing the rest of statistics with sampling, now are you? Like how someone recently started talking about coin-tosses as if that were taking a "sample"... lol.

No study can ever be "perfect" (particularly by your definition). All studies are hampered by real world constraints.

Of course, but what this study does is attempt to seem more precise than it actually is. Also, they could have just extrapolated the results for those parts of Iraq they went to, and not for all of Iraq. That would have been an honest way of going about it. In other words, they should have just extrapolated the result for the 75% of Iraqis they sampled, and not for 97% (because of Fallujah) that they ended up doing.

I'm not saying that things have to be "perfect", as no survey ever is, but this one there is just things that went wrong beyond the normal kind of constraints. "Clumping of clusters" is something that I cannot find in any other survey, which means that they sliced and diced this methodology a bit too much. The fact that this slicing and dicing doesn't and can't end up in the final estimates is also very troubling.

OK, thanks, but of course that means that the true number is as likely to be > 98000 as it is to be less than that.

Again, how do you know that? It depends entirely on those unsampled provinces.

Nobody here was particularly vested in the number being precisely 98,000, I believe, as would life insurers, coffin manufacturers, etc.

Oh really? You might want to check with Lambert on this one... And you know, all the anti-war people who keep using the 100,000 as if it were a fact.

It's not really a linear relationship; 50,000 excess deaths would be just as shocking and horrifying as 98,000.

So why was 98,000 so important? Why was it rounded up to 100,000? Why was the "civilians" connotation added?

If i had to put the point of the survey into one sentence it would be that this war to save the Iraqi people from the ravages of Saddam has resulted in a large increase in their death rate, which is indicative of a certain lack of achieving its goal.

And as a response, I would say that anyone would be out of their mind to believe that the death rate would not go up in a time of war, and you are also leaving out the fact that the insurgents have killed over 10,000 innocent Iraqis in the last 2 years. You are also looking at this quite short-sightedly. Saddam's regime lasted for around 30 years. Yet you are ready to claim that just 2 years after his removal, things are already going to be worse for the next 28? Hmmm.

Using that kind of logic would get you pretty screwed if you took a trip back to 1945, for example...

And the survey has indeed proved valuable and widely accepted, in that beforehand the war supporters were pooh-poohing the Iraqi BodyCount numbers as ridiculously high and now that has become their fallback position, despite being an obvious underestimate.

I did not accept the IBC numbers. During the first months of the war. That is because the counts seemed too high - at that time. A long time has passed since then, and the IBC numbers now seem very realistic and are in fact most likely an underestimate. So I think you are sort of leaving out the time frame here, in my case anyways. There are of course partisan pro-war people who acted as you just said, but there are people on the other side who do the same thing.

So I guess that means that at least one person grudingly admits that there are problems with the methodology, but accepts the problems due to constraints.

As long as you don't go around citing the Lancet study as some sort of fact, you are good on my list. It is those such as Lambert who continually keep up the farce that the Lancet study is "robust" and immune from all criticism that I find very hard to stomach. Not to mention the reliance on the results as if they were written in stone...

With such imprecise methodology and imprecise results, only the most rabid of partisans will claim that the results are proof of anything. Especially when they don't acknowledge that additional imprecision was introduced by the pollsters that goes beyond the normal errors one typically finds in a survey.