One last word on the Geiers: So good it shouldn't be buried in the comments

By oracknows on March 6, 2006.

I was going to give this a rest for a while, but this is too good not to post a brief note about.

Posted in the comments of my piece debunking the Geiers' pseudoscience and their laughable "scientific" article claiming to show a decrease in the rate of new cases of autism since late 2002, when thimerosal was removed from vaccines completely other than some flu vaccines was this gem of a comment, by one MarkCC, which stated the essence of what was wrong with the Geiers' so-called "statistical analysis" of the VAERS database:

Here's the key, fundamental issue: when you're doing statistical analysis, you don't get to look at the data and choose a split point. What the Geiers did is to look at the data, and find the best point for splitting the dataset to create the result they wanted. There is no justification for choosing that point except that it's the point that produces the result that they, a priori, decided they wanted to produce.
Time trend analysis is extremely tricky to do - but the most important thing in getting it right is doing it in a way that eliminates the ability of the analysis to be biased in the direction of a particular a priori conclusion. (In general, you do that not to screen out cheaters, but to ensure that whatever correlation you're demonstrating is real, not just an accidental correlation created by the human ability to notice patterns. It's very easy for a human being to see patterns, even where there aren't any.)

Redo the Geiers analysis using any decent time-trend analysis technique - even a trivial one like doing multiple overlapping three-year regressions (i.e., plot the data from '92 to '95, '93 to '96, '94 to '97, etc) and you'll find that that nice clean break point in the data doesn't really exist - you'll get a series of trend lines with different slopes, without any clear break in slope or correlation.

So - to sum up the problem in one brief sentence: in statistical time analysis, you do not get to pick break points in the time sequence by looking at the data and choosing the break point that is most favorable to your desired conclusion.

Exactly! Unfortunately, that's exactly what the Geiers did.

A proper statistical analysis of such data, looking for time points at which a rate of change in a variable changes, is designed such that there is no bias in selecting a time point at which a significant change in slope is observed. As much as the Geiers might want to believe that there is a marked change in the slope of the curve beginning around late 2002 to early 2003, they can't assume that there is such a breakpoint before doing the analysis.

Once again, what pseudoscientists like the Geiers never seem to understand is that all those precautions we scientists take with control groups and statistical analyses designed to minimize investigator bias exist because we realize how easy it is for a scientist, particularly a medical scientist who is invested in finding a cure for a particular disease or condition, to be seduced into believing something that is not supported by data. (If they did understand, they wouldn't use such simplistic and easily debunked "scientific" methodology.) It's a very human tendency, and the scientific method is designed to minimize that tendency. That's why it takes so much training to overcome.

Some scientists never do overcome this tendency, and if they fall deeply enough into belief over evidence they become pseudoscientists.

Like the Geiers.

Thanks, MarkCC.

More like this

Midweek reading

Unfortunately, I didn't have time to write much for today. Fortunately, this gives me the perfect opportunity to remedy a situation in which I've been remiss.

Why not just castrate them? (Part 7): The fallout from the suspension of Mark Geier's medical license

The other day, I wrote about how mercury militia general and autism quack extraordinaire Dr. Mark Geier had his medical license suspended by the State of Maryland.

Why not just castrate them? (Part 8): The State of Maryland moves against David Geier

Well, well, well, well.

Andrew Wakefield and Mark Geier: Why does the "autism biomed" movement love one and not the other?

It's been a busy and rough week. The news on the vaccine front has been coming fast and furious, with the release of one bad study and another highly touted great white hope of a legal study.

In fact, what the Geiers did was a textbook case of the "Texas sharpshooter fallacy," so named from a possibly apocryphal story about a Texan who brags about his target-shooting ability. He stands way back from the side of a barn, fires wildly hitting it all over the place, and then draws a target around the places he hit.

The point is that in statistics, you can't use the same set of data to generate a hypothesis and then test that hypothesis; if you do so, you're reasoning in a circle. Introductory stats courses do a really bad job of explaining this, just saying "don't look at the data before testing it" which often leads students to something similar to the New Age "interpretation" of quantum mechanics.

Exactly, ebohlman - that's why in microarray expts or any other sort of biomarker studies we first have a 'training set' and then test the hypothesis in a completely different experimental set of subjects.

Ever see a picture of Dr. Geier?

Does he look like Dracula? Always wanted to meet Dracula.

While I wholeheartedly agree that the abuse of statistics through reading data before positing a hypothesis is dead wrong, there exists a whole field - data analysis - that superficially does just that. Under these circumstances, a respectable scientist has to draw attention to any pitfalls that he/she may perceive when applying inferential procedures and reaching conclusions 'a posteriori'.

Glad you unearthed this gem from the comment pile: I missed it.

While I wholeheartedly agree that the abuse of statistics through reading data before positing a hypothesis is dead wrong, there exists a whole field - data analysis - that superficially does just that. Under these circumstances, a respectable scientist has to draw attention to any pitfalls that he/she may perceive when applying inferential procedures and reaching conclusions 'a posteriori'.

That's a key distinction, isn't it? Real scientists are very careful to qualify the limitations of their methodology, particularly when doing retrospective analyses, which by their very nature are much more prone to bias and incorrect conclusions than prospective studies--even when the data used isn't as questionable as what is contained in the VAERS database. Pseudoscientists don't bother to list the limitations of their analysis or only do so in a very perfunctory fashion, mainly because they don't want to weaken their conclusion, which was usually reached before they ever looked at the data.

The bottom line is that correlation does not equal causation, and the Geiers haven't even been able to demonstrate correlation convincingly.

Here's the key, fundamental issue: when you're doing statistical analysis, you don't get to look at the data and choose a split point.

While I agree with the sentiment of the above statement, I do think a few things ought to be clarified. First, you can choose a split point a priori, e.g. the stock market crashed on March 4, let's collect data on stock prices and see whether the Mar 4 crash affected it. I didn't see where the Geiers spelled out whether the chose their point before seeing the data or not. Given that VAERS and CDDS data are public, I conservatively assume not. I expect to see some formal hypothesis test that directly addresses the split point. The so-called interrupted time series methods are one good class of methods, although I looked at their CDDS data and decided it didn't need time series analysis after all (no autocorrelation or partial autocorrelation). You can also do some regression model-building techniques to address this hypothesis test. Of course, the Geiers tried to justify their change point by looking at the slopes of two lines and comparing, and well, that was rather bizarre.

The other thing to note is that the Geiers did not really perform a changepoint analysis. Take a look again. They overlap their two regression lines by a year. I certainly haven't seen that in the 15 years that I have studied and done statistics.

This paper was certainly not a red letter day for statistics.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Turning out the lights and moving on: Goodbye, old ScienceBlogs blog, hello new blog

October 30, 2017

Today is the last day that ScienceBlogs will exist. Sometime today the site will go into read-only mode. A few days later, it will disappear completely from the Internet. It's a sad thing to contemplate after all these years. Whatever happened later, I will always be grateful for the start in…

A quick update on the migration to a new domain

October 23, 2017

Here's a brief update on the move, announced last week. Things are progressing, and most of my old material has been transferred to the new blog, which is located at respectfulinsolence.com. Of course, there are still some things to tweak and fix, which is why, given how insanely busy this week is…

A change is gonna come. Respectful Insolence is moving.

October 16, 2017

Well, QEDCon is over, and this box of blinky lights is on its way back across the pond to its home in the US, having had an excellent time imbibing skepticism from its (mostly) British and European partners in skepticism. Before I left, I made a somewhat cryptic remark about "major changes" to this…

And the box of blinky lights has arrived in Manchester for QEDCon

October 13, 2017

As you probably noticed, I didn't manage a post yesterday. Nor did I manage one today, other than this. That's because I was busy preparing for QEDCon, where I will be on a panel and giving a talk, and, of course, putting together my talk. As I write this, I'm horrendously jet lagged; so I probably…

On the "integration" of quackery into the medical school curriculum

October 11, 2017

QEDCon is fast approaching (indeed, I can't believe I have to leave for Manchester tomorrow night), and because my talk there will be about the phenomenon of "integrative medicine," I've been thinking a lot about it. As I put together my slides, I can't help but see my talk evolving to encompass…