One of the most common sleazy tricks used by various sorts of denialists

comes back to statistics – invalid and deceptive sampling methods. In fact,

the very first real post on the original version of this blog was a shredding of

a paper by Mark and David Geier that did this.

Proper statistical analysis relies on a kind of blindness. Many of the things

that you look for, you need to look for in a way that doesn’t rely on any a priori

knowledge of the data. If you look at the data, and find what appears to be an

interesting property of it, you have to be very careful to show that it’s

a real phenomena – and you do that by performing blind analyses that demonstrate

its reality.

The reason that I bring this up is because one of my fellow SBers,

Tim Lambert, posted something about a particularly sleazy example of this

by Michael Duffy, a global warming denialist over at his blog, Deltoid.

The situation is that there’s a Duffy claims

that global warming stopped in 2002. It didn’t. But he makes it *look* like it did by using a deliberately dishonest way of sampling the data.

Looking at things like climate, one way of looking at trends is to

take periodic trending samples. That is, take every two-year interval, and

compute the difference between the two years. (So, for example, to look at two year trends since 2000, you’d look at (2000-2002, 2001-2003, 2002-2004, 2003-2005, etc.) To look for strong trends in

this way, you need to be sure that you’re capturing the right phenomena – because climate is chaotic, if you look at a period of time that’s too short, you can

see a lot of noise. So, for example, you might look at every 2 year trend, every 4 year trend, every 6 year trend, every 8 year trend, and every 10 year trend.

Let me take a moment to explain one very important word in the discussion above: *chaotic*. In mathematics, chaos has a very specific meaning. It doesn’t mean random without pattern. It means that there’s a high sensitivity

to initial conditions, and a particular kind of stochastic self-similarity. The canonical example of this is brownian motion. Take a cup of tea, and float a grain of pepper on it. Now, every second, plot its position in the tea. It’s going to float around in seemingly random ways. But there’s a pattern to its motion. You’ll see it make some large moves, but they’ll be rare in comparison to average.

There’s a lot more to mathematical chaos than that, and I’ll probably write about it at some time. But the thing that’s important here is that the chaotic

behavior of things like brownian motion can mask trends. If you stirred the tea in

the teacup, you’ll find the pepper jumping around in a chaotic fashion – but there’ll be an underlying trend for it to move in a circle. If you drop a ping-pong ball into a river, it’ll move all over the place – it will sometimes even get caught in an eddy, and move backwards. But overall, there’ll be a strong trend for it to move downriver.

If you did a trend analysis of the motion of the ping-pong ball, you’d be

looking at “How far did it move downriver in a given period of time?” – so you’d record its position every second, and then look at the difference in its position

over 1 second intervals, 5 second intervals, 10 second intervals, etc.

If you wanted to argue that the ping-pong ball had completely stopped moving

downriver, you couldn’t just take a couple of 2 second intervals, and show that in

three consecutive two-second intervals, it’s position didn’t move downriver. The chaotic nature of its motion means that you’d *expect* intervals of that length where it didn’t move downriver.

To get back to the weather issue, if you look at climate trends,

climate is chaotic. There’s a lot of bumps in it. If you look at short

trends, you see a huge amount of noise. But if you look at slightly longer

trends, a very strong pattern starts to appear. Even that has its bumps, but you can see a very compelling pattern in the data.

So, our denialist friend did trending – up to six year trends. And that’s

what he focuses his discussion on: six year trends. Why, you might ask, would he look specifically at six year trends? That’s easy. Because six-year trends are

the longest ones that produce the results he wants. Plot seven year or 8 year

trends, and suddenly, you can see the warming trend again. In fact, it’s an extremely obvious thing. Just look at the graph (taken from RealClimate).

What’s going on mathematically is that there is an upward trend in

the data. Most estimates put that warming trend at around 5 degrees F per

century – or about 1/20th of a degree per year. But yearly variation – the

chaotic component – is plus or minus a couple of degrees. So over short periods of time, that yearly variation drowns out the trend. But if you look at longer trends – which damp out the random yearly variation, while allowing the trend to accumulate – then the overall warming trend becomes visible.

What Duffy did is look at his data, and try to find a way of presenting

it that appeared to support his pre-selected conclusion. And he managed to find

one. He didn’t show a complete analysis – he couldn’t, because a complete analysis would have refuted his argument. So he selectively chose a way of analyzing the

data that would produce the desired results: he looked at the data to find the

longest period where trend analysis would show what he wanted – and he stopped there.

As sleazy tactics go, this is pretty extreme. As I said earlier, the

very first post on this blog was a takedown of an autism crank paper. This

is *far* worse that the autism paper – which was pretty bad. In the case of

the autism paper, they wanted to find an inflection point in the data, so they looked at the data, and picked something that would produce the result they

wanted. Arguably, you could just be clueless about proper statistical methods, and

do that by mistake. In the case of this global warming thing, there is no

possibility that this was caused by clueless error. This was deliberate

deception by cherrypicking data to produce a desired result.