Why you need to correct for clustering

Suppose you had a pair of dice and were wondering if they were fair. The average number you will get on a pair of fair dice is seven, so one way you could check your dice is to roll them a few times and look at the average of the results. Trouble is, you aren't likely to get an average of exactly seven. Suppose you get an average of 9. Are the dice fair? Well, that depends on how many times you rolled them.

i-369a967bdd9173c3550427b11fd7fc7d-4dice.png
I rolled\* a pair of dice twice and averaged the results. I repeated the experiment 1000 times and plotted all the averages. 95% of the averages lie between the two horizontal lines, between 4.5 and 9.5. We would say that getting an average of 9 was not particular unusual and we would not be able to conclude that the dice were unfair.

i-ba94b685d2575ed1aaf9db939fa06199-100dice.png
This graph shows what happens if you average 50 rolls instead of two. Notice that he averages lie much closer to 7. Now 95% of the averages are between the two horizontal lines, between 6.5 and 7.5. This time, getting an average of 9 is very unusual and we can conclude that the dice were almost certainly unfair.

The important difference between averaging two rolls and averaging 50 rolls is the distance between the two horizontal lines. For two rolls it is 9.5-4.5=5, while for 50 rolls it is 7.5-6.5=1. Rather than use this distance, statisticians use something called the standard error which is about 1/4 of the distance, that is, 1.25 and 0.25 in our two examples. The usual way things are expressed is that a result is statistically significant if it is more than two standard errors from the expected one. So, in the two rolls case 9 is not statistically significant since it is (9-7)/1.25=1.6 standard errors away from 7, while in the 50 rolls case 9 is statistically significant since it is (9-7)/0.25=8 standard errors away.

Notice that if you think you have more dice rolls than you really have, then you might think that the standard error is smaller than it really is and decide that a result is statistically significant when it really isn't.

i-7ca86678c3bd297bd0dcf6292c0cae48-51dice.png
You might think that it is really easy to count the number of rolls, but it is really easy to go astray. This time I had a red die and a green die. While I rolled the green die 50 times, I only rolled the red die once. I added the score on the red die and the green die to get 50 different totals and plotted the averages. Even though I took the average of 50 different rolls the results look much more like the first example with only four rolls than like the second example with 50 rolls. The reason for this is that the red die is shared by all the rolls so you aren't really rolling as many dice as you think. The technical term for this clustering, and adjusting the standard errors to allow for clustering is the clustering correction. In this case, the clustering correction would increase the standard errors from 0.25 to 1.25

What has all this to do with the "More Guns, Less Crime" data? Well, when you think of random changes in the crime rates in a particular county, some of the factors causing crime to change just operate within that county (that corresponds to the green die above), while others operate statewide (that corresponds to a red die shared by all the counties within a state). So it is necesssary to make a clustering correction to the standard errors in the "More Guns, Less Crime" data.

\* OK, I didn't really roll dice, but simulated them on a computer.

Tags

More like this

Sitting here on the calendar between Chinese New Year and Saint Patrick's Day, it seemed like a good time for the sprogs to do some investigations of gambling devices -- in particular, dice. Dr. Free-Ride: Will you roll dice for me? Younger offspring: Can I use the purple ones? Dr. Free-Ride:…
Helland and Tabarrok's paper 'Using Placebo Laws to Test "More Guns, Less Crime"' has been published in Advances in Economic Analysis & Policy. Their objective was to correct for serial correlations in the crime data. I explained earlier how, if crimes rates in adjacent…
Lott has posted some criticism of Chris Mooney's article. Let's see how many errors he has successfully identified: 1) Paraphrasing claim from the Chronicle of Higher Education stating that the "coding errors had not been reviewed by a third party." I was never asked by the…
So, apart from pretending to be one, what expertise does Lott have on women and gun issues? Well, he wrote this NRO article on women and guns. It was widely linked by bloggers, who felt that the key statistic was this: "The probability of serious injury from a criminal confrontation is 2.5…

That was the best damned description of clustering I've ever seen. If there is a silver lining in all this crap about Lott I think it will be how much more careful even casual users of econometric techniques will have to be in making policy prescriptions and recommendations.

Best,

Jeff

By Jeffrey Wenger (not verified) on 11 Feb 2004 #permalink