The Lott calling the kettle black

By tlambert on May 21, 2003.

Posts by d-squared and John Quiggin on data mining and Lott reminded me that Lott accused his critics of data mining in a response to Webster:

The Black and Nagin paper excludes Florida after they have already excluded the 86 percent of the counties with populations fewer than 100,000. Eliminating Florida as well as counties with fewer than 100,0000 does eliminate the significance in the one particular type of specification that they report for a couple of crimes, but the vast majority of estimates were unaffected from this extreme data mining and they ignore that doing this actually strengthens some of the results.

and in a Reason interview:

I wanted all the data that were available....I didn't pick and choose, and when somebody drops out 86 percent of the counties along with Florida, you know they must have tried all sorts of combinations. This wasn't the first obvious combination that sprang to mind. And it's the only combination they report....If, after doing all these gymnastics, and recording only one type of specification, dealing with before-and-after averages that are biased against finding a benefit, they still find only benefits, and no cost, to me that strengthens the results.

So, how accurate is Lott's claim that Black and Nagin were doing "extreme data mining"?

Well, Lott's comment about dropping 86% of the counties is a red herring. Black and Nagin also got similar results if they included the small counties and just dropped Florida. They wrote:

"Nor is this result a function of our use of the large-county sample. Without Florida in the sample, the estimation of Lott and Mustard's model, which is given by equation (1), for all counties provides no evidence of an impact of RTC laws on homicide and rape."

Even if Lott failed to notice this sentence, he must have known that dropping the small counties didn't matter, since he said he reran all the regressions without Florida. Lott's accusation of "extreme data mining" was deliberately misleading.

Lott's 86% figure is also misleading. Maltz and Targonski noted problems in the county crime data that Lott used, with about 13% of counties having significant under-reporting. In Lott's reply, he argued out the regressions were weighted by population, so the size of the problem was best measured by the percentage of the population in the problem counties, which was only 6.8%. And yet when he criticized Black and Nagin he used the percentage of counties that they dropped (86%), rather than the percentage of population in those counties (about 30%).

Is it data mining to check to see if the results depend on the inclusion of a particular state? No, that is a legitimate test of the robustness of the result. It does not prove that the carry laws did not reduce crime, but it strongly suggests that something is wrong with the model.

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

The Lott calling the kettle black

More like this

Lott sues Levitt

Wikipedia 2, John Lott 0

If you look up "sneaky" in the dictionary...

Letter to Editor in Columbus Dispatch

Scienceblogs is shutting down

June 2017 Open Thread

March 2017 Open Thread

January 2107 Open thread

December 2016 Open Thread

Comments of the Week #173: From quantum uncertainty to Earth's final total solar eclipse

Why observatories shoot lasers at the Universe

Robust jaws and a (sometimes) 'greenish' pelt: house bats (vesper bats part IX)