In article fkk@leland.stanford.edu writes:

In a recent post Pim cites Tim Lambert as support for his position on the

Florida data. I’m sorry but Lambert’s analysis is flawed at its core.

No it isn’t. It appears that you don’t understand what statistical

hypothesis testing is, or what it means.

Let’s see what he says:

First of all Rick is stating thatit is supportable only in the ‘right’ time span. Yet he fails

to provide proof for such remarks. The time span is the same

data as used by Kleck. It is Kleck however who is

misleading the facts by carefully selecting his data. Yet

even this is not going to help him since the data show no

support for his thesis as well.

This is overstating things somewhat. The data *does* support his

hypothesis, in that the reported rape rate did fall. However, this

decrease was not statistically significant. This means that random

variations in the reported rape rate provide an equally good explanation.

A crime rate two standard deviations from the mean would be

statistically significant. However, The 67 rape rate was only 0.9

standard deviations less than the 58-66 mean, so this rate is not

statistically significant. However the 66 rate was 1.7 standard

deviations above the mean, so the change from 66 to 67 was 2.6

standard deviations (of the 58-66 rate). This is NOT significant

because the standard deviation of the changes in the rates does not

equal the standard deviation of the rates. For this data, the

standard deviation of the rates from 58-66 is 12, and the standard

deviation of the changes from one year to the next in the years 58-66

is 16. The change from 66 to 67 does NOT exceed 2 standard deviations

(of the 58-66 changes in the rate).

In Kleck’s statistical analysis of the Orlando data, he wrote:

“It might be suggested that Orlando had experienced erratic ups and

downs in its rape trends before and that the 1967 experience just

happened to reflect one of the brief, sharp downward swings in the

rape rate, which it had experienced before, and that the downward

swing was therefore unrelated to the gun training program.

However, this suggestion too is unlikely, since Orlando had not

experienced so large a one-year change in rape rates in its recent

past, and the decrease exceeded two standard deviations, a measure

of the variability in rape rates over the 1958-1966 period (1958

was the first year the FBI reported rape data for the Orlando

SMSA). In other words, the rape decrease was considerably larger

than would be expected on the basis of variation in the rate in the

recent past.”

This is mathematically incorrect. Kleck has confused “standard

deviation of the rates” (a measure of the variability of the rates)

with “standard deviation of the CHANGES in the rates” (a measure of

the variability of the CHANGES). His conclusion about the decrease

being considerably larger than expected is simply incorrect.

The notion that you can accurately assign a standard deviation to the

data as Tim Lambert – and perhaps Kleck as well is fraught with hazards.The problem is that there is no in this case there is no way to determine

an accurate mean for the data – there is therefore no way to accurately

determine a standard deviation.In this case – the rape rate in each year is actually an

INDEPENDENT and entirely different measurement – potentially

reflecting entirely different conditions. An estimate of a mean and

standard deviation when you really have only single data points to

support a particular view is invalid in this case. You may make

certain assumptions to arrive at a number but these assumptions can

easily include observer bias.

This is confused. Statistical hypothesis testing works by DISPROVING

hypotheses. In this case the null hypothesis is that the observations

are drawn from the same normally distributed population. Under this

hypothesis the sample mean and standard deviation are unbiased

estimators of the underlying mean and standard deviation. The

statistical test tells us that we cannot reject the null hypothesis.

You are correct in that it might be false, but statistics can never

prove it to be true.

Simply taking y number of years around

any given data point and trying to calculate a mean and standard

deviation is simply wrong without compelling reasons based in physical

reality.

One might make compelling reasons to select a particular time range but

there are so many variables that can affect the rape rate in any given year

it is easy to be wrong.

In this particular case 1958-66 was chosen by Kleck because it was all

the UCR data available about rape rates in Orlando before the gun

training.

Tim Lambert’s comments should be entirely discounted – his numbers

are meaningless – he could select the range of data any way he pleases

to get his numbers.

Utter nonsense. I didn’t select the range of data — Kleck did. I

merely corrected the error in his calculations.

I also note that by the same reasoning Frank could also have said that

Kleck’s numbers are meaningless, but he doesn’t. Is this the pro-gun

double standard at work again?