In article fkk@leland.stanford.edu writes:

In a recent post Pim cites Tim Lambert as support for his position on the
Florida data. I’m sorry but Lambert’s analysis is flawed at its core.

No it isn’t. It appears that you don’t understand what statistical
hypothesis testing is, or what it means.

Let’s see what he says:
First of all Rick is stating that

it is supportable only in the ‘right’ time span. Yet he fails
to provide proof for such remarks. The time span is the same
data as used by Kleck. It is Kleck however who is
misleading the facts by carefully selecting his data. Yet
even this is not going to help him since the data show no
support for his thesis as well.

This is overstating things somewhat. The data does support his
hypothesis, in that the reported rape rate did fall. However, this
decrease was not statistically significant. This means that random
variations in the reported rape rate provide an equally good explanation.

A crime rate two standard deviations from the mean would be
statistically significant. However, The 67 rape rate was only 0.9
standard deviations less than the 58-66 mean, so this rate is not
statistically significant. However the 66 rate was 1.7 standard
deviations above the mean, so the change from 66 to 67 was 2.6
standard deviations (of the 58-66 rate). This is NOT significant
because the standard deviation of the changes in the rates does not
equal the standard deviation of the rates. For this data, the
standard deviation of the rates from 58-66 is 12, and the standard
deviation of the changes from one year to the next in the years 58-66
is 16. The change from 66 to 67 does NOT exceed 2 standard deviations
(of the 58-66 changes in the rate).

In Kleck’s statistical analysis of the Orlando data, he wrote:

“It might be suggested that Orlando had experienced erratic ups and
downs in its rape trends before and that the 1967 experience just
happened to reflect one of the brief, sharp downward swings in the
rape rate, which it had experienced before, and that the downward
swing was therefore unrelated to the gun training program.
However, this suggestion too is unlikely, since Orlando had not
experienced so large a one-year change in rape rates in its recent
past, and the decrease exceeded two standard deviations, a measure
of the variability in rape rates over the 1958-1966 period (1958
was the first year the FBI reported rape data for the Orlando
SMSA). In other words, the rape decrease was considerably larger
than would be expected on the basis of variation in the rate in the
recent past.”

This is mathematically incorrect. Kleck has confused “standard
deviation of the rates” (a measure of the variability of the rates)
with “standard deviation of the CHANGES in the rates” (a measure of
the variability of the CHANGES). His conclusion about the decrease
being considerably larger than expected is simply incorrect.

The notion that you can accurately assign a standard deviation to the
data as Tim Lambert – and perhaps Kleck as well is fraught with hazards.

The problem is that there is no in this case there is no way to determine
an accurate mean for the data – there is therefore no way to accurately
determine a standard deviation.

In this case – the rape rate in each year is actually an
INDEPENDENT and entirely different measurement – potentially
reflecting entirely different conditions. An estimate of a mean and
standard deviation when you really have only single data points to
support a particular view is invalid in this case. You may make
certain assumptions to arrive at a number but these assumptions can
easily include observer bias.

This is confused. Statistical hypothesis testing works by DISPROVING
hypotheses. In this case the null hypothesis is that the observations
are drawn from the same normally distributed population. Under this
hypothesis the sample mean and standard deviation are unbiased
estimators of the underlying mean and standard deviation. The
statistical test tells us that we cannot reject the null hypothesis.
You are correct in that it might be false, but statistics can never
prove it to be true.

Simply taking y number of years around
any given data point and trying to calculate a mean and standard
deviation is simply wrong without compelling reasons based in physical
reality.
One might make compelling reasons to select a particular time range but
there are so many variables that can affect the rape rate in any given year
it is easy to be wrong.

In this particular case 1958-66 was chosen by Kleck because it was all
the UCR data available about rape rates in Orlando before the gun
training.

Tim Lambert’s comments should be entirely discounted – his numbers
are meaningless – he could select the range of data any way he pleases
to get his numbers.

Utter nonsense. I didn’t select the range of data — Kleck did. I
merely corrected the error in his calculations.

I also note that by the same reasoning Frank could also have said that
Kleck’s numbers are meaningless, but he doesn’t. Is this the pro-gun
double standard at work again?