Charles Scripter writes:
BTW, I notice that your web page still seems to purport that your analysis was correct, even though your friends over in sci.stat.edu pointed out that it was not correct;
That's an interesting interpretation of the discussion.
Perhaps you'd like to correct this "oversight".
No problem, here's something from one of my friends in sci.stat.edu:
Barry McDonald writes:
THE ARGUMENT ABOUT AUTOCORRELATION IN THE NSW HOMICIDE STATISTICS One complaint by Scripter about your data was to do with the evident autocorrelation in your data. It was this that was of interest to me. You see sometimes autocorrelation is not real, but just apparent- arising from having omitted a variable from the analysis. The Minitab output shows this is the case for your data. First I should explain that a Durbin-Watson statistic shows significant autocorrelation if D< DL where DL is a number from tables, clear non-significant autocorrelation if D>DU where DU is also from tables, and inconclusive results if DL<= D <= DU. Suppose we just fit a single line to the data:MTB > Regress 'homocide' 1 'year'; SUBC> Constant; SUBC> DW. Regression Analysis The regression equation is homocide = 47.3 - 0.0236 year Predictor Coef Stdev t-ratio p Constant 47.30 16.08 2.94 0.006 year -0.023646 0.008381 -2.82 0.008 s = 0.5666 R-sq = 18.1% R-sq(adj) = 15.8% Durbin-Watson statistic = 1.23Notice that if one just fits a straight line in terms of year then the _overall_ trend is downwards!! but the Durbin-Watson statistic indicates significant autocorrelation at the 5% level (D< DL=1.41) and nearly at the 1% level (1.21). The autocorrelation is cleared up by fitting a more appropriate analysis (see after graphs)MTB > GStd. MTB > Plot 'homocide' 'year'; SUBC> Symbol 'x'. Character Plot 3.20+ - x x xx homocide- - x x x - x x 2.40+ x x - x - x x x x - x x - x x xx x 1.60+ x x x x - x x x x x x - x - x - x x x 0.80+ - --------+---------+---------+---------+---------+--------year 1904.0 1911.0 1918.0 1925.0 1932.0 MTB GPro. MTB GStd. MTB Plot 'FITS1' 'year'; SUBC Symbol 'x'. Character Plot 2.40+ x - xx FITS1 - xxx - xx - xx x 2.10+ xx - xx x - x xx - x x - x xx 1.80+ xx - x xx - xx - xxx - xx 1.50+ xx - --------+---------+---------+---------+---------+--------year 1904.0 1911.0 1918.0 1925.0 1932.0 MTB GPro.Allowing for a change in intercept level with the law change in 1920:MTB > Regress 'homocide' 2 'year' 'lawchnge'; SUBC> Constant; SUBC> DW. Regression Analysis The regression equation is homocide = - 40.4 + 0.0223 year - 1.18 lawchnge Predictor Coef Stdev t-ratio p Constant -40.37 26.94 -1.50 0.143 year 0.02233 0.01410 1.58 0.122 lawchnge -1.1769 0.3111 -3.78 0.001 s = 0.4841 R-sq = 41.9% R-sq(adj) = 38.6% Durbin-Watson statistic = 1.63There is clear evidence _not_ to reject the hypothesis of zero autocorrelation if D>DU=1.52. This is indeed the case so the addition of this extra variable has simultaneously given us a much more believable analysis (see two lines below: not going down this time!!) and removed the apparent autocorrelation. The choice of this cut point (1920) has had a very significant effect (p-value approx 0.001). Note however that the trend in year is not significantly different to zero.MTB > GStd. MTB > Plot 'FITS2' 'year'; SUBC> Symbol 'x'. Character Plot - - xx x 2.40+ x xx x - xx x FITS2 - x xx x - xx x - x xx x 2.00+ - - - - x xx 1.60+ xxx x - x xx - xxx x - x xx - --------+---------+---------+---------+---------+--------year 1904.0 1911.0 1918.0 1925.0 1932.0If we just assume constant rates of homicides before 1921 and constant after 1920, (i.e. zero slopes) then we retain a significant drop in the level of homicides, and the autocorrelation is still not significant. (though I would be cautious). The conclusion from this test is exactly the same as you were trying to do by a two-sample t-test except that in this analysis we have the added feature of checking whether the autocorrelation is significant.MTB >Regress 'homocide' 1 'lawchnge'; SUBC> Constant; SUBC> DW. Regression Analysis The regression equation is homocide = 2.28 - 0.753 lawchnge Predictor Coef Stdev t-ratio p Constant 2.2762 0.1078 21.11 0.000 lawchnge -0.7527 0.1612 -4.67 0.000 s = 0.4941 R-sq = 37.7% R-sq(adj) = 36.0% Analysis of Variance SOURCE DF SS MS F p Regression 1 5.3221 5.3221 21.80 0.000 Error 36 8.7887 0.2441 Total 37 14.1108 Unusual Observations Obs. lawchnge homocide Fit Stdev.Fit Residual St.Resid 4 0.00 1.0000 2.2762 0.1078 -1.2762 -2.65R R denotes an obs. with a large st. resid. Durbin-Watson statistic = 1.52(D= DU=1.52 so there is no proof of autocorrelation ) Of course this analysis does not clear up Scripter's complaint that (in his eyes) the law change year is irrelevant to homicides and so an adjacent year could be used to give a similar significant result. That is a causal matter that I cannot comment on as a statistician. - except that since a highly significant effect is apparent in the data, it really behooves him to come up with a better explanation than yours, especially as to why he postulates any other year as the change point.




