Which fits better? Gradual decline or abrupt decrease?

Frank Crary said:

In an effort to clear up this statistical game, I'm posting a detailed
comparison of Mr. Lambert's and my models of the crime rate in New South
Wales, between 1910 and 1930.

The data, taken from the graph he posted on the 15th of this month, is:

[Numbers deleted]

(Please correct me if I'm in error, Mr. Lambert's ascii graph reached me in
a slightly garbled form.)

Eeek! About half of those numbers are incorrect. I guess ascii
graphs are not the most robust ways to transmit information. I have
appended the correct numbers to the end of this posting, so that my
calculations can be checked.

[Excellent and detailed analysis showing a steady decline fits the
data better than an abrupt change deleted]

I'll repeat the calculation using the correct figures.

The first model is a constant rate from 1910-1920, and another
constant rate from 1921-1930.

The best such model will be obtained if we use the 1910-1920 mean
(2.33) for the 1910-1920 rate and the 1921-1930 mean (1.49) for the
1921-1930 rate.

Summing the squares of the deviations of the model from the actual
rates gives me 2.85.

The second model is a gradual decline.

The best such model can be found be doing a linear regression. This
gives a starting value of 2.52, declining by 0.059 annually.

Summing the squares of the deviations of the model from the actual
rates gives me 3.80.

We can conclude that the "abrupt change in 1920" model fits the data better
than the "gradual change" model.

Incidently, if we consider all "abrupt change" models, the one that
that fits the data best is the one where the change occurs in 1920.

      NSW  Qld
1900  1.5  7.0
1901  1.4  9.0
1902  2.3  7.5
1903    1  7.1
1904    3  7.5
1905  1.8  6.6
1906  2.1  6.1
1907    3  7.4
1908    3  6.8
1909  3.1  4.4
1910  2.6  4.6
1911  2.7  4.0
1912  2.4  3.1
1913  2.1  4.4
1914  2.4  4.0
1915  2.6  3.9
1916  2.1  4.2
1917  1.9  4.9
1918  1.4  5.3
1919  2.7  3.8
1920  2.7  3.5
1921  1.8  3.8
1922  1.6  2.9
1923    1  3.6
1924  0.9  5.1
1925  1.1  3.7
1926  1.6  5.1
1927  1.6  6.1
1928  1.9  3.8
1929  1.7  3.7
1930  1.7  2.6
1931  1.6  2.4
1932  1.3  3.3
1933  1.5  2.5
1934  1.4  2.5
1935  1.8  2.4
1936  1.4  1.5
1937    2  2.5
Tags

More like this

I like computers, really I do. Computational physics is a good thing. However, there is a small problem. The problem is that there seems to be a large number of people out there that treat numerical methods and simulations as something different than theoretical calculations.
(two entries from my old blog)
Here I am, at my parents house. There is no power at my house and Louisiana in September with no power is really a whole bunch of no-fun. But maybe I can use this time to talk about science. **The Nature of Science**
My model has two parameters (pre 1920 rate, post 1920 rate). Your model has four parameters (starting rate, first decrease, second decrease, year that rate of decrease changed). The more parameters that your model has, the easier it is to fit the data. Frank Crary said: