Now on ScienceBlogs: The Galaxy's Biggest Valentine

ScienceBlogs Book Club: Inside the Outbreaks

Deltoid

Goodness of fit of abrupt change model

My model has two parameters (pre 1920 rate, post 1920 rate). Your model has four parameters (starting rate, first decrease, second decrease, year that rate of decrease changed). The more parameters that your model has, the easier it is to...

Search

Profile

Tim Lambert Tim Lambert (deltoidblog AT gmail.com) is a computer scientist at the University of New South Wales.

Wikio - Top Blogs - Sciences

Recent Posts

Recent Comments

Categories

Archives

Full archives

Links

Blogroll

1st for computer science

« Effects of gun control in South Australia | Main | Did Orlando gun training reduce rapes? »

Goodness of fit of abrupt change model

Category: NSW
Posted on: February 6, 1992 3:00 AM, by Tim Lambert

My model has two parameters (pre 1920 rate, post 1920 rate). Your model has four parameters (starting rate, first decrease, second decrease, year that rate of decrease changed). The more parameters that your model has, the easier it is to fit the data.

Frank Crary said:

However, no one is restricting the number of free parameters in your model, except yourself: You are (or were) using this data to support your assertions that: The homicide rate in New South Wales dropped suddenly after the introduction of gun control laws in 1920, and that there was no pre-existing trend toward lower rates. The only thing restricting the number of free parameters in your model is these assertions. If models based on this do not fit the data well, that would imply that these assertions are not accurate. If you can post a more accurate model, which is still consistent with your theory, please do.

A model with as many parameters as data points will fit the data perfectly. Are these the best models? No, of course not. We should prefer a model with as small a number of parameters as possible. The only reason to consider a three parameter model is if we can't find a two parameter model that fits the data adequately.

To test goodness of fit, we need to find the chi-square value

chisquare = sum( (o[i]-p[i]/sd[i])^2 ) where i=1,2,...,n

o[i] is the observed value at i, p[i] is the value predicted by the model, and sd[i] is the standard deviation of o[i].

Estimating sd[i] is the tricky part. I assume that homicides are Poisson distributed. This means that the variance is the same as the expected number of homicides. We still need to know the expected number of homicides. Using the model we are testing to tell us this would be naughty, so I just took the average over the period 1910-1930. The resulting standard deviations are at the end of this posting.

For my model, over the period 1910-1930, the resulting chi-square statistic is 24.6, with 19 degrees of freedom, which has a probability of 0.17.

I conclude that my model gives a good fit to the data, and there is no reason to consider models with more parameters.

Standard deviations for NSW homicide rate

1910  0.34
1911  0.34
1912  0.33
1913  0.33
1914  0.32
1915  0.32
1916  0.32
1917  0.32
1918  0.32
1919  0.31
1920  0.31
1921  0.30
1922  0.30
1923  0.30
1924  0.29
1925  0.29
1926  0.29
1927  0.28
1928  0.28
1929  0.28
1930  0.28
Share on Facebook
Share on StumbleUpon
Share on Facebook

ScienceBlogs

Search ScienceBlogs:

Go to:

Advertisement
Follow ScienceBlogs on Twitter

© 2006-2011 ScienceBlogs LLC. ScienceBlogs is a registered trademark of ScienceBlogs LLC. All rights reserved.