Poisson Hockey

By evolgen on February 9, 2006.

If you like sports (specifically hockey) and you like statistics, two posts from Tom Benjamin's NHL Blog are must reads (available here and here). With help from Dave Savit, a math professor at the University of Arizona, Tom describes how hockey can be modeled using a Poisson distribution. There are also Poisson Standings for the NHL season. Some have called this Moneyball for hockey. More stuff below the fold.

The idea of the model is that goals can be considered Poisson random variables. You can calculate the expected number of goals scored by a team in a single game using the number of goals scored in the entire season. The same can be done for individual players. We can also calculate the expected number of goals in any length of time (single periods, over two periods, over a five minute stretch). The expected number of goals scored by a team in a single game can then be used to predict the winners of individual contests.

Saying that the outcome of a game is random is a bit misleading. Each team has its own probability distribution for goals scored per game. The model predicts who will win a match up between two teams. Sometimes the 'worse' team will win because of the variance around the expected number of goals scored. You can think of each game as a coin flip with a weighted coin -- the weighting of the coin depends on which team scores more goals per game. This is not very remarkable (the team that tends to score more goals per game tends to win more games), but it indicates that a team's performance is consistent from game to game. As Tom points out:

"This assumption flies in the face of many hockey myths around things like clutch play and the apparent ability some players have to rise to the occasion. The fact is that somebody has to come through in the clutch and that somebody is randomly selected by the hockey gods. This idea makes us feel uncomfortable because it is disturbing to realize that so many things in life are beyond our control. It means hockey - indeed life - in the short run is about luck and probabilities. Skill only outs in the long run and even a season is a relatively short time period."

Other sports could probably be treated in a similar manner. I could definitely see soccer and possible baseball fitting the Poisson model. Football (American style) and basketball would be trickier because the offensive output of a team depends so much on the defensive capabilities of their opponent (this may be the case in baseball as well).

What we must remember is that a single poor (or excellent) performance could be due to deviations from the mean, and it all balances out over a complete season. In sports like hockey, baseball and basketball where teams play many games in a single season (82 in the NHL and NBA, 162 in Major League Baseball), the best teams will finish with the best records. This is because the more trials you run (each game is a trial), the lower your variance. In the NFL (16 game seasons) there is a greater chance that the best teams will not have the best records (assuming points are determined by a Poisson process). In single game playoffs (the NCAA basketball tournament for example) the better team has even less chance of winning. The benefit of playing best of seven series in the major American sports is that we increase the probability that the better team will win.

Keep this in mind as you watch the Winter Olympics the next couple of weeks or the World Cup this summer. It makes me wonder whether the Miracle was a great athletic achievement or an odd draw from a Poisson distribution.

(Via Can't Stop the Bleeding and The Sports Frog)

More like this

Basics: Standard Deviation

When we look at a the data for a population+ often the first thing we do is look at the mean. But even if we know that the distribution

Seasons, short and simple

I love this question: Why is it warmer in the summer than in the winter (for the Northern hemisphere)? Go ahead and ask your friends. I suppose they will give one of the following likely answers:

The Real Bozo Attempts to Atone: Why the DDWFTW Car Works

Technorati Tags: ddftw, bozos, markcc-screwups

BIO101 - Lecture 7 - Physiology: Coordinated Response

Last week we looked at the organ systems involved in regulation and control of body functions: the nervous, sensory, endocrine and circadian systems. This week, we will cover the organ systems that are regulated and controlled.

"Other sports could probably be treated in a similar manner. I could definitely see soccer and possible baseball fitting the Poisson model. Football (American style) and basketball would be trickier."

Exactly. The poisson is good at modeling rare events, which is how one could consider a goal in hockey or soccer (the probability of scoring each time down the field or ice is pretty low).

For sports where scoring is more common (basketball, football), you could use a binomial, but the fact that there are different ways to score (field goal vs. touchdown) would make it a bit more difficult. But pretty fun to think about in any case.

Interesting!

Has anybody extended this model so that it doesn't just model goals made, but also goals allowed (and, by implication, net goals). If goals made are Poisson distributed, so should goals allowed, right?

It's already established that the Pythagorean Theorem of baseball is pretty accurate over a season. For any team, expected winning percentage = runs scored^2/(runs scored^2 runs allowed^2). Over a season, it does a pretty good job of coming within 4 games of the actual record.

See this Baseball Prospectus article at http://tinyurl.com/dletq (unfortunately missing the graphs) for much more info about the Pythagorean Theorem of baseball, finding the correct exponents, and the applicability of the Poisson distribution to baseball.

Ummm.. why do people keep making this error - Yes a Poisson random variable will follow a Poisson distribution, but that does not mean that something following a Poisson distribution discretely is a random variable. I used to see this all the time in statistical quality control where people would throw up their hands at a problem, then when some underlying cause was found, bingo the distribution changed radically. The quote above is actually funny:

"This assumption flies in the face of many hockey myths around things like clutch play and the apparent ability some players have to rise to the occasion. The fact is that somebody has to come through in the clutch and that somebody is randomly selected by the hockey gods. "

Maybe they are better players? Some do score better than others - even in clutch situations. If a team got better their goals per game would go up, is that just the Hockey gods also? The fact that individual players generally score goals roughly the same over different time periods only shows that their talent and coaching is roughly the same over different time periods - Look at coaching changes where the winning percentage went up, what do you know the scoring went up also... they would still follow a Poisson, just a different one.

The fact that you want your best scorer to have the puck in a clutch situation doesn't mean they are better in the clutch, than otherwise, it means they are better in the clutch than anyone else you've got. So a coach would be right to say they are better in the clutch...

All this is really saying is that in any given game you can't predict who will do well except with probabilities, and that evidently talent and coaching do matter because different teams score and defend differently. The randomness does not seem to be "real" because personel and coaching changes do make a difference, just the effect of complexity.

Markk,

Sure, the wording is sloppy, but the point was that goal scoring does not differ from game to game, period to period, or shift to shift. The probability that a player scores a goal in the the last few minutes of game is, essentially, the same as any other time during the game -- they're working to disprove the notion that there is such a thing as "clutch" performance.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

This is a Good-bye Post

January 16, 2009

This is the final post ever at evolgen. It was a fun 4+ years, the last three spent at ScienceBlogs, but it has come time for me to close up shop. When I first got into blogging, I did it as a way to share what was on my mind to the few people who would read what I had to say (usually in topics…

Mendel's Garden #27 - Call for Submissions

January 2, 2009

Mendel's Garden is the original genetics blog carnival. The next edition will be hosted by Jeremy at Another Blasted Weblog. If you would like to submit a blog post to be included in the carnival, send an email to Jeremy (jcherfas at mac dot com). The carnival should be posted within the next few…

Eric Lander Teaches?

December 20, 2008

John Hawks points out that Eric Lander has been appointed to co-chair Obama's Council of Advisers on Science and Technology along with science adviser John Holdren and Nobel Laureate Harold Varmus. Here's how the AP article describes Lander: Lander, who teaches at both MIT and Harvard, founded the…

The Implementation of Molecular Evolution for the Masses

December 18, 2008

A couple of years ago, there was talk in the bioblogosphere about getting the general public interested in bioinformatics and molecular evolution: Amateur bioinformatics? Lowering the Ivory Tower with Molecular Evolution Molecular Evolution for the Masses The idea was inspired by the findings of…

Do people still use microarrays?

December 17, 2008

Larry Moran points to a couple of posts critical of microarrays (The Problem with Microarrays): Why microarray study conclusions are so often wrong Three reasons to distrust microarray results Microarrays are small chips that are covered with short stretches of single stranded DNA. People…