…you’ve got to ask yourself one question: ‘Do I feel lucky?’ Well, do ya, punk?
- Dirty Harry
The laws of probability, like most of the mathematical rules that govern the world, are a relatively recent discovery. Ancient people like the Romans loved to gamble as much as we do, and they had at least some idea of how certain kinds of odds worked, but they’d probably have been flummoxed by many of the mathematical tools we use today to study chance. But then again, how many people at your average casino understand how to calculate the probabilities that govern the flow of their money? Well, probably more than in the general population. But I bet it’s still not that many.
Because there’s so much money involved, problems involving gambling are very popular in introductory probability classes. Take, for instance, the Texas Lottery. The odds of winning are supposedly 1 in 25,827,165. Millions of people buy tickets for each drawing, so what is the probability of having n winners given that N tickets are purchased?
The exact answer is given by the binomial distribution. However, the binomial distribution involves taking factorials of its parameters, and when the parameters are numbers like 25,827,165 that becomes pretty much impossible. But there’s a very good approximation we can use that makes our life a lot easier in those cases where you have a low-probability event repeated many times. It’s the Poisson distribution, named after the early 19th century French mathematician Siméon-Denis Poisson. It’s our Sunday Function:
Here n is the number of successes and lambda is the expected number of successes. The expected number of successes is just the number of trials N times the probability of any given trial being successful. Here “successes” is a term of art meaning just “the low-probability even happens”. Its connotation works well when we’re talking about winning the lottery, but not necessary in other cases – you could just as well model something like train derailments as a Poisson process, after all.
The population of Texas is about 24,782,302 according to Wikipedia. So let’s pretend every one of them buys a single ticket and they don’t collude in picking their numbers. Multiplying that by the probability of winning gives an expected number of wins: λ = 0.959544. Use this as our parameter and plot:
This plots the probability of having n winners, for n on the x axis.* As expected, it falls off rapidly. If more people played – say 40 million – the bump would be shifted over to the right as it became more likely there’d be more winners. As it is, under these conditions the probabilities of n winners are:
0 – 38.3%
1 – 36.8%
2 – 17.6%
3 – 5.6%
4 – 1.4%
5 – 0.3%
And so on, rapidly heading toward zero. In practice I’m sure a lot less than the entire population is buying tickets, so most days the prize ought to roll over or otherwise go unclaimed. As of now the jackpot is about $35 million, but I think I’ll be saving my money.
*As a commenter points out, while the function is defined for all real n, the Poisson distribution itself is only valid for positive integer n. Plugging in n = 3.14159… wouldn’t tell you anything meaningful since it’s not possibly to have anything other than a whole number of winners. The reason I’ve plotted the function continuously instead of just at integral values is so that the overall behavior of the function itself – particularly the location of the maximum – is most clear. From there it’s no difficult thing to understand that in reality it’s usually just the integers we’re interested in.