Sunday Function

Head down to Box Office Mojo and pull up the list of the top grossing films of the year thus far. Seven of the top ten have a dollar gross beginning with the number 1. Okay, that's not too weird. Big films tend to pull down somewhere between $100-200 million, while only the real monsters have high grosses. So what if we look at the inflation-adjusted all-time list, which is less likely to be fixed by the coincidental size of the film-going public and ticket prices? Again, seven of the 10 have grosses beginning with 1.

Well, maybe movies are just weird. What about cities? In the US, five of the top ten cities have a population figure which begins with a 1.

Maybe cities are just weird too. How about election results? If you rank the states of the 2008 US presidential election by Obama's vote total, zero of the top ten have Obama vote totals beginning with 1 - but then again, all the rest of the top 20 did.

Why this preponderance of numbers that happen to start with 1? Is it just an artifact of the data sets I've picked, or something more interesting. Try a thought experiment:

Pick a number, say, one million. Write it out in decimal notation and it reads 1,000,000. Its first digit is the number 1. If you increase or decrease 1,000,000 by ten percent, you get 1,100,000 or 900,000, which start with 1 and 9 respectively. If you increase or decrease 1,000,000 by twenty percent, you get 1,200,000 or 800,000, which start with 1 and 8 respectively. If you increase or decrease 1,000,000 by thirty percent, you get 1,300,000 or 700,000, which start with 1 and 7 respectively.

Continue this exercise and basically the pattern continues. Essentially the million numbers following 1,000,000 start with 1, but the million below 1,000,000 can start with just about anything, including 1.

Obviously had you started with (say) 3,000,000 the effect would be much less pronounced, but it would still be there. It's possible to rigorously analyze this sort of thing, and the result is Benford's Law, which gives the probability distribution for the first digits of random numbers:


Plotting this distribution gives:


From Benford's law, you'd expect around 30% of leading digits to be the number 1. Not every set of randomly chosen integers satisfies the conditions required to Benford's Law and its odd preponderance of 1s, but lots of them do. In the financial industry, the law has even been used to search for fraud. Humans are generally terrible at making up random numbers that act anything like actual random numbers, and as a result the figures they make up when cooking the books don't tend to satisfy laws like Benford's.

Unless you're angling to hang out with Bernie Madoff in Club Fed, you should probably use your math knowledge for good rather than evil. But if you're gonna cook your books, your recipe should probably include about 30% 1s as leading digits...

More like this

In physics, you come up with an idea, formulate it mathematically, find the theory's predictions about the real world, and test those predictions by experiment. This works because God is subtle, but not malicious (to borrow Einstein's words). In more concrete language, the laws of physics fit…
I demand the sum MILLION visits! Muhahahahahaha! Sometime while I was in clinic this morning, Respectful Insolence recorded its 1,000,000 visitor: Hmmm. Durham, North Carolina, eh? Could it be that Bora or Abel put me over the top? Come on, boys, 'fess up! Were either of you taking in…
So hit him. Hit him hard. He's 40,000 hits away from 1,000,000 visits. Help get him over the top. Given that I'm starting to get in that range (866,000 hits as of this morning), here's hoping someone will help me out in around two or three months, which is when I estimate that I'll be approaching 1…
We've spent more than a few Sunday Function features discussing the properties of the prime numbers. They're just so important and interesting in number theory that they're an irresistible target. Let's set some scenery before getting to the actual function this week. There are an infinite number…

Benford's law does not work for binary; it gives '1' as the leading digit with frequency 1. The leading digit of zero is '0'.

If one wanted to apply Benford's Law-type reasoning in a binary context, one would presumably use a generalization to leading prefixes of length > 1.