DiMaggio's Streak

It's been a hotly debated scientific question for decades: was Joe DiMaggio's 56-game hitting streak a genuine statistical outlier, or is it an expected statistical aberration, given the long history of major league baseball? I'd optimistically assumed, based on the work of Harvard physicist Ed Purcell (as cited by Stephen Jay Gould) that DiMaggio was the real deal. Here's Gould:

Purcell calculated that to make it likely (probability greater than 50 percent) that a run of even fifty games will occur once in the history of baseball up to now (and fifty-six is a lot more than fifty in this kind of league), baseball's rosters would have to include either four lifetime .400 batters or fifty-two lifetime .350 batters over careers of one thousand games. In actuality, only three men have lifetime batting averages in excess of .350, and no one is anywhere near .400 (Ty Cobb at .367, Rogers Hornsby at .358, and Shoeless Joe Jackson at .356). DiMaggio's streak is the most extraordinary thing that ever happened in American sports.

But science, with its relentless pursuit of fact and abhorrence of anomalies, has apparently concluded that DiMaggio wasn't so special after all. In their latest excellent podcast, Radiolab interviews Steve Strogatz, a mathematician at Cornell, who worked with his student Sam Arbesman to simulate the history of MLB only to demonstrate that there was nothing statistically freakish about DiMaggio's hitting streak. Others, however, aren't quite so sure. The controversy continues.

More like this

I don't understand what you mean - the streak is a remarkable achievement. It's strange that it is 12 games longer than the 2nd longest streak. Can't it also be something that one would reasonably expect to happen to someone in the long history of baseball?

As a comparison, you might take any number of scientific discoveries and say that it was remarkable that scientist X discovered it (say evolution). If Darwin and Wallace had never lived - someone else would have eventually come up with the idea of natural selection because that's where the evidence pointed. They are still remarkable for recognizing it first.

I think there's a difference between saying that DiMaggio's streak was not remarkable and saying that it is not earth-shattering that someone managed to hit in 56 straight games at some point.

Also - in the passage you quote, it appears that Purcell calculated based on careers of 1000 games. This is a pretty large flaw as I see it, given that even with the old season length of 154 games, that's 6.5 seasons. Lots of players play more or less every day for 15 years.

Did you know that during the two months of DiMaggio's streak, Ted Williams actually hit for a higher average than DiMaggio? What what amazing about DiMaggio's streak wasn't his great hitting, but the distribution of it.