While I’m complaining about statisticulation in social media, I was puzzled by the graph in Kevin Drum’s recent post about college wage gaps, which is reproduced as the “featured image” above, and also copied below for those reading via RSS. I don’t dispute the general phenomenon this is describing– that the top 10% of college grads earn way more than the average, and the bottom 10% way less, and somewhat less than high school grads– but I’m baffled about what was done to generate this graph.

Specifically, I’m puzzled by the vertical axis, which is labeled “Real hourly wage (natural log).” That seems to imply that this is a log scale in disguise, so a particular vertical interval corresponds to a multiplication of the starting value, not an addition. But then the scale is completely wacky– a value of 100 for the natural log would imply that a college grad in the 90th percentile earns 10^{43} times the wage of a high-school grad, which is rather more than the entire economic production of the planet. That’s an understatement, by the way– 10^{43} is something like the number of atoms in an asteroid with a mass of 1,000,000,000,000,000 kg.

Give that the figures for both men and women are very close to 100 at the modern end of the time series, I suspect that they divided college-grad income by high-school-grad income, took the log, and then scaled everything so the most recent data point has a value of 100. Which is the kind of thing economists like to do. I say “I think” because the figures is lifted almost directly from this 2010 report (PDF), which doesn’t explain the vertical axis in any detail, either.

(Why take the log at all? Good question. I suspect because the high income is a large-ish multiple of the small, and using a linear scale would put the lower lines too close together to see any variation. A log scale spreads things out, and as a bonus makes the smaller numbers negative. The re-scaling completely obliterates any ability to reconstruct the underlying data, though.)

Really, this doesn’t matter to anyone other than a giant nerd like me, because they don’t do anything remotely quantitative with the data in the figure. Basically, they just say “Look, high-earning college graduates make more than high school grads, and low-earning ones somewhat less,” and leave it at that. They could’ve left the puzzling numbers off entirely, and avoided distracting me, but they’re working for the liberal Center for American Progress, not the American Enterprise Institute, so they use numbers on graphs to signify that they weren’t just sketched on a cocktail napkin.

The underlying point– that college-graduate wages are spread over a wider range than most people realize– is a good one, worth thinking about. I also share Kevin’s skepticism about some of the interpretation of this, particularly when you consider that some fraction of those recent grads are going to be in graduate or professional school earning minimal wages for several years in hopes of a larger payoff down the road.

But the labels on that graph are really distracting, at least if you’re a giant nerd.