This morning, via Twitter, I ran across one of the most spectacular examples of deceptive data presentation that I’ve ever seen. The graph in question is reproduced in this blog post by Bryan Caplan, and comes from this econ paper about benefits of education. The plot looks like this:
This is one panel clipped out of a four-part graph, showing the percentage of survey respondents reporting that they are satisfied with their current job. The horizontal axis is the years of schooling for different categories of respondents.
So, I looked at that, and said “Wow, people with more education are significantly happier with their jobs.” Then in the post, Caplan is talking about how small the effect is, and I said “What the hell?” then looked closely at the axis labels. Which actually span a tiny, tiny range of responses. A totally honest version of the black bars in the plot would look like this:
That is, there’s very little difference between the four groups, with only a tiny shift up as you go to higher education. The fraction of people with post-graduate education who are satisfied with their jobs is only about 7 percentage points higher (9% of the total value) than the fraction of those who never completed high school who are satisfied with their jobs.
By carefully choosing their vertical axis to start just barely below their minimum value, though, the authors have managed to create the impression that the post-graduate cohort is about 25 times more satisfied than the non-high-school cohort (based on counting pixels in the vertical bars). Which is really impressive– even the Excel auto settings do a better job, starting the vertical axis at around 0.74. It still exaggerates the effect, but isn’t anywhere near as far into “How to Lie With Statistics” territory than the published graph, which is a marvel of axis-limit deception.
And economists wonder why they have a hard time getting physicists to take them seriously…