Dreadful Graphics and Health Care Costs

There was a bunch of discussion yesterday about a graph comparing the amount of money spent on veterinary expenses over the last twenty-odd years to the amount spent on human health care over that same span:

i-972cebf696cf19f26f6f6d031567f459-vetspending2-1.jpg

There were a lot of dumb things said about this, but really, the worst part of the whole thing is that it's amazingly badly done. You've got the two series represented by different types of plot, gridlines for one vertical scale but not the other, the year labels floating in space down at the bottom, not associated with the tick marks in any obvious way...

This is the work of a professional think-tank worker? What the hell are they teaching in right-wing public policy programs these days?

As a public service, below the fold is a cleaned-up version of the same graph:

i-4fa5d66d8f762b7d53e0f5f48f982471-sm_medical_spending.jpg

I got rid of the distracting grid lines, provided data points with both curves so you can more easily compare them point by point, pared down the horizontal axis labels and associated them with tick marks in a reasonable way. I also got rid of some of the excess clutter, removing the pointless title and making the legend a little smaller.

It doesn't change the content in any way (and the significance of this comparison remains somewhat dubious), but at least it doesn't look like it was made by a chimpanzee with Excel. Honestly, it's no surprise that our political discourse is so hopelessly superficial, given the godawful way the people conducting the debate stumble around presenting information.

(Of course, the two-axis plot is probably sub-optimal to begin with-- it would probably be better to normalize both data sets to their 1984 values (or whatever that year is-- I guessed), and plot them on a single set of axes. You'd still make the point that they've increased at similar rates, and readers wouldn't strain their eyes trying to figure out absolute numbers. But if you're going to do the two-axis plot, do it right.)

More like this

So, um...for me? The graph you posted *begins* just to the left of the ad banner on right side of the blog, which is then displayed *above* the graph. I can't view the last two-thirds of the graph without using the dread horizontal scroll-bar, either.

Presumably the people who make this sort of criticism (which I see every few months) are completely unfamiliar with Excel and its arguably counterproductive features, design quirks and defaults. To the rest of us, it seems petty to ridicule individual users for where the by far most widely used graphing package puts its tick mark labels.

it would probably be better to normalize both data sets to their 1984 values (or whatever that year is-- I guessed), and plot them on a single set of axes

Better yet, look at the corresponding figures per capita, where in the case of vet costs "per capita" refers to the number of pets. (I'm making the dubious assumption that vet costs are not dominated by something like race horses or polo ponies.) But that might not give the answer the chart preparer's boss wanted.

By Eric Lund (not verified) on 15 Jul 2009 #permalink

Skwid: Chad may have fixed that problem. The first time I read the post, I saw the same thing you did, but after I posted my comment the chart was where it should have been and sized properly to fit there.

By Eric Lund (not verified) on 15 Jul 2009 #permalink

Yeah, I fixed the problem. I had originally included the full-size version of the graphic, not the scaled-for-the-web version.

Presumably the people who make this sort of criticism (which I see every few months) are completely unfamiliar with Excel and its arguably counterproductive features, design quirks and defaults. To the rest of us, it seems petty to ridicule individual users for where the by far most widely used graphing package puts its tick mark labels.

If this were coming from a first-year college student, I would agree. This graph was produced by somebody at the American Enterprise Institute, which is ostensibly an intellectual operation staffed by people who ought to know something about how to deal with numerical data.

I don't think it's necessarily bad to put the data points (and labels) in between tick marks. If a data point represents an annual average for the year of 1990 (Jan 1 - Dec 31), then it can make sense for the time of the data point to be the average date, 1990.5, if 1990.0 is Jan 1. Then the label "1990", placed at the time 1990.5, represents the entire span of dates between 1990.0 and 1991.0, with the tick marks denoting year boundaries. I see this done sometimes in time series where seasonality is important.

By Ambitwistor (not verified) on 15 Jul 2009 #permalink

If this were coming from a first-year college student, I would agree. This graph was produced by somebody at the American Enterprise Institute, which is ostensibly an intellectual operation staffed by people who ought to know something about how to deal with numerical data.

Even so, when you're talking about unchangeable (or only changeable via an option three clicks deep) aspects of Excel, the appropriate criticism is a wholesale condemnation of Excel use, not making it sound like the graph maker made a conscious decision to do it that way.

Since the scales for both y axes both start at zero (thank God), can't the normalization be done simply by picking the upper bound for one of the y axes (I would suggest the right-hand axis) so that the initial data points overlap? Just eyeballing it, this would require setting the upper bound on the right to about $3200 billion, which would of course lower the slope of the pet data.

By Chris Goedde (not verified) on 15 Jul 2009 #permalink

Oops, I can't read axis labels. Human data uses the scale on the right, pets on the left. So to do the normalization, you'd have to increase the y bound on the left to about $15 billion, which still lowers the slope of the pet data.

By Chris Goedde (not verified) on 15 Jul 2009 #permalink

I've got to say, I think critiquing the design of the graph is a little bit silly. Could he have made it more viewer-friendly? Sure. But since this is a guy with a graduate degree from Cambridge and a PhD from the London School of Economics (hardly right-wing public policy programs), I'm going to guess that he's more than capable of designing a better graph, but simply didn't want to waste the time wrestling with Microsoft software for something that was simply going into a blog post, rather than legitimate work product.

A more interesting question to me is whether or not this represents only pet owners, or all veterinary care? He simply identifies the data as veterinary care. Is this only people choosing to get MRI's for Lassie or does it include commercial ventures like cattle farms? After all, the point is to compare behavior in making medical choices and how it affects costs, which only works if the incentives are similar. They wouldn't be if commercial animal owners are included. Parents hardly make choices regarding their family medical care with an eye towards selling off their children down the road. "Hey doc, can we get some steroids for little Timmy? I'll never be able to auction him off to USC if he doesn't gain some weight."

Um, why is nobody commenting on the different Y scales? All this graph tells me is that medical costs are going up regardless of the species. More importantly, the Y scale difference is telling me we spend a factor of 200x on people versus not-people. For a true comparison, use identical Y scale, and the veterinary line becomes the floor.

By Gray Gaffer (not verified) on 15 Jul 2009 #permalink

It's amazing how many in the health reform debate, don't get or deliberately confuse "total cost" that we pay for the system, versus the "net cost" compared to what we already pay ("cost" meaning cost-benefits together anyway, to be precise.) Really, it's not like paying one trillion or etc. *more* than we already do, but people griping about the taxes etc. often act as if that's what it amounts to. You notice that? Am I right, or are they not that dumb or dishonest after all?!

You are a physicist and you connect discrete values with lines?

I'm a big fan of a logarithmic scale for this kind of thing - but Excel does a lousy job at it. Also, this data means nothing without constant dollars.

Would you mind posting the code that you used to create your graphic? It would be fun to play around with.

It's not code, it's a SigmaPlot file. I don't have it here with me, but I might be able to post it.

I had to resort to a kind of a kludge to get the two-axis thing (what you see is really two separate plots overlaid, one with the vertical axis labels on the left, the other with the labels on the right). There isn't a built-in function for plots with two different axes.