If the only tool you have is a hammer....

By tlambert on September 19, 2004.

Summary: Lott and Hassett have not analyzed their data correctly---it actually shows no evidence that headlines are biased against Republicans. (My previous posts are here and here.)

The essence of Lott and Hassett's case that newspapers are biased against Republicans is given in their presentation:

"In the case of unemployment, 44 percent of the headlines under the Clinton administration were positive while that same number was only 23 percent under Bush II. By comparison, the average unemployment rates were fairly similar, 5.2 percent under Clinton s eight years and 5.5 percent under Bush during the sample."

Their argument is that Clinton got more positive headlines and that this is not explained by differing economic conditions. To test this they do a whole pile of regressions, but these just obscure what is going on. To understand what is happening, all you have to do is look at some headlines of stories reporting the unemployment rate. Here are some typical examples:

Unemployment rate falls to 5.4%
US Economy Creates 144,000 Jobs; Unemployment Rate Drops Slightly
Missouri unemployment rate holds at 5.5 percent
Colorado unemployment rate steady

Almost all of the headlines for stories reporting on the latest unemployment rate told you whether the rate had increased, decreased or stayed the same. Lott and Hassett counted a headline about unemployment as positive if it reported that the unemployment rate had gone down. So if headlines just follow the pattern above, what percentage of positive headlines would you expect to see? You won't find the answer in their paper---they don't even include a control for whether unemployment increased.

I downloaded the relevant unemployment data from the Bureau of Labour Statistics and calculated what percentage of the time the unemployment rate decreased.¹ The results are in the third column of the table below. For example, it shows that under Bush I, 21% of the time, the rate decreased. The second column shows the percentage of positive stories that Lott and Hassett report in their paper. For Bush I, this is 20%, almost exactly the number you would expect if all headlines just reported whether the rate had gone up or down. Clinton doesn't do as well, with only 42% positive headlines despite unemployment decreases 48% of the time, a 6 point gap, while Bush II does even worse with an 11 point gap. Clearly there is no bias for or against Republicans here---the gap for Clinton is exactly half way between that for Bush I and Bush II.

President	Percent of stories that are Positive	Percent of months with monthly declines	Gap between Positive stories and monthly declines	Percent of months with quarterly declines	Percent of months with yearly declines
Bush I	20	21	-1	17	0
Clinton	42	48	-6	65	92
Bush II	22	33	-11	28	13

Why is there a gap for Clinton and for Bush II? The correct way to answer this is to go back to the news stories and look at the ones that did not have positive headlines when the unemployment rate went down. I suspect that the ones that weren't positive included some negative news as well. For example, if unemployment fell, but jobs also fell, Lott and Hassett would count this as "mixed" rather than positive. The incorrect way to answer this is to run more regressions². The actual words in the headline tell you why it was positive or not. Lott/Hassett's methodology throws away this information and then tries to work out why the headline was positive by doing lots of regressions. This approach seems to be driven solely by the need to express the data in a form suitable for the application of linear regression.

The last two columns of the table show how unemployment had changed over the previous three months and over the previous year. This probably gives a better idea of trends in unemployment since it smooths out small fluctuations. These show that the focus on the short term in the headlines made the headlines better for Bush I and II and worse for Clinton. Clinton only got 42% positive headlines even though there was a one-year decline 92% of the time, while Bush I managed 20% positive headlines despite never having a one-year decline in unemployment.

1. You can download the data and calculations in this spreadsheet.

2. The title of this post comes from the saying "If the only tool you have is a hammer, then everything looks like a nail.". If the only tool you have is linear regression ... Update: D squared reminds me that they actually used tobit regression, which isn't even the right kind of regression for their data. So I'm going to make regression equal hammer in my analogy, and say that they didn't even use the right kind of hammer.

Update 2: I searched Factiva to see what the headlines were when a decrease in unemployment did not generate a positive headline and in those cases it seems that while unemployment had fallen, the number of jobs had also fallen. The headline then either reported both things, which they count as a "mixed" headline or the fall in jobs, which is a negative headline. I added data for the change in the number of jobs to my spreadsheet and worked out how many positive headlines you would get if both unemployment had to fall and jobs have to increase to get a positive headline. The results are in the table below.

My new model predicts the results for Clinton and Bush II almost perfectly. For Bush I, the previous model fits better. It appears that after Clinton came to power headlines switched from just reporting the unemployment rate to reporting the unemployment rate and the number of new jobs. And once again we see no evidence of partisan bias in the headlines.

President	Percent of stories that are positive	Percent of months where you expect a positive story
Bush I	20	13
Clinton	42	45
Bush II	22	23

More like this

When Adjusted for Age, Unemployment Is Worse Than in the Early 80s

If you take the raw numbers, the 1982-3 recession seems worse compared to the current depression: after all, during that time, unemployment stayed above ten percent for seven months.

The Unemployment Deficit: It's Really That Bad

A while ago, I snarked that ten percent U3 unemployment was the new normal.

Comparing Apples to Oranges: Unit Errors in the NYT

Via Atrios, I found this article at the American Prospect, which demonstrates an example of a very

More Stupid Graphs

Remember the post I made a couple of weeks ago, flaming the wall-street idiots for a bad graph? They were comparing the value of financial firms before and after the current

did you manage to get the dataset, Tim? Lott never replied to me even though I asked him fairly politely.

I don't have his dataset. In his AEI presentation he said he would provide it to anyone who asked, so it is a bit suspicious if he won't give it to you.

Do you understand that Hassett and Lott say that they controlled for the level and if the unemployment rate was going up or down? If they did do that, doesn't that say that your concern does not pan out? As to your claim about how best to do this work, "The correct way to answer this is to go back to the news stories and look at the ones that did not have positive headlines when the unemployment rate went down," that appears to me to be what their regressions really do. They regress the percent positive headlines on the level and change in some economic variable and then some dummy variable for party or president. Therefore when the unemployment rate is falling Hassett and Lott are seeing if there are an unusual number of negative stories and if that unusual number is occurring under different presidencies.

They did not have a variable that controlled for whether the unemployment rate was going up or down. They included the value of the change, which is not the same thing.

Lott's just got back to me and the offer of the dataset is genuine ... now I just need a way to read STATA files and I'll be able to see just how serious the effect of using the wrong regression models is.

Excellent. One interesting test would be to run their regression with the headlines predicted by my simple model. I suspect their regression will find a bias against Republicans.

They did not have a variable that controlled for whether the unemployment rate was going up or down. They included the value of the change, which is not the same thing. Do you mean here the absolute value of the change? Otherwise I'm confused.

I think he's saying they didn't add a variable for up/down/unchanged. Given how the press likes a horse race, it would likely be a better predictor than the change itself.

On the overall subject, where's the Stata file? Last time I replicated a Lott regression of news headlines (the Rush Limbaugh defense piece) it smelled like cherrypicking to me, but I'm not quite expert enough to say for sure. Despite that, I'd like to try again, as before with a Poisson regression (count data is count data, right?).

Prof., it's double-truncated Tobit because they are doing percentages of stories negative, bound by zero and one. Linear regression might not be the right hammer -- you can test for that. What you really would have wanted is a probit for each newspaper (trichotomous negative-neutral-positive). If you've got the STATA, you might try this if you're of a mind...

Seems to me on pages 15-16 of their paper (on which the presentation is based, avaiable on SSRN) that they have tried several specifications. You might want to read that if you haven't already -- you've only referenced the presentation slides on this post.

I owe you a post on my blog however, and I will give it to you anon.

But Tobit isn't for bounded data, it's for censored data. You must agree that it is the wrong model.

If have read their paper and looked at what they did. Their whole approach is misguided. Doing a whole bunch of regressions and throwing more and more variables into the mix isn't going to help you understand what is going on.

If I was going to set out to seriously analyze this I'd use something like C4.5 rather than doing a gazillion regressions. You might find out something about what makes for postive headlines, but the data just isn't there to answer the question of whether there is political bias.

To Tim:
I may be missing something, but it sure looks to me that L&H are picking up whether unemployment is going up or down. The change in unemployment would be the difference in the unemployment rate now and what it was last month should pick up the trend. Also, what is a C4.5?

To wcw:

On pages 15 and 16 L&H mention negative binomials and it is my ubderstanding that that is a type of generalized poisson. So I believe that they have done what you say,

You know Tim, after reading this blog for some time, I'm beginning to thing that John Lott's pronouncements shouldn't be trusted...

True, should be truncated least squares regression. (See e.g. Greene, ch 21, sec. 1.) And if I understand LH correctly the (underlying) dependent variable is non-normal. I am not thrilled with using Tobit in this case, but as I said they tried other specifications. They report no difference. If you find one, call them on it.

Look, people, this is not a difficult model choice problem. The underlying decision that you're modelling is the editor's choice "Do we write a positive story, or a negative one?" This is a binary choice, conditional on the economic data. Therefore, it's a logit or probit model, with the parameter of interest being whether the coefficient on the (Dem/Rep) variable is zero or not. Although I must say that even in the context of this regression, I would suspect that any significant result would be quite likely the result of under-conditioning, given that the unemployment model seems to visibly underperform Tim's "naive" model of simply asking whether it went down or not.

In general, chucking loads of different regression models at a dataset is poor econometric practice, particularly when no reason is given for eg not trying a probit.

To dsqaured:
How do you handle a logit or probit approach if there is more than one newspaper story for any given news announcement and if you get some mixture of positive and negative? What if you have some values that are .333, .5, and .6667 for example as they claim? Could you explain how you use the approach you are saying. It would seem that all the estimates used by Hassett and Lott would be fine. What are we missing here?

that would be an "Ordered Probit" (or logit), but to be honest I would try to reparameterise the data to show a binary choice.

To dsquared:
Thanks for your answer. I can see this as a possible approach for the individual newspaper estimates that they look at. If you have two or more articles that come out for the news report, how would you reparameterise the data? Do you really have multiple observations for each news report for each paper? Anything that I can think of that would break it down further would seem to imply that or are you arguing that it either doesn't matter or can be fixed some other way?

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

If the only tool you have is a hammer....

More like this

When Adjusted for Age, Unemployment Is Worse Than in the Early 80s

The Unemployment Deficit: It's Really That Bad

Comparing Apples to Oranges: Unit Errors in the NYT

More Stupid Graphs

Scienceblogs is shutting down

June 2017 Open Thread

March 2017 Open Thread

January 2107 Open thread

December 2016 Open Thread

The LHC, Black Holes and You

Messier Monday: The Phantom Galaxy at the Beginning-of-the-Marathon, M74

Eta Carinae's 21-Year Outburst: A Cosmic Instant Replay!