Who Will Win Control of Congress In November? Statisticians Make a Prediction

If you're not reading the Columbia University stats blog, Statistical Modeling, Causal Inference, and Social Science, you're missing a lot of great stuff. For example, today's post by Andrew Gelman discusses the paper "Forecasting House Seats from Generic Congressional Polls" by Bafumi, Erikson, and Wlezian. From the paper:

This paper is intended to provide some guidance for translating the results of generic congressional polls into the election outcome.1 Via computer simulation based on statistical analysis of historical data, we show how generic vote polls can be used to forecast the election outcome. We convert the results of generic vote polls into a projection of the actual national vote for Congress and ultimately into the partisan division of seats in the House of Representatives. Our model allows both a point forecast--our expectation of the seat division between Republicans and Democrats--and an estimate of the probability of partisan control. Based on current generic ballot polls, we forecast an expected Democratic gain of 32 seats with Democratic control (a gain of 18 seats or more) a near certainty. (Emphasis in original.)

To arrive at these predictions, they used the results of the last 15 midterm elections, and generic polling data from within 30 days of those elections. From this data, they produced a regression equation that predicts the vote-share for Democrats based on generic polling. Here's the equation:

Dem Vote Share = 24.38 + 0.51 * Dem Poll Share - 1.07 * Presidential Party1

They then stuck this year's polling data (from October 8 until today) into the equation, and get a figure of 55% of the vote-share for Democrats (95% confidence interval = 51.3 to 58.7). Using simulations (desribed on p. 3 and in the appendix) that treat open and incumbent seats differently, they translate the vote share into the number of congressional seats and the probability of Democrats taking control of the House. The simulations yield the data summarized in this graph (p. 4):

i-9ce304416c078d1506315f6c0598480f-congress.JPG

As you can see, when you simulate the Democratic vote shares from within the confidence intervals (the graph goes from 50 to 60%, on the x-axis), the Democrats get at least 218 seats, and the probability that they'll have control of the House of Representatives goes up to almost .9. In his post, Andrew Gelman argues that they might be overstating the confidence of this prediction. I'll let the statisticians argue about that, but I still think the whole thing is really cool, especially since it predicts the outcome I want!

1In case you're not familiar with regression, before the values are put in, the equation looks like this:

Vj = α +β1Generic Polljt + β2Presidential Partyj

from this paper by the same authors. The equation is basically that for a line (y =mx + b, where m is the slope and b is the intercept), so α (24.38 in the final version) is the intercept, and the two β's are the regression coefficients (0.51 and 1.07 in the final version), which take the place of m, or the slope, in the equation for a line.

More like this

I've gotten an absolutely unprecedented number of requests to write about RFK Jr's Rolling Stone article about the 2004 election. RFK Jr's article tries to argue that the 2004 election was stolen. It does a wretched, sloppy, irresponsible job of making the argument. The shame of it is that I…
There's been some debate among the climate hawks about last night's election returns. Politico posted a story suggesting that the toll was especially hard on Democrats who supported the landmark climate change legislation passed by the House last summer. Kate Sheppard observed that quite a few of…
Suppose you've got a bunch of data. You believe that there's a linear relationship between two of the values in that data, and you want to find out whether that relationship really exists, and if so, what the properties of that relationship are. Once again, I'll use an example based on the first…
Since I posted on a really bad study that's outside of my area of expertise the other day, I thought I should make it up to you by posting on what I think is a good study by Gelman et al. that's also outside of my area of expertise today. Plus, with a title like "Rich state, poor state, red state,…