The Science of Error: how polling botched the 2016 election (Synopsis)

"Distinguishing the signal from the noise requires both scientific knowledge and self-knowledge." -Nate Silver

When you take a poll, you survey a number of people with an opinion about something in an attempt to predict the behavior of a much larger number of people. If you increase the number of people you poll, your poll uncertainty drops. This reduction in what we call a statistical error will mean your polls reflect the likely outcome better and better, given one assumption. You have to assume that data obtained from the people you’re polling are reflective of a random sample of future voters.

A visualization of how your statistical uncertainty drops as your sample size increases. Image credit: Fadethree at English Wikipedia. A visualization of how your statistical uncertainty drops as your sample size increases. Image credit: Fadethree at English Wikipedia.

And that’s a big assumption! Any deviation from that, in turnout, in voter preference, in sampling bias, etc., will mean that there are additional sources of error that you have no way of accounting for. These systematic errors plague all observational and measurement sciences, and predicting an election’s outcome is no exception.

Truman holding up a copy of the infamous Chicago Daily Tribune after the 1948 election. Image credit: flickr user A Meyers 91 of the Frank Cancellare original, via https://www.flickr.com/photos/85635025@N04/12894913705 under cc-by-2.0. Truman holding up a copy of the infamous Chicago Daily Tribune after the 1948 election. Image credit: flickr user A Meyers 91 of the Frank Cancellare original, via https://www.flickr.com/photos/85635025@N04/12894913705 under cc-by-2.0.

The successes of predictive models in determining the outcome in 2012 gave us an unwarranted confidence in 2016, which should serve as a rude awakening for us all.

Tags

More like this

I'm not sure in this case we can blame the polls too much. Silver had written the day or so before the election that Trump was within the margin of error behind Hilary. Its not the polls' fault that most people interpreted (I'm estimating here) "Hilary up by 3 points, +/-3 points" as a lock. That was our fault, not the numbers' fault, and probably had a lot to do with confirmation bias (i.e., we've been hearing for so long that she was ahead, we interpreted this really negligible coin-flip type of situation as confirming she was still ahead). And really, 71% is not a high confidence number. Consider all the other major life decisions that you would absolutely never want to risk on a mere 71% chance.

Overall I'm not sure we can say the polls systemically mislead us. They aren't all that accurate, and that's something we need to remember. But the inaccuracy we saw between the days' polling and the final result is fairly adequately explained through random error. Perhaps a little systemic bias, sure, but a shirt-rending, mea culpa amount of polling bias is not needed to explain what happened.

When the entire purpose of the poll was to influence opinion (not inform it) to begin with, your statistical bias has already been baked in quite hard. Drizzle as much statistics as you like over bullshit, it's still bullshit.

In all due respect that is deserved, I think the 'experts' pretty much were completely accurate in reporting their OWN opinions, preferences, and hopes quite accurately. I for one have learned that the media don't so much want to report what actually happened as much as be the ones to determine what happens next.

"Overall I’m not sure we can say the polls systemically mislead us. They aren’t all that accurate"

They have been amazingly accurate in the past. The difficulty in today's world is the changing modes of communication: there is still no good way to adapt traditional sampling methods to account for the huge numbers of people who don't have a fixed location or phone number, and the weighting measures usually used to account for under-represented groups haven't been adequately updated.

The notion that the large, reputable polling sources are willfully engaging in biased work simply to influence people is simply an asinine conspiracy. There are places and individuals that do that, but the larger problem is a sampling system that hasn't been adequately updated and can't respond to societal changes.

(The problems with current survey methods are almost as annoying as the lack of a "preview" button here.)

And, there is the issue of the overwhelmingly massive lack of basic numeracy among people. As eric points out, the significance (no pun here) of a margin of error in a poll completely escapes most people, so any difference of x percentage points is interpreted as a lead (or a behind (sic), depending on one's view). The resulting lack of concern about participation, whatever the level, that misunderstanding generates cannot be laid off on the pollsters. It should be put on the people who communicate those polls, but (IMO) it is highly unlikely that the general writer of political columns today has any greater understanding of basic statistics than the majority of the people reading the articles.

You can attempt to predict as much as you want; the turncoats will always throw the odds at the post.

I'd say the sampling bias would have been more significant than usual.
a/ Most in the media were clearly reluctant to listen to the opinions of anybody supporting Trump
b/ The anti-establishment "anger" or disaffection that resulted in Trump votes includes the Trump voters' attitudes towards the media, probably reducing the chance of "angry" people participating - or providing accurate data - in such polls.

By Craig Thomas (not verified) on 09 Nov 2016 #permalink

a/ Most in the media were clearly reluctant to listen to the opinions of anybody supporting Trump

How is that even relevant to polling? Pollsters use random draw methods to select whom they poll. If they get a skewed sample in terms of who replies, they may then weight the results to better represent the overall population and there are debates about how to best do this, but any "reluctance" of TV personalities to listen to Trump supporters has basically nothing to do with the random phone numbers that pollsters call.

Given the scads of free.news time trump received early in the campaign the "nobody listened to him" line is just crap. It certainly does not have anything to do with polling.

There wasn't a reason for voting, except for the reactionaries on the right who see their entire worldview collapse, and whose vote was the only one ever courted by the USA's "left".

dean, remember too that the democrats didn't do anything to be visible whilst Bernie Sanders was trying to win the nomination. That gave, what, a year or more screentime to Trump without any visibility to Clinton.

Making sure Bernie didn't get the nomination was more important than winning the presidential election for the democratic party.

"a/ Most in the media were clearly reluctant to listen to the opinions of anybody supporting Trump"

Yeah, but there;s a damn good reason for that: there wasn't any, and I mean ANY substance to their "opinions", they were just content free whinges and blaming everyone else for the problems, not their own blind partisanship for the clusterfudge that the country was in.

Like "Brexit", where there were lots of whinges about how bad it was, but fun all about what to do, other than "We gotta leave!". And it's only now that what the hemp they will have to do to actually DO to exit that everyone's backpedalling and, predictably, blaming everyone else for the yelling.

"You can attempt to predict as much as you want; the turncoats will always throw the odds at the post."

There are scads of people on the internet to whom "I won't be voting for either" was NOT an allowed option to pass by without berating the holder of that opinion for helping the nutbar Trump.

That will skew the results.

Or maybe polling is scientistic BS?

By Wesley Dodson (not verified) on 10 Nov 2016 #permalink

Maybe you can use words that have actual definitions, Wes.

PS How many polls happened in the world last year? And how many were wrong? Because if it's less than 5% wrong, then the polls are doing very well, being 95%+ reliable.

@14: Is my bathroom scale scientific BS? It varies by 0.5% every time I get on it. Is my bedroom clock scientific BS? It goes fast or slow by a few seconds every day. How about national weather reporting - do you consider that to be scientific BS?

The polls weren't as accurate as everyone wanted them to be. Very few tools or methods in science are - something that's true for the hard sciences just as much as the social sciences. But that's (IMO at least) a far cry from being BS.

Or maybe polling is scientistic BS?

Wesley's really been on a roll lately. (It gets worse.)

Wow says,
" there wasn’t any, and I mean ANY substance to their “opinions”, they were just content free whinges and blaming everyone else for the problems, not their own blind partisanship for the clusterfudge that the country was in."

Well, *that* is your opinion, and it was apparently the consensus view throughout the media, hence why I believe there was a reduced likelihood of those whose opinions were being derided in this way participating in these erroneous polls.

By Craig Thomas (not verified) on 10 Nov 2016 #permalink

Wow,
In any argument with you, there is ONLY your opinion. This is why you have no credibility to convince anyone of anything except that you think yourself incredibly clever...much like most of the experts who were wrong calling the elections outcome.

The people who decided to vote, and actually, by default, even those who decided to abstain from voting (both are choices with consequences), played a part in the outcome of the election.

When you surround yourself with people who all share your views, political or otherwise, you are not an informed person. You may know a great deal about what you yourself believe is true, but are actually quite ignorant of what lies beyond your little safe spaces and group-think enclaves.

Most people do things for reasons, not statistics.
Here's some actual reasons people didn't vote for Hillary besides being what you would call 'deplorable, racist, bigots, homophobic, islamaphobic etc...':

It is very rare for a party who has won the presidency for the previous two elections to win a third term, especially if things are not going well...such as:

A. Obamacare is an unmitigated disaster. It has done none of the things it was promised to do, and has ended up costing most of us more than we would have been paying otherwise, and forced many of us to lose our doctors and coverage we did like. The law is not workable and has only remained in force by executive orders which non-lawfully over-ride the actual law as it was written. If you want a god-damn single payer health care system, go through a bi-partisan congress honestly and lawfully, or get stuffed.

B. Security. Take your pick, foreign or domestic? It's a mess. Obama was not apparently gifted in actual tactics or diplomacy, and managed to offend our allies as he tried to win over our adversaries with his utter ignorance of culture or tone. Hillary was right in the middle of Obama's foreign policy, and she shot her credibility to hell with the ridiculous 'you-tube' excuse, all the while informing her daughter Chelsea of what she actually knew to be true. Hillary has always had a chronic problem with honesty.
Wiki links also revealed that Mrs. Clinton is not a fan of our national 'borders', and would like to pretty much do away with them. Most of the country (including the Latino community) did not agree apparently.

C. The economy... and 'you didn't build that!' . NO it is not doing well outside of government jobs and federal expansion. Printing trillions of dollars has many consequences...none of them good for good investment, savings, growth, or industry. Living inside the DC area, I am quite aware of where most of this money is going, Many more government employees living in 800,000.00 mac-mansions.

D. Racial relations are worse. This has not gone well for 'great uniter' at all.. and by default, Hillary. From the Beer Garden Summit (because Obama shot off his mouth before he even knew what he was talking about), to "if I had a son he'd look like Trayvon", to telling illegal aliens to vote in the last presidential election, our esteemed president seems to be far more suited to agitating mobs of snowflakes to riot than healing racial divides. Hillary herself for all her bluster about glass ceilings did not even pay her own women workers equally to her men workers...her own campaign manager knew it, read about it in wiki links if you like.

E. The Democrat party is clearly rigged party. More than just a few Bernie Sanders fans were not too peachy when they found out how the game was already determined ahead of time. The corruption and conflict of interest internally is driving many young democrats to third parties. More than just a few of these disgusted democrats decided not to support Hillary or outright began to tear her down from within her own party. I know more than a few of such said angry Bernie supporters, and they aren't sorry Hillary lost. Before you can remove the mote from your neighbor's eye, remove the beam from your own...and clean your damn house.

D. Hillary was a weak candidate. She has been in politics a looooong time since she graduated from law school. I was there when she sailed into town as First Lady, and even then she had more baggage than Trump every did. She was also in trouble politically from the get go. Did you know she was pretty much fired from her first job in DC for dishonesty? As a young woman she was described by one of her first employers in Washington DC as “She was a liar. She was an unethical, dishonest lawyer. She conspired to violate the Constitution, the rules of the House, the rules of the committee and the rules of confidentiality.”…by Jerry Zeifman, chief counsel of the House Judiciary Committee during the Watergate inquiry into Nixon's misconduct....
Ok...it may not be scientific, but that is some serious Karma going on there... and I approve.

So there you go folks. A few reasons from outside your comfort-zone bubble of why Hillary lost the election besides flimsy kvetching about biased statistics and polls. Trump (if you are a legal United States Citizen) is your president now, and hopefully he will turn out fine. In any case, it will be different than the somewhat less than glorious last eight years.

"Or maybe polling is scientistic (sic) BS?"

No Wesley, you are just ignorant about statistics.

"If you believe, as I do, in universal intelligence..."

Good lord, you are just ignorant - it isn't limited to any discipline.

“Or maybe polling is scientistic (sic) BS?”

One might wonder whether Wesley couldn't bring himself to type "scientismic." Then again, I don't exactly pay attention to the usages in currency among the 'scientism' crowd.

I for one am thrilled about the botched polling. It demonstrates the wide difference between biased and manipulated calculation and reality.

The media is already beginning to admit they had confirmation bias big time, and were not asking the right questions because of their smug attitudes and assumptions.
RCP has a great article posted about this,
The Unbearable Smugness of the Press by Will Rahn CBS News.

With a sample of 127,431,868,
Hillary Clinton got 47.5 % of a ~52% turnout.

That's 24.7 %

Cue sample of 127 pundits demanding the overthrow of the Electoral College

You've also got to distinguish between polls and punditry.

The polling in 2016 was more accurate than the polling in 2012. The final collection of polls on RealClearPolitics in 2012 showed Obama beating Romney by 0.7 points. Obama won by 3.9 points, an error of 3.2 points. Hillary was shown as winning by 3.3 points and she'll likely have won the popular vote by ~1 point, an error of 2.3 points.

When you get down to the state level, Pennsylvania had Hillary up by 1.9 points. That was within the margin of error and trending towards Trump, but 20-vote Pennsylvania is so reliably Democrat that it makes up part of the "blue wall".

Late Night shows are heavily staffed by liberals, as are op-ed pages, and news rooms. Those men and women weren't reading the data as a scientist would. They sought out the data that reinforced their view and reported it. Blaming the data is just scapegoating for a false narrative.

Ironically the pervasive narrative of a Hillary landslide combined with footage of loooooong lines at polling places likely depressed her vote. Why stand in a line for hours when it is unnecessary? In becoming partisan participants to further what they viewed as the 'correct' path forward, the media at large helped it to not happen.

Late Night shows are heavily staffed by liberals, as are op-ed pages, and news rooms. Those men and women weren’t reading the data as a scientist would.

In the last few days I don't think anyone was reading the data "as a scientist would," even the scientists. The error was not limited to liberals, or talk show hosts, or intelligencia; pretty much everyone (even Trump staffers) pegged Hillary as being ahead, because even though her lead was actually within the margin of polling error at the end, confirmation bias had everyone thinking that earlier polls were relevant data that should factor into the overall calculation of wining. But they shouldn't; an election result is not a time-average of your popularity across the last few weeks. Being ahead 10% in week 1 and in a dead heat in week 2 doesn't mean you're ahead by 5% at the end of two weeks, it means you're in a dead heat. I think the majority of people (conservatives or liberals) were at least subconsciously doing the 5% calculation, not throwing out all the past results and just paying attention to the latest results as they should have been doing.
Here is 538's analysis of the polling results.

Ironically the pervasive narrative of a Hillary landslide combined with footage of loooooong lines at polling places likely depressed her vote.

True but ultimately irrelevant, as the drop in (blue) turnout occurred mostly in California and a number of other safe blue states. That drop will change the final popular vote, but its doubtful it affected the electoral college outcome (and therefore, your claim that the media affected the outcome is likely untrue for the same reason; media coverage didn't depress the blue vote anywhere that mattered).

"I for one am thrilled about the botched polling. It demonstrates the wide difference between biased and manipulated calculation and reality."

Yeah, a whole 6% difference when they're not really showing the same values, in a dataset that theoretically should have a 3% error, is "botched" to the idiots who don't like what reality says, so have to find a way to ignore reality.

Following up on CFT's thoughtful analysis, which I don't entirely agree with, I have read two other analyses that I think contain some excellent observations:
John Schindler:
"The problem with pushing identity politics among minorities as a political weapon is that the majority eventually realizes they have an identity too."
(He goes on to say that the Republican Party is now a de-facto "white party").

Chad Orzel:
"There are a lot of people who feel like they’re being screwed by a system run for the benefit of people in big cities on the coasts who sneer at them as ignorant, racist hicks. Some of them positively relish the chance to vote for a vulgar buffoon who horrifies people from the coastal elites, even when they themselves would not behave a tenth as boorishly as Trump does."

By Craig Thomas (not verified) on 15 Nov 2016 #permalink

"of people in big cities on the coasts who sneer at them as ignorant, racist hicks."

Many of whom *ARE* ignorant racist hicks.

Probably as many as there are people in the big cities on the coast sneering at them.