… if Nate Silver’s analysis is correct.

If a person is asked to make up a bunch of numbers … random numbers … s/he will tend to make up non-random numbers instead. So, for instance, if I ask you to state random numbers that have two digits, and I plot the second digits on a hisogram, and then I ask my computer to make up two digit random numbers and plot the second digits on a histogram, my computer’s histogram will show roughly the sane number of 0’s as 1’s as 2’s …. as 9’s (especially with a large sample size) but you, being a silly you-man, will make up numbers with more of one than another in a non-random fashion. If you are a typical Westerner you will come up with more sevens, most likely.

Nate Silver studied the polling data prodcued over a long period of time by the conservative publicity firm and polling company “Strategic Vision” and showed that their polling data is not random in the trailing digit like it should be.

Maybe polling data is just not random? If you studied the trailing digits of, say, prices of gasoline to the tenth of a cent in the United States, you’d find that all gasoline costs something-something-point-nine (very non random). If you studied the trailing digits of retail prices in most areas, again, you’d find more 9’s than expected. But, when Nate Silver studies Five Thirty Eight’s polling data, he gets a pretty even distribution across the trailing digits, which is what you’d expect in a large sample.

Nate’s article is here.

It will be interesting to see how Strategic Vision reacts to this . They’ll probably have a poll that shows that most people think they are being honest.

Comments

  1. #1 Jason Thibeault
    September 27, 2009

    They’ve already reacted with bluster and insinuation of legal threats. As I noted over at my place, they’ve also been sanctioned by the AAPOR for not disclosing their methodologies, which I figure is because who wants to actually disclose that their methodology is “pick some numbers that are as close to reality as we can manage, then skew them heavily toward whoever paid us”.

  2. #2 Tony P
    September 27, 2009

    I did a quicky in Excel 2000 using the RND function. I ran it for 100 numbers.

    Interesting on pass 1 the most frequent numbers are 4, 5, and 8.

    Second pass gets 5, 6, 9.

    So if I combine the two the most frequent numbers are 4, 5, 6, 8, 9. 0, 2, and 3 seem to be missing.

    I know there’s a randomization bug in VBA so maybe that accounts for it.

  3. #3 D. C. Sessions
    September 27, 2009

    It will be interesting to see how Strategic Vision reacts to this .

    They’re threatening Nate with lawyers:

    Secondly in regards to Nate Silver’s statements, we categorically deny them and will refute them. We have a call into our attorney on this and fully intend to take action that will vindicate us. I wish Nate had contacted me directly yesterday when he began this tirade, I could have answered his questions fully to his satisfaction prior to damage being done to our reputation. Now that he has made these accusations and posted them online, I must and will defend our company’s reputation through all legal avenues available. The reason that we are going the legal route is he has attempted to do severe damage to our reputation and what is he going to do when we disprove him just say I am sorry. That isn’t enough at this point.

  4. #4 Caravelle
    September 27, 2009

    Maybe polling data is just not random? If you studied the trailing digits of, say, prices of gasoline to the tenth of a cent in the United States, you’d find that all gasoline costs something-something-point-nine (very non random). If you studied the trailing digits of retail prices in most areas, again, you’d find more 9’s than expected.

    He does say the numbers might not be random in a homogeneous sample (say, only McCain-Obama polls in the presidential election), but he’s using data from a wide variety of polls so it doesn’t really apply.

    Anyway, there’s a later post addressing that problem, where he compares Strategic Vision’s numbers to Quinnipiac polls, and although the comparison shows that polling data might not follow a completely uniform distribution, SV’s data is still way off.

    Also, it turns out that citizenship test that Oklahoma high-schoolers flunked so badly that 5% of them thought Obama was our first president ? Strategic Vision.
    That would explain a lot…

  5. #5 The Science Pundit
    September 27, 2009

    They’re threatening Silver with legal action yet still refuse to disclose their polling methodologies? That sounds like the polling firm’s equivalent to leading police on an ultra-slow chase through LA to me.

  6. #6 Oran Kelley
    September 27, 2009

    They won’t sue for the simple reason that Silver would have to get access to lots of their raw data & methodologies in order to defend himself, which they probably won’t be able to allow.

  7. #7 Deen
    September 28, 2009

    @TonyP: that’s not how you do statistics. There’s always a most common digit in any run of any size, or a set of three most common digits. Running two samples likely will give you two different sets of three most common digits. None of this gives you any information on whether some digits are less common than others in your RNG. It’s even possible that there might be a digit that is in the top three most common digits of the combined data set, even though it’s not in the top three of each data set separately.

  8. #8 wolfwalker
    September 28, 2009

    So what? Everybody juices their polls, one way or another. Skewed samples, loaded questions, finagled analyses … you can find something “questionable” about any poll done by any firm regardless of its political persuasion. That’s why I never trust any of them, and haven’t in many years.

  9. #9 Greg Laden
    September 28, 2009

    wolfwalker: well, no. And the specific accusation here is that they pulled the data out of their asses. That is not juicing the poll, it is totally making the thing up.

    I like the fact that Nate has asked for people to contact him if they were polled by this group and have some story. At first I thought this was dumb because the ‘story’ would be pretty useless. But then I realized his true motivation.

    Has anyone ever been polled by this group?

  10. #10 Virgil Samms
    September 28, 2009

    D.C. Sessions: We have a call into our attorney on this…

    *subdued cackle* – they “have a call into”? So their attourney doesn’t take their calls straight away? Maybe a receptionist makes a note, and later in the afternoon, when the serious work for the day is done, their attourney will look through the notes and decide whether to call them back – or whether to put in a quick nine holes on the local course?

  11. #11 wolfwalker
    September 28, 2009

    And the specific accusation here is that they pulled the data out of their asses.

    Again: so what? What’s the difference between data “pulled out of their asses” and data slanted using a skewed party affiliation, or selective sampling, or loaded questions, or any of the other ways that pollsters have learned to use over the years? It’s all fraud, and all sides do it. If you get hot and bothered over this case and not over others, you’re being hypocritical.

  12. #12 D. C. Sessions
    September 28, 2009

    Has anyone ever been polled by this group?

    That question came up at dinner last night, and the answer is best described here. Given the degree of ingenuity displayed (always assuming Nate is right) I wouldn’t rule out other demonstrations of brilliance. Still, the “thumb on the scales” hypothesis can’t be ruled out. Ockham’s Razor rather than Hanlon’s.

    Either way, discovery will tell. Which is a good reason to expect that this will never actually go to court — too much other dirty laundry (e.g. the objectives requested by the sponsors) would come out.

  13. #13 travc
    September 28, 2009

    wolfwalker, the “pox on both your houses” BS is getting really old. Many polling orgs put a lot of effort into actually measuring the opinions of the public and generally do a pretty good job of it.

    There are plenty which make serious but honest mistakes. Certainly, some polling orgs get clever with sampling, ordering, and wording to get their desired results. However, these dishonest manipulations are far more often designed to produce “newsworthy” results than a particular partisan bias. Subtle changes in wording which result in dramatic swings in polls which follow the existing media narratives are a classic example.

    Several of the polling organizations are actually academic endeavors, where getting the methodology right is actually the central goal.

    Nate Silver is popular (and respected) because he is extremely competent at statistical measurement. He has also quickly gained real expertise in the subtleties of polling and the use of surveys and polls to actually make real measurements.
    It is notable that he started off in the world of predictive baseball statistics, where being right is all that counts.

    PS: Nate had a post a while back on
    How to Poll on the Public Option.
    Beyond being interesting, I think it also illustrates Nate’s essential geekiness and concern for ‘doing it right’.

  14. #14 wolfwalker
    September 28, 2009

    wolfwalker, the “pox on both your houses” BS is getting really old.

    [shrug] It’s what I think of polling and pollsters. I have no doubt that at least some pollsters try to get it right. I remain unconvinced that they can succeed reliably. I also remain unconvinced that there is anything honest about political polling — by anyone, on any side, of any issue whatever. Some of them bend the results on purpose, others do it unintentionally — but one way or another, they all do it.

    I don’t trust any political poll, and I don’t recommend anyone else do so either.

  15. #15 Carl Nyberg
    September 28, 2009

    wolfwalker, are you claiming that you make no moral distinction between pollsters making erroneous assumptions and fabricating the data?

    I can see why it was so easy to con the sheeple on the “conservative” side of U.S. politics. They don’t care what the truth is.

  16. #16 Ben Zvan
    September 29, 2009

    Doesn’t this just mean the 538 uses a computer to fake their polls?

Current ye@r *