State of the Statistics: A Nonlinear Non-Diebold Effect?

By developinginte… on January 19, 2008.

UPDATE: Diebold effect explained?

Marc has an excellent summary of a flurry of Diebold-related discussions between me, "T", Marc, and Sean.

Sean also has a network model of the apparent Diebold effect.

I think we'll soon hear from Brian Mingus (who's running a meta-classifier) and Steve Freeman (an expert on machine-effects in elections) as well.

At bottom is a disagreement over how to infer causality in observational data, and how to diagnose the functional form of a data set.

The good news is two-fold: there may not be a large "Diebold effect" when nonlinear methods are used, and reason suggests that the apparent Diebold effect will be explained through demographics.

The "bad news" is also two-fold: not everyone agrees those nonlinear methods are appropriate, and there's an alarmingly persistent, consistent, and large Diebold effect when simple - but traditional - inferential statistics are used.

It's still not clear exactly which demographic feature results in such discrepant results between nonlinear and linear models. (Edit 1/21: An important but previously unconsidered variable is how each precinct voted in the 2004 democratic primaries).

More like this

Recount Redux

It's been a couple of days since I posted on the New Hampshire recount. At the time, I fully expected that I wouldn't do another post on the topic, but a couple of things that have happened since then changed my mind. First, Scibling Chris Chatham included me in a list of people who he thinks…

Hurr-rrr-rrrting America: Strawmen on Soapboxes and the NH Primary

Update: Diebold effect explained. Jon Stewart famously accused the Crossfire co-hosts as "hurting America" by imitating the style and appearance of political debate to disguise partisan hackery and vacuous strawman arguments. In the case of the recent NH primary, the same criticism can be leveled…

"The Diebold Effect": Hillary's Votes Higher From Diebold Machines Even Controlling for Demographics (education, income, population, etc)

UPDATES: Diebold effect explained. (previous: 1, 2, 3, 4 5 6 (a nonlinear approach) 7) In contrast to exit pre-election polls, the final vote tally from the NH democratic primary shows a surprise victory for Hillary Clinton. People quickly noticed an anomaly in the voting tallies which seemed…

Machine Votes, Hand-Counted Votes, and the Willie Sutton factor.

In the week since the New Hampshire voting, a number of people have become increasingly concerned about some of the things that they've seen in the results. Two things, in particular, have gotten a lot of attention. The first is the difference between the pre-election polling, which had Obama…

I had got a desire to begin my own company, nevertheless I did not earn enough of cash to do that. Thank God my close mate advised to take the home loans. Hence I used the car loan and made real my dream.

The credit loans suppose to be essential for people, which are willing to organize their own career. By the way, this is very easy to get a car loan.

Cars and houses are not cheap and not everyone is able to buy it. Nevertheless, business loans was created to aid different people in such kind of hard situations.

I ran a few dozen experiments with a variety of classifiers and meta classifiers and found that J48, which generates a C4.5 decision tree, and Ada Boost (pdf) achieve the best results using my feature set. RandomForests perform poorly over a variety of parameters. I didn't experiment much with feature selection (such as by PCA - takes too long) and I'm not sure which features others are using. Send me your datasets and I'll run them.

Features used
Town,Sqmiles,Votes,Municipalwater,Municipalsewer,
Totalhousingunits,Singlefamilyhomes,Multifamilyunits,
Manufacturedhomes,Totalpopulation,Medianage,Percenthighschoolgraduates,
Percentholdingbachelorsdegree,Totallaborforce,Totalemployed,
Totalunemployed,Percapitaincome,Medianhouseholdincome,Age5andunder,
Age5to19,Age20to34,Age35to54,Age55to64,Age65andup,Employeesinlargestbusiness,
MarkObama,MarkClinton,IncomeDev,EducationDev,Contested,PopDensity

AdaBoostM1
Relation: diebold

Correctly Classified Instances 204 88.3117 %
Incorrectly Classified Instances 27 11.6883 %
Kappa statistic 0.7647
Mean absolute error 0.1565
Root mean squared error 0.3046
Relative absolute error 31.5858 %
Root relative squared error 61.2877 %
Total Number of Instances 231

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure Class
0.868 0.098 0.918 0.868 0.892 Hand
0.902 0.132 0.844 0.902 0.872 Diebold

=== Confusion Matrix ===

a b <-- classified as
112 17 | a = Hand
10 92 | b = Diebold

J48
Relation: diebold

Correctly Classified Instances 206 89.1775 %
Incorrectly Classified Instances 25 10.8225 %
Kappa statistic 0.7812
Mean absolute error 0.1804
Root mean squared error 0.3088
Relative absolute error 36.4002 %
Root relative squared error 62.1423 %
Total Number of Instances 231

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure Class
0.891 0.108 0.913 0.891 0.902 Hand
0.892 0.109 0.867 0.892 0.879 Diebold

=== Confusion Matrix ===

a b <-- classified as
115 14 | a = Hand
11 91 | b = Diebold

I hate to add to all this noise, and I have neither fancy statistics nor the time and energy to produce any, but can't resist making two quick points:

1. this may be a good example of why Killeen (e.g. Killeen et al. An alternative to null-hypothesis significance tests. Psychological science : a journal of the American Psychological Society / APS (2005) vol. 16 (5) pp. 345-53) has suggested that null-hypothesis significance tests can be very misleading-- if you don't know the prior probability distributions under the null hypothesis (which you can't usually, and definitely not in this case), Fisher himself said ï¿½Such a test of significance does not authorize us to make any statement about the hypothesis in question in terms of mathematical probabilityï¿½ (Fisher, 1959, p. 35). This is a problem we have all tried really hard to ignore because it calls into question much of what we do with statistics, but it seems like a particularly big problem for complex and uncontrolled (in the sense that people were not randomly assigned to conditions, not in the sense that you have not tried to control for confounding variables) correlational studies like this, where we really can't know the priors. Killeens's proposed solution is to use the probability of replication (p-rep) instead, but since this is currently just a mathematical transformation of p, I don't think it really solves the priors problem here (although it seems to has other advantages).
2. This (http://www.nytimes.com/2008/01/06/magazine/06Vote-t.html?ref=magazine) is a really interesting article in the NY Times about voting machines-- basically, everyone worries about them, but the public and the experts worry for very different reasons. The public tends to believe in deliberate fraud, while the experts seem to all agree that the problem is random error and lost votes due to crashes, which could cause very tight races to be cast into doubt. Optical scan of paper ballots is actually the preferred solution, since it enables hand re-counts.

I was about ready for the stats to show no problem. But then Nashua 5 ward result came out.

Somehow a systematic (human) error in that ward is supposed to explain the scaling up every candidate BUT Obama by 9%.

Hillary, Kucinich, Edwards, Richardson all lost around 9% in the recount, Obama went up but by well under 1%.

Question, to what extent were the models looking for a Hillary effect - I wonder about comparing the Obama vote vs the Others vote, or something, to find maybe a stronger signal if it's there?

I have faith in the stats to pick up a problem - and they seemed to, but now they don't, but this just doesn't seem reasonable against the explanation being provided:
1030 x 0.93 [HILLARY]
405 x 0.93 [EDWARDS]
9 x 0.88 [BIDEN]
72 x 0.96 [RICHARDSON]
673 x 1.01 [OBAMA]

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Performance Improves with Transcranial Random Noise Stimulation

November 21, 2011

Stimulating the brain with high frequency electrical noise can supersede the beneficial effects observed from transcranial direct current stimulation, either anodal or cathodal (as well as those observed from sham stimulation), in perceptual learning, as newly reported by Fertonani, Pirully &…

Attractors All the Way Up: Metastability, Rostrocaudal Hierarchies, and Synaptic Facilitation

November 18, 2011

In their wonderful Neuroimage article, Braun & Mattia present a comprehensive introduction to the possible neuronal implementations and cognitive sequelae of a particular dynamical phenomenon: the attractor state. In another excellent paper, just recently out in Frontiers, Itskov, Hansel and…

Architecture of the VLPFC and its Monkey/Human Mapping

November 17, 2011

If you ever said to yourself, "I wonder whether the human mid- and posterior ventrolateral prefrontal cortex has a homologue in the monkey, and what features of its cytoarchitecture or subcortical connectivity may differentiate it from other regions of PFC" then this post is for you. Otherwise,…

Modus Tollens, Modus Shmollens! When people commit a fallacy so absurd that it's only recently been given a name.

November 16, 2011

Suppose - rather reasonably - that soups which taste like garlic have garlic in them. You observe two people eating soup; one of them says to the other, "There is no garlic in this soup." Do you think it's likely that the soup taste like garlic? If you said yes, then congratulations! You've just…

Greater Performance Improvements When Quick Responses Are Rewarded More Than Accuracy Itself.

November 8, 2011

Last month's Frontiers in Psychology contains a fascinating study by Dambacher, HuÌbner, and SchlÃ¶sser in which the authors demonstrate that the promise of financial reward can actually reduce performance when rewards are given for high accuracy. Counterintuitively, performance (characterized as…