Nice analysis of why the Iranian election is probably fraudulent

There's an interesting article in the Washington Post today exploring one line of reasoning suggesting that the Iranian election is fraudulent. Basically, it comes down to this: the results aren't random enough. In a fair election, you'd expect that each digit, from 0 to 9, would be the final digit the results in each region roughly ten percent of the time: you'd see a vote count like 12,437 just as often as 12,435. But in fact certain digits come up more often:

The numbers look suspicious. We find too many 7s and not enough 5s in the last digit. We expect each digit (0, 1, 2, and so on) to appear at the end of 10 percent of the vote counts. But in Iran's provincial results, the digit 7 appears 17 percent of the time, and only 4 percent of the results end in the number 5. Two such departures from the average -- a spike of 17 percent or more in one digit and a drop to 4 percent or less in another -- are extremely unlikely. Fewer than four in a hundred non-fraudulent elections would produce such numbers.

You can't expect the first digits in a result to be random, because they represent tens of thousands of voters, and in any given region, one candidate probably is supported by more voters than the other candidates. But the final digits should be random in a fair election.

For some reason, people seem to pick numbers ending in "7" as more "random" than other numbers. When we asked our readers to generate random numbers from 1 to 20, 7 and 17 were the most common answers, appearing almost three times as often as you'd expect if the numbers were truly randomly generated. Meanwhile numbers ending in 5 only came up about half as often as they should have. In fact, our results were quite similar to the Iran election results for those digits:

i-58ea9c37d984d45fdf2011acc83d61e7-iran2.jpg

Beber and Scacco also found that the patterns in the last two digits of each number are not random. They calculate the chance of these two anomalous results in the elections occurring due to chance as less than 1 in 200.

Tags

More like this

This week's Casual Fridays study was inspired by this comment on the Random Number thread: When a freshman at Penn State too many years ago to count, the intro psychology prof did an amazing demonstration. I wonder if anyone knows the answer to this which I have long forgotten. He said he had…
There's one kind of semi-mathematical crackpottery that people frequently send to me, but which i generally don't write about. Given my background, I call it gematria - but it covers a much wider range than what's really technically meant by that term. Another good name for it would be numeric…
I've gotten an absolutely unprecedented number of requests to write about RFK Jr's Rolling Stone article about the 2004 election. RFK Jr's article tries to argue that the 2004 election was stolen. It does a wretched, sloppy, irresponsible job of making the argument. The shame of it is that I…
I heard it again the other night. One of the TV chin strokers talking about this poll or that poll showing Obama (or McCain) ahead with a "statistically insignificant" lead, and I thought to myself, no one who knew much about statistics would use a phrase like that. Strictly speaking, while there…

Like many people I'm not convinced that the Iran elections were on the "up and up." That said...
So the pheonomenon noted occurs 4% of the time in legit elections? Were there more than 25 elections in the world last year? If so at least one of those would fail this test. For me this isn't enough of a long shot to rally behind as evidence.

As for the additional post-hoc statistics, we're starting to get into numerology territory here. In my intro to stats class back in my undergrad days we talked about post-hoc tests when comparing two samples. Essentially, if you are looking for significance levels of 0.05 (at which many scientific claims are made) if you can devise at least 20 random tests between identical populations, you are likely to find a "signifiant differnce" which doesn't truely exist.

Andrew/Ray:

Your points are well taken, but remember, these are two separate statistical tests. When they are combined, the researchers say the likelihood of both of them occurring is less than 1 in 200. I'm not great with stats, but my thumbnail calculation actually puts the probability at less than 1 in 500. So no, that's not impossible, but it's highly unlikely.

Also remember that it's not exactly a post-hoc test. The particular digit that's seen too often -- 7 -- is precisely the one that shows up in other tests where humans try to pick random numbers. The chances of a 7 showing up too much and a 5 too little in a sample are much lower than the chances of *any* digit showing up and *any* digit showing up too little.

Beber & Scacco developed the tests while studying other elections. For Iran, the two tests are merely suggestive individually, but they strongly point to fraud when combined.

The authors made an error in determining the combined probability though: the probability is not 1 in 200, but 1 in 700 (0.14%); i.e. the evidence is stronger than they suggest. I pointed this out to them and they concurred; there is a report about this online at Discover Magazine:
http://blogs.discovermagazine.com/discoblog/2009/06/22/update-irans-num…

Interesting, I was just reading about 'Benford's Law', which talks about this (at least I think it applies.. I'm no mathematician). Basically the early digits in statistically-gathered numbers occur in a logarithmic distribution, while later digits should be closer to random if I understand correctly:

http://www.mathpages.com/home/kmath302/kmath302.htm

Russ: It does apply, but only with small numbers. Once you're talking about 3 digits or more, the last digit is very nearly completely random. Since we're typically talking about 5- or more digit numbers, I think we can safely ignore it.

The probability for each last digit isn't (quite) 1 in 10 due to potential issues with Benford's Law (assuming it applies in this case, which it doesn't necessarily), although since the effect diminishes rapidly towards a uniform distribution, the effect would probably not be significant on numbers of this size. And even if it were an issue, it'd make 5's more likely than 7's, not the reverse.

@Miko - the authors' previous work demonstrates the (empirically obvious) fact that the last digits are uniformly distributed for any distribution you'd expect in an election. Benford's law applies to the first digit (and to a much lesser degree the second and third digits).

@Dave Munger - as I pointed out on another Discovery blog it's misleading to state that there's a 1 in 200 (or 1 in 600 as actually is the case) chance of this happening randomly. That's true, but there are hundreds of other equivalent events with a 1 in 200 chance of happening. There's no evidence that the authors established this single "one number too frequent and one too infrequent" test before looking at the data. In their previous work on Nigerian elections, they draw the same conclusions from a much different result.

I applied the correct test (I think, I'm not an expert in this) and there's no reason to doubt that the numbers are random - http://alchemytoday.com/2009/06/24/is-the-devil-in-the-digits/

Zach,

I don't buy your assertion that there's no established pattern with 5 or 7. Our little "study" confirms it, and I've seen it elsewhere. Got a link?

My professional assessment: Post-hoc drivel.

@Dave Munger

Looking at your study, I see a periodicity in the false random results with people trying to predict numbers within a given range like that -- the 17 result is interesting, but you'd also have an autocorrelation peak at a lag of 6 or so, with the exception of avoiding the first number. I suspect if you randomly ordered the numbers in the poll you wouldn't get that result, and that it's possibly an artifact of showing the numbers in that order. In previous studies, groups were asked to write a random string of digits long enough that you wouldn't see that effect (generating a random number between 0 and 10^125 or something like that). Saying "pick a random number between 0 and 9" will give you different results than looking at the last digit of "pick a random number between 0 and 10,000."

I won't link a PDF here, but you can find Beber & Scacco's Nigerian work here - http://www.columbia.edu/~bhb2102/research.htm

I quote one of the relevant points on my site:

We showed that we can expect the last digits of electoral results to occur with equal frequency given a wide range of distributional assumptions, and we then emphasized the fact that humans tend to be biased in the production of random numbers: They tend to select small digits, avoid repetition, and favor adjacent numerals.

If you read through that article, you can see what specific numbers they're talking about. They aren't 7 and 5, I believe.

Hoping that all of the text after the link doesn't stay blue after I click post... it's like that in the preview :)

Good points, Zach. The digits Beber & Sacco are talking about are a different scenario too, though -- low digits are chosen in long sequences of numbers, not single "random" numbers like the fabricated final digits of a 5- or 6-digit number.

It's possible that this result really is a product of chance, but when combined with all the other indications that foul play is involved in the Iranian election, I still think it's an important piece of evidence.

After reading this thing I decided to test this thing out. Random number chosen was 17. It was pretty amazing, it is of course 1/20 chance that it was pure coincidence and it does not say that this theory is true, but there might be something behind that 7 thing.
This thing is of course no irrefutable proof, but considering all the sign I'd say it is rather likely that the elections were rigged.

By Liudvikas (not verified) on 25 Jun 2009 #permalink

I definitely don't dispute that foul play was involved to some degree - see Mousavi's complaints on election day about poll monitors, intimidation, etc. More importantly, the whole Iranian elections system is flawed. I think the best solution is a new, monitored election - I suspected that Iran was moving in that direction a day or two ago but now it looks doubtful. The only thing I dispute is that there's solid evidence that the numbers we've seen so far don't reflect a realistic distribution.

Prof. Membane's work looking at the 2nd digit at various levels of coarse-grained results is more interesting. His methods seem too empirical for me to be comfortable with them, but I haven't read enough to judge. The thing is, there's a perfectly fine experiment to run here -- take the reported result, randomly shift around a few hundred votes in each province (or county, or ballot box), and then see how random the data looks. This will tell you how likely it is that an election with a similar result will pass the various tests that are being cooked up.

Alternatively, contact the local ballot counters (ballots are counted locally with results electronically transmitted to Tehran) and see if their memory squares with the reported result. The actors here are all powerful within the Iranian government and undoubtedly a large number of people involved in vote counting are in Mousavi's camp.

As far as these numbers being different, they're similar to the Nigerian numbers, and Beber/Scacco identified excess low numbers as evidence of fraud there.

Get real.

Maybe there was something odd, maybe not. We'd like to believe there was something odd, because we don't like the result.

While there may have been some corruption, the facts surrounding the way the Iranian electorate tends to vote continues to point to Ahmadinejad winning:
- Most who voted for M are Liberals who live in major metro areas.
- Those who lived outside of those areas tended to vote for the guy we don't like - this is where the majority of the Iranian population resides, and they tend to not be very liberal minded...
- Regardless of what we think, Ahmadinejad remains popular because he represents values of the masses - Peity, anti-corruption (against the religious aristocracy), and he's very strong on Iranian security. These are very compelling to many Iranians, and traditional weak suits of liberal politicians.
- Most of the new coming out of Iran is skewed - it's either by Westerners, or West leaning Iranian liberals (they know how to use the technology and have access to release their information.

So, while I hope that somehow Ahmadinejad ends up looking for a new job, corruption or not I don't hold out high hopes.

By faceinthecrowd (not verified) on 25 Jun 2009 #permalink

@Nick Gogerty

It's questionable to apply Benford's law to election results in general, and it's definitely not relevant when it comes to the last two digits of elections returns with four or more digits. Dr. Membane's work specifically looks at Benford's law in the 2nd digit because of empirical observations that it doesn't necessarily hold for the 1st in fair elections. The authors here are right to expect uniformly distributed last and second-to-last digits; they're just wrong to conclude that that's not what we're seeing in Iran.

Yes, very convincing statistics. It's as if Ahmadinejad is holding a smoking gun.

For the rest what faceinthecrowd said.

Hi. I wish you could stay focused on USA internal affairs the way you stay focused on Iran's internal affair. That way we had a better U.S.A. Do the same job on the info released on homeless men and women and the institutionalized racism.

@faceinthecrowd:
An Ahmadinejad win was projected, but not in one round and not with the overwhelming majority he received, certainly not with the high turnout reported, because a high turnout means that a lot of young, liberal voters voted.

Technically, these results show that some kind of "bias" may exist in the election results. However, it does not show that the elections were _fraudulent_.

Addition of a simple constant number of fake votes certainly won't create this bias. Rounding errors (Ahmedinejad got 234 votes, lets round that to 300) might introduce a bias, but it is hard to imagine a rounding method that will create extra 7s. One would think rounding up would cause a bias towards greater 5s and 0s.

Further, expectation of randomness is not sufficient when evaluating a result that can be affected by so many parameters. It is important to establish what the histogram of such a complex process looks like, and standard deviations will determine any bias. Unfortuantely, this experiment doesn't seem reproducible with any kind of fidelity - and the point is moot.

It's an interesting observation, though not one that you can draw conclusions of foul play from.

Reading thru these posts it occurs to me that those who did not understand the analysis are unable to appreciate the math. Those who can't appreciate math, usually have little capacity for logic, as the two are much alike. While the math is indisputable, the first clue to me that the results are fraudulent was the fact that the winner was announced before all the votes were counted. Now that's a clue....

By Paul Apollonio (not verified) on 27 Jun 2009 #permalink

This article reminded me the movie "Jerk". Specifically the sceene where Jerk escapes from a shooter into a carnival lot, where there is a sign that says "carnival personnel only". When Steve Martin (the jerk) drives into the carnival area in order to save his back side, the shooter says "Hey, you are not a carnival personal".
I think we all sound like the shooter guy when we shout "Iranian election is fraud".
What do we expect from Iran? Really! What do we think we are dealing with? Democracy? In todays Iran? They would disappear fromm the face of the earth and the history if they implement democracy at this point. They have what they deserve and what they need in order to survive.
I think we are making a big mistake in evaluating Iran by looking at the Iranian people living in the USA. They are one of the most smartest, most educated, most creative, very sucsessful and intelligent group of people I have met and you can find in any ethnicity in the USA . And they don't have the same problems as the people living in Iran. I don't know how much they represent Iran?
We should do more home work. Learn more about Iran and Iranians. Before we can develope expectations for the next president for them. Presently , both candidates are the same anyway. I guess what makes this election important was the fact that it was an experiment for us to see how much and how we can influence the Iranian people in the future? And, I think the results are very encouraging. There is hope. Nort just hope. Freedom will win.

Great work!

Now, why don't you use this method on the results of the Iranian elections in the twenty-five years between the CIA-orchestrated overthrow of the Iranian leader Mossadegh in 1953 and the fall of the Shah in 1979. We can be quite confident that these elections had problems since President Obama has just acknowledged the US role in overthrowing democracy in that country!

Please do this quickly -- or people will think that rather than being scientists, you are just opportunistic stooges of the neoconservative axis of evil.

By Tony Lawless (not verified) on 28 Jun 2009 #permalink

I warn you, I have nothing to say about the Iranian election. I know nothing, though everyone in the blogosphere seems to suddenly be a pundit and I feel very left out.

The 7s, on the other hand, are fascinating. Personally, if asked to choose a random number, I always choose one ending in 7. I didn't think of it as a pattern until now, but it's true. More interestingly then my number thoughts, is the fact that in the type of obsessive compulsive disorder where people do things in sets of numbers, or have to do everything a set number of times, 7s are one of the most common obsessions. (With 2 and 4).

I don't remember where I read that, and now I can't find it. This is the most annoying thing ever, because I hate to not cite. I will keep looking.

Funny how the narrative about "elections" continues here replicating the Media meme. If the American Presidential Elections was between Bush,Palin and McCain and the others were shut off by Supreme Court or if you prefer pick only the Democratic side was still an election?

-------------------------

I think for this conclusion to have more value there is a need of a research of how frauds are made and input that in testing.

By lucklucky (not verified) on 11 Jul 2009 #permalink