This week’s article on the “most random” number was the most popular post ever on Cognitive Daily. The stats aren’t all in yet, but so far the post has been viewed at least 40,000 times. It wasn’t long ago that 40,000 was a good *month* for Cognitive Daily! Since comments and questions about the project were spread over at least four different threads, as well as at least a dozen posts on other blogs, I thought I’d sum up some of the questions about our poll and the results in one place.

We polled 347 CogDaily readers, asking them to simply “think of a random number between 1 and 20,” and found that the number 17 was chosen significantly more frequently than it would be in a truly random sample.

Some of the best responses:

- Shouldn’t you have called 17 the
*least*random number, since it was the number picked most frequently? - How likely is it that this result is due to chance? “62 people out of 347 replied 17. To give a flavor, the chance of more than 30 people saying 17 is 1 in a thousand. The chance for 40 or more is 1 in a million. The chance for 62 or more choosing 17 is practically nil (unless 17 is really the most popularly chosen random number).” [
*note: I haven't checked these calculations --DM*] - “17 is a prime example of a random number. *math pun*”
- Shouldn’t you have asked people to pick a number
*from*1 to 20? (yes, we should have!) - Dilbert on random numbers

Some of the best questions (with my responses):

Why make the computer calculate only 347 points? If you want to find the distribution you should take much more points than that (it would only take some seconds to get a perfectly uniform distribution).

Because there were 347 responses to the poll. I wanted to see if computer-generated random numbers showed as much variance from the theoretical distribution as humans do.

It’s silly to even plot the computer output. If you did more trials, and had a proper random number generator then the distribution would be even along all the numbers. Always.

You’re misunderstanding “random.” The reason I plotted the computer’s random numbers is to show how much variance there is in a set of random numbers the same size as our poll’s sample. The dotted line on each graph shows the theoretical limit that a random sample approaches, but with a small sample, there will always be variance in a set of random numbers. As I said in the original post, if you roll a die six times, it’s extremely unlikely that you’ll get one of each number.

Different algorithms may output different numbers, so those “computer” numbers shows nothing relevant.

They are a reasonable set of random numbers. I used the generator at random.org to come up with my list of random numbers.

But that’s beside the point. Think of it this way: I could have just compared the percentage of “17″ guesses with the theoretical number of times 17 should appear in a random sample, but I wanted assess our poll using a stricter standard. The point is this: it’s possible that there were more 17s than any other number just due to random chance. If you roll a die 6 times, and the number “2″ comes up twice, you don’t have enough evidence to show that the die is biased.

Yes, “17″ was chosen significantly more frequently than the 5 percent of the time we would expect, but even in a truly random sampling of a finite number of numbers, we would expect that some numbers would come up more than 5 percent of the time, and some would come up less. By comparing the “17″ to the most common number from the computer-generated set of numbers, we can see if the large number of “17″ responses the humans came up with reflected a true bias, or if they might have been due simply to chance.

There is certainly a way to do this using a statistical calculation (in fact the 90 percent confidence interval around the theoretical 5 percent value would probably do it), but I wanted to demonstrate it in a way that was clearer to people who do not have a background in statistics.

The only problem with this is that it’s highly likely many of those who took the survey had read the Pharyngula and/or Cosmic Variance post, compromising the results.

I would think if the results were compromised, they would have been compromised in the *opposite* direction; that is, people would have tried to think of some number *other* than 17. I also think nearly all CogDaily readers are interested in good research, and so were participating in good faith — in nearly all the “casual” studies we’ve conducted, participants have been exceptionally careful to be sure they aren’t polluting our results.

Finally, some misconceptions expressed by commenters:

- My wife said “17″ so it must be true
- You only surveyed 347 people, so your results don’t apply to the general population
- I really like the number 17, so that’s why it’s the most random
- 17 has the most syllables (
*17, meet 11*) - Random numbers are perfectly distributed
- The question never asked the reader to respond in a random manner (
*yes, it did*) - The study author didn’t take statistical significance into account (
*yes, he did*) - It’s interesting that the computer came up with 19 as the most random number (
*no, it’s random*)

And speaking of “random,” how random are the British when they pick their computer passwords? (Liverpool and arsenal seemed pretty random to me until I realized they were both Premier League teams). I’d love to see a similar U.S. list (redsox, anyone?).