The results of the estimation contest are in. There were 164 serious entries (I excluded the $12,000 and $1,000,000 "guesses" from the final data). The mean value guessed by commenters was $83.30, and the median was not far off, at $77.12. The standard deviation was high-- $43.10-- but as you would expect with a large sample, the standard error (or standard deviation of the mean) was small, $3.37.
Or, in convenient graphical form:
That's a histogram with $20 wide bins showing the number of guesses in a given range. A pretty nice distribution, on the whole.
The red line indicates the actual total value of the change in the box, $165.26 (not counting $1.10 in Canadian coins that the bank wouldn't take). Congratulations to Michael Day at comment #48, who wins the contest with his guess of $166.32, just $1.06 off from the actual total. Send me a mailing address, and I'll send you a galley proof of How to Teach Physics to Your Dog.
Now, why were the rest of the guesses so far off? Isn't the "wisdom of crowds" effect supposed to make an average of a large number of estimates better than any individual guess?
Well, for one thing, it's a little difficult to make an accurate estimate of the size of the box and the depth of the coins from the image I provided. The angle of the shot makes it particularly tricky to gauge the depth.
Another factor is that my behavior with regard to coins is a little different than most people. Whenever possible, I discard pennies in those "take-a-penny, leave-a-penny" cups or boxes. As a result, quarters, nickels, and dimes are overrepresented in my change box relative to those of people who keep pennies. People trying to estimate based on past experience were almost inevitably going to come in low.
Another likely factor would be the "winner's curse" effect discussed by the Albany Math Circle blog. People seem to systematically underestimate the value of large collections of coins. Though, admittedly, a factor of more than 2 seems a bit much.
Finally, I think there was probably a collective effect, in that the first batch of guesses were way low, and later guessers took those as some sort of consensus value and looked in the same range. The median separation between guesses was $0.67, so they're really very tightly bunched in the middle, which I think says something about the psychology of people entering the contest.
Other guesses? Comments, questions, complaints about the contest?
(For the record, I would've guessed $130, based on assuming the box was about half full, and the sample was entirely quarters with a volume filling factor of 50%. So I would've lost my own contest...)
- Log in to post comments
One thing was that I think we were mostly underestimating just how deep your box was. It was hard to tell from the picture whether it was as little as 2" deep, or as much as 6" deep. I ended up putting in a second guess based on the box being almost twice as deep as I thought it was, and the average of the two guesses was just about spot on.
For your next experiment, I suggest that you give a round of feedback to the guessers. Give the graph as above, but without showing the answer. They get a 2nd guess, now knowing the distribution of other guessers.
In my experience (with the ARG community mainly) crowdsourcing doesn't work when averaging the crowd's guesses. Sadly, that approach doesn't seem to do terribly well; instead you just get a poll of peoples' expectations which is vulnerable to a ton of biases.
It works by having a bunch of ideas competing in public by acting as a huge parallel processing network that tries multiple approaches simultaneously (while "showing your work") and seeing which approach arrives at the most accurate/useful/predictive/etc conclusion. The idea isn't to act as a popularity contest, but rather that of many different approaches taken at once one is going to be better than the rest and the larger the pool of approaches the closer the "best" solution the crowd derives will be to an optimal solution.
Crowdsourcing is also very good at digging out obscure information, because (in my personal experience) in a big enough crowd you're going to find an antiquities enthusiast with enough physics literacy to recognise the connection between bowdlerised Babylonian mythology and M-theory necessary to solve the puzzle.
-- Steve
Because the later guessers knew the guesses of those who came earlier. It's a systematic and well-known bias. I really to recommend you read Surowiecki's book "The Wisdom of Crowds," review here, which discusses the conditions under which a crowd is indeed "wise" very nicely.
Next time, have them send you the guesses or something similar. I'd be interested to see if it improves the result!
It's primarily the depth effect, I suspect. The only other possibility is the penny density. You'd have to actually tell me the depth to know.
My guess was made based on my results from the previous weekend dumping out my own mug of change into the CoinStar machine, in exchange for an Amazon gift certificate. Since I know the dimensions of my mug (I measured it) and I had the recepit from coinstar, the math was elementary.
Because I did that, I cheerfully disgregarded everyone else's guess and certainly didn't rely on naive intuition.
Well, for one thing, it's a little difficult to make an accurate estimate of the size of the box and the depth of the coins from the image I provided. The angle of the shot makes it particularly tricky to gauge the depth.
I'd noticed that. One of the guessers ahead of me provided his estimate of the dimensions of the box. I thought his guess was low (I'd eyeballed it as 10x6 inches filled to a depth of 1.5 inches).
Whenever possible, I discard pennies in those "take-a-penny, leave-a-penny" cups or boxes.
Where I obviously got it wrong was the density of money. I don't have much experience here (I try to spend my change, especially pennies), but it seemed to me that you had more quarters and pennies than dimes and nickels. Which is exactly what you expect: in any given transaction you might acquire up to four pennies and three quarters, but in most cases you would not get more than two dimes without a nickel, or one dime with a nickel. So I assumed that each cubic inch would have 3 quarters, one dime, one nickel, and four pennies for a total of $0.94. But if you were leaving your pennies behind, that would skew the ratio. Also, money probably packs a bit more densely than I naively assumed.
Coming in fairly late, I deliberately guessed toward the center of the distribution. My actual gut feeling was that the amount was more like $50. So I was way off.
Bee beat me to it. The later guessers' guesses were not independent of the earlier guessers' guesses. It would be interesting to repeat the experiment with the guesses being kept confidential.
Wow, I actually won a contest! My strategy, similar to the approach of at least one other commenter I read, was one that worked for me once when I was guessing the number of jelly beans in a jar. I took a long time counting the coins in the top "layer" of your box (noticing that there were very few pennies), and assumed that the coins in that top layer were representative of the whole box. Then I took a (somewhat wild) guess at the number of "layers" of coins in the box. So, one bit of absolute counting, one assumption, and one guess! I have sent my address to you in an email. Thanks for the blog and I look forward to the galley proof!
Don't worry, you can still use the Canadian coins in a vending machine to buy a snack. Except maybe if one of them is a loonie.
Ah, well. Maybe next time!
Bee's got the main reason, I think. The wisdom of the crowds only works when guesses are independent. I can't remember where, but I've read articles about studies finding that effect.
--
I don't think the Albany Math Circle thing has much to do with anything, and I think you've taken the wrong conclusion. I don't think the story shows evidence that people underestimate the value of collections of coins.
The goal of that contest was very different from the goal of your contest. There, the goal was to get close to the value *without going over*. Not only that, but you would actually be penalized for going over if you won, so you were better off shooting what you expected to be low.
In the contest here, there's no penalty for going over, so the goal is different.
@MRW: I don't think the Albany Math Circle thing has much to do with anything, and I think you've taken the wrong conclusion. I don't think the story shows evidence that people underestimate the value of collections of coins.
The description suggests that the participants in the game were asked to both make bids and report estimates. The winner's curse is a reason for them to have bid lower than their estimates but it seems that the estimates they reported were low too.
Another factor is that my behavior with regard to coins is a little different than most people. Whenever possible, I discard pennies in those "take-a-penny, leave-a-penny" cups or boxes. As a result, quarters, nickels, and dimes are overrepresented in my change box relative to those of people who keep pennies.
Yeah, this is a big part of it, I suspect. I tend to accumulate mostly pennies. It was clear from the photograph that the top layer of your box had a much higher density of quarters than change that I accumulate, but I don't think I took it into account when guessing.
@Ian - Whoops, I missed that
I agree with those above who say it's not a winner's curse phenomenon. If there'd been an auction of the jar, it's most likely that the winning bidder would be one of the few people who overestimated its value, and that person could well have suffered winner's curse, depending on how close his last bid came to his estimate. (Sophisticated bidders will take winner's curse into account, and will not bid as high as their estimate.)
But there's no auction, and hence no winner's curse here.
There is, however, apparently a systematic tendency for people to tend to err on the low side in guessing coin values. (That was also evident in the MBA experiments. Their estimates--not their bids, but their estimates--averaged 65% of the true value.) So the MBAs actually did better than the Uncertain Principles readers in their estimates.
However, they also got considerably more information than the Uncertain Principle's readers. They had a clear glass coin jar, which they were able to physically examine up close.
I also agree that there are problems of "information cascades" and "herd instinct" behavior here due to the fact that people were taking each other's guesses into account. This same sort of thing contributes to bubbles and panics in financial markets.
I discuss these phenomena and my own strategy at greater length here:
http://albanyareamathcircle.blogspot.com/2009/08/information-cascades-a…
I rarely notice crowds showing any wisdom. I suppose there are some things that crowds can predict, for example, crowd behavior, but otherwise it is just a consensus. There is wisdom in individuals in a crowd. If you are standing on a street corner in Manhattan and wonder aloud, "What is the neutron capture cross section of a uranium atom?", odds are someone in the crowd will have the answer.
I guessed based on my experience as a regular change-collector. I think that your considerations (hard to tell the real volume of the box; skew due to lack of pennies) are robust.
The "fewer pennies" thing in particular seems a likely factor. When I roll my coins (yep, I roll 'em! it's like knitting--gives me something to do with my hands while I watch TV), the distribution is consistently mostly quarters, next most pennies, and dimes and nickles are similar in quantity.
In section 2 of this statistics pedagogy paper, there is a neat example of a coin-guessing decision theory problem. In this game, if you guess the exact value of the coins, you win all of them. (You win nothing otherwise.) What should you guess? It turns out that it's optimal to guess higher than what you think the most likely value actually is, since that will maximize your expected winnings. You can work out a formula for what you should guess assuming your uncertainty is described by a normal distribution.