Sorry for the sporadic blogging. For the past week I’ve been working on the Progressive Monty Hall problem, and it has proven to be considerably more complicated than I at first realized. I had expected to polish it off with a few hours work. Instead I have thought about little else for the past several days, and have grown very unhappy with my inability to prove certain statements that are obviously true. Adding to my frustration is the nagging feeling that I am overlooking something simple and conclusive. So, I figured, why not turn it over to my readers?
I blogged about this previously. But I will go into far more detail now.
Here’s the problem. You are a game show contestant confronted with n identical doors. Behind one of them is a car, the others conceal goats. You choose one of the doors, but do not open it. Monty (the host of the show) now opens a door other than the one you have chosen most recently, always choosing one he knows to conceal a goat. He now gives you the options of sticking with your original door or switching to some other door. After making your decision, Monty opens another goat-concealing door. Again you are given the options of sticking or switching. This continues until only two doors remain. At this point you make your final decision, and then you receive whatever is behind your door. Throughout we assume that Monty chooses randomly from among the goat concealing doors when there is more than one such door. What strategy should you follow to maximize your chances of winning the car?
So there it is. The optimal strategy is actually pretty obvious. You stick with your initial choice until only two doors remain. You will win with probability n-1/n by following this strategy. Basically, your initial choice is correct with probability 1/n This probability does not change when Monty opens goat-concealing doors. Therefore, it will still have probability 1/n when only two doors remain. It follows that the other door will have probability n-1/n, as claimed.
This can all be proven rigorously using Bayes’ Theorem, but I’ll spare you the annoying details.
The intuition behind this strategy is that every time Monty opens a door he is giving you information about the location of the car. The optimal strategy will be the one that forces Monty to give you the maximum amount of information at each step. You receive more information from learning that an improbable event occurred than from learning that a likely event occurred. If you switch from your initial choice, you are allowing Monty to open a door that you already knew was very unlikely to be correct. By sticking with that door, you force Monty to open one of the other doors, and these doors are far more likley to conceal the car. So you should stick with your initial choice for as long as possible.
I am trying to prove that this strategy is not only optimal, but that it is uniquely optimal. That is, any other strategy gives you a probability of winning that is strictly less than n-1/n.
Now for the complications. Let D_i denote the event that the car is behind door i. Let M_j denote the event that Monty opens door j. Every time Monty opens a door, we use our new information to update the probabilities. That is, we try to evaluate P(D_i|M_j), the probability that door i conceals the prize given that we just saw Monty open door j. This, as always, is accomplished via Bayes’ Theorem.
Let’s do a warm-up exercise. Suppose there are five doors. We choose door one, and then watch as Monty opens door two. If what I said earlier is correct, then this revelation should not alter the probability of door one. According to Bayes’ Theorem we have:
P(D_1|M_2) = P(D_1) P(M_2|D_1) / P(M_2).
We clearly have P(D_1) = 1/5. Since Monty chooses randomly from among the goat concealing doors, we have P(M_2|D_1 )= 1/4. (If the prize is behind door one then Monty will choose randomly from among the other four doors).
The term on the bottom is a bit more complicated. The probability that Monty opens door 2 depends on the location of the car. So we write:
P(M_2) = P(M_2|D_1) P(D_1) + P(M_2|D_2) P(D_2) + P(M_2|not (D_1,D_2)) P(not (D_1,D_2)).
Now, P(M_2|D_2) = 0. If the car is behind door 2 then Monty will not open that door. We also have P(M_2|not(D_1,D_2)) = 1/3 and P(not(D_1,D_2)) = 3/5. (Note that doors one and two each have a probability of 1/5. So the probability that the car is not behind either of those doors is given by 1-(2/5).)
If you now plug all this in you get:
P(D_1|M_2) = (1/5)(1/4) / (1/5)(1/4) + (1/3)(3/5) = 1/5,
precisely as expected.
So far so good. Let’s push this a little farther. Suppose at this point you switch to door 3. Since door 1 has probability 1/5, door 2 has been eliminated, and doors three, four and five are identical, they all must have probability 4/15. What happens if Monty now opens door 1?
Using the same Bayes’ Theorem argument as before, we would now evaluate: P(M_1|D_3) = 1/3, P(M_1| not (D_1,D_3)) = 1/2. Plugging in now gives
P(D_3|M_1) = (4/15)(1/3) / (4/15)(1/3) + (1/2)(8/15) = 1/4.
Did you catch that? Did you see what just happened? Initially we assigned a probability of 4/15 to D_3. But after Monty opened door 1 we assigned a probability of 1/4 to D_3. That means P(D_3) went from 4/15 to 1/4 as the result of Monty opening door one. So P(D_3) actually went down!
And that’s the first question. Can anyone provide a simple, intuitive reason for why this is so? Why is it unsurprising that door 3 seems less likely after Monty opened door 1 than it did before Monty opened that door?
The message of our calculation is clear. After door two is eliminated there are four doors remaining. If we pretend for the moment that all are equally likely , then we would assign a probabiltiy of 1/4 to each of them. It’s as if we were playing the game from scratch with only four doors instead of five, and then watched as Monty eliminated one of the doors. So the effect of eliminating door one from the game is to erase all traces of history. But it’s not clear to me why eliminating door 1 has that effect.
Let’s move on. The fact that it is actually possible for the probability of a door to go down as Monty opens doors makes it quite difficult to prove that the strategy of switching at the last minute really is optimal. Here’s the problem: Regardless of the strategy we follow, there comes a time when only two doors remain. We want one of those doors to have a very low probability, so that we can win with high probability by switching to the other door. Outperforming the switch at the last minute strategy requires that one of those doors have probability smaller than 1/n. Now, every door starts with that probability. But how do we know that we can not drive the probability down below that level with some clever sequence of switches during the game?
That’s the problem. Here’s an attempted solution. While it is true that the probability of a door can go down in the course of the game, I think there is a definite statement that can be made regarding how far it can go down. If you choose a door at a time when x doors have been eliminated, then no matter what happens in the future, regardless of how many times you switch or which doors Monty opens, that door will always have a probability of at least 1/(n-x) for as long as it remains in play.
In the example we worked out before, we first chose door 3 at a time when four doors remained in play. We saw that we could then drive the probability of this door down to 1/4. But if I’m right, then no subsequent events could possibly drive this probability down any lower.
The situation here is the reverse of the question I asked a moment ago. I regard this as intuitively obvious, but I lack a rigorous proof. I regard it as obvious for the following reason: Regardless of which doors you have chosen in the past, after x doors have been eliminated there are n-x doors still in play. If all of those doors were identical you would assign a probability of 1/(n-x) to each of them. That probability represents the most ignorant you can be about the location of the car when only n-x doors remain. The fact that you might have chosen other doors earlier in the game only drives up the probability of the never chosen doors. That is, knowing that some of the n-x doors have a lower probability because they were chosen at some earlier point in the game only makes the never chosen doors seem more likely to be correct.
No matter what happens, Monty can not take away from you the fact that you know the door was chosen from a selection of (n-x) ostensibly identical doors. If he dutifully eliminates all of the previously chosen doors then that is all you will know about the probability of that door, and you will assign it a probability of 1/(n-x). But you will never have a reason to think the door is less likely than that.
That seems convincing to me. But a proof would be nice.
Considerably more than this can be said, but if you grant me this then it is straightforward to show that the switch at the last minute strategy is uniquely optimal.
Among the other things that can be said is the fact that the only way to drive down the probability of a door is by having Monty open previously chosen doors. That is, no door has its probability go down as a result of Monty revealing there is a goat behind a door that you had never previously chosen at any point during the game.
The Bayes’ theorem arguments I gave above can be readily adapted to more general situations. Instead of five doors, we consider n doors at the start of the game, and we suppose that we are considering the moment when x doors have been eliminated. You can then work out all sorts of inequalities about what it would take for some probability to dip below 1/n as the result of Monty’s door opening. I have dozens of pages of scrap paper strewn about my office working out these sorts of calculations for different starting assumptions. And I have an awful lot of very suggestive stuff. But no slam dunk argument. Yet.
So I haven’t just been blowing you guys off frivolously. It’s just that it’s part of the mathematical temperment that when you latch on to an interesting problem it’s basically impossible to think about anything else, no matter how miserable the problem is making you. Over the past week I have tried at various times to pick up any of the four or five books I’m in the middle of, only to find my mind drifiting back to this problem after just a few paragraphs.
Whatever. The problem will yield. Oh yes. It will yield…