If I say that X has probability p, what does that mean? What sort of thing is X, and what does the number p represent?

Philosophers have spilled a lot of ink on this question, with no clear answer emerging. Instead there are a handful of major schools of thought on the issue. Each school captures an important aspect of what we mean when we talk about probability, but none seems to provide a comprehensive account.

One possibility is the so-called classical interpretation. It is classical because it shows up in the earliest formal treatments of probability, for example in the work of Pascal. Laplace was probably the most famous adherent of this view. We imagine an exhaustive, finite list of all possible outcomes from a particular experimental set-up. In the absence of specific evidence to the contrary, we assign each of these elementary outcomes the same probability. The probability of an event is then defined as the number of “favorable outcomes” divided by the number of possible outcomes.

This interpretation is well-suited to simple examples drawn from gambling. The probability of rolling a perfect square with one roll of a fair die is 2/6 since there are six, equally likely, possible outcomes and only two of them are perfect squares. The probability of drawing a heart out of a well-shuffled deck of cards is 13/52, because each of the 52 cards is as likely as any other and 13 of them are hearts. This is the interpretation usually presented in elementary textbooks on probability.

Thus, in this view X is an event in a well defined sample space and p then records an objectively measured ratio.

For these simple examples the classical interpretation works very well. Sadly, it is far too limited to serve as a general account of probability. As it stands it is evidently inappropriate for infinite sample spaces, for example. It also raises the question of how we know equally probable events when we see them. More complex situations are not so readily modelled in terms of finite lists of equally probable events.

An alternative is the frequentist interpretation. The basic insight here is that probabilities are things you measure from long series of trials of repeatable experiments. As with the classical interpretation, this is an objective view of probability. That is, probabilities exist out there, in the real world, and not just in the human mind.

The actual definition of probability in this interpretation will depend on the specific sort of frequentism you are discussing. In finite frequentism we assume that we have the results of a sequence of experimental trials in front of us. Then the probability of any particular outcome is the measured frequency of that outcome among all of the trials. A somewhat more complex version argues that probabilities should be seen as limiting frequencies obtained in an imagined infinite sequence of trials. Regardless, the law of large numbers is crucial here. Without trying to state it as a formal theorem, we can say that in essence it gives us reason to believe that measured frequencies will fluctuate less and less as the number of trials gets larger and larger.

In this view X is an outcome of a repeatable experiment, and p is a quantity you measure based on actual data.

The benfits of this interpretation are its concreteness and its objectivity. It is similar in certain ways to the classical interpretation, in that it assigns an equal weight to all of the events in a certain set. The main difference between them is that the classical view begins with an enumeration of hypothetically possible outcomes, while the frequentist view only examines actual outcomes obtained in an actual data set.

That there is a relationship between measured frequencies and probabilities is obvious to any gambler. That notwithstanding, the frequentist view of things suffers from certain difficulties that rule it out as a comprehensive view of probability. One obvious problem is that it commits us to the idea that probabilities are rational numbers (in its finite form, at any rate), which seems indequate for certain problems in modern physics. There is also the problem of knowing how many trials you need to carry out before you can be confident that your measured frequency is telling you something about the world, and is not telling you simply that something really improbable has happened. There are also conceptual problems inherent in the very idea of talking about an infinite seuqence of trials.

A third major school of thought is the Bayesian interpretation. This is really a family of interpretations united by the idea that probabilities are subjective. A probability is the degree of belief held by a person with regard to a specific proposition. I might assess the probability of getting heads in one toss of fair coin to be one half based on my general knowledge of coins and my lack of knowledge concerning the possibility that the coin in front of me is loaded in some way. If a sequence of coin tosses is now carried out, I would view the data produced as new information with which I would update my prior beliefs regarding the fairness of the coin.

As the name suggests, Bayes’ Theorem plays a major role here. The main idea is that there are only subjective assessments of probability that get updated as new information comes in. Bayes’ Theorem is then the primary tool for updating things rationally.

Bayesianism is especially well-suited to problems in decision theory. A person is confronted with a decision that must be made in the absence of certain relevant information. Part of thinking rationally about such situations is assigning probabilities to outcomes based on the information you have. A simple example is a juror deciding whether to convict a defendant. He might say, I am ninety percent certain the suspect is guilty. Plainly he is talking about his degree of belief based on the evidence presented, and not on the percentage of times the suspect is guilty in a long run of something or other.

Probability as degree of belief is certain an important notion, but Bayesianism has problems as well. First, how ought we assign numbers to degrees of belief? In this interpretation, X represents a proposition. But what is p? The usual idea is that you assign probabilities based on your willingness to bet a certain amount of money on the outcome. You would assign a probability of one half to an event if you would be willing to pay fifty cents for a bet that pays you one dollar if the event occurs, and nothing if it doesn’t. But this makes probability dependent not just on the information a person has, but also on his wants and desires. That doesn’t seem right.

Furthermore, it just seems wrong to propose, as a universal truth, that all probabilities are subjective levels of confidence in propositions. The tendency of various randomizing devices to produce stable long term frequencies is as much a fact about physical reality as anything else scientists study. Probability language seems like the natural device for discussing such realities. An interpretation that leaves no room for such ideas seems a bit impoverished, to put it kindly.

What is going on here? Chance and randomness are ubiquitous in everyday life. Probability theory has been applied with great success in virtually every major branch of science. Can its foundations really be as shaky as my very brief overview suggests? For that matter, what does it even mean to “interpret” probability?

From the viewpoint of pure mathematics there is nothing to interpret. Things like probability spaces and probability measures are just abstract constructs, defined entirely by their axioms. Kolmogorov, writing in the 1930’s, provided the axiomatization of probability that is nearly universally accepted today. As long as the things you are studying satisfy his axioms, you can plausibly claim to be doing probability. Things like Bayes’ Theorem or the Central Limit Theorem are just statements that follow as a matter of logic from the axioms.

The price you pay for viewing things in this manner is that your objects of study are entirely divorced from everyday reality. By an intepretation of probability we therefore mean some way of assigning real-world counterparts to the undefined terms in the axioms. For the interpretation to be successful we want our assignments of real-world meaning to non-defined terms to be done in such a way that the resulting theorems of the mathematical theory are turned into true statements about physical reality. Ideally, we would come up with one interpretation that covers all of our intuitive notions of probability, and that is applicable to various scientific problems of interest.

The trouble is that probability talk gets used in a large variety of seemingly different contexts. Indeed, I was led to think about this issue in the course of writing about the Monty Hall problem. I noticed that in the space of seventy pages or so I had casually used three different notions of probability, without even realizing immediately that I had done so.

The solution to the basic Monty Hall problem typically proceeds by enumerating the cases and finding the ratio of wins by switching to all plays of the game. This is essentially the classical view of things at work. (It’s not precisely the classical view, since in solving the Monty Hall problem you need to realize that not all scenarios have an equal probability, but it is close enough for this discussion.)

I then described the results of Monte Carlo simulations of the Monty Hall problem. This is the method that convinces pretty much everyone that switching is the way to go. The famous mathematician Paul Erdos refused to accept that there was an advantage to be gained from switching. He changed his mind immediately upon seeing the results of a simulation (though he still felt dissatisfied with the result). But the idea that a Monte Carlo simulation is telling you something important about the probability of winning by switching is based on a frequentist view of the situation.

Moving on, in other places it was natural to present the Monty Hall problem as a question in decision theory. This way of proceeding is especially natural in variations with many doors or exotic host behaviors. Suddenly I found myself writing about how the player ought to assess his chances of winning under various chances given the information at his disposal.

Just to be clear, I am not saying that you are logically compelled to adopt one school of thought over another in pondering different aspects of the Monty Hall problem. Only that different intepretations seem more natural depending on the aspect of the problem you are discussing.

Three different interpretations all rolled up into one problem. What hope, then, of finding a comprehensive interpretation of probability?

So which side am I on? Well, as a pure mathematician my attitude is that probability was doing just fine as an abstract theory, and if you find yourself running into conceptual difficulties when trying to understand it in real-world terms that just serves you right for trying to apply it to anything.

Kidding aside, I tend to follow the example of most mathematicians and take an ecumenical view of the whole thing. If different interpretations are useful in different contexts, then you go right ahead and use which ever interpretation you find most useful. Indeed, why is it even reasonable to expect that a single interpretation of probability will cover all of our intuitions about the subject?

As with so many of the things about which philosophers write, this debate strikes me as much ado about nothing. Very little, anyway. Having now read some of the literature on the subject, I can say that the arguments adduced for and against the various schools of thought are subtle, ingenious, and ultimately just not very important. Or so it seems to me.

The subject of statistical hypothesis testing does offer some instances where frequentists and Bayesians prescribe different ways of proceeding, So this can not be entirely dismissed as just an academic dispute. We’ll save that for a different post, however.

Comments

  1. #1 Russell
    December 26, 2007

    I was working at a company specializing in simulation software when the Monty Hall puzzle was first making the rounds. Most of the engineers there would get the right answer.

    I don’t see any more problem with a formalist view of probability theory, than with a formalist view of mathematics generally. The application of the math to a problem domain always carries a variety of assumptions that go into that modeling. That’s true whatever math one is using (e.g., calculus, Hilbert spaces, probability spaces) and to whatever problem domain they are applied (e.g., building a dam, calculating the probability of atomic decay, or deciding how to bet when playing Monty Hall.)

  2. #2 g
    December 26, 2007

    I think the Bayesian (meta-?) approach is in better shape than you suggest (“But this makes probability dependent not just on the information a person has, but also on his wants and desires”; “it just seems wrong to propose, as a universal truth, that all probabilities are subjective levels of confidence in propositions”).

    Regarding the first: There are theorems that say, in effect, “If you choose to represent degrees of belief by numbers, and if you want what you do to satisfy the following consistency conditions [...], then up to isomorphism those numbers *have* to be values between 0 and 1 that obey the usual rules of probability”. So, no need for dependence on anyone’s particular wants and desires.

    Regarding the second: a Bayesian can claim that probabilities don’t describe only individuals’ subjective confidence levels, but also (indeed, more fundamentally) the confidence levels of an idealized reasoner. This point of view encompasses classical and frequentist probability as special cases: if something has N indistinguishable outcomes then of course an idealized reasoner will assign them all equal probability, etc. So, from this perspective, it’s perfectly OK for there to be objective probabilities out there (and when there are and our hypothetical ideal reasoner knows about them, its probability assignments will match the objective ones), but the same formalism extends “continuously” to situations where subjectivity is unavoidable.

  3. #3 qetzal
    December 26, 2007

    A simple example is a juror deciding whether to convict a defendant. He might say, I am ninety percent certain the suspect is guilty. Plainly he is talking about his degree of belief based on the evidence presented, and not on the percentage of times the suspect is guilty in a long run of something or other.

    But what does “degree of belief” really mean? I think it generally is something like the juror’s guess at percentage of times similar suspects tried under similar sets of evidence would actually be guilty. Obviously, there’s a huge question as to how a juror could legitimately arrive at 90%, since s/he almost certainly has no prior experience with similar cases, but that’s a separate issue.

    To be frank, I’m not sure how else to interpret “90% certain.”

  4. #4 ctw
    December 27, 2007

    “I’m not sure how else to interpret “90% certain.”

    Not as a probability, for just the reason you give – the absence of multiple independent, identically distributed (IID) “trials” (both senses) either past or future. Without them, interpreting it as a probability has neither meaning nor utility, respectively.

    I agree with Jason that typically “X% certain” is merely a qualitative measure of one’s relative confidence in an opinion based on available evidence. Ie, it’s just another way of saying “on a scale from 1 to 10, I’d give it a 9″.

    – Charles

  5. #5 Enigman
    December 27, 2007

    Hi Jason, I’m just wondering why you go from the classical to the frequentist interpretations without mentioning the propensity interpretation, which seems (to me) to capture realistically the physical probabilities of quantum mechanics, in all their infinite complexity.

  6. #6 ctw
    December 27, 2007

    re. “the propensity interpretation”

    When I was typing my “repeated trials” comment above, it occured to me that not only were there no repeated “trials” in the case of a jury “trial”, neither were there in the Monty Hall problem. For a given contestant, the “die has been cast” and that’s the whole game. It seemed like each contestant should switch nonetheless, but it wasn’t clear what that buys an individual contestant. Once the game has commenced, the situation is binary – whichever you decide to do you either win or lose.

    The “propensity” view seems to sort of address this (I haven’t yet read carefully any of the material to which you point, though I intend to do so), but my immediate inclination is to argue that probability theory, like other math disciplines, essentially creates models and analyzes behaviors, and the results then can be used in situations where they apply to describe the behavior of real systems. Because my contact with probability theory has been in the realm of info/comm theory, repeated trials are the essence and probability models apply handsomely. In some situations, they presumably don’t. So although I see that in the MH game there may be a “propensity” for the other door to be more probable, it isn’t clear what the practical consequence of that insight might be.

    And it occurs to me that in some situations, whether probability models apply or not depends on perspective. Take Las Vegas. From the house POV, relative frequency models do – the house has the odds and they have repeated trials galore. To a lesser degree, so do sophisticated players (eg, card counters) with lots of time and resources so they can also benefit from many repeated trials. But an individual one-time sap can’t – you either win or lose those “few” specific trials – which is presumably why people keep going back. Earl and Millie go to Vegas on their four day bargain weekend and occasionally actually do win, the fact that the odds are against them notwithstanding.

    Anyway, thanks for the pointer to a different perspective.

    – Charles

  7. #7 Jason Rosenhouse
    December 27, 2007

    g-

    Thanks for the comment. My point about wants and desires was simply to illustrate that it is not obvious what it means to assign a number to a degree of belief. I was not offering it as a fatal objection to Bayesianism. The idea of relating this sort of subjective probability to betting odds was first proposed by, I believe, de Finetti, who was an early advocate of the idea of subjective probability. That such an influential thinker on this subject offered an account that was not adequate shows that there is a real issue here that needs to be addressed. Howson and Urbach offer a much more complex discussion of what it means to assign a number to a subjective degree of belief. Again, this shows simply that there is a difficult issue here, not that it is necessarily insurmountable.

    I will have to think about your second point. Wondering about what an ideal reasoner would think in a particular situation seems a bit abstract to me. The treatments of Bayesianism I have read to date all emphasize that its distinguishing feature is that it rejects the idea that probabilities are things that attach to physical objects in the real world. If there are situations where we can talk about objective probabilities, then I would think that the classical or frequentist approaches are better suited to handle them. It’s not that you can’t fit them into a Bayesian viewpoint, but just that it’s needlessly complicated to do so. That is why I endorsed an ecumenical view of things at the end of the essay.

    qetzal-

    That it’s not clear what is meant in assigning a number to a degree of belief was one of my main points. I suspect that when people say they are ninety percent certain of something they mean simply that they would be very surprised if it turned out to be false. Given what they know it seems likely that a certain statement is true, but they can imagine scenarios where it is false nevertheless. This applies, I think, to the example of the juror. He’s saying simply that he’s pretty sure, but the number 90 is not to be taken too literally.

    Enigman-

    I had a few very good reasons for not discussing propensity interpretations. First, the essay was already quite long. Second, the interpretations I did discuss strike me as the more famous and influential interpretations. And third, I don’t feel I understand propensity interpretations very well.

    Charles-

    You raise an interesting point about whether thinking in terms of multiple trials is helpful to a player of the Monty Hall game who will only get to play once. My files contain a number of papers from philosophy journals that discuss this very point. There will be a section in the book discussing this issue (assuming I can figure out what all these folks are talking about, which is never a sure thing when your dealing with academic philosophy.)

  8. #8 ekzept
    December 27, 2007

    It’s not just about probability in the abstract, or about neat little problem frames with no connection to the rest of the world. In the former case, especially when making decisions, probability really can’t be divorced from loss functions or their flip, utility. This even extends to data analysis and experiments. After all, equipment costs.

    And it’s not just about Monty Hall-type frames, either. Even from a Bayesian perspective there is a world of observation and data out there with vastly high likelihoods of being correct. Sure, we limit the frame so we can solve problems, extending it to more and more. But “subjectivity” in probability often is simply a measure of ignorance, some of it adopted to make making a decision or a model simpler. Depending upon the loss function, embracing that ignorance might not harm the quality of the model or decision much, so it’s a wise thing to do.

    I think this also suggestions that the effort to calculate also costs something, and needs to be considered another aspect.

  9. #9 Kevin
    December 27, 2007

    ok I got the link

    http://www.solstice.zxq.net/

    by my calculations the days ARE getting longer! this means that the party was a success.

  10. #10 ctw
    December 28, 2007

    “it is not obvious what it means to assign a number to a degree of belief”

    I have coincidentally been thinking about what it means to “believe” something and my first cut actually addresses this. Here it is:

    Let G be some benefit, C be the cost associated with an action A intended to achieve G, and E be a set {Ei,Wi} of evidentiary elements and credibility weights relevant to the likelihood that A will in fact result in achieving G. Then a belief B is a function of A, G, and E the value of which is essentially a confidence measure that A will result in achieving G. Symbolically, a belief is a function B(A,G,E) with the range [0,1]. By establishing a threshold D(C), B becomes a decision function:

    Take A iffi B(A,G,E) is greater than or equal to D(C).

    Clearly, this formulation has no objective content – the “belief” and the threshold are set subjectively, not by actually doing a numerical calculations. But I think some interesting things can be said about it nonetheless.

    – It makes explicit the significance of action associated with belief, consistent with what I assume James intended in “Will to Believe” by distinguishing between “live” and “dead” beliefs. If there is no action and therefore neither cost to be incurred nor decision to be made, the confidence measure cannot be “computed”. Ie, B is a “dead” belief.

    – Let A be “behaving as if God exists”. Pascal’s wager is (I think) essentially the argument that since G is extremely large, D(C) should be set so low that for almost any C,E, and B, A should be taken. (Some readers of this blog might object – as I do – to the implicit assumption that violating one’s intellectual integrity, a component of C, is negligible.)

    My initial motivation for thinking about the nature of “belief” was the popular statement that “faith is belief without evidence”. Despite assurances from both the religious and non-religious that I am wrong, I don’t accept that. I think that logically, one can only “believe” something based on some inputs, and I choose to call whatever those may be “evidence”. Then the difference between one person’s “faith” (A) and another’s “disbelief” is a matter of the values a person (implicitly) assigns to the various parameters. A religious believer typically has weighted the credibility of scripture, testimonials, etc, high and set D relatively low, possibly somewhat independently of C since G is so large, and thus decides for A. A non-believer presumably weights the same evidence low, and – probably being by nature a skeptic – typically sets D high for most beliefs and consequently decides against A.

    – Let A be “vote to convict”. Then G is punishing a guilty defendent and C is the risk of punishing an innocent one. The juror must assess B and then set D. So “90% certain” might be construed as meaning something like “I am confident that in similar cases with evidence equivalent to E, G will be achieved by a conviction at least 90% of the time”. The juror’s choice of D(C) will depend on the type of action (preponderance, shadow of a doubt, etc).

    – Let A=”behaving as if the case for the theory of evolution is sound”. I submit that most of the time for most non-religious people, B is a “dead belief” because there is no substantive content to A (mere recital of “I believe X” is nonsubstantive) and E is essentially null. ( in the context of my definition, E is relative to an individual, and most individuals know almost nothing about evolution). But when the ID crowd invades a school district, that situation changes. At a minimum, A includes supporting a school board’s position, and to make a knowledgable decision, one needs to “beef up” E. With this perspective, IMO we should be grateful to the ID crowd for improving the level of knowledge in the general public. I, for one, am a beneficiary of the process described.

    – Finally, to Monty Hall. A={switching}, G is the car, and C is the risk of having chosen the right door. E is that entity peculiar to axiomatic systems – the”conclusive proof”. Hence, for one possessed of E, B=2/3; D=1/2, so switch.

    Any comments, guffaws, or pointers to sources where something like this has already been addressed by people who actually know what they’re doing will be appreciated.

    – Charles

  11. #11 g
    December 28, 2007

    Jason, see E T Jaynes’s “Probability theory: the logic of science” (warning: great thick brick of a book, and Jaynes never finished it before he died and its posthumous editor has mostly left the gaps as gaps) for the nearest thing there is to a definitive statement of the view that probability theory describes how an ideal reasoner would deal with uncertainty. That book also has a pretty good account of the theorem that (kinda) justifies that view.

  12. #12 dk
    December 28, 2007

    Jason, nice essay. How we define probability may be more than an academic exercise though. I recall an argument from Nassim Taleb in his book The Black Swan: The Impact of the Highly Improbable. Often people are prone to the “ludic fallacy.” This is the idea that people apply the Gaussian model appropriate for games of chance to “real life” where there are many more unknown variables. Consequently, probability estimates are often wrong.

  13. #13 Kevin
    December 29, 2007

    what? no one liked the link? Come on guys, I’m claiming that it was me me me that caused the days to get longer.

    there is no scientific evidence that it wasn’t

  14. #14 Eric Thomson
    December 30, 2007

    An excellent summary. Thank you.

    I really like your mathematician’s take on it, focusing on the abstract system that implicitly defines probability theory, and the interpretive stuff is not really part of the mathematics, doesn’t influence the mathematics (e.g., the axioms are the same for the frequentist and Bayesian), and is a heuristic overlay of the applied mathematician rather than part of the theory itself.

    I agree that adding a bit about the propensity interpretation would fill things out nicely: I remember thinking it seemed the most reasonable in the bunch when I studied this stuff a few years ago.

    I never really understood the allure of the Bayesian view, especially the species that involve quantifying over beliefs.

  15. #15 Enigman
    December 30, 2007

    A very nice post; it’s odd how thinking about any of the probabilistic puzzles leads one to wonder about the meaning of “probability” in general. The propensity interpretation is largely irrelevant to the Monty Hall problem, as there’s no objective indeterminism, just subjective ignorance about where the prize is; but there’s always the old chestnut about whether or not a reasoning subject (with epistemic responsibilities) could exist in an objectively deterministic world, or whether only mechanical simulations of certain aspects of reasoning could (apologies for going even more off-topic:)

  16. #16 jj mollo
    December 31, 2007

    The juror states that he is 90% sure of something. The best case that this is a probability statement is that the juror knows his psychological condition and is aware of what his internal state was on previous occasions. In situations where he had the same “feeling” that he now has, he was right on the order of 9 times out of 10. We stipulate that it is approximate, but it is based on a frequentist model under a specified condition. The juror does not have to know how often the specified event occurs. He only needs an internal regression model to correlate “feeling” with the accuracy of his predictions across a broad range of experience. Perhaps this would also be a good way to describe Bayesian prior estimates.

  17. #17 ctw
    December 31, 2007

    “In situations where he had the same ‘feeling’ …”

    I submit that this is only one component of my evidence set E. Ei might be this “feeling” based on past personal experiences, Ej might be recalling similar situations in TV courtroom dramas, Ek conversations about friends’ jury experiences, etc.

    And my larger point (re faith in the previous comment) was that we all have different E-sets and hence reach different conclusions despite what superficially seems to be essentially equivalent evidence.

    I agree that underlying this process is an implicit frequentist interpretation because that’s the only applicable one a juror (typically a lay person) will have any sense of.

    – Charles

  18. #18 Eric Thomson
    January 5, 2008

    My problem with the Bayesian view is that I think probabilities are objective in some sense. For example, when I present a stimulus to an animal and record from some neuron, I typically describe the neuronal response probabilistically. I am not describing my degree of belief that the neuron will fire five spikes, but an objective feature of the nervous system. That is, given the way the nervous system is wired up, the probability of this neuron firing N spikes is P(N). No need to appeal to my degree of belief, or the animal’s belief.

    On the face of it, it seems unscientific to try to incorporate subjective elements into the equation (unless I am doing psychology, which is the study of the subjective, but even then I can objectively describe psychological states, that would exist even if I were not there to observe them). Even if there were no creatures with beliefs around, there would still be probabilistic relations among events in the world, informational relationships (which depend on probabilities), etc..

    I have yet to see an example that a Bayesian proposes that I can’t recast in terms of more objective factors.

  19. #19 dan
    January 6, 2008

    Dear Blogger:

    This is for your information and is not intended as a spam post.

    Find out how you can promote your blog and/or website at no cost (really!) at:

    http://college-scholarships.com/free_website_promotion_program.htm

    You’ll learn how I achieved #1 rankings for my websites in Google, Yahoo, and MSN.
    The program is absolutely without cost and it really works.

    See the proof at:

    http://college-scholarships.com/free_website_promotion_program.htm

    Best Wishes,

    Dan

  20. #20 Skelliot
    January 10, 2008

    Great article. I am adding you to my blog roll for future reference.

    Check out my blog if you have time.

    Thanks

    Skelliot.

  21. #21 Pete B
    January 19, 2008

    Jason, you say:

    “I then described the results of Monte Carlo simulations of the Monty Hall problem. This is the method that convinces pretty much everyone that switching is the way to go.”

    And it certainly convinced me. Don’t know if the following would help anyone else visualise what’s happening but not being a statistician by trade (and, to be honest, not being like 110% convinced) I tried approaching this as a programming problem – to model the description of Monty Hall’s game show. I produced an algorithm from the problem description and then coded it (in Python – I barely know the language and I’m sure it shows, but if you’ve got a Monty problem … ) and ran a few tests.

    The following is the original problem description with my algorithm steps interspersed as lines beginning ‘#’ words with the ‘*’ prefix are variables and there are lots of interim displays (‘show’s) of values to demonstrate what’s happening at each stage.

    I produced 2 programs – one for the ‘sticker’ case and one for ‘switcher’ but because both algorithms are identical apart from one step, and to save space here, I’ve combined them (into what we might therefore think of as ‘The Full Monty’). I built the the two versions of the following algorithm into Python programs with a loop that executes 100 times. The algorithm lines (beginning ‘#’ remember) become comments in the code, followed by their Python implementation – so if anyone can be bothered, they can trace through from the description of the problem to the code and back again.

    Results for three runs of each are below.

    Here’s the algorithm being developed:

    In the Monty Hall problem, you are confronted with three identical doors, one of which conceals a car while the other two conceal goats.

    # randomly position one car and two goats in an array *doors
    # show *doors

    You choose a door at random, number one say, but do not open it.

    # set *choice to a random number 0 .. 2 [NOTE Computer numbering 0,1,2 are the three positions]
    # show *choice

    Monty now opens a door he knows to conceal a goat. [ the description doesn't actually say
    that Monty won't open the door chosen by the contestant even if it contains a goat, but we assume this]

    # set *open to the position of one of the two goats – selected at random, but NOT = *choice
    # show *open

    He then gives you the option of sticking or switching.

    # add 1 to *tries

    # (stick case) if the car is at the position *choice show ‘sticker won’ add 1 to *stickerwins
    # else add 1 to stickerloses

    # (switch case) show *choice ‘before’ then reset *choice to the position that is NOT *open and
    # NOT *choice
    # if the car is at the (switched) position *choice show ‘switcher won’ add 1 to *switcherwins
    # else add 1 to switcherloses

    Output looks like:

    91 Doors contain
    ['goat', 'car', 'goat']
    91 choice is: 2
    91 door 0 opened by Monty.
    91 switches from 2 …
    to 1
    car
    CAR! CAR! I switched and won a car.

    92 Doors contain
    ['goat', 'car', 'goat']
    92 choice is: 1
    92 door 0 opened by Monty.
    92 switches from 1 …
    to 2
    goat
    oh no …… what a bummer ……

    93 Doors contain
    ['car', 'goat', 'goat']
    93 choice is: 2
    93 door 1 opened by Monty.
    93 switches from 2 …
    to 0
    car
    CAR! CAR! I switched and won a car.

    Rsults:

    Out of 100 tries, sticker won 30 times.
    (and lost 70 times.)

    Out of 100 tries, switcher won 63 times.
    (and lost 37 times.)

    Out of 100 tries, switcher won 61 times.
    (and lost 39 times.)

    Out of 100 tries, sticker won 33 times.
    (and lost 67 times.)

    Out of 100 tries, sticker won 32 times.
    (and lost 68 times.)

    Out of 100 tries, switcher won 66 times.
    (and lost 34 times.)

    ***************************************************************


    Program listings:

    # The Monty Hall problem Sticker version
    # Pete Berry Jan 2008 v 0.1

    import random

    doors = ['car', 'goat', 'goat']
    tries = 0
    stickerwins = 0
    stickerloses = 0

    for i in range(1, 101):
    # randomly position one car and two goats in an array *doors
    # show *doors
    random.shuffle(doors)
    print str(i) + ' Doors contain '
    print doors
    # set *choice to a random number 0 .. 2
    # show *choice

    choice = random.randint(0, 2)
    print str(i) + ' choice is: ' + str(choice)

    # set *open to the position of one of the two goats - selected at random BUT
    # NOT = *choice
    # show *open

    found = 0
    while found == 0:
    open = random.randint(0, 2)
    if (doors[open] == 'goat'):
    if (open != choice):
    found = 1

    print str(i) + ' door ' + str(open) + ' opened by Monty.'

    # add 1 to *tries
    tries += 1
    print str(i) + ' chosen door is ' + str(choice) + ' which hides a ' + doors[choice]
    # (stick case) if the car is at the position *choice show 'sticker won' add 1 to *stickerwins
    if doors[choice] == 'car':
    print ' CAR! CAR! I stuck and won a car. '
    stickerwins += 1
    else:
    print 'Oh No! ...... what a bummer ......and I stuck'
    stickerloses += 1

    print ' '

    print 'Out of ' + str(tries) + ' tries, sticker won ' + str(stickerwins) + ' times.'
    print '(and lost ' + str(stickerloses) + ' times.)'

    ************************************************************

    Program listings:

    # The Monty Hall problem Sticker version
    # Pete Berry Jan 2008 v 0.1

    import random

    doors = ['car', 'goat', 'goat']
    tries = 0
    stickerwins = 0
    stickerloses = 0

    for i in range(1, 101):
    # randomly position one car and two goats in an array *doors
    # show *doors
    random.shuffle(doors)
    print str(i) + ' Doors contain '
    print doors
    # set *choice to a random number 0 .. 2
    # show *choice

    choice = random.randint(0, 2)
    print str(i) + ' choice is: ' + str(choice)

    # set *open to the position of one of the two goats - selected at random BUT
    # NOT = *choice
    # show *open

    found = 0
    while found == 0:
    open = random.randint(0, 2)
    if (doors[open] == 'goat'):
    if (open != choice):
    found = 1

    print str(i) + ' door ' + str(open) + ' opened by Monty.'

    # add 1 to *tries
    tries += 1
    print str(i) + ' chosen door is ' + str(choice) + ' which hides a ' + doors[choice]
    # (stick case) if the car is at the position *choice show 'sticker won' add 1 to *stickerwins
    if doors[choice] == 'car':
    print ' CAR! CAR! I stuck and won a car. '
    stickerwins += 1
    else:
    print 'Oh No! ...... what a bummer ......and I stuck'
    stickerloses += 1

    print ' '

    print 'Out of ' + str(tries) + ' tries, sticker won ' + str(stickerwins) + ' times.'
    print '(and lost ' + str(stickerloses) + ' times.)'

    ************************************************************

    # Monty Hall problem switcher version
    # Pete Berry Jan 2008 v 0.1
    import random

    doors = ['car', 'goat', 'goat']
    tries = 0
    switcherwins = 0
    switcherloses = 0

    for i in range(1, 101):
    # randomly position one car and two goats in an array *doors
    # show *doors
    random.shuffle(doors)
    print str(i) + ' Doors contain '
    print doors
    # set *choice to a random number 0 .. 2
    # show *choice

    choice = random.randint(0, 2)
    print str(i) + ' choice is: ' + str(choice)

    # set *open to the position of one of the two goats - selected at random BUT
    # NOT = *choice
    # show *open

    found = 0
    while found == 0:
    open = random.randint(0, 2)
    if (doors[open] == 'goat'):
    if (open != choice):
    found = 1

    print str(i) + ' door ' + str(open) + ' opened by Monty.'

    # add 1 to *tries
    tries += 1

    # (switch case) show *choice 'before' then reset *choice to the position that is NOT *open and
    # NOT *choice
    print str(i) + ' switches from ' + str(choice) + ' ... '
    found = 0
    temp = 0
    while found == 0:
    if (temp == choice) or (temp == open):
    temp += 1
    else:
    found = 1

    choice = temp
    print ' ' + 'to ' + str(choice)
    print doors[choice]
    # if the car is at the (switched) position *choice show 'twister won' add 1 to *switcherwins
    if doors[choice] == 'car':
    print ' CAR! CAR! I switched and won a car. '
    switcherwins += 1
    else:
    print 'oh no ...... what a bummer ......'
    switcherloses += 1

    print ' '

    print 'Out of ' + str(tries) + ' tries, switcher won ' + str(switcherwins) + ' times.'
    print '(and lost ' + str(switcherloses) + ' times.)'

    print ‘Out of ‘ + str(tries) + ‘ tries, switcher won ‘ + str(switcherwins) + ‘ times.’
    print ‘(and lost ‘ + str(switcherloses) + ‘ times.)’

  22. #22 ?apka
    July 24, 2008

    was working at a company specializing in simulation software when the Monty Hall puzzle was first making the rounds. Most of the engineers there would get the right answer. I don’t see any more problem with a formalist view of probability theory, than with a formalist view of mathematics generally. The application of the math to a problem domain always carries a variety of assumptions that go into that modeling. That’s true whatever math one is using (e.g., calculus, Hilbert spaces, probability spaces) and to whatever problem domain they are applied (e.g., building a dam, calculating the probability of atomic decay, or deciding how to bet when playing Monty Hall

The site is currently under maintenance and will be back shortly. New comments have been disabled during this time, please check back soon.