Randomness Versus Stochasticity

In a recent post, I mentioned the common confusion between randomness and stochasticity. A couple of commentors brought this issue up, so I'll discuss it further (I really do read your comments...). Needless to say, with mathematicians and philosophers lurking around these ScienceBlogs, I'm giving one biologist's amateur perspective on what these terms mean.

Let's start with randomness. I don't mean random in an existential 'does life have any meaning?' sense (Yawn. You bore me. Stop worrying about that and go do something meaningful. Or have an ice cream cone). By random, I mean unpredictable. Let me give an example. In population genetics, if you have two alleles (variants of a gene) that do not differ in their fitness effects--that is, the variants do not alter the survival or reproduction of the organism that carries either allele--then one allele will eventually disappear from the population by a process known as genetic drift. (Genetic drift has also been called a random walk). The point is that it is utterly unknowable which allele will disappear (although we do know that one will disappear). This is an instance of randomness.

Stochastic is best explained by its converse, deterministic. Flashback to algebra class, and imagine the equation Y = 2X. Suppose X = 50. Every time you solve the equation for Y, Y will equal 100. Not 99 or 101, but 101. It doesn't matter when you solve this equation, or what equipment you use (unless it's broken), Y will always equal 100. That's a deterministic equation.

Now imagine the equation Y = 2X ± e, where e is some random small number. The 'e term' incorporates stochasticity into the equation. Depending on what random number is used to generate e, the value of Y will differ. Running the equation over will not necessarily generate the same value for Y (although it could). However, while we can not determine the precise value of Y, and thus this equation is not deterministic, we have a general idea of what that number will be. For example, if X = 50 and e is a random number between 1 and 5, we know that Y will range between 100 ± 5. This isn't random in that we can approximate the term Y a priori, even if we can't estimate its precise value.

This is relevant to biology (and all the sciences, for that matter) because, while it can be difficult to determine the exact outcome, we typically can incorporate stochasticity into our predictions--this is why technical papers on global warming, for example, incorporate confidence intervals. In the War on Science, those who oppose a given result often try to conflate randomness with stochasticity to undermine the integrity of science: the eggheads can't give an exact result, so they must not know what they're doing. In the context of evolution, the two terms are often confused by morons like Dembski who argue that complex structures could not evolve by "random chance." However, evolutionary biology is a historical science. Given that there are pre-existing starting points, we can often approximate what will happen, even if we can't be exact.

More like this

I have spoken of the probability of extinction and the rate of substitution once past extinction, but now to something more prosaic, genetic drift. My post is based on John Gillespie's treatment in Evolutionary Genetics: Concepts & Case Studies. Like R.A. Fisher he does not think much of…
There are many ways one can model population genetic dynamics. A simple avenue is to imagine a deme, a group of breeding individuals, subject to a few major parameters which are modulating genetic variation. Mutation, migration, random drift and selection. A model being what a model is, one's goal…
Recently a few blogs I follow have been having a back and forth "debate" which seem to recapitulate in the most general sense the "selectionist vs. neutralist" debates of the 1970s. Three posts from p-ter: Do phenotypes evolve neutrally? More on adaptation Final Thoughts on adaptation From Larry:…
Over the last two days we've talked about hash functions and their uses in cryptography and elsewhere. Remember that an ideal hash function is basically what cryptographers call a random oracle - given an input, it produces a random number in some range. (In practice this range is always [0,2^(2^n…

e is a random variable drawn from some distribution to model a stochastic process... anyway, my main gripe is with this:

The point is that it is utterly unknowable which allele will disappear (although we do know that one will disappear).

We can assign a probability that a given neutral allele will fix or be lost. The probability of fixation is just the frequency of that allele, so I wouldn't say it's "unknowable", just uncertain.

In a general sense, randomness and stochasticity are in fact synonyms, so there can't really be confusion between them.

In a technical sense, a stochastic process is defined as a collection of random variables. Because typical examples are random walk-like, one could get the impression stochasticity necessarily implies time-dependence, but that is not quite correct: rolling dice over and over again is just as stochastic as a random walk. Evolution is in fact random walk-like, in the sense that the distribution of a genetic configuration is extremely dependent on the previous configuration, but to say it is stochastic and not random is not really correct, (although it is obvious what you're trying to say).

I think the distinction you are trying to make is between unconditioned and conditioned probability. Rolling a die is unconditioned since any one outcome is independent of previous outcomes. Biological mutations on the other hand are highly conditioned on previous history, so probability distributions are highly asymmetric. The connotation of rolling dice is what creationists are trying to create when they say "random", but the whole point of natural selection is the conditioning of probabilities according to genetic history and the physical environment.

I just wanted to reinforce the comment above in case there was any confusion - stochastic and random are in fact synonyms. In any situation you can use stochastic or random interchangeably - now, one or the other may be more common (we tend to say "stochastic processes" vs. "random processes" and "random numbers" vs. "stochastic numbers") BUT - you can almost always find literature where they use the other form. The distinctions you make above are valid in terms of understanding conditioned vs. unconditioned probability, or the uniform distribution vs. a non-uniform distribution (depending on the specific example). So, there is nothing "special" about the use of the word of "random" vs. "stochastic". I think sometimes biologists get hung up with terminology because it is so important to describe the details that are the subject matter of biology - whereas in math and physics, everything comes back to fundamentals, and the words are just jargon for how you describe the math - the math is what really matters, the structure of the equations. The words are mostly irrelevant, and they change based on the age of the literature or the cultural background of the authors.

I think you nailed it. Random means not capable of being predicted, whereas stochastic means ineliminably probabilistic (where the probabilities involved may be exactly determinable).

Terminology IS important. In most cases, we do use the terms "stochastic" and "random" interchangably, but when we are delaing with technical questions distinctions of the sort Mike is drawing are meaningful. The usual place one sees this distinction is in quantum mechanics. If you take a system comprised of a single particle in a superposed state for some observable with two possible values, (electrons and spin are the standard example). When we observe the particle, the wave function collapses and the particle will always be observed in one of the two states. The formalism of the theory is absolutely deterministic in its ability to posit the distribution of observations, say 75% spin up and 25% spin down. If you were to conduct this experiment many, many times, the formalism is always spot on. But while it is deterministic in giving us the distribution, for any given observation the result is random. We cannot determine for any given electron whether it will be spin up or spin down -- it is random, but for a large collection we can determine the distribution -- it is stochastic, that is, there is a fundamental probabilistic aspect that is determinable.

Of course, I'm just a humanist...

> Random means not capable of being predicted, whereas stochastic means ineliminably probabilistic.

Umm no. Again, look at any textbook definition of stochastic process and it says "a collection of random variables". In the most technical sense possible, stochasticity is defined in terms of randomness. I think it's the definition of "random" that is causing the confusion: it's broader than you think. A random variable is any variable on a probability space. You can "determine" a distribution for it (since a probability space necessarily requires a distribution function).

The whole QM example is superfluous (and in fact the word stochastic is never used in quantum theory). You could just as easily specify a formalism for dice. It determines a distribution of (1/6, 1/6, 1/6, 1/6, 1/6, 1/6). There is nothing fundamentally different between that and the Copenhagen interpretation: a typical Stern-Gerlach experiment is no different than flipping a coin.

Thanks for the clarification; I see the technical distinction you are trying to make, although I still think the two [i]words[/i] are synonymous. But now I'm more confused by your original post. Although creationsits do abuse the word "random" constantly, I don't see them conflating it with "stochasticity." They're usually just exploiting the common connotations of "random" to make evolution seem highly unlikely, much as the common (mis)understanding of the word "theory" implies doubt and dissent. How can fudging the subtle distinction between "random" and "stochastic" be, to you, one of the more painful examples of scientific illiteracy? I don't see how for the average person this idea is important - especially compared with why it's ok that evolution is a theory, or how antibiotic resistant bacteria evolve.

Perhaps I'm setting my threshold for scientific literacy too low. But I've been exposed to some remarkable illiteracy while teaching - such as the time I tried to explain to my college science class that due to its slow rotation, we can't directly view the moon's back side. I couldn't get a single one of my thirty students to buy it. Getting them to grasp a difference as technical as the one you describe in this post wouldn't be one of my priorities.

The word "stochastic" is always used in discussing interpretations of QM for exactly the sort of reason Mike is concerned with here -- there is an inherent difference between ineliminably stochastic processes and classical examples like dice or coins. If one were to take systems amply modeled by classical theory like coins and specify all of the values for all of the operative variables, then the whole thing becomes completely deterministic. We simply don't have access to the variables and this is why the captains call heads or tails after shaking hands. In the case of an EPR/Stern-Gerlach experiment, however, there is an ineliminable randomness attached to each event, but that randomness is constrained by the probabilities attached to the system. A quantum system is stochastic in that it has absolutely determinable probabilities attached to it, yet any given observation is random because complete specification of all operative variables does not allow you to determine the outcome of the measurement.

> The word "stochastic" is always used in discussing interpretations of QM for exactly the sort of reason Mike is concerned with here -- there is an inherent difference between ineliminably stochastic processes and classical examples like dice or coins.

Argghh...now you're talking about the distinction between quantum and classical ignorance, and that's completely off-topic. Not only are you abusing terminology, you're throwing around examples from QM to confuse all the non-physicists. Biological mutations are classical processes, that hardly means they're not stochastic. The source of ignorance in a Stern-Gerlach experiment and coin-flipping may be different, but the outcome of each is a random variable, and strings of outcomes can be viewed as stochastic processes. And they both have the same distribution functions.

Once again, there is NO distinction between randomness and stochasticity. A random variable is a stochastic process with n=1. Honestly, read an undergrad text on probability (e.g. Sheldon Ross).