Misrepresenting Simulations

Yet another reader forwarded me a link to a rather dreadful article. This one seems to be by
someone who knows better, but prefers to stick with his political beliefs rather than an honest
exploration of the facts.

He's trying to help provide cover for the anti-global warming cranks. Now, in light of all of the
data that we've gathered, and all of the different kinds of analyses that have been used
on that data, for anyone in the real world, it's pretty undeniable that global warming is
a real phenomena, and that at least part of it is due to humanity.

One of the standard arguments from the supposed skeptics about global warming is
the fact that much of our understanding of it is generated using simulations. That's exactly the tack that this article takes:

Think of a (mathematical) function. The function takes values as input, processes the values, and spits out a value. A simulation is a lot like a function, with three crucial exceptions. First, a simulation has no real input values, but uses instead simulated input. Second, because the input is simulated, instead of running the simulation only once (as you would with a function), you have to run a simulation many times (iterations), then statistically analyze the output -- the simulated output, calculated from the simulated input. Third, because the output of the simulation is simulated output and there are multiple outputs (because the simulation must be run many times), it must be statistically analyzed for reliability and the multiple results must be analyzed statistically.

A function is certain. A simulation is uncertain.

This is utter bullshit. No nice way to put it; it's utter crap. It is not a fact that
simulations don't take real input data. It is also not a fact that all simulations
need to be run multiple times. This is a pure strawman; some simulations are run with
simulated input; some simulations are run multiple times with varying inputs to get a sense
of trends. It is manifestly not the fact that all simulations work this way; and
many of the simulations used for analyzing global warming trends were fully deterministic simulations run using real measurements.

Let's take a moment and ask the fundamental question: What is a simulation?

For our purposes: a simulation is a computer program which implements a mathematical model
of some phenomenon. The input to a simulation is the data needed to describe an initial state
of that phenomena. The simulation is run against the input data, and produces a description of
what it's mathematical model predicts about the state of the phenomena at subsequent points in time.

Given a simulation model, it can be run in a number of different ways:

  1. Verification: given a set of data for some point in the past, the simulation can be run
    to see if its results match the present.
  2. Prediction: given a set of data for the present, the simulation can be run to
    generate predictions about the future.
  3. Exploration: given a manufactured set of data, the simulation can be run
    to explore what could happen in a given situation, or to study how the model works
    in various cases.

The author of the original article pretends that everything is case 3 - the exploratory case,
where the data being input is manufactured, rather than being a representation of real measurements.
He also assumes that the simulation is stochastic: that is, that it is using some sort of
randomness in its input, so that starting from the same input data, the simulation will
generate different results. In my experience, that's incredibly rare in computer simulations
- in fact, I can't recall ever seeing a single simulation that wasn't fully deterministic -
meaning that it always generates the same result for the same input.

There's one thing he got right: functions are certain. Given a function, you know what
its result is, and you can check whether or not the result is correct. For a simulation, the result
is more fuzzy - the simulation may be generating a "correct" result in the sense that it doesn't have
any bugs, and it generates the result that its model says it should; and at the same time be
completely wrong because the mathematical model that it's using doesn't accurately
represent reality. And there's one other thing that's true, although he doesn't explicitly mention it: many simulations are probabilistic, in the sense that instead of generating exactly one results, they compute multiple possibilities: when there are choices about what to do in the model, they
run all of them, generating a probability for each of the branches.

With that out of the way, let's continue on to the next interesting part of the article.

So how does a simulation simulate input values? Usually, by taking real data, analyzing it statistically to determine its frequency and distribution, then using statistics to generate input values using the same frequency and distribution. Note that in order for this to work, one must assume that the data are stable, that is, that the frequency and distribution of the data will not change over time.

The part of the simulation that corresponds to a function is known as the model. Obviously, only an accurate model can produce reliable results (output), and the more accurate the model, the more reliable the results.

Simulations can be powerful tools for making predictions, and are used in many fields, including
business. However, because simulations use simulated (that is, not real) input and result in
simulated (that is, not real) output, they have no evidentiary power -- that is, you cannot use a
simulation as evidence for anything, nor can you call the output of a simulation (real) data.

First paragraph is not nonsense, but it's not exactly honest, either. He wants to make
it look like the models are as uncertain as possible, so he stresses the idea of a stochastic model.
A stochastic model is entirely probabilistic - its model is based on nothing but statistics about how
things have behaved before. Stochastic models are relatively weak, as models go; they're generally
used in cases where we don't have a good behavioral model that accurately models the real phenomena
that's being simulated. In any situation where we use simulations, we strongly prefer
physical models - that is, instead of using stochastic models, we prefer a simulation that's based on
really simulating physical behaviors, rather than just playing with probabilities. What he's saying is misleading - because he's pretending that all models are stochastic.

He also deliberately overstates the case about the evidentiary value of simulations. A simulation is never considered the equivalent of real-world evidence in terms of quality; but they
most emphatically are frequently used as supporting evidence for various theories. The quality of a simulation as evidence is generally based on how well it performs in verification runs - a simulation that can be demonstrated to generate accurate results in a wide variety of situations is considered a good piece of supporting evidence when run on data similar to the data used for the verifications.

For example, the US Army's ordinance testing facility in Maryland now uses simulations
for most of its tests. It uses tests for information gathering, and it periodically
performs tests to validate the simulations; but the quality of the simulations has gotten high enough that for many purposes, they no longer consider it necessary to run real tests of dropping explosives from an airplane. They quite definitely consider the results of those simulations to be evidence!

And now, finally, we get to the sleaziest part.

You can use estimated data in your model, but doing so inserts another layer of uncertainty into the simulation results. Let me explain.

Since the data in our model are estimated, we must use statistics to determine their validity. Since Mr. Lewis uses dice, I will as well, sticking for the sake of simplicity and clarity to rolling one die.

If you roll a die, the probability that you will roll, say, a 1 is 1/6. If you roll the die a second time, the probability that you will roll, say, a second 1 is 1/6 * 1/6, or 1/36. If you roll the die a third time, the probability that you will roll, say, a third 1 is 1/6 * 1/6 * 1/6, or 1/216, and so forth.

Estimated variables in a simulation model must be treated in the same way as rolling a die, because each is uncertain, and each involves probability. Assuming that our climatologist is ethical, each of the estimated variables in his model should fall within the statistical norm of reliability, or be 95% reliable. Given the complexity of climatological models, hundreds of such estimated variables would be necessary, but for clarity's sake, we will say the model includes only 50 such variables.

That means that the reliablility of the simulation model is 0.95^50 (0.95 raised to the fiftieth power), or 0.0769, or 7.7%. So even if we didn't have the uncertainty of the simulated input (not to mention the additional uncertainty of assuming that the data are stable), even if there were no uncertainty in our ouputs themselves, our simulation results would only 7.7% reliable.

This is a thoroughly dishonest bunch of babble, which is in no way an accurate description of anything.

Even if we accept what he says at face value: that there are multiple variables
in a simulation which need to be considered separately in terms of probability - he's quite
deliberately ignoring the correct way of combining those probabilities. In fact, he's really just trying to play the inverse of a classic "big numbers" game - he wants to artificially combine things to make the probability look as untrustworthy as possible. The trick is in pretending
that all 50 (or whatever) variables in the simulation are independent. In real
climate simulations, the kinds of things that become variables are not independent. To give a couple of examples, real climatalogical simulations will include a parameter to describe the
humidity of airmasses based on temperature; and simulation to describe the viscosity of airmasses based on temperature and humidity. Those are not independent - the viscosity of the airmass is determined in part by its humidity; the ability of the airmass to pick up more moisture while over the ocean is determined in part by its viscosity. The probabilities of these things being correct are not independent - if one is right, the other is almost certainly right; and if one is wrong, the other is almost certainly wrong - because each depends on the correctness of the other. Dependent variables get treated very differently in a probability calculation that independent variables - that's what Bayes theorem is all about.

But it's much worse than just making a misleading probability argument. He's very deliberately
mischaracterizing how we model the accuracy of a simulation. The accuracy of simulation is based on
its performance and the known accuracy of the fundamental model which it's based on. So, for example,
most airplane manufacturers no longer use wind tunnels - computational fluid dynamics simulations
generate better results than the wind tunnel (Do a websearch on "Boeing" and "Tranair"). The
reliability of the simulation is based on two things. One is a long history of measuring things on
instrumented aircraft, and comparing the measurements to the predictions from the simulations; the
other is the known accuracy of the Navier Stokes equations, and the computational methods used to
implement NS systems. On the basis of those two, we come up with results about how accurate we
believe the models to be.

And further - we look at simulations based on multiple models. If 20 different models, generated in 20 different ways, all of which have strong track records for accuracy - if all 20 of them have been been implemented by simualations whose quality has been demonstrated - and all of them generate nearly the same result, and no system/model with a proven track record disagrees, then we consider the results of those simulations to be very strong evidence.

Of course, he saves the worse for last.

Climatological simulations cannot be taken very seriously. They can certainly never be taken as evidence or proof, as no simulation can be, because they are simulations. They aren't real.

What disturbs me about all this global warming warfare is that the climatologists know this. They know that their models have no evidentiary power. Yet, they disingenuously claim the reverse. This isn't science. It's politics. It's dishonest. And it's a breach of professional ethics and integrity.

So says the man who just threw together a bundle of lies and misrepresentations to try to
support a pre-determined opinion without regard for the facts. And he accuses others
of dishonesty and breaches of ethics and integrity.

More like this

In my post yesterday, I briefly mentioned the problem with simulations as a replacement for animal testing. But I've gotten a couple of self-righteous emails from people criticizing that: they've all argued that given the quantity of computational resources available to us today, of course we can…
The IPA is the Australian version of the CEI, so you don't have to read an article they publish on global warming to know what the conclusions will be. But you do have to read it to find out what pretext will be used to dismiss concerns about warming. In the latest issue of IPA review we find an…
Having spent a lot of time solving equations related to sticky tape models, including trying to work solutions in my head while driving to Grandma and Grandpa's with the kids, and making some measurements of real tapes, there was only one thing left to do: try simulating this problem in VPython.…
Most of what would ordinarily be blogging time this morning got used up writing a response to a question at the Physics Stack Exchange. But having put all that effort in over there, I might as well put it to use here, too... The question comes from a person who did a poster on terminology at the…

I found his first paragraph pretty amusing. Never have I seen the word simulation used so frequently, two or three per sentence. He clearly hopes that the reader will conclude that simulation**(a high power) is meaningless, therefore GW is a manufactered result!

There is a lot of dishonesty on both sides.

For example back testing. Here you take a period, say 20 years. You run your models and extract out the parameters that give a model that is good at explaining the data.

Now you test, say against the next five years. If the model is still good, you use that model to make a prediction.

Is this a good approach? It seems at first sight that it is. The model has predictive value for the 5 years it predicted blind.

However, there is a strong bias involved. You are only going to go forward with models that did predict the 5 year period blind and you will have selected out the models that didn't. It has introduced a bias.

Happens all the time in the finance world, and the results are pretty much the same. Money down the drain.

Next issue concerns parameters. If you have to fudge a physical constant, or a paramter that is measurable in the lab to make the model work, its a failed model.

If you parameters are uncertain, then you have to perform a set of simulations with the range of the value. If that leads to wildly different answers, then your model or simulation doesn't have predictive value.

A model is only good if it makes predictions that are testable.

If there are lots of models, and they have a prediction that is good at the 5% level, 1 in twenty is going to be wrong by chance.

If you believe in GW, run lots more models. Some of them will be right by chance.

Nick

I don't know why he even bothered to write that long. All he says and had to say is that he didn't like the climatology data. The key paragraph is:

...because climatology is a young science, and nobody is certain how, say, CO2 or water vapor affects the climate. A climatologist, given enough data, can make an educated guess. But it's still estimated.

What he needs to do is to argue with a climatologist about the quality of their data and their understanding. He doesn't do that, so he shouldn't expect to convince anybody.

However, there is a strong bias involved. You are only going to go forward with models that did predict the 5 year period blind and you will have selected out the models that didn't. It has introduced a bias.

Wouldn't this just bias you in favor of the models which worked?

Time is asymmetrical. The models that do best at prediction often do worst at retrodiction, and vice versa. There are deep theoretical reasons for this. But it tends to make both sides of the Anthropic Global Warming debate look like idiots, when one gets foundational.

Now, I don't know too much about this subject, but I'm confused by Nick's comments. Surely you want to be biased against models that don't work? And you point out ways to invalidate models, but that just means that they don't use those models to run predictions. I don't see how it would indicate dishonesty.

A little more explanation, please?

By CaptainBooshi (not verified) on 21 Jan 2007 #permalink

This guy doesn't like statistics, does he?

I love the full power of this:

You can use estimated data in your model, but doing so inserts another layer of uncertainty into the simulation results.

Translation: if we try to base our simulations on the real world, things are even worse.

Bob

Nick, climate science is not finance. They build on really good models, known as "physics" and "chemistry", rather than the weak models you see in finance.

You misrepresent uncertainty as certainty, claim that simulated data can be evidentiary (whether people use it as evidence is another issue altogether), and then call into question my ethics?

The problem here is that science doesn't concern you; groupthink and politics does. I am a global warming agnostic. I have no position either way. I'm not the one with the political agenda.

You are.

Uncertainty is uncertainty. Period.

I am a global warming agnostic. I have no position either way.

Climatological simulations cannot be taken very seriously. They can certainly never be taken as evidence or proof, as no simulation can be, because they are simulations. They aren't real... the climatologists know this. They know that their models have no evidentiary power. Yet, they disingenuously claim the reverse. This isn't science. It's politics. It's dishonest. And it's a breach of professional ethics and integrity.

So it's not anthropogenic global warming you're opposed to-- just climatology itself?

I'm not the one with the political agenda.

The handle "rightwingprof" and blog named "right wing nation" are just coincidences, I'm sure.

rightwingprof

You are new here so I will try to be nice.
The blogger and most of the readers on this blog understand math and stats much better than you (we didn't teach you statistics, we taught measure theory to the statistics Prof. who taught you statistics).

Just a list of some questionable "simulations"
p = mv
k_e=1/2mv^2
E=mc^2
F=ma
F =GMm/r^2

The problem is that all of the hard sciences consist entirely of "simulations" (or models as we like to call them). Maybe it isn't equations that are bothering you, maybe it is the fact that differential equations are being solved numerically.
I don't now what the issue is, but the best you can hope for here is a spanking.

rightwingprof: For anyone who reads this blog, the allegation that "science doesn't concern" Mark CC is ludicrous. That dog won't hunt. If you'd like to have a serious discussion of exactly what is wrong from a mathematical/scientific viewpoint with the numerous climate simulations showing global warming, I'm sure Mark CC and the folks at Real Climate would be glad to either engage you, or point you to FAQs if the particular objections you raise have already been answered.

Nick: If you care to make accusations that predictions of global warming stem from running lots of simulations, some of which show global warming "by chance," and that "there is a lot of dishonesty on both sides," I think you'll have to be prepared to show actual examples to back up what you're saying. Got any?

If you roll a die, the probability that you will roll, say, a 1 is 1/6. If you roll the die a second time, the probability that you will roll, say, a second 1 is 1/6 * 1/6, or 1/36. If you roll the die a third time, the probability that you will roll, say, a third 1 is 1/6 * 1/6 * 1/6, or 1/216, and so forth.

A variant on the old "The last 9 times I flipped this coin it fell on heads. What's the probability it will land on heads next time?" debate.If you know your probability, the answer is a half. If you don't, like the article's author, you'll suggest something like "But it cant come up 10 times in a row!!"

People who live in illiterate houses shouldn't throw numbers around.

By ObsessiveMathsFreak (not verified) on 22 Jan 2007 #permalink

(1) Mark is right to deconstruct such dishonest pseudo-Math.

(2) The underlying issue is more about subjectivity and dishonesty than the Math being abused.

(3) Anthropic Global Warming is one of these issues that fire people up or leave them cold; a hot-button issue which is very polarizing, such as Abortion or Intelligent Design. Most people don't see the need for objectivity, having taken sides for religious or political ideology further contaminated with ad hominem bias.

(4) Time is asymmetrical. The models that do best at prediction often do worst at retrodiction, and vice versa. There are deep theoretical reasons for this. But it tends to make both sides of the Anthropic Global Warming debate look like idiots, when one gets foundational.

(5) What follows is somewhat rambling, but tries to address the issues of subjectivity, nationalism, and why some people get excited over things of vast indifference to others.

For whatever reason, something woke me up around 2:30 a.m. this morning, and I'm still awake -- freezing cold, but awake.

Last night, with my wife, son, and son's girlfriend, we enjoyed a Burns' Night event, commemorating Robbie Burns' 248th
birthday. My Physics professor wife, born and brought up in Edinburgh, can better speak than I to the Scots culture, and from the vewpoint of a blood relative of the other of the two greatest poets of Scotland. Other features of the festivity are scotch, poetry, bagpipes, and haggis. People are either passionate or indifferent to all 4.

As to Haggis, I've always liked it. I've never understood why people are creeped out by it, unless there is some bias in describing how sausage is made to people who don't want to know. If so so, that's
some sort of analogue to people who can't enjoy a meal whose food is artifically colored with food dyes to unusual shades.

The haggis at the restaurant we were at, Beckham Place in Pasadena, was okay, neither as good as the good haggis I've had from good butchers in Scotland, nor as mediocre as at some fast-food places.

It is more oatmealy than conventional meat dishes, with a distinctive set of spices -- not so spicy though as Mexican chorizo. The meat flavor is more interesting than sides of beef or lamb, which are mostly muscle, as it is organ meats of various kinds. I happen to like liver, kidneys, and -- though I won't eat them any more in a world with too many slow viruses and pathologically misfolding proteins (prions), brains. I recommend, for instance, the chicken liver and waffles lunch at Roscoe's Chicken & Waffles on Lake Avenue, in Pasadena, for instance.

I also don't know why some people dislike bagpipes. To me, the sound is stirring, distinctive, and haunting. It is also very good for going into battle. I admire the pipers who march into battle as, for
instance, on D-Day. Imagine plunging into combat with no armament, but that which stirs one's own compatriots. More visceral than a flag-waver, but somehow transcending symbolism.

I also enjoyed an unusual comic -- Joel "Joey" Friedman -- whose comedy routine I saw at a local venue, the Coffee Gallery. He plays a quintessential Geek, mentions his MIT degree and software enfgineering career in some depth, as well as his wife being a former US Women's Chess Champion.

There were people on stage (10 people auditioned) who have been doing standup for many years. Some of them were more comfortable onstage, or more practice in microphone technique. But none had better nor more original material, I thought. I hope that Joel perfects and enriches the
Geek character, rather than tell conventional jokes. I want Joel to "kill" at the Googleplex, at an Apple convention, at Microsoft Research. He could be a very
big hit in that social niche, which is, after all, a trillion dollar technology industry. And normal people now get his jokes (i.e. waiter at resetaurant slips and falls; Joel calls tech support because his server is down). Google for his web site.

I'm also now in the interesting position now of writing a technical scientific paper which evolved from a response to a particularly irritative Intelligent
Design troll on this very fine science blog. He asked, I thought, a very good question. As a scientist, and as an
artist, I respect a good question, regardless of ad hominem feelings about its author.

I was disappojnted by the result of the much-hyped Indianapolis (having snuck away years ago from Baltimore so egregiously that science fiction bestselling author Harry Turtledove complained in the LA Times Sport section this past week) Colts versus New England Patriots rivalry, where
(from my biased view) the wrong team won, and is going to the Superbowl. Yet I acknowledge the game as exciting, a victory for the Colt's quaterback who is famous for never winning the big game, and that for
the first time the superbowl pits two african american coaches against each other. Perhaps few you reading this comment care about football. It isn't even the big game to me, who grew up with baseball as
THE American game, and football being dominated by college football, even though the NFL goes back to the early 1930s. TV changed everything in sports, not
necessessarily for the better.

Had you heard that Caltech's bastketball team won about a week ago? No joke. They snapped a 207-game losing streak.

For perhaps a majority of people in the world, sports substitutes for what we
get from science and literature. Style, excitement, a kind of certainty comingled with uncertainty.

More about subjectivity than usual in this comment. If you're still reading, thanks for letting me ramble about a tricky topic, or vague area not quite in any thread.

Now, in light of all of the data that we've gathered, and all of the different kinds of analyses that have been used on that data, for anyone in the real world, it's pretty undeniable that global warming is a real phenomena, and that at least part of it is due to humanity.

It is interesting to see the different kinds of reasonings here. (And my own analysis will no doubt be incredibly naive, since I haven't studied this much.)

First, the kind of theory and evidence. Global warming is obviously not a hypothesis or a first order theory that one can test against one simple prediction. It is not a climate theory that explains why we can measure a temperature for example, but it is telling us to look for certain climate effects on top of default climate. It seems more analogous to general relativity explaining deviations from classical gravitation.

So we look at several predictions (warming, energy to drive weather system, ice coverage, migration of species), and each is currently giving us small and perhaps overlapping changes in distributions. It is quite like in medicine where you may not see separated distributions between healthy and sick specimens.

For example in some cases of glaucoma, where eye pressure overlap. Apparently one get false positives and negatives if the individuals pressure wasn't recorded in the healthy situation. I imagine that some people with other eye problems may get treated just in case.

But on this matter I've seen physicists complain in earnest that you can't get 3 sigma certainty for one GW (or medical!) prediction, since this is what physicists typically use when testing theories.

Second, the situation. If GW was a separate phenomena, it may be argued whether we should take action before estimating full costs and risks. But it is linked to extinction rates, and in this case we are certain that we live in one of the great extinctions, and that it will cost us in diversity and potential economy.

And market economy is like evolution, pretty intelligent and resource effective in the short perspective, but not good at looking ahead and arrive at the most efficient global solution. (Think vagus nerve, or vertebrate eye.) It should be prudent, at some point, to offer a little bit of immediate returns even if there are remaining uncertainties, even if the market expects typically exponential returns over time. (But don't ask me how to decide that break point. :-)

righwingprof:

science doesn't concern you

You have missed one of the points with this blog; you have also missed to note Mark's occupation.

So, does this opinion of yours make your claims more credible? Think about it - and please don't chalk that answer up to uncertainty.

By Torbjörn Larsson (not verified) on 22 Jan 2007 #permalink

What really annoys me about the people complaining about situations is that they've got some bold text written between the lines:

"Humans can't predict ANYTHING! Therefore, science does not exist."

If you want to argue about simulations, don't take such a foolish stance. Argue about what really matters: Does the simulation in question accurately model the relevant laws of nature and known facts?

Mark, a few comments:

I would have pointed out that a simulation is a mathematical function, whose value is calculated on a computer. The dichotomy between a "function" and a "simulation" is false to begin with.

Regarding stochastic simulation, there are simulations which are inherently stochastic: running them twice with the same input values will produce different outputs. There are times when this is important to do, e.g. in the case of stochastic resonance, when noise internal to the system can fluctuate at a frequency proportional to some natural frequency of the system, and drive it into a different basin of attraction. For example, see the theory of stochastic resonance in ice age cycles.

By Ambitwistor (not verified) on 22 Jan 2007 #permalink

Nick:

Like others, I am confused by the "bias" involved in discarding models that don't work. Some would call that "science".

Regarding parameters, are there any particular climate studies you are accusing of altering known physical constants to fudge the results?

Finally, regarding uncertainty in parameters: yes, if you get a range of results out of a sensitivity analysis of the model, then that means the model is not very predictive. That's precisely why such sensitivity analyses are performed, to judge a model's utility. Note that this isn't necessarily a problem with the model; even a perfect model can make wildly different predictions for slight variations in input parameters if the physics of the modeled system is chaotic.

By Ambitwistor (not verified) on 22 Jan 2007 #permalink

rightwingprof:

I do believe that is the worst attempt at a rebuttal I've seen in some time, and on the Internet, that means something.

Not only was your comment entirely free of scientific content or argumentation, it didn't even address Mark's points. Mark never claimed that "uncertainty was certainty", he merely noted that uncertainty in models is quantifiable. Nor did he claim that "simulated data can be evidentary". (I think you are rather confused as to what "simulated data" means in science, as opposed to "simulation output".) He did correctly claim that the predictions of models are used to support theories, since that is how all science is done, whether the models are calculated on a computer or by hand.

Your comment as the whole took the last recourse of the intellectually lazy: it contested nothing, ignored Mark's claims, made no new claims, and failed to substantiate any of the assertions it did make. Finally, you accused your opponent of bias and declared yourself the victor.

This is not impressive.

By Ambitwistor (not verified) on 22 Jan 2007 #permalink

Re stochastic simulations

Having been involved in the development, testing and application of simulation models of automobile traffic flow for evaluation of traffic control policies for 22 years, I would like to comment on the issue of stochastic simulations.

There are some applications of simulation which are inherently stochastic in nature. Simulation of automobile traffic is one of these. It is an unfortunate fact that driver behavior and vehicle performance characteristics are not deterministic. I provide two examples (there are many more) below

1. The service rate of a queue of vehicles discharging at a traffic signal is variable, depending on driver agressiveness and vehicle acceleration capabilities. Thus in simulating such a scenario, this variation must be accounted for; usually this is done by applying Monte Carlo techniques.

2. The speeds at which drivers wish to attain, especially on freeways is variable. Some drivers like to drive at 80 miles/hour others at 70 miles/hour, still others at 65 mulse/hour. Thus, in simulating such a scenario, this variability must be accounted for, again usually applying Monte Carlo techniques.

A simulation model that does not account for such stochastic variability will not accurately simulate traffic flow, particularly in the most interesting region of congested traffic.

I wouldn't say it was the worst one out there, Ambitwistor, but I will say it's probably the worst one that tries to look like an actual argument. In short, the only way it'd get worse is if he stuck in gay and suicide "jokes."

I have put this in many forums, so I will include it here. When it comes to doubt about global warming we only need to look at two facts.

1) We know that in a system such as ours, if you add CO2 to it, it will be warmer. That is, we know that CO2 is a greenhouse gas, and we have known this for over 100 years.

2) We have polar ice cores that give us CO2 levels for the last 40,000 years. For the last 10,000 years the CO2 levels were relatively constant. Beginning 200 years ago, those levels began to increase. They have increased substantially. This coincides directly with the industrial revolution.

Climate modeling etc. are just going to validate that the increase in global mean temperature has increased in the last 200 years and will continue to do so. Anyone who says otherwise has a political ax to grind.

The debatable points for global climate change are the effects this increase in CO2 and warmer mean temperatures will have on weather, climate, etc.

Nick wrote:

However, there is a strong bias involved. You are only going to go forward with models that did predict the 5 year period blind and you will have selected out the models that didn't. It has introduced a bias.

Which provoked me to write:

Wouldn't this just bias you in favor of the models which worked?

As best as I can tell, this stirred Jonathan Vos Post to say:

Time is asymmetrical. The models that do best at prediction often do worst at retrodiction, and vice versa. There are deep theoretical reasons for this.

I am not clear on why the asymmetry of time enters the question. To make the problem more concrete, suppose we gather (from tree rings, ice cores and so forth) climate data for several thousand years, ending in 1990. We fine-tune our computer program so that, given CO2 concentration for a specified year, we can compute the average global temperature for that year, matching the temperature we determined experimentally. Then, with our code in hand, we run the simulation forward from 1990, feeding it only the CO2 data, and we compare the temperatures it outputs with those we have on record. If this comparison works out favorably, we can say we have a useful tool for predicting what global climate will do in 2007 and beyond.

Time is really only running forward, although during the debugging phase I suppose we could stop time and start it over again.

rightwingprof:

The problem here is that science doesn't concern you; groupthink and politics does. I am a global warming agnostic. I have no position either way. I'm not the one with the political agenda.

This implies that a "political agenda" is the only reason one could have a position on the question, which is patently absurd. The weight of the scientific evidence comes down on one side, and it takes a heavy political agenda to bring the lever back down to a superficially "agnostic" position. Someone who truly had no political beliefs, like Klaatu watching from his flying saucer, would be convinced by the scientific data.

Adding my two cents on stochastic simulations. My field is digital communications and we use them all the time, since the main obstacles to reliable communications are random; noise for example.

Under the right conditions (ergodicity and stationariety, for instance) some statistics of the simulation results will converge (for example, the measured reliability under certain amounts of noise); however, the noise used in each simulation run is random (pseudorandom actually :)

Several people have mentioned that they don't understand why it is not a good idea to train your simulation on stage all the data to the past five years and then run the simulation to compare this a million results with the actual measured results. In some cases this might actually be a good approach, this is an example of a broader technique called bootstrapping, but it's important to weigh the costs and benefits.

The first problem is that this can be seen as an inefficient mechanism for training on the whole data set. You are just specifying a fitness function that emphasizes predictive power in the last five years -- then just train that on the whole set. You actually aren't gaining anything through this approach and you may be sacrificing the ability of your algorithm to take that recent data into account.

The second problem, which is really a problem in all of these sorts of endeavors, is simply over-fitting your model to the data at hand. The result of this is that you can have very good predictive power over the training set but fail miserably on a novel input. Basically, any sufficiently complex model can be fit to any sufficiently complex problem, the trick is to know the difference between making small adjustments and just whacking it with a hammer.

"The accuracy of simulation is based on its performance and the known accuracy of the fundamental model which it's based on."

The reason I am skeptical of the arguments for global warming comes from just this point -- the fundamental models on which the simulations are based are not known to be accurate. We know the Navier-Stokes equations are an accurate model of fluid dynamics, not just because they agree with all the empirical data (though they do) but also because we can prove they are consequences of deeper physical laws. We know both why each term in the equations is included, and why there are no other terms. When it comes to climate, however, we don't have that level of knowledge; the simulators can no doubt justify each term in the equations a climate simulation is based on, but they can't say with any confidence that there are no other relevant terms. And, because we don't understand climate in this deep way, simulations of the climate have to be classed as speculations -- attempts to reach a deep understanding. (The proponents of global warming, however, usually speak as if they had the necessary understanding, and point to the simulations as proofs.)

Taking Blake Stacey's example, if the simulation he described did not fit the measurements taken after 1990, that would show the model underlying it is wrong. But the converse doesn't hold; a wrong model can fit a set of measurements just as well as, or better than, the right model does. Moreover, wrong models can often be adjusted to make them fit -- and an abundance of data makes it easier, in some ways, to force an agreement. (You can always say the data that just won't fit in is an outlier, for instance.) Using several thousand years of measurements to predict only 17 years' worth isn't a strong verification of a model.

MarkCC hasn't answered this class of objection, no doubt because the original article didn't raise it. To repeat, the dynamics of the climate are not well understood, in the sense that aerodynamics is, and so the accuracy of climate simulations cannot be proven.

By Michael Brazier (not verified) on 22 Jan 2007 #permalink

Blake:

Is Entropy Relevant to the Asymmetry between Retrodiction and Prediction?
Martin Barrett, Elliott Sober
The British Journal for the Philosophy of Science, Vol. 43, No. 2 (Jun., 1992), pp. 141-160

Rev. Mod. Phys. 27, 179 (1955): Watanabe - Symmetry of Physical ...

There's a literature on this. I'm not making it up, nor am I an expert on it. My friends who do postgraduate work in optimization theory assume that everyone knows this...

I shall probably accept your offer to host the evolution modeling on your blogger site. Would you like to prime the pump by copying postings that I made to Good Math Bad Math, either with or without the typos?

I can then add more fairly soon. I have also gotten off-line emails and phonecalls with interesting criticism. They say that my algebra is right, but question my assumptions. They raise interesting questions about the Fitness Landscape and entropy, and Hardy-Weinberg equilibria. After maybe 30 minutes in the phonecall, I finally realized that we differed on the granularity of the model. My Physicist friend was thinking of chromosomes as bitstrings, and genes as bits.

I said that I specifically did NOT model that way, but allowed crossover and inversion to occur in inside genes, to make very different phenotypes. That is, microevolution versus macroeveolution depends in part on how one defines genes. And one must walk a tighrope on where the complexity is in one's model: it should not all be on the fitness function, swept under the rug, nor conversely see the fitness function by low-dimensional intuition, or as a classical system with multiple parameters being knobs that can be twiddled. My PhD simulations were acknowledged by Holland to be deeper tyhan thos his own doctoral and postdoc students did, in that my fitness functions were not parameterized functions in a naive way, but the results of running the genotype through an interpreter. I wasn't evolving strings of numerical parameters. I was evolving character strings, then run into an APL interpreter, and the ones interpreted, run against a digital simulation of a metabolic dynamics model for goodness of fit to empirical data. Holland liked my GA searches of syntactic space and semantic space. Nobody "got it" when I presented at the first Artificial Life conferences in Santa Fe. Now many people "get it." But they are still a minority, while the old paradigm rolls on.

Do we know the syntax and semantics of models of the whole Sun, Earth's atmosphere, oceans, land, ecosystem? I think not, but how much do we need to make good predictions OR retrodictions? How nonlinear are global models? This was an entire track of the Sixth International Conference on Complex Systems (which you can google for, and read abstracts and some of the papers).

We also argued about discrete versus continuous probability theories. This is a deep question. He refrred me to the work of Charles Adami, at the Keck Institute, with whose work I am not familar, but which is said to be related.

One very important subject which Mr. Chu Carroll did not mention was calibration. Most simulation models are made up of submodels which include parameters which must be adjusted to fit local conditions (that is, they are not universal). These parameters must be measured separately in order for the model to accurately represent the physical situation being simulated. As an example, in a previous comment on this thread, I mentioned the service rate of vehicles in a discharging queue at a signalizated intersection. This discharge rate is not universal for all intersection approaches. It is substantially lower for an intersection in downtown Washington, D. C. then it is for an intersection on an exurban arterial with a 55 mile/hour speed limit.

(struggles with TypeKey again, gives up and logs out)

Jonathan Vos Post wrote:

I shall probably accept your offer to host the evolution modeling on your blogger site. Would you like to prime the pump by copying postings that I made to Good Math Bad Math, either with or without the typos?

I sent you an e-mail invitation (at jvospost2(AT)yahoo.com). Accepting that invitation should give you the ability to post your own top-level comments. Alternatively, we could try taking the discussion to the NECSI Complex Systems Wiki, which I believe has better math support than Blogger.

The public seems deeply unaware of the Prediction/Retrodiction paradox, and mainstream journalists can't be expected to understand Prediction without knowing the Statistics basics that Mark is so nicely providing, let alone a definition of "Retrodiction."

Blake: It's rude of me to abuse Mark's blogular hospitality with overmuch dialogue between you and I. The NECI wiki is an interesting idea, though, as I have chaired sessions at ICCS-2006, at least one of which you attended. I do want this to turn into a paper for NECSI, and have not yet informed Yaneer Bar-Yam nor his lieutenants. Let's continue this discussion by email (you got my email address right, and I have another one not for blog consumption) until we converge,

About my previous comment, Google "Christoph Adami" not "Charles Adami" and find arXiv papers that are indeed about information theory and simulated evolution of self-reproducing automata.

Schneider, T.D., Stormo, G.D., Gold, L., and Ehrenfeucht, A, "Information Content of Binding Sites on Nucleotide Sequences", J. Mol. Biol. 188(1986)415-431.

Adami says: "The ability to analyze the entropy of each site in the genome quantifies the loss of variability... This entrpy analysis has been carried out in a biological context by [Schneider et al, above citation]..."

I'm puzzled by Mark's assertion on the rarity of stochastic simulations as well. It would seem that if it were just a matter of plugging in the values into equations, that would be deterministic but wouldn't really merit the term "simulation".

I don't know how the earth science people do it, but at least in evolutionary biology, a simulation generally involves lots of simulated individuals that replicate (and perhaps mate or compete for resources, etc.) according to probabilities; the point is to model cases in which it is too complicated to use equations in order to make predictions directly.

Apologies for being a grammar Nazi, but "phenomena" is the plural and "phenomenon" is the singular. I'm one of those dorks that get distracted by details like that.

Jonathan:

I was intending to talk about the rarity of stochastic simulations in physics, of the sort that are used for things like global warming.

For those situations, we do use deterministic simulations - for example, climatology is mostly fluid-flow simulations based on the Navier-Stokes formulas. NS describes how things behave, but they're a set of differential equations with no closed-form solution. So the way that we work out answers for the fluid flow problems is by running simulations to compute the solution. That's what's going on in most of the climatology simulations: they're using NS models of the atmosphere.

In evolutionary biology, you're looking at a very different kind of simulation. The phenomena that you're simulating requires a strong random element.

(In reality, I'm oversimplifying things. Climatology simulation do also have random elements. The model is deterministic, but it depends partially on unpredictable variable phenomena - for example, sunspot activity can have an effect on weather, and we have an approximate model of sunspot activity, we don't have a precise one, and so the effects need to be simulated randomly based on our
understanding of the approximation. My point was that the idea that simulations, by definition, operate on "fake" data is nonsense.)

Okay, that makes sense -- I just couldn't figure out what they would simulate if they already had equations to handle it, but I think I understand it now.

We must stop the study of the climate! Climate simulations run on computers. Computers need a steady stream of electrons to perform the calculations. That steady stream of electrons comes from power stations that use fossil fuels to generate the electron stream. The burning of fossil fuels places CO2 into the atmosphere that helps to trap heat that contributes to global warming. Computers running the climate simulations produce heat that is trapped by the greenhouse gases that were produced to generate the electrons needed for the simulations. Eliminate the simulations and global warming can be reduced.

Next we can work on a cure for cancer and how to have all nations coexist in peace and harmony.

reboho:

Perhaps you have the right idea. But the real energy wasters and carbon producers in analysis are human brains - much less efficient than computers really.

So we should eliminate all humans, or at least all concentrated thought. That would solve a lot of problems - no GW, all peace, and no one would care about cancer.

By Torbjörn Larsson (not verified) on 22 Jan 2007 #permalink

I believe this Carl Sagan classic nearly applies here:

"Creationists make it sound like a 'theory' is something you dreamt up after being drunk all night."

How about we stop debating GW and concentrate on eliminating pollution, including CO2 emissions?

We can argue who's model was right later.

This is a fascinating debate and the comments are worth reading.
To just add my 2 cents, it could be worth pointing out that a number of theories are being used as fact to create these simulations.

The one that come to mind are that CO2 is increasing which is causing the temperature to rise. However, what if, as some scientists are noticing, the CO2 is increasing because the temperature is rising. Its almost a chicken and egg situation where the egg has been chosen as coming first.

Another point that is being missed is that its taken that the initial temperature as recorded in the 1700's is the 'correct' temperature for the planet.

Neither of these points are directly related to the statistical analysis, however, they do seriously effect the conclusions of the final reports.

on an unrelated point, have you checked out www.numberwatch.co.uk? I would be interested in reading your review of the math on the site.

F0ul:

I do know that scientists have discussed greenhouse gases being released because of increased temperature, such as methane and CO2 released from melting permafrost, but this doesn't mean that they've interpreted an effect as a cause. It's known independently that certain gases have certain effects and approximately what levels do what. Instead, what this points to is "runaway greenhouse," a positive feedback loop increasing greenhouse gases and temperatures to higher levels and faster rates. This is much the same as polar melting. Heats melts the ice, which then reflects less sunlight out to space, which then warms ice even more...

The idea that the temperature of the 1700s is the 'correct' temperature... well, this is the first time I've heard of such a thing. That period is completely in the little ice age! And it almost seems as if you're implying that that we don't know what temperatures were before the 1700s. (Perhaps I'm reading too much between the lines.) We have temperature and greenhouse gas data going back millions of years! Besides, I don't think what actually concerns people is having the "correct" climate as much as an environment conductive of human life.

When I read rightwingprof's reply to his critics, viz.:

It's always nice when people show their ignorance, eh, undergrad?
Simulation. Not real. Simulated. I guess that's over both of your pointy little heads, isn't it?

an eerie feeling of deja vu came over me. Where have I seen this before? Oh, now I remember. In his petulant tone and refusal to learn anything at all about the subject under discussion, rightwingprof reminds me of the many engineers, lawyers, musicians, etc., who have learned just enough information theory to be sure that Kolmogorov complexity disproves evolution, no matter what biologists say. This is made the more exasperating when the commenter is a reasonably bright person in the usual sense (as we've seen, emotional intelligence is another matter). It takes only a confident familiarity with some of the physical sciences, and a dash of hubris, to dismiss the field of climate modeling as unreliable because "the fundamental models on which the simulations are based are not known to be accurate." Pace Michael B., rightwingprof and the rest of the critics, climate modelers are well aware of the limitations of their models.

In my view, it is not too much to ask that anyone who believes that climate models are inherently unreliable, or oversold as tools for prediction, at least learn a little bit about the subject before offering opinions on it. An excellent introduction for a general audience may be found here. And if you really want to plunge, you can download the same model NASA uses and run it yourself. When you know more about how climate modeling is actually done, you will have earned the right to tell us why it is unreliable. Until then, zip it!