Chaos and Initial Conditions

One thing that I wanted to do when writing about Chaos is take
a bit of time to really home in on each of the basic properties of
chaos, and take a more detailed look at what they mean.

To refresh your memory, for a dynamical system to be chaotic, it needs
to have three basic properties:

  1. Sensitivity to initial conditions,
  2. Dense periodic orbits, and
  3. topological mixing

The phrase "sensitivity to initial conditions" is actually a fairly poor
description of what we really want to say about chaotic systems. Lots of
things are sensitive to initial conditions, but are definitely not

Before I get into it, I want to explain why I'm obsessing
over this condition. It is, in many ways, the least important
condition of chaos! But here I am obsessing over it.

As I said in the first post in the series, it's the most widely known
property of chaos. But I hate the way that it's usually
described. It's just wrong. What chaos means by sensitivity
to initial conditions is really quite different from the more general
concept of sensitivity to initial conditions.

To illustrate, I need to get a bit formal, and really
define "sensitivity to initial conditions".

To start, we've got a dynamical system, which we'll call f.
To give us a way of talking about "differences", we'll establish a
measure on f. Without going into full detail, a measure is
a function M(x) which maps each point x in the phase space of f to a
real number, and which has the property that points that are close together
in f have measure values which are close together.

Given two points x and y in the phase space of f, the distance
between those points is the absolute value of the difference of
their measures, |M(x) - M(y)|.

So, we've got our dynamical system, with a measure over it
for defining distances. One more bit of notation, and we'll
be ready to get to the important part. When we start our
system f at an initial point x, we'll write it fx.

What sensitivity to initial conditions means is that no
matter how close together two initial points x and y are,
if you run the system for long enough starting at each point,
the results will be separated by as large a value as you want. Phrased
informally, that's actually confusing; but when you formalize it, it
actually gets simpler to understand:

Take the system f with measure M. Then f is sensitive to
initial conditions if and only if:

  • (∀ ε > 0 (∀ x,y : |M(x)-M(y)| < ε
    (For any two points x and y that are arbitrarily close together)
  • Let diff(t) = |M(fx(t)) - M(fy(t))|.
    (Let diff(t) be the distance between fx and fy
    at time t)
  • ∀G, &exists;T : diff(T) > G. (No matter what value you
    chose for G, at some point in time T, diff(T) will be larger than G.)

Now - reading that, a naive understanding would be that the diff(T)
increases monotonically as T increases - that is, that for any two
values ti and tj, if ti > tj then
diff(ti) > diff(tj). And in fact, in many of the newage
explanations of chaos, that's exactly what they assume. But that's
not the case. In fact, monotonically increasing systems aren't
chaotic. (There's that pesky "periodic orbits" requirement.) What makes
chaotic systems interesting is that the differences between different starting
points don't increase monotonically.

To get an idea of the difference, just compare two simple quadratic
recurrence based systems. For our chaotic system, we'll about the logistic map: f(t) =
k×f(t-1)(1-f(t-1)) with measure M(f(t)) = 1/f(t). And for our non-chaotic
system, we'll use g(t) = g(t-1)2, with M(g(t)) = g(t).
Think about arbitrarily small differences starting values. In the
quadratic equation, even if you start off with a miniscule difference -
starting at v0=1.00001 and v1=1.00002 - you'll
get a divergence. They'll start off very close together - after 10 steps,
they only differ by 0.1. But they rapidly start to diverge. After 15
steps, they differ by about 0.5. By 16 steps, they differ by about 1.8;
by 20 steps, they differ by about 1.2×109! That's
clearly a huge sensitivity to initial conditions - an initial difference
of 1×10-5, and in just 20 steps, their difference is measured
in billions. Pick any arbitrarily large number that you want, and
if you scan far enough out, you'll get a difference bigger than it. But
there's nothing chaotic about it - it's just an incredibly
rapidly growing curve!

In contrast, they logistic curve is amazing. Look far enough out, and you
can find a point in time where the difference in measure between starting at
0.00001 and 0.00002 is as large as you could possibly want; but also,
look far enough out past that divergence point, and you'll find a point in
time where the difference is as small as you could possible want!
The measure values of systems starting at x and y will sometimes be close together, and sometimes
far apart. They'll continually vary - sometimes getting closer together,
sometimes getting farther apart. At some point in time, they'll be arbitrarily
far apart. At other times, they'll be arbitrarily close together.

That's a major hallmark of chaos. It's not just that given
arbitrarily close together starting points, they'll eventually be far apart.
That's not chaotic. It's that they'll be far apart at some times, and close
together at other times.

Chaos encompasses the so-called butterfly effect: a butterfly flapping its
wings in the amazon could cause an ice age a thousand years later. But it also
encompasses the sterile elephant effect: a herd of a million rampaging giant
elephants crushing a forest could end up having virtually no effect at all
a thousand years later.

That's the fascination of chaotic systems. They're completely
deterministic, and yet completely unpredictable. What makes them
so amazing is how they're a combination of incredibly simplicity
and incredible complexity. How many systems can you think of that are
really much simpler to define that the logistic map? But how many have
outcomes that are harder to predict?


More like this

The last major property of a chaotic system is topological mixing. You can think of mixing as being, in some sense, the opposite of the dense periodic orbits property. Intuitively, the dense orbits tell you that things that are arbitrarily close together for arbitrarily long periods of time can…
It's been quite a while since my last chaos theory post. I've been caught up in other things, and I've needed to do some studying. Based on a recommendation from a commenter, I've gotten another book on Chaos theory, and it's frankly vastly better than the two I was using before. Anyway, I want…
Another one of the fundamental properties of a chaotic system isdense periodic orbits. It's a bit of an odd one: a chaotic system doesn't have to have periodic orbits at all. But if it does, then they have to be dense. The dense periodic orbit rule is, in many ways, very similar to the…
One mathematical topic that I find fascinating, but which I've never had a chance to study formally is chaos. I've been sort of non-motivated about blog-writing lately due to so many demands on my time, which has left me feeling somewhat guilty towards those of you who follow this blog. So I…

The question I have, is this: unpredictable how? Are we talking "computationally expensive beyond all reason" or "a function of the limits of formal systems"?

I usually assume the former, but it'd be really cool if it were the latter.

'Are we talking "computationally expensive beyond all reason" or "a function of the limits of formal systems"?'

Well it is both: even in its formal model for any realizable (classical) physical computer the distance function would be unpredictable. Why? All physical computers are equivalent to Finite State Automata at the bottom - they are all finite. So given the size of the largest number the computer can deal with, if the phase space is big enough you will eventually run over the edge, as it were. On the other hand a formal Turing machine has an infinite tape, so in a sense it could calculate the differences no matter how big they got.

'The question I have, is this: unpredictable how?'

In order to predict, we need to measure the initial conditions. But we can only measure them to some finite accuracy, which means that our measured value will differ from the actual value by some small value. As we try to compute future values, that minute error will become arbitrarily large, which means that our prediction will become worthless.

And now for a nitpicky question:
Suppose that my measure is bounded by [0,1]. Then do we only require diff(T)>G for all 0

I'm not sure where you're getting this "measure" thing. What you need is a metric to give a notion of distance between points. And most metrics have nothing to do with this "measure" function you're talking about.

Can you point me to your reference for you definition of sensitivity to initial conditions?
Yours seems strange, the "measure" you defined seems like what should be a metric (and you use it like one) but it doesn't have enough constraints on it to be (namely is is possible for |M(x)-M(y)|=0 even if x=/=y), also the condition you have written down seems overly strong, for all pairs x,y close is hard to do and even worse is if M is bounded the last condition isn't possible, and lastly the use of the word measure is just odd there is a highly related field (ergodic theory), meaning it is the same thing just in a measure space vs a topological/metric space, that uses the term measure in the usual sense which is dealing with measure spaces and measuring the size of sets.

I found this definition on wikipedia, which is using a metric and can easily be generalized to a topological space (instead of saying for all delta use any neighborhood of x) which seems more reasonable, and my book on ergodic theory agrees with. (Introduction to Dynamical Systems by Micheal Brin and Garrett Stuck)

Also can you put tex commands in posts?
Dang it beaten to the punch.

Ok so it may not be possible to generalize the definition I sited to topological spaces without some extra work, it is too easy to call your neighborhood the whole space, the same problem crops up with using a bounded metric and that definition as well. So the question becomes is chaos possible on a bounded metric space... I know ergodicity is possible on a bounded measure space (also can be called a probability space) actually that is always assumed in class and from what my teacher says that is just about all they work with (you get some really nice stuff for free).

"Sterile elephant effect" -- that's exactly the metaphor I was looking for earlier today, dammit, when I was talking about causality and prediction in historical analysis. Well, I'll remember it next time, for sure.


We're talking "No matter how good your bounds for error is at any given time, there is a point in time after which there is NO guarantees you can make on the divergence from error". You can't bound it above. You can't bound it below. You'll be, in fact, guaranteed that the behaviour of the point you try to estimate has nothing to do with the point you DO estimate it with.

What I find really spooky about chaotic systems is the dense periodic orbits. It's not enough that close starting points have divergent orbits, or that any neighborhood eventually covers the whole space with it's orbits---it's that no matter how crazy the behavior of points in a region is, there are points with predictable, periodic orbits right near by.

Re aebrosc No 6.

Chaos works fine on finite metric spaces.

Sensitivity to initial conditions doesn't require getting arbitrarily far apart, only that there is some beta > 0 such that all orbits, no matter how close initially, eventually are farther apart than beta at some time. Obviously, on a bounded metric space, there will be an upper limit on the size of beta.

Plenty of interesting functions are chaotic on, say, the interval [0,1], where the distance between two points is always bounded by 1.

A good example of something to play with is the discrete doubling function f(x) = { 2x, if 0 \leq x \leq 0.5 ; 2(x-0.5), if 0.5 < x < 1; 0, if x=1. It looks like two sawtooths. It is chaotic, but it will probably not show chaotic behavior if you simulate it on your computer (i.e. n+1 = f(n), iterated ad nauseum).

Hmm. f(t) = f(t-1)-1.1 with M(f(t)) = f(t) appears to be able to get arbitrarily large or small differences, but is clearly not chaotic!

Yea I misread the quantifiers. Thanks for pointing it out.

I thoroughly have to agree with John Armstrong & aebrosc on your misuse of the term measure. You are describing a metric function, not the measure of a set.

What you wrote is even contradictory with the notion of measure: "the property that points that are close together in f have measure values which are close together." If you consider the Cantor set, for example, every point is an accumulation point, which could be seen as "close together" (as that is not a formally defined notion), but the measure of the Cantor set is always 0.

(Sorry, I didn't really finish my comment...)

That's the measure of the standard Cantor set taken from [0,1] out of the reals- even though each point is an accumulation point (although the set is nowhere dense), measure is not equivalent to length in this case (which is what you really want), as the measure of the whole set is still 0 (regardless of the "length" of the interval).

Then is h(a) = x*sen(x+a) chaotic?

because if a,b are two different initials condition (a!=b+2*k*pi, k in N) then |h(a) - h(b)| will loop between 0 and 2*x


Is there a discrete math book that is good for novices? I'm an undergrad CS student and we're using Kenneth Rosen's book. The book seems geared towards more advanced students so I was wondering if you've come across one that seems easier to learn from. Any suggestions would be helpful.


Sorry, I did mean "metric" instead of "measure"; my mind was occupied by some other stuff, and used the wrong term. I didn't want to get into the full formality of metrics, so I was trying to use a very stripped down version of the definition that just gave a reader a basic sense of what I meant. Obviously, there's a lot more to a metric than the "closeness" thing, but I didn't want to get overcomplicated. Given my readership, I probably erred on the side of over-simplification.


If there is, I don't have it.

Seriously, back in my undergrad days, we used two different discrete math textbooks, and they were both absolutely awful. Fortunately, I had really good profs for those two classes, which more than made up for the lousy books. (The teacher of my first discrete math course was a guy named Eric Allender, who's one of the two best teachers I ever head.)

You really need to start a petition for ScienceBlogs to get MathML support à la Jacques Distler's Musings... it's somewhat surprising that it isn't available round this part of the intertubes.

Mark, I really think you could have used a proper definition of metric here without any trouble.

I encourage anyone who can to read the Definition in Wikipedia.

Mark's definition has several differences from that, the accepted definition:
1) most metrics are not definable as simply the absolute difference of a one-arg function evaluated at two spots. For example, the Euclidean distance metric in more than one dimension.
2) The unbounded nature of G, which is obviously a problem as people have pointed out on bounded metric spaces.
3) That Mark's definition begins with âε>0, â x,y, instead of with âε>0 âx ây.

In fact the sensitive dependence on initial conditions does not make a statement about any two points which are close enough together, but rather says that âx, and for all arbitrarily small ε, there is a y closer to x than ε such that the system started at x and at y diverge enough eventually.