Since my post a couple of weeks ago about NASA and the antenna evolution experiment,
I’ve been meaning to write a followup. In both comments and private emails, I’ve gotten
a number of interesting questions about the idea of fitness landscapes, and some of the things
I mentioned in brief throwaway comments. I’m finally finding the time to write about
it now.
Fitness Landscapes
First, let me review what a fitness landscape is.
Suppose you’re searching for something. There are N parameters that define the thing you’re searching for. Then we can say that you’re doing a search for a target within N dimensions. The N-dimensional space containing all of the possible values for all of the dimensions is called the search space.
You can model that search as an iterative function, F, which takes
as parameters a tuple of N values defining a current position in the search space,
and returns a new position in the search space.
So you start at some arbitrary position in your search space, and that’s
the first argument to F. Then each time you call F, it returns a new position in
the search space, which is hopefully closer to the target of the search.
How do you evaluate the quality of a position in the search space?
You define an evaluation function, E. E takes a position in the search space,
and returns a real number which describes the quality of the position in terms
of the particular search. The higher the value returned by E for a point, the
better the quality of the target described by the position in the search space. Expressed using the evaluation, the goal is to find the position p within the space that
maximizes E(p).
Now, suppose you want a way of visualizing the search. It’s easy: define
a new space with N+1 dimensions, where the N+1th has the value of
calling E on the first N dimensions. Call this new, N+1th dimension “height”. Now,
the surface produced by doing this is the evaluation landscape. If your search is
an evolutionary process, the evaluation landscape is called a fitness landscape
As usual, an example makes this easier. Suppose that you’re searching
in two dimensions. Then your position at any time is just a position in
a two-dimensional cartesian graph. To visualize the search space, you create
a three dimensional graph, where the quality of the search result at (x,y) is
the value of the Z coordinate.
For example, imagine that the fitness function is something nice and simple, with
a single obvious maximum: E(x,y) = sin(x) + cos(y). The maximum will naturally be 2, and
one of the places where it will occur is at (π/2,0). Then your fitness landscape ends up looking like the graph below.

We can also get some much wilder landscapes, like the one below, which is defined by E(x,y)=sin(xy)cos(xy) + (3-xy)/((xy)2 + 1). It’s still structured,
but it’s a lot more complicated.

And we can get totally unstructured landscapes, where each point is complely random, like this one:

Static Versus Dynamic Landscapes and Adding Dimensions
I’ve frequently criticized creationists for talking about a static fitness landscape. In a throwaway comment, I mentioned that you can use a static landscape by adding dimensions. This comment really confused a lot of people.
Suppose you’ve got a search of a two-dimensional space. Now, we’re going to make it more
complex. Instead of the search function always being the same, we’re going to actually have each
invocation of the search function return both a point in the search space, and a new
fitness function. So we start with an initial search function, f0 and some initial
point (x0, y0) in the search space. The first search step evaluates
f0(x0, y0), getting (x1,y1) and
f1. The second search step evaluates f1(x1, y1),
getting (x2,y2) and f2. So the search function is
changing over time.
Assuming we’re in the world of computable functions, every possible value for
fi can be represented by an integer, Fi. So we can re-model this as three-dimensional search, where instead of evaluating fi(xi, yi) getting fi+1 and (xi+1, yi+1), we’ll evaluate
a new three-dimensional search function f’(xi, yi, Fi)
getting back (xi+1, yi+1, Fi+1). So we’ve taken the
way that the search function itself can change, and incorporated it into the
search space, at the cost of making the search space itself more complex.
For any way that a search space or search process changes over time, we can
incorporate that change into the search space by adding dimensions. This can give us
a fixed fitness landscape – but it does so at the cost of adding lots of dimensions.
So now we’ve got a measure of the quality of the search result at any particular step. How
can we define the performance of a search over time? There are several ways of doing it; the easiest to understand is as the limit of a function P, such that at time T, P(T) = (σi=0,TE(xt, pt))/T; that is, the mean of
the evaluation result at each step, divided by the number of steps. So take P(T), and
compute its limit as T approaches infinity – that’s one reasonable way of describing the
performance of a search. What it’s really doing is computing the
average fitness result of the search. There are others measures that also take
the rate of convergence on a solution into account, but for simplicity, the simple measure is
adequate.
No Free Lunch
Now, to move on to a bit of NFL-related stuff.
The main fitness-landscape argument used by creationists against evolution is based
on Dembski’s No Free Lunch work. NFL is a family of theorems which (stated informally) says:
averaged over all possible fitness landscapes, no search algorithm can possibly do
better than a random walk.
Given a search function and a landscape, you can compute a performance measure. Given a set of multiple landscapes, you can compute the performance of your search function on each landscape separately. Then you can describe the average performance of your search on your set of landscapes. If the set of landscapes is enumerable, then you can (theoretically) compute the average performance of your search function over all possible landscapes. (Even if the the set of landscapes isn’t enumerable, you can still talk theoretically about the average performance over all landscapes; you just can’t compute it.)
So – given any reasonable performance metric, no search algorithm can possible do better than random guessing over the set of all possible fitness landscapes.
There are places where this theorem is actually interesting. But they’re not the places
that creationists like Dembski continually invoke it. (The main interesting application
if it is in game theory, where you can show things like “There’s no universal
game winning algorithm.”) In something like evolution, it’s silly for one simple reason: the “all possible landscapes” clause.
Evolution doen’t work over all possible landscapes, and no sane person would argue
that it could.
When we talk about fitness landscapes, we tend to think of nice, smooth, continuous
surfaces, with clearly defined maxima and minima (hilltops and valley bottoms). But all
possible landscapes doesn’t just include smooth landscapes. It includes random ones. In
fact, if you work out the set of all possible landscapes, there are a lot more
discontinuous random ones than there are structured, smooth ones.
So in the NFL scenario, you need to be able to work in a nice smooth landscape, like
this one:

It’s easy to see how a search function could find the maximum in that one-dimensional
seach space – it’s got a very clean fitness landscape. On the other hand, NFL also requires
you to be able to find the maximum in a totally random landscape – like the one below:

It should be obvious that there’s absolutely no meaningful way to search for the maximum value in that seach space. There’s nothing in the fitness landscape that can provide the slightest clue of how to identify a maximum.
Evolution works in landscapes with structure. Another way of putting that is that
evolution works in landscapes where the result of a search step provides feedback about the
structure of the landscape. But the key takeaway here is that NFL doesn’t provide any meaningful rebuttal to information, because we don’t expect evolutionary search to
work in all possible landscapes!
In terms of evolution, the average performance over all possible landscapes is a strange
idea. And it’s obvious on an intuitive level why there’s no way that a single search function
can possibly perform well on all possible landscapes. On some landscapes, it will do really
well. On some (the random ones, which make up the majority of all possible landscapes), it
will be effectively the same as a random walk. And on others, it will perform worse than
random, because the adversarial landscape is perfectly structured to make the search strategy
travel towards the worst possible places.
Smuggling Information
Lately, Dembski and friends have been taking a new tack, which involves talking about
“smuggling information”. They’ve been using the NFL argument for years, but they’ve
run into a rather serious problem: evolution works.
As evolutionary algorithms have become well known, and have had some astonishing
successes, it’s become difficult to make the argument that an evolutionary process
can’t possibly succeed. So it became necessary to come up with an excuse for why
evolutionary algorithms work so amazingly well, even though the creationists
mathematical argument shows that it should be impossible.
Their excuse is to basically accuse the experiments that use evolutionary
algorithms of cheating. They argue that the reason that the evolutionary mechanism can
succeed is because information was “smuggled” into the system – that, essentially, the
search algorithm encodes information about the landscape which allows it to succeed. They
argue that this is cheating – that the evolutionary process really has nothing do with the
success of the search, because the “solution” is really encoded into the structure
of the problem via the smuggled information.
There are two responses to this.
The first one is, in a word, “Duh!”. That is, of course there’s information
about the landscape in the system. As I discussed above, there’s no such thing as a search
algorithm that works on all landscapes, but for landscapes with particular properties, there
are search algorithms that are highly successful. If you look at it from an
information-theoretic viewpoint, any search algorithm which can successfully operate
in a particular search space encodes information about the space into its structure. From
the viewpoint of math, this is just totally, blindingly obvious.
And it’s not a problem for biological evolution. Biological evolution is based
on mutation, reproduction, and differential success. That is, it’s a process
where you have a population of reproducing individuals, where the children are
slightly different from the parents. Some of the children survive and reproduce,
and some don’t. This process clearly only works within a particular kind of search space;
that is, a search space where the survival to reproduction of a subset of the population
indicates that that subset of the population has a higher fitness value.
Evolution, modelled as a search, requires certain properties in its search space
for it to work. The information smuggling argument basically claims that that
means that it can’t work. But every successfull search algorithm
has certain requirements for he search-space in which it operates. By the
arguments of Demski and friends, there is no such thing as a successful search
that doesn’t cheat.
The second response to the argument is a bit more subtle.
If you look at the evolutionary process, it’s most like the iterative search process
described towards the beginning of this post. The “search function” isn’t really static over
time; it’s constantly changing. At any point in time, you can loosely think of the search
function for a given species as exploring some set of mutations, and selecting the ones that
allow them to survive. After a step, you’ve got a new population, which is going to have new
mutations to explore. So the search function itself is changing. And how is it changing?
Modelled mathematically, it’s changing by incorporating information about the
landscape.
There’s a really fascinating area of computer science called inductive learning machine
theory. It was invented by a guy named John Case, who’s a friend of mine; he was a member of
my PhD in grad school. John is interested in understanding how learning works on a
mechanistic level. What he does is model it as an inductive process.
You’ve got a target function, T. The goal of the learning machine is
to find a program for T. The way that it does that is by continually refining its
program. So you give it T(0), and it outputs a guess at a program for T. Then you give
it T(1), and it outputs a better guess (it should at least be correct for
T(0) and T(1)!). Then you give it T(2), and it outputs another guess. Each step, it
generates a new program guessing how to compute T.
According to Dembski, the process by which the learning machine improves its guesses is
cheating.
Evolution is really doing exactly the same thing as John’s inductive learning machines:
each step, it’s gaining information from the environment. You can say that the evolutionary
search at step N has incorporated the results of steps 0..N-1. Information about the
environment has naturally been incorporated. In fact, if you think about it, it’s
obvious that this is the case – and that this is exactly why, on a
mathematical level, evolution works.
Go back to the NASA experiment for a moment. The search space of possible antennas is
huge – but it’s structured and continuous. How does evolution find the best antenna? It tries
lots of options – and based on the results of those options, it discovers something about the
structure of the search space – it discovers that some set of solutions work better
than others, and it tries to search forward from those. By doing that, it’s pruned out huge
areas of the search space – the areas rooted at the least successful tests. Eliminating a
large part of the search space is a way of incorporating information about the search
space: it’s adding the information that “the solution isn’t over there”.
Pointing out that an evolutionary process works in a particular set of landscapes, and
that it’s got a way of acquiring and integrating information about the landscape that it’s
searching isn’t a critique of evolution – it’s an explanation of how it works. Far from being
a good argument against evolution, it’s an argument for evolution, because it shows
the mechanism by which evolution is capable of finding solutions.