Since my post a couple of weeks ago about NASA and the antenna evolution experiment,

I’ve been meaning to write a followup. In both comments and private emails, I’ve gotten

a number of interesting questions about the idea of fitness landscapes, and some of the things

I mentioned in brief throwaway comments. I’m finally finding the time to write about

it now.

### Fitness Landscapes

First, let me review what a fitness landscape is.

Suppose you’re searching for something. There are N parameters that define the thing you’re searching for. Then we can say that you’re doing a search for a target *within N dimensions*. The N-dimensional space containing all of the possible values for all of the dimensions is called the *search space*.

You can model that search as an iterative function, F, which takes

as parameters a tuple of N values defining a current position in the search space,

and returns a new position in the search space.

So you start at some arbitrary position in your search space, and that’s

the first argument to F. Then each time you call F, it returns a new position in

the search space, which is hopefully closer to the target of the search.

How do you evaluate the quality of a position in the search space?

You define an evaluation function, E. E takes a position in the search space,

and returns a real number which describes the quality of the position in terms

of the particular search. The higher the value returned by E for a point, the

better the quality of the target described by the position in the search space. Expressed using the evaluation, the goal is to find the position p within the space that

maximizes E(p).

Now, suppose you want a way of visualizing the search. It’s easy: define

a new space with N+1 dimensions, where the N+1th has the value of

calling E on the first N dimensions. Call this new, N+1th dimension “height”. Now,

the surface produced by doing this is the evaluation landscape. If your search is

an evolutionary process, the evaluation landscape is called a *fitness landscape*

As usual, an example makes this easier. Suppose that you’re searching

in two dimensions. Then your position at any time is just a position in

a two-dimensional cartesian graph. To visualize the search space, you create

a three dimensional graph, where the quality of the search result at (x,y) is

the value of the Z coordinate.

For example, imagine that the fitness function is something nice and simple, with

a single obvious maximum: E(x,y) = sin(x) + cos(y). The maximum will naturally be 2, and

one of the places where it will occur is at (π/2,0). Then your fitness landscape ends up looking like the graph below.

We can also get some much wilder landscapes, like the one below, which is defined by E(x,y)=sin(xy)cos(xy) + (3-xy)/((xy)^{2} + 1). It’s still structured,

but it’s a lot more complicated.

And we can get totally unstructured landscapes, where each point is complely random, like this one:

### Static Versus Dynamic Landscapes and Adding Dimensions

I’ve frequently criticized creationists for talking about a *static* fitness landscape. In a throwaway comment, I mentioned that you *can* use a static landscape by adding dimensions. This comment really confused a lot of people.

Suppose you’ve got a search of a two-dimensional space. Now, we’re going to make it more

complex. Instead of the search function always being the same, we’re going to actually have each

invocation of the search function return both a point in the search space, *and* a new

fitness function. So we start with an initial search function, f_{0} and some initial

point (x_{0}, y_{0}) in the search space. The first search step evaluates

f_{0}(x_{0}, y_{0}), getting (x_{1},y_{1}) and

f_{1}. The second search step evaluates f_{1}(x_{1}, y_{1}),

getting (x_{2},y_{2}) and f_{2}. So the search function is

*changing* over time.

Assuming we’re in the world of computable functions, every possible value for

f_{i} can be represented by an integer, F_{i}. So we can re-model this as three-dimensional search, where instead of evaluating f_{i}(x_{i}, y_{i}) getting f_{i+1} and (x_{i+1}, y_{i+1}), we’ll evaluate

a new three-dimensional search function f'(x_{i}, y_{i}, F_{i})

getting back (x_{i+1}, y_{i+1}, F_{i+1}). So we’ve taken the

way that the search function itself can change, and incorporated it into the

search space, at the cost of making the search space itself more complex.

For any way that a search space or search process changes over time, we can

incorporate that change into the search space by adding dimensions. This can give us

a fixed fitness landscape – but it does so at the cost of adding lots of dimensions.

So now we’ve got a measure of the quality of the search result at any particular step. How

can we define the *performance* of a search over time? There are several ways of doing it; the easiest to understand is as the limit of a function P, such that at time T, P(T) = (σ_{i=0,T}E(x_{t}, p_{t}))/T; that is, the mean of

the evaluation result at each step, divided by the number of steps. So take P(T), and

compute its limit as T approaches infinity – that’s one reasonable way of describing the

performance of a search. What it’s really doing is computing the

average fitness result of the search. There are others measures that also take

the rate of convergence on a solution into account, but for simplicity, the simple measure is

adequate.

### No Free Lunch

Now, to move on to a bit of NFL-related stuff.

The main fitness-landscape argument used by creationists against evolution is based

on Dembski’s No Free Lunch work. NFL is a family of theorems which (stated informally) says:

averaged over all possible fitness landscapes, no search algorithm can possibly do

better than a random walk.

Given a search function and a landscape, you can compute a performance measure. Given a set of multiple landscapes, you can compute the performance of your search function on each landscape separately. Then you can describe the average performance of your search on your set of landscapes. If the set of landscapes is enumerable, then you can (theoretically) compute the average performance of your search function over all possible landscapes. (Even if the the set of landscapes isn’t enumerable, you can still talk theoretically about the average performance over all landscapes; you just can’t compute it.)

So – given any reasonable performance metric, no search algorithm can possible do better than random guessing over the set of all possible fitness landscapes.

There are places where this theorem is actually interesting. But they’re not the places

that creationists like Dembski continually invoke it. (The main interesting application

if it is in game theory, where you can show things like “There’s no universal

game winning algorithm.”) In something like evolution, it’s silly for one simple reason: the “all possible landscapes” clause.

Evolution doen’t work over all possible landscapes, and no sane person would argue

that it could.

When we talk about fitness landscapes, we tend to think of nice, smooth, continuous

surfaces, with clearly defined maxima and minima (hilltops and valley bottoms). But all

possible landscapes doesn’t just include smooth landscapes. It includes random ones. In

fact, if you work out the set of all possible landscapes, there are a *lot* more

discontinuous random ones than there are structured, smooth ones.

So in the NFL scenario, you need to be able to work in a nice smooth landscape, like

this one:

It’s easy to see how a search function could find the maximum in that one-dimensional

seach space – it’s got a very clean fitness landscape. On the other hand, NFL also requires

you to be able to find the maximum in a totally random landscape – like the one below:

It should be obvious that there’s absolutely no meaningful way to search for the maximum value in that seach space. There’s nothing in the fitness landscape that can provide the slightest clue of how to identify a maximum.

Evolution works in landscapes with structure. Another way of putting that is that

evolution works in landscapes where the result of a search step provides feedback about the

structure of the landscape. But the key takeaway here is that NFL doesn’t provide any meaningful rebuttal to information, because we don’t expect evolutionary search to

work in all possible landscapes!

In terms of evolution, the average performance over all possible landscapes is a strange

idea. And it’s obvious on an intuitive level why there’s no way that a single search function

can possibly perform well on all possible landscapes. On some landscapes, it will do really

well. On some (the random ones, which make up the majority of all possible landscapes), it

will be effectively the same as a random walk. And on others, it will perform worse than

random, because the adversarial landscape is perfectly structured to make the search strategy

travel towards the worst possible places.

### Smuggling Information

Lately, Dembski and friends have been taking a new tack, which involves talking about

“smuggling information”. They’ve been using the NFL argument for years, but they’ve

run into a rather serious problem: evolution works.

As evolutionary algorithms have become well known, and have had some astonishing

successes, it’s become difficult to make the argument that an evolutionary process

can’t possibly succeed. So it became necessary to come up with an excuse for why

evolutionary algorithms work so amazingly well, even though the creationists

mathematical argument shows that it should be impossible.

Their excuse is to basically accuse the experiments that use evolutionary

algorithms of cheating. They argue that the reason that the evolutionary mechanism can

succeed is because information was “smuggled” into the system – that, essentially, the

search algorithm encodes information about the landscape which allows it to succeed. They

argue that this is cheating – that the evolutionary process really has nothing do with the

success of the search, because the “solution” is really encoded into the structure

of the problem via the smuggled information.

There are two responses to this.

The first one is, in a word, “Duh!”. That is, *of course* there’s information

about the landscape in the system. As I discussed above, there’s no such thing as a search

algorithm that works on all landscapes, but for landscapes with particular properties, there

are search algorithms that are highly successful. If you look at it from an

information-theoretic viewpoint, *any* search algorithm which can successfully operate

in a particular search space encodes information about the space into its structure. From

the viewpoint of math, this is just totally, blindingly obvious.

And it’s not a problem for biological evolution. Biological evolution is based

on mutation, reproduction, and differential success. That is, it’s a process

where you have a population of reproducing individuals, where the children are

slightly different from the parents. Some of the children survive and reproduce,

and some don’t. This process clearly only works within a particular kind of search space;

that is, a search space where the survival to reproduction of a subset of the population

indicates that that subset of the population has a higher fitness value.

Evolution, modelled as a search, requires certain properties in its search space

for it to work. The information smuggling argument basically claims that that

means that it can’t work. But *every* successfull search algorithm

has certain requirements for he search-space in which it operates. By the

arguments of Demski and friends, there is no such thing as a successful search

that doesn’t cheat.

The second response to the argument is a bit more subtle.

If you look at the evolutionary process, it’s most like the iterative search process

described towards the beginning of this post. The “search function” isn’t really static over

time; it’s constantly changing. At any point in time, you can loosely think of the search

function for a given species as exploring some set of mutations, and selecting the ones that

allow them to survive. After a step, you’ve got a new population, which is going to have new

mutations to explore. So the search function itself is changing. And how is it changing?

Modelled mathematically, it’s changing by *incorporating information about the
landscape*.

There’s a really fascinating area of computer science called inductive learning machine

theory. It was invented by a guy named John Case, who’s a friend of mine; he was a member of

my PhD in grad school. John is interested in understanding how learning works on a

mechanistic level. What he does is model it as an inductive process.

You’ve got a target function, T. The goal of the learning machine is

to find a program for T. The way that it does that is by continually refining its

program. So you give it T(0), and it outputs a guess at a program for T. Then you give

it T(1), and it outputs a better guess (it should *at least* be correct for

T(0) and T(1)!). Then you give it T(2), and it outputs another guess. Each step, it

generates a new program guessing how to compute T.

According to Dembski, the process by which the learning machine improves its guesses is

*cheating*.

Evolution is really doing exactly the same thing as John’s inductive learning machines:

each step, it’s gaining information from the environment. You can say that the evolutionary

search at step N has *incorporated the results* of steps 0..N-1. Information about the

environment has naturally been incorporated. In fact, if you think about it, it’s

*obvious* that this is the case – and that this is exactly *why*, on a

mathematical level, evolution works.

Go back to the NASA experiment for a moment. The search space of possible antennas is

huge – but it’s structured and continuous. How does evolution find the best antenna? It tries

lots of options – and based on the results of those options, it discovers something about the

structure of the search space – it discovers that *some set* of solutions work better

than others, and it tries to search forward from those. By doing that, it’s pruned out huge

areas of the search space – the areas rooted at the least successful tests. Eliminating a

large part of the search space *is a way of incorporating information about the search
space*: it’s adding the information that “the solution isn’t over there”.

Pointing out that an evolutionary process works in a particular set of landscapes, and

that it’s got a way of acquiring and integrating information about the landscape that it’s

searching isn’t a critique of evolution – it’s an explanation of how it works. Far from being

a good argument against evolution, it’s an argument *for* evolution, because it shows

the mechanism by which evolution is capable of finding solutions.