GP for the masses

My colleague Nic McPhee (with a couple of other people) is an author of a new book, A Field Guide to Genetic Programming — I think I'm going to have to read it.

Genetic programming (GP) is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until high-fitness solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions.

See? It sounds cool!

Tags

More like this

I guess I'd better starting being nice to my computer.
Skynet cant be too far away.

hahahaha

It does indeed sound very cool.

By Andy James (not verified) on 27 Mar 2008 #permalink

At what point does an Intelligent Designer influence the program?

This will give a whole new meaning to 'debugging' programs.

Not that the code emitted by these systems can be debugged, if even read. About all one can do is select among the candidate programs using whatever test criteria seem appropriate. But don't expect reliable extrapolation or interpolation to new test conditions (new 'environments'). And definitely don't expect graceful failure modes. So, to borrow from a bit of boilerplate which used to accompany the release of consumer operating systems: "not suitable for use controlling nuclear reactors, aircraft, weapons systems,..." - or financial systems.

(Usual caveats for the sweeping generalization: most GP code isn't debuggable, although I know a few small examples have been after intensive effort. The GP codes are nothing like what a human author designs, use arcane tricks and are of course entirely uncommented.)

thwaite wrote: So, to borrow from a bit of boilerplate which used to accompany the release of consumer operating systems: "not suitable for use controlling nuclear reactors, aircraft, weapons systems,..." - or financial systems.

Actually, the instance I'm aware of where applications created via genetic programming are being used *is* in control of weapons systems, or more precisely, an anti-weapons system. Such applications are used because they cope far better with chaotic environments (e.g., momentarily changing atmospheric conditions) than anything purpose-built.

Last time we tried something like this, the thing abused every loophole and squeezed through every crack trying to get the best possibility out of what we had given it. We actually ended up with unusable output the first few times around because it exploited the data and rules so much that it's solutions just weren't practical.

Will look forward to reading this.

By Richard Eis (not verified) on 27 Mar 2008 #permalink

What's also cool is that the book is self-published and printed on Lulu, so you can get a printed copy for around $16 including shipping. Books like this published through traditional publishers (this would likely get published by Springer or one of the university presses) cost several times that.

For more information, look up John R. Koza on amazon. HE has also written a number of good books on the topic.

We're using genetic algorithms for scheduling systems. They work pretty good.

However, there's a particular story about genetic programming which I like: http://www.cogs.susx.ac.uk/users/adrianth/ascot/paper/paper.html

In short, the evolved circuit:
1) Worked flawlessly
2) Required far less elements than any designed circuit
3) Nobody found HOW it worked.
4) The evolved circuit contained elements, not connected electrically to the output, but still required for functioning.
5) The circuit worked only in the temperature range of about 10C.
6) Oh, and the circuit was "irreducibly complex" - it stopped working if even a single element was taken out.

I.e. the evolved circuit exhibited the same features as the real biologic evolution.

By Alex Besogonov (not verified) on 27 Mar 2008 #permalink

At what point does an Intelligent Designer influence the program?

Well, actually it does with the "ooze of random computer programs," since you really do need to have a starting point. I know what you mean, of course, I just don't want to leave a thread dangling for an IDiot to start pulling on it.

Perhaps most important with the entire situation, though, is that we need these evolutionary programs to deal with complexity. Recently the IDiots at UD were pointing to the "design" of an enzyme, and indeed, it is often best to use rationality to match solution to problem. But the interesting aspect of this particular "design strategy" is that intelligence dealt with the simple things (basic structure), while evolutionary strategies dealt with the complexity of the outer structure of the enzyme, That is, evolution was found to be best in adapting the details of the enzyme to the complexity of the environment and substrate, while rational design makes things that evolution itself would never be likely to make.

So yeah, rational intelligence would change a number of things about humans, like turning the eye right-side out. But no rational intelligence of which we know could make an integrated organized complexity like the human body is--at least not from first principles. Only evolution is up to that task (no, don't bother telling me that by defining "God" as "omniscient" that such a "God" could do it--by definition that's true. By evidence, such a "God" is not evidently in existence).

We're improving our capabilities by turning to evolutionary processes. Evolution has limitations which our intelligence can overcome, but intelligence has limitations which evolution can overcome.

It's a good thing that evolution can do what intelligence thus far cannot--give us complex multicellular life. It is equally a shame that intelligence was not involved, or we'd be much better off (at least if the intelligence were interested in our welfare, however unlikely that may be).

Glen D
http://tinyurl.com/2kxyc7

To find another cool example, google "NASA", "ST5", and "antenna". The x-band antenna for NASA's ST5 satellite mission was produced with an evolutionary algorithm. It consumed less power, took less man-hours to develop, and outperformed its conventionally-designed counterpart.

Sounds pretty spiffy! I'll have to take a look.

Here's another example of evolution-in-computing in action (neural nets, to be precise).

NERO

Genetic Algorithms are one of the best tools we have for solving global optimization problems in physical chemistry. Of course, the creationists will see this as evidence that evolution requires a grand Fortran programmer to first write and design the genetic algorithm.

By Christianjb (not verified) on 27 Mar 2008 #permalink

Genetic programming is an extremely interesting field.

I've written a random image generator, each image is generated from a string of numbers (20 digits correlates to a 'gene'), there are 'genes' that code colours, 'genes' for shapes even 'genes' that cause the reader to skip to different locations on the strand (back X, forward X or stop). I think the mutation rate might need tweaking however, there seem to be a larger number of 'stillborns' then I would have expected. For the current batch I've left the selection criteria very open, it's simply which of the two do you prefer. I'm currently writing an other version that will present 3 images, one to be rejected, the two winning images will then mate.

(http://www.alchemyx.com/board/random/compare.php)

Cool stuff. However, although these techniques can produce unexpected and more efficient results they are not necessarily time savers. They still require large amounts of computer power to run through the permutations and a very large amount of effort to ensure that your environment and pass/fail conditions are adequate to produce the outcome that you are hoping for. It's similar to training a neural net. I remember my AI instructor telling this story about a military project to teach a neural net to recognize a tank in a given picture. The showed it hundreds of pictures with and without tanks until it had a 100% success rate. Then they took it out to the field to test it and it completely failed. The reason why? All of the pictures with tanks were taken on a cloudy day and those without on a sunny day. All they really trained the net to do was recognize the difference between cloudy and sunny.
Same problem with genetic algorithms.
Also, one can not necessarily draw any correlations between an artificial genetic algorithm and real biology. Genetic algorithms are typically used in very simple environments that are carefully structured to produce very particular results. Then they use a brute force "survival of the fittest" method to drive the random permutations towards a set goal. Which, unless I miss-understand, is completely different from evolutionary biology...
It would be equally as dangerous for a evolutionist to point to these systems and say "See! It works" as it would be for a Creationist to say "See! It requires ID!".

I have written many GA's and am working on a new one as we speak. I am not saying this definition is inaccurate, but I don't know what they mean by "All this without the user having to know or specify the form or structure of solutions in advance." It seems to me you certainly do specify the structure of your solutions in advance, since you provide the genome and its representation. Also, sometimes ideas that aren't from natural evolution--such as more than two parents, are quite helpful.

I assume everyone knows the "checkers" GA story? Fogel wanted to train his Checkers playing GA on-line so he went to a game site. But he couldn't get enough games from the other players. That is, until he changed his username from something like checkerGuy to "Blondie." Then he got plenty of games and side chats.

GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until high-fitness solutions emerge.

Oozing computer programs having sex? Witness the birth of a new paraphilia.
Will someone please take these computer nerds to a titty bar or something?

By HPLC_Sean (not verified) on 27 Mar 2008 #permalink

Not that the code emitted by these systems can be debugged, if even read.

I wonder how hard it would be to add readability to the fitness criteria.

I remember my AI instructor telling this story about a military project to teach a neural net to recognize a tank in a given picture. The showed it hundreds of pictures with and without tanks until it had a 100% success rate. Then they took it out to the field to test it and it completely failed. The reason why? All of the pictures with tanks were taken on a cloudy day and those without on a sunny day. All they really trained the net to do was recognize the difference between cloudy and sunny.

This really has nothing to do with genetic algorithms or even AI at all, but with statistical sampling procedures and not guarding against confounding variables.

I recently took a "design of experiment" class and one of our exercises was to vary several parameters of a catapult and after measuring the distance thrown, develop a polynomial relating the input variables to the distance. However, there was a mistake in the design of the experiment that always varied two of the parameters in the same direction at the same time. (since it was a partial factorial design) This made it impossible to seperate the effect of those inputs on the distance thrown. Pretty much exactly what happened in your tank example.

I used a GA approach many years ago. I had objects with a certain internal state, subjected to a set of rules that could change the internal state. We wanted to know whether there existed some rules that could produce a particular shape of histogram of object states over time. If not, it meant we had to augment the internal state of the objects, something we tried to do as infrequently as possible, and the problem wasn't amenable to simple analysis. So, write up a genetic algorithm, go home. Come back after the weekend, and look at the histograms. They showed the shape we wanted, so I was able to dissect the genomes of the specimens in the ecology and figure out what properties of the genome were relevant to the desired behaviour. After that, it wasn't hard to write up new rules based on the increased understanding.

I keep thinking there must be a biological parallel to this anecdote. Studying a gene that does something useful, and using the imparted understanding to do new things in a non-genetic context.

genetic algorithms are way too cool...they are what finally converted me to atheism...before knowing about them, there was still the possibility in my mind that in spite of the evidence for evolutionary mechanisms, maybe they were not sufficient for the biological complexity we see today, and maybe some other minor tweaks through intelligent input (via some trans-dimensional intelligence) were needed..

after reading this talkorigins article, all such illusions were shattered...some of the emergent solutions given there literally blew my mind (for example, the one alex quotes in #11 is listed there)...anyone who reads that article in good faith and still insists on the impossibility of the self-emergence of irreducibly complex systems is being willfully obtuse...

A short review of genetic programming (see Bill above, for example)

Genetic programming is extremely cool, but nowhere near as cool as it sounds.

Gee, I seem to remember reading somewhere that thinking of genomes and computer programs as analogous is clearly the sort of stupidity that only an ID crackpot would do... where was that?

Oh, yeah, it was here:

http://scienceblogs.com/pharyngula/2008/02/the_genome_is_not_a_computer…

Doubtless the same horde of rabid commenters that argued that computer science had no place in studying the genome will assert that this is completely different from what the computer science folks were saying in the comments over there. No doubt because this is "framed" as the wise biologists educating the ignorant computer scientists on how to apply biological techniques to write computer programs, so it's "politically correct" in the intellectual hierarchy that runs from the apex of biology through computer science down to intelligent design.

/sarcasm

In some areas of computer science, genetic programming is frowned upon - precisely because it allows to solve a problem without really understanding it. Thus, if a conference committee is presented two papers which present similar performance gains on a given process, but one is GA (genetic algorithm) based and the other one is a first-principles model, the second stands a much better chance of publication.

That said, I concur - Genetic Algorithms are cool. Sometimes unpredictable, usually requiring careful data conditioning, but always interesting.

I once used GP to make a program to play rock, paper, scissors. I trained it against inputs from games I played against real people. It ended up beating me 75% of the time. I wasn't sure whether to be proud or ashamed.

Paul P: I wouldn't be proud of that. Since there's no winning strategy in rock/paper/scissors, what you wound up with was a simple program that understood your personal strategy (given what I played previously and my win record, what shall I play next?). It knew it well enough to predict your actions. I'm not sure I'd like to think that my actions could be modeled fairly reliably in what was likely a handful of bits.

I've gotta say I'm not *that* impressed by genetic programming, and I think this attitude was impressed upon me by my CS(particularly AI) teachers in college. GP is basically just a restricted sort of stochastic search by gradient descent. I think people often do themselves a disservice by getting caught up in the hype of the metaphor and making their algorithms less efficient by trying to tailor them to fit some sort of genetic conception of search, where it would be better just to apply the more general tools that computer science makes available for optimization problems.

Oddly this came up on another forum I frequent, here's a slightly edited version of what I wrote there:

The final product may be a model of efficiency, but the process of getting there is time and resource-intensive. By its nature, the vast majority of a genetic algorithm's output is garbage. It's a useful tool when you need to search for an optimal solution over a huge space of potential ones, but I don't think it's ever a superior alternative to conventional problem-solving when conventional problem-solving is applicable.

Practically speaking, a design that's less efficient but well-described and well-understood is usually better than one that's more efficient but nearly inscrutable - both will break eventually, and the former is a lot easier to fix. Mechanics are cheaper than surgeons, y'know?

Curse you PZ. How am I supposed to find the time to read yet another great book? The pile sitting on my desk is already curving space-time.

There's an old saying in AI and machine learning circles:

Neural networks are the second best way of doing anything.

And it's often followed by the corollary:

And genetic algorithms are the third best way.

Things like GAs are powerful methods for stochastic optimization for non-convex objective functions and they are useful optimization algorithms. But to make them work well in practice, you have to inject a *lot* of intelligence into designing your representation and coming up with a clever objective function for whatever it is that you're trying to achieve. GAs have their place, but often the best way of solving problems is not a GA. (Or a neural net, support vector machine, dynamic bayes net or whatever else is popular at the moment) You solve problems by achieve a deep understanding of the issues and then using the right tool for the job. That being said, yes, GAs are cool.

Gee, I seem to remember reading somewhere that thinking of genomes and computer programs as analogous is clearly the sort of stupidity that only an ID crackpot would do... where was that?

Try actually reading that article, the comments, and the braindead article it was a reply to before shooting off your mouth. This has nothing to do with "framing".

I want to second Buck @ 25.

Before I programmed a GA, I didn't really understand Darwinian evolution. By the end of the project, I'd gained a deep appreciation for the simplicity and effectiveness of this approach, and, unexpectedly, it helped to lift me out of the morass of Christian dogma I'd been mired in my whole life.

That I had to 'intelligently design' my program was irrelevant. Every system starts from somewhere.

Mark (Monty) Montague:

Yes, DNA is NOTHING like a common computer code. It's an example of what you GET if you run a genetic algorithm.

We can use _principles_ of evolution in our computer code.

By Alex Besogonov (not verified) on 27 Mar 2008 #permalink

I notice the book is free for download, which is nice, and shocking. When does anyone give something away for free? I couldn't find a place to give a little donation...did somebody see one?

GAs are dead useful for finding approximate/good/useful solutions to intractable algorithms (same sort of thing you'd use non-perfect heuristic functions on). Rather like ANNs, they've got little to do with their counterpart biological systems, but, they are neat and can be applied usefully (and are - frequently).

Fear any Computer Scientist that pulls a shallow biological/physical analogy :)

By Adam Robinson (not verified) on 27 Mar 2008 #permalink

I agree. GAs are cool. Especially when they do unexpected things.

And it seems to me the DNA comparison is reversed in this case. DNA isn't similar at all to regular computer code. GAs are an attempt to make computer code more like DNA, to varying degrees.

For those of you who have some programming curiosity, but arent hard core geeks ... go ahead and wade into this book. I helped proof the last drafts of the book and found it to be highly readable and accessible. A programmer I am not.

Excellent work by Phi and his colleagues on this effort.

By Desert Donkey (not verified) on 27 Mar 2008 #permalink

There is a warning for biologists in GA (or the related if not identical concept of evolving neural networks) that they don't want to hear. If a simple evolved network with just a couple dozen nodes and 2 levels of hidden units can defy attempts to explain "how it works," what chance do we have with cellular signalling? Which by the way is MUCH more complex than any neural network, most notably in that: information flow can be bidirectional; the physical location and functional properties of nodes are shifting; the networks are constantly excluding/including various nodes in ways that can subtley or substantially alter the network properties; and we can't watch it all work at once.

Dennis Bray pointed out that we still can't properly model the bacterial chemotaxis network. We know every protein involved and the kinetics of its interactions with every other one. We know where they are, how they are arranged, and what each one does. If we can't crack this one, what hope do we have with anything "hard?"

For now, there is enough to do just doing the stamp collecting work of measuring and cataloging the topology of networks. But someday we'll want to explain how they work. I think the problem is that lots of biologists think ther are explaining something when they draw a little summary arrow diagram of signal transduction components.

I'd like to concur with Desert Donkey. I'm definitely not a programmer, and the first 40 pages have been pretty enlightening.

I'd also like to thank the authors for making it available with the creative commons license.

I'm a little curious about these caveats and warnings about "shallow" analogies and such. I see them from time to time when people discuss the intersections of biology and evolutionary computing (EC).

It seems to me that many of the EC algorithms are exhibiting what can justifiably be called a process of "evolution". Sure, it's in a different medium, the raw materials aren't the same, the computing power and degrees of freedom are hugely dwarfed compared to a whole planet's worth of the actual laws of physics spread over 4.5 billion years, and the fitness is explicit instead of implicit (although not necessarily for ALife systems). I don't think anyone disputes this, and nobody is claiming that a GA or a GP system is a computer simulation of *biological* evolution. But it is evolution. Is it not? One can generalize beyond biology can't one?

I can find examples in the EC literature of systems exhibiting high-level evolutionary behaviors like the Baldwin effect, arms races, mimicry, parasitism, co-evolution, exaptation, speciation, convergent evolution, and more. Sure they're not in perfect one-to-one correspondence in every detail with their biological counterparts, but neither are the labels entirely unjustified. Do these things not betray an underlying commonality with biology?

Do my statements above constitute a "shallow analogy" or (as it seems to me) am I being fairly reasonable in this comparison and contrast of EC and biological evolution. What do you think?

Maybe someone has mentioned this, but has anyone heard of the guy in California who used a supercomputer and the principles of natural selection to generate a new design for a wide angle lens that reduces distortion? I think the story was in a Discover magazine article a few months back. He programmed a simulator that tested each lens design, kept the best and randomly created new ones for each generation, going through many thousands of generations, and got a really great result, a very wide angle lens with low distortion.
These sorts of things seem to me like proofs of principle for natural selection, but I guess creationists can still say, "But you didn't see animals evolve!" These are interesting examples regardless.

By David Reymond (not verified) on 27 Mar 2008 #permalink

> But it is evolution. Is it not?

GP is a specific instance of the class of all systems that evolve just as airplanes are a specific instance of things that can fly. There are underlying physical principles to flight that can be applied in different hardware (feathers vs. rivets). There are underlying mathematical principles underlying evolution that can be applied in different hardware (proteins, bytes). Airplanes are flight, but they are not biological flight.

Lee Graham wrote: Do my statements above constitute a "shallow analogy" or (as it seems to me) am I being fairly reasonable in this comparison and contrast of EC and biological evolution. What d$o you think?

This is a realm of pure opinion. On one end, Dennett has argued that only a couple of criteria are necessary for "Darwinian" evolution: replication with variation and selection.

Others consider evolution to be inextricably biological and historical... it is what has happened to living things on this planet over time, nothing more or less.

re: responses to my #29

#38: a. the /sarcasm was supposed imply that I was being sarcastic and b. of course I read it and the comments, I have a bunch of comments of my own. There was a very extensive (one might even say mind-numbingly so) discussion about this, to which I contributed a lot. Not defending the crackpot, but rather arguing that looking at the similarities between genetics and computers is a research area that can have some value... in fact, in #151 over there, you seem to agree with the points I was trying to make. A significant number of pharynguloids spent a good deal of time arguing that suggesting that there might be something to learn by analyzing the genome as a computer program was an offensively bad idea that could never be better than a waste of time and was more likely a gross misrepresentation of what the genome really is. Which pissed me off.

#40: the output of a genetic algorithm is generally a computer program, at least in the most interesting cases (you can use them to optimize parameters to fixed programs, but that's boring.) It's a computer program that doesn't look like it was written by an intelligent programmer (hence the "hard to debug" comments all over the place) but that doesn't make it not a program. It does make it a moronic argument to say "a program has to have a programmer therefore...blah blah God blah." however.

Lest I seem like a complete curmudgeon, there are many good comments in this thread:

#17 Karl Sims did something very similar to this (link to paper at the bottom): http://www.karlsims.com/genetic-images.html -- this is pretty much my favorite alife thing ever, but maybe just because it looks cool.

#26: Avida's good stuff. Where Sims' stuff is pretty, Avida gets deep. It is still a model, though, that's more like a traditional computer program than the genome, but it's close enough to rigorously test some of the hypotheses of evolutionary biology. Weirdly, the much-maligned Templeton Foundation actually gave Charles Ofria a grant to continue his Avida work, and he says they've never put any pressure on him to tie it to religion or renounce Darwinism or whatever. Go figure.

#34: this seems right for using GAs for optimizing parameters, but if you actually define a fitness function and evolve a program, the search can be somewhat different from gradient descent, or even stochastic gradient descent or simulated annealing, since the way you "mutate" the program is not the same sort of randomness, and your fitness criterion can be (and often is) a very indirect function on the "phenotype" program.

#35 all true, but sometimes a black box that works is fine for some applications. It's worrisome if it's in a critical role, though, since it's not clear how to validate if it could fail in some way that wasn't included in the environment it was evolved in, which leads to #37: I've heard neural nets described as "great for solving problems really, really fast as long as you don't mind if you occasionally get the wrong answer."

In neural nets, there is a problem with "overtraining" where the neural net learns, rather than "find a generalized solution to this problem" to just compute "is the input in the exact set of training examples I was given?" If the fitness function is chosen badly, GAs do similar things... in fact, it's pretty common to see things like "since we asked it to move its center of mass as far as it could, it grew really tall and fell over" when trying to evolve walking or crawling critters.

#47: **APPLAUSE** this was what I tried to say in the other thread, which is why I'm bitter and sarcastic in this one.

And I agree with the kudos for making the book freely available.

it's nice in a test environ where your daily chekins change the code and you want your self modifying code to keep up.

i cannot begin to fathom how this works but it sounds more fascinating than collective wisdom

By Paul Johnson (not verified) on 27 Mar 2008 #permalink

"Not that the code emitted by these systems can be debugged, if even read."

Of course, if the genome were the product of "Intelligent Design", it would be extensively commented. Maybe that explains some of the "junk DNA" ;-)

By Nick Gotts (not verified) on 28 Mar 2008 #permalink

For now, there is enough to do just doing the stamp collecting work of measuring and cataloging the topology of networks. But someday we'll want to explain how they work.

Besides the problem of explaining emergence, I find it fascinating that learning algorithms like the evolution process can circumvent the inverse problem by brute power (enough trials).

Of course, in biology it helps rather than hinders that the process don't know which functions it is selected for. So even dumb learning works, repeating errors and not "learning about learning" (at least not evolving evolvability on small time scales if at all AFAIU), which is kind of amazing IMO.

Btw, The Panda's Thumb has a fresh post which OT mentions EU's program Tackling complexity in Science. The program document reviews the history of the subject. It looks like the field of explaining complex systems has been more active vs biology since 2000, with a learning synergy between biology and software technology.

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

In neural nets, there is a problem with "overtraining" where the neural net learns, rather than "find a generalized solution to this problem" to just compute "is the input in the exact set of training examples I was given?"

But not necessarily. When we learn by classification by symbols AFAIU we avoid this. (To get problems with false positives/false symbols instead.) Here is an example of a biologically inspired neural net model of a neocortex, where symbol like behavior spontaneously emerged and replaced overtraining with generalization:

FWIW, this neuroscientist claims [2006]:

Cognitive modeling with neural networks is sometimes criticized for failing to show generalization. That is, neural networks are thought to be extremely dependent on their training (which is particularly true if they are "overtrained" on the input training set). Furthermore, they do not explicitly perform any "symbolic" processing, which some believe to be very important for abstract thinking involved in reasoning, mathematics, and even language.

However, recent advances in neural network modeling have rendered this criticism largely obsolete.

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

Oops, sorry about the formatting error and hasty reading at the basis of the preceding comment. Sure, there is a problem that generalizes to GAs. The paper confirms that (but also seem to point out that it is now known how it can be avoided for NNs).

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

@53:
#38: a. the /sarcasm was supposed imply that I was being sarcastic and b. of course I read it and the comments, I have a bunch of comments of my own.

Yes I saw the tag and thought it was unwarranted, thus the harsh reply. The way I read your note was that PZ was being inconsistent at best or hypocritical at worst by first condemning the "DNA is a program" crackpot and then promoting genetic algorithms here. I do not see the inconsistency. I think PZ was well aware of GA in that previous post and simply argued that although there are some superficial similarities between computers and DNA, that is all it is; superficial. The crackpot to whom he was replying was not only claiming that DNA was exactly like a program but so exact that it could not possibly have evolved. Genetic algorithms was discussed at length there to refute that assertion.

You also seemed to be snarking at the "pharygulites" for political correctness. There was plenty of dissent in the other thread from both CS and Biologists about modeling DNA as a program and the cell as the computer.

Even so, the point is, yes, this is completely different than that thread. Here evolutionary processes are being modelled to create programs, the other was claiming that DNA was in fact a program and the cell the computer that runs it. The success of the former does not imply the latter.

im a big fan of genetic algorithms, i wish i could program them myself...but my programing understanding is kinda...poor...so i have a hard time grasping what is needed to make such a thing.

one thing id like to see, a GA used to create a kind of 3D biosphere for a game. but so many things would go into making them so the creatures look real (for instance there would be no selection for teeth unless you programmed the environment such that certain teeth were needed, same for claws or hands. so much would just go into making an environment whereby they could evolve a resymblance to actual creatures it would almost be easier just to make them yourself)

GA are of course limited by their environment and the fittness algorithm, they can very nicely simulate biological evolution...they just tend NOT to...because they are being used FOR something and not just for research purposes (though there are a number of research purposes which have used genetic algorithms to evolve creatures) so the fitness algorithm will of course have a purpose, where normal evolution has no such thing. but because the fitness function is different doesnt negate the algorithm itself.

personaly i love some of the things they have done with these things. i think GAs are a mixed basket when it comes to making AI, one thing is...we want our AI to be smarter than us, we want it to perfect and infallible, GAs, neural nets, and ga evolved neural nets give great results...but they are fallible, sometimes they are smarter, or more capable than us for some purpose (that one checkers program was in the expert rank, better than quite a few humans there) but usualy they have this tendancy to make non-nonsensical connections...as many times do humans.

i hold some questions myself if the perfect ideal of most AI will ever be attainable though...

one thing id like to see, a GA used to create a kind of 3D biosphere for a game.

Isn't that supposed to be Spore from Will Wright (SimCity, SimEarth, SimAnt, The Sims, etc. )

A couple of general points:

1. GA (genetic algorithms) and GP (genetic programming) are related, but different technologies. Both borrow ideas from natural selection and population dynamics to create solutions to problems. The main difference is that GP produces programs (or more usually, functions) that present a general solution to a problem while GAs are usually used to improve parameter selection. For example, a GA might be used to find a good combination of parameter values for a "fly by wire" process control program while GP might be used to create a de novo controller program.

This is a generalization, and there is more that you can do with GAs than simply improve parameter selection, but it gives you a sense of the difference: one produces a pattern that solves a problem, the other produces a function that solves a problem. Since a function is a specialized kind of pattern, it is obvious that you can use GAs to make functions, but it is harder to do than using GPs to do the same thing.

2. As a general rule, evolutionary algorithms such as GAs and GPs have had more impact in "real world" applications than many AI disciplines. Many of the cases cited above demonstrate that GP is having a very real impact. One has but to look at things such as antenna design (coming soon to a cell phone near you, as well as to NASA spacecraft), process control work (used by Dow Chemical), financial market modeling (including stock selection - see State Street Global Advisors and Investment Science Corporation), analog circuit design, molecular diagnostics, and others.

That said, they are machine learning approaches. As has been separately observed, it is only as good as the data it learns from. Brittle solutions are caused by using limited learning sets, no matter what machine learning approach is used (eg, lack of sun on images learned from). AI is more of a "first principles" approach than this.

And as Adrian Thompson has pointed out about the circuit that could work but only within a limited temp range and on a specific PLA, they produced a perfectly robust solution by changing the the environment the evolution took place in. It is actually exciting and informative that it exploits the environment - a characteristic of natural evolution - and useful as well as it is an easy way to "stress test" software.

At the same time, an interesting use of GP is to gain insight into a process that is not well understood. Dow used GP to produce approximate chemical reaction models that were then handed to 1st principles modelers. This reduced the time necessary to produce a reaction model from 15 months to 3 months because the 1st principles modelers could gain insight into the process from what the GP system discovered, particularly from the feature selection and combination. Perhaps this approach could be used in AI research.

3. As a general rule, I personally think EAs do nothing to prove or disprove the existence of dog. People wishing to disprove dog point to the development of functional novelty from a random starting point while people wishing to prove the existence of dog point to the fact that the "evolutionary engineer" provides initial starting values. To my mind, there is clearly nothing to be added to the existence question based on these algorithms.

4. That said, I believe (though PZ may have different opinions on this) that there is something to be learned about biology from these algorithms, and that this may make a small addition to the already overwhelming evidence for evolution and against IDiocy. I have been working with EAs for over a decade and one of my most surprising moments was, when visualizing the dynamic development of a population of programs, I watched a species "evolve" from an existing species of programs, then watched the two species "compete" for space in the artificial ecology of our problem. Part of this work was in the recently published book "Genetic Programming Theory and Practice V" in the chapter titled "Program Structure-Fitness Disconnect and Its Impact On Evolution In Genetic Programming." [FULL DISCLOSURE: As an editor, I get a v. small, but recognizable royalty (about enough to buy a good dinner) from this and the earlier GPTP books.]

[FOOTNOTE: At the GPTP workshop this book is taken from, we had the pleasure of having PZ as one of our keynote speakers. He was his usual graceful self and was very patient with a bunch of wild-eyed programmers' notions - thanks PZ! PS Squid (and squid ink) was involved in a dinner PZ was the guest at. :D]

I had a similar response when Melanie Mitchell visualized the evolution of a cellular automata where you could almost watch the system craft a tool necessary to solve the problem at hand.

In summary, evolutionary algorithms are an exciting area of computer science and engineering but they do not replace other areas of research, though they may have more applications. However, the relationship between EAs and evolutionary biology is interesting and hopefully will deepen as our implementations become more sophisticated.

Just to play spoiler- genetic algorithms are popular and attract a lot of interest but they are generally considered to be too slow for hardcore optimization problems. Here are some of my past comments, as well as a link to some commentary from people in the mainstream machine learning community:
http://simra.net/blog/node/465

People wishing to disprove dog point to the development of functional novelty from a random starting point while people wishing to prove the existence of dog point to the fact that the "evolutionary engineer" provides initial starting values.

Granted that I have never worked with GAs or GPs - but this doesn't seem to describe all the possibilities.

When people like Dawkins point to evolution of functions as a natural process they typically use the biological process. And at times point to GAs at simple models.

It is typically creationists that prefer to use "the genome is a program" incomplete simile to imply design. And since they imagine that the genome must degrade by evolution ("microevolution") at worst, they claim frontloading. Thus one can often see IDCers claim that the problem solution as somehow provided by the problem statement in GAs. (Well, everywhere. Isn't "godsdidit" fantastic? No problems in the creationist universe!)

Looking at a single 'new' selective pressure as starting at random is simplistic (no gene doubling and/or exaptation) but a reasonable analog I guess. One should be able to complicate the public relations problem for creationists by having a demonstration simulation randomly select between some functional targets, their parameters and starting points. They will always claim frontloading, but their imagined 'solution' will be one more step removed from the simplistic claim.

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

"godsdidit"

I'm sorry for my blasphemous expression.

I meant "dogpoopedit" of course!

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

Whatever next! Only a matter of time before someone makes the junk DNA bloat connection. Oh, I just did. Sigh.

Big thanks to PeeZed for mentioning the book, and thanks to everyone for the positive comments. I only just saw all this, having been off-line most of the last 24 hours.

It's really nice to see all the positive comments about the free PDF download. Our primary goal is to make the material as widely available (and hopefully useful) as possible. Thus we chose to self publish as inexpensively as possible (the printed copies are being sold at cost) and make the PDF freely available under a CC license.

Regarding Chris's comment (#41), there is currently no "Donate please" button. We've talked about it, and something like that might happen, using the money to support students to help manage projects like the GP Bibliography. For the moment, though, the best "donation" people can make is a little free press like Paul's post. There's no publisher out hawking our book, so it's down to our efforts and word of mouth. This makes the support of communities like this particularly important, and greatly appreciated.

Thanks to all!

Questions, suggestions, and ideas always welcome.

I tend to agree with #62. The problem with GP is that the *environment* is either too wide, or too narrow. Most times its 2D, even if its 3D, there are artificial constraints, like a) not needing food, b) not needing any way to "detect" things, c) being stuck on ground, etc. Avida showed that an environment that was either too narrow, or which had unlimited food (or inputs), and no direction, either failed to evolve at all, or evolve only *one* form. One needs something with scarcity, but not too much, expansiveness, but not infinite range, etc. In other words, you need a world with rules that allow for what you want to see, but neither the *expectation* of which one you will get, or conditions where its not optimal to *evolve* those traits in the first place. Avida does well in 2D with that, but it has a lot of limitations. You could never make a 2D game world, never mind a 3D one with it. And on reason is that a) there is no "physical" representation for what the GP "genes" look like in a creature, nor b) any way for form to develop, nor those physical differences to have an *effect* on the fitness.

We are, with our best GP, playing with single celled organisms, whose only *functional* characteristics is if they can sit on one place ans filter the data, and maybe do the equivalent of moving over a rock, to find a slightly better place to filter it. And, coral colonies don't exactly make astoundingly exciting computer games. lol

#61 yeah, I was in sort of a cranky mood, and I'm sorry if it offended you.

I found the other thread very frustrating in a "this is throwing out the baby with the bathwater" sort of way... obviously, the seed for it was some IDiot who deserves no respect whatsoever, but PZ, to some extent, and many of the commenters on that thread seemed to be taking the opportunity to argue that any comparison between the genome and a computer program is naive and misguided. I vehemently disagree with that position, and I find that the discussion here isn't even consistent with that: if genetic programming has any validity in studying anything to do with biology in more than an allegorical way, then the program created by the GP has to be analogous to the genome.

I think my actual issue is more that, while it's quite possible to find naive, incorrect, and misleading interpretations of the genome as a computer program, and even to make stupid creationist arguments using those bad interpretations, I also think that computational models of the genome can be a valid lens through which to study genetics, development, and evolution. But it seems like a very common response for people to respond like #40 "Yes, DNA is NOTHING like a common computer code" or several of PZ's and others' similar comments in the other thread.

I think part of the debate is just semantic, in that what I define as a computer seems to be fairly broad. I also disagree with the claim that any resemblance between the animal genome and a computer is "superficial," although I'm pretty much convinced that Pharyngula comments aren't the greatest forum to debate that... essentially, though, I think that "Sturgeon's Law" applies to say that 90% of comparisons between DNA and computer programs are crap, because 90% of everything is crap. I happen to find the other 10% interesting, compelling, and useful, though.

My biggest source of exasperation, though, is what I meant by "framing" (admittedly, a deliberately contrarian word choice): this "favorably framed" post on GP/GA stuff seems to be full of biologists loving the idea of computer models of genetics and evolution by evolving a computer program as a simulated genome, yet the other post, admittedly "negatively framed" in the seed post about some IDiot's computer analogy, seem to be full of (some of the same) people suggesting that it's offensively misleading to apply techniques for assessing a genome as a computer program. It just seems inconsistent, unless similarity is somehow being claimed to be intransitive in this case (evolved programs are like an evolved genome, but evolved genomes aren't like evolved programs).

Heh Lee, just wondering, ever consider making those things have different "surfaces", so that the type of motion they develop, as well as shapes, will depend on which they try? I.e., a snake like movement would work "better" if the "surface" only slide well in one direction. And, add pain type effects, where tall leaps or drops would "hurt", so it would learn to avoid them? Seem to me that while "some" of those things eventually found some really neat solutions to moving, most of them are inadequate for "real" worlds, because a) they only have to deal with flat surfaces, b) can't get "hurt" by doing something really dumb, and c) are not constrained by anything other than "I moved". In other words, your "scope" of inputs may not be limiting the evolution enough to promote results that would look or act like a real animal in a real world. Real creatures have to not just figure out how to move, but do so without killing themselves, and/or getting stuck, most of the time. Yours don't.

#72: If the physics engine allowed for the sorts of variable surfaces you describe, I wouldn't mind trying them out under genetic/evolutionary control. It would be interesting, for sure. Unfortunately, the physics engine doesn't have that level of fanciness and I don't really know enough to make the required modifications myself.

One thing I did consider was allowing some body parts to be "sticky". I put that in quotes because the stickiness simulation would just be a matter of establishing weak breakable "joints" between the body part(s) and any object(s) contacted. Creatures might be able to use body parts like that to, say, climb walls. That's a change I could make fairly easily.

As for your points about the lack of "pain" and so on, I think there's something to that. In fact, I think one of the main reasons I get a lot of jumping strategies from my creatures is that I don't bother to take into account the amount of energy a creature uses. Heck, if I as a human didn't have to worry about that, I'd probably do lots of jumping too :P I think if I penalized creatures for using too much energy they might evolve forms of locomotion that more closely resemble "walking" styles. That, of course, would require a proper (and lengthy) experiment to examine, but it's one I'd be interested in trying.

As for "pain", I think it would only be useful if I had creatures that did some form of learning during their time in the virtual world. At the moment, only my populations learn (by evolution) but not my individual creatures. However, damage/pain/wear from hard collisions with other objects could be tracked and factored into the fitness function. That might encourage creatures to be less reckless.

As for constraints and environments affecting the results, I do have a rough terrain environment option, but it's pretty limited. There is another 3D creature evolution project out there called "Evorunners", and it uses a more more constrained body plan scheme. The differences between Evorunners's creatures and my own is very clearly seen in the evorunners videos (see below). The Evorunners creatures find strategies that one could call "walking". My creatures almost never do. See, for example:
http://www.youtube.com/watch?v=uS57VNvq0o4

I'd definitely like to try modifications along the lines you suggest in future versions of the program. Thanks for the feedback :)

http://www.mitpressjournals.org/doi/abs/10.1162/106454601300328034?jour…

may be of interest for this. There was a paper, I think out of MIT, at SIGGRAPH in the early 90s that had some interesting results in evolving virtual swimming and walking creatures, too, but I can't find it just now, but I believe it described their physics engine in some detail.

Larry Yeager's Polyworld was very novel for its time, in that it included a sophisticated virtual environment involving food, perception, and so forth. I haven't looked at the "modern" polyworld in sourceforge he refers to here, though: http://www.beanblossom.in.us/larryy/

I don't remember if polyworld had an interesting physics model, but it certainly added some 3-d environmental factors that most other systems don't really try to model.

There's often sort of a tradeoff in these systems between making things complex enough to avoid being oversimplified and making things simple enough to notice if there are unrealistic assumptions that bias the simulation in some way... like if the "organism" evolves to exploit bugs in the physics simulator by vibrating in ways that confuse the numerical integration, for example.

like if the "organism" evolves to exploit bugs in the physics simulator by vibrating in ways that confuse the numerical integration, for example.

Well, I am sure you could find ways to constrain that, like constraining how much they move at all to be greater than the level of granularity possible to "cause" such numerical errors. Sort of a, "sorry, even though the sim uses 64-bit floating point, you creatures only get to use 32-bit integers." And really, if you want to create minds for creatures, like for a game, you just want a skeleton anyway, onto which you can add a "skin" for the final result. As long as it moves in a way that *fits* the skin, who cares. It would still produce unique skeleton designs, and thus potentially interesting creatures.

Though, I would love to see it get good enough that you could give is a basic "goal" for the shape, abilities, etc. it would have, then produce a close match, including the skinning.

Kagehi (#76) is right about the physics engine instability mentioned by Mark (Monty) Montague (#75). There are ways around it. Not perfect, but pretty good, such as setting speed limits on body parts and "killing" creatures that violate them. For anyone interested, there's some good discussion of that sort of problem, and other interesting problems in the papers:

Chaumont N., Egli R., and Adami C., Evolving Virtual Creatures and Catapults, Artificial Life, Volume 13, p. 139-157, 2007.

and

Taylor T., and Massey C., Recent developments in the evolution of morphologies and controllers for physically simulated creatures, Artificial Life, Volume 7, Number 1, p. 77-87, 2001.

The problem of creatures producing undesirable behaviors that still satisfy the fitness function ("cheaters") is a more interesting one (also discussed in those papers). An example would be a creature that begins coiled up and just unfolds to move its center of mass some distance horizontally instead of actually traveling. Another (mentioned by a commenter above, IIRC) would be a creature that is super tall, and covers horizontal ground just by falling over.

You have to make the fitness function "clever" to get around those things. Some might be tempted to say that this is a way to inject complexity and intelligence into the evolutionary process (what do they call it? "frontloading"?), but really it's just the usual thing one runs into when programming a computer. Whenever you're programming, the computer is going to do what you *tell* it to, and not what you *want* it to, and it always boils down to making sure you've phrased the former to match the latter. When you ask the evolutionary process to, say, maximize the net Euclidean distance reached by the creature's center of mass in the horizontal plane, that's what it's going to try to do. Tall creatures that fall over are one type of solution that might be found. The computer will not read your mind and see that that behavior is not what you meant. So it's not a matter of artificially exaggerating the power of evolution when extra effort goes into straightening out the fitness function. It's just a matter of precisely stating the problem you want solved.

It's like in those shows where you have a genie that's going to grant you a wish. You have to be very careful about how you phrase your request or the Genie will find a loophole and screw you over. Either way, once you've made the wish, the Genie is still going to cook up some pretty magical sh#t.

if genetic programming has any validity in studying anything to do with biology in more than an allegorical way, then the program created by the GP has to be analogous to the genome.

Wouldn't GPs describe whole "organisms", not the genome?

Anyway, a cell is duplicated by splitting, it isn't enough to duplicate the genome. In at least some animals there is more information added by maternal hormones, PZ has had posts on this.

I assume that this is what people mean when they observe that "the program", the functional trait, doesn't reside in the genome but is distributed over the cell and beyond, into the environment. (Perhaps this is what Dawkins means with "the extended phenotype"? I need to read up on that.)

A better analogy for the genome is perhaps that it is a collection of recipes, but not a full program description of how to cook and serve the food.

By Torbjörn Larsson, OM (not verified) on 29 Mar 2008 #permalink

Thank you for drawing attention to genetic programming and genetic algorithms in the context of the ID "debate." As a composer working with sonified genetic algorithms. I've been frustrated with the lack of attention GA and GP have been given in this context. The only testable hypothesis in ID is that all irreducibly complex systems require a designer. We do not have to show that our computer models exactly emulate what happened in nature. It is enough to achieve any irreducibly complex system without runtime interference from a designer.

Yes, with a computer program, there is an intelligent designer of the process itself. But, the religious can consistently think their god designed the process of evolution. It allows them to study evolution just fine without ID. IDers simply believe their god isn't capable of designing cool processes.

#79: Hi cristyn. Thanks for the link to the sonified GA. I'm definitely going to check that out. Sounds interesting.

I feel I have to comment on the statement that "The only testable hypothesis in ID is that all irreducibly complex systems require a designer." I don't think it makes ID testable at all. We could test whether or not intelligent designers can make something IC. We already know they can (human designers, at least). It has no bearing on the validity of ID, of course. We could test whether or not evolution can produce something IC. If the answer is yes, this only harms ID. If the answer is no it has no bearing on ID. The answer being no isn't a useful prediction from the hypothesis than an intelligent designer exists any more than if I claimed that "all snowdrifts are made in Belgium" and then tested it by seeing if any snowdrifts come from Ecuador - the answer being no doesn't help me. The IC business is only an attack on evolution, and even if it were successful it wouldn't help ID. ID doesn't automatically hold the "default" position of a theory that's supported by having others fail.

I completely agree with your finals paragraph. I've always felt that the complaint that "well, GAs require a designer, so there!" is something creationists should take up with the cosmologists, not the biologists ;)

In terms of IC in evolutionary computing, there's a bit out there already. I think if you look up "Lenski", "Adami", and "Ofria" (not sure on the spelling of the last one) you'll find something. Plus, I have a little applet that takes a stab at it myself:
http://www.stellaralchemy.com/ice

For now, there is enough to do just doing the stamp collecting work of measuring and cataloging the topology of networks. But someday we'll want to explain how they work.

Besides the problem of explaining emergence, I find it fascinating that learning algorithms like the evolution process can circumvent the inverse problem by brute power (enough trials).

Of course, in biology it helps rather than hinders that the process don't know which functions it is selected for. So even dumb learning works, repeating errors and not "learning about learning" (at least not evolving evolvability on small time scales if at all AFAIU), which is kind of amazing IMO.

Btw, The Panda's Thumb has a fresh post which OT mentions EU's program Tackling complexity in Science. The program document reviews the history of the subject. It looks like the field of explaining complex systems has been more active vs biology since 2000, with a learning synergy between biology and software technology.

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

In neural nets, there is a problem with "overtraining" where the neural net learns, rather than "find a generalized solution to this problem" to just compute "is the input in the exact set of training examples I was given?"

But not necessarily. When we learn by classification by symbols AFAIU we avoid this. (To get problems with false positives/false symbols instead.) Here is an example of a biologically inspired neural net model of a neocortex, where symbol like behavior spontaneously emerged and replaced overtraining with generalization:

FWIW, this neuroscientist claims [2006]:

Cognitive modeling with neural networks is sometimes criticized for failing to show generalization. That is, neural networks are thought to be extremely dependent on their training (which is particularly true if they are "overtrained" on the input training set). Furthermore, they do not explicitly perform any "symbolic" processing, which some believe to be very important for abstract thinking involved in reasoning, mathematics, and even language.

However, recent advances in neural network modeling have rendered this criticism largely obsolete.

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

Oops, sorry about the formatting error and hasty reading at the basis of the preceding comment. Sure, there is a problem that generalizes to GAs. The paper confirms that (but also seem to point out that it is now known how it can be avoided for NNs).

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

People wishing to disprove dog point to the development of functional novelty from a random starting point while people wishing to prove the existence of dog point to the fact that the "evolutionary engineer" provides initial starting values.

Granted that I have never worked with GAs or GPs - but this doesn't seem to describe all the possibilities.

When people like Dawkins point to evolution of functions as a natural process they typically use the biological process. And at times point to GAs at simple models.

It is typically creationists that prefer to use "the genome is a program" incomplete simile to imply design. And since they imagine that the genome must degrade by evolution ("microevolution") at worst, they claim frontloading. Thus one can often see IDCers claim that the problem solution as somehow provided by the problem statement in GAs. (Well, everywhere. Isn't "godsdidit" fantastic? No problems in the creationist universe!)

Looking at a single 'new' selective pressure as starting at random is simplistic (no gene doubling and/or exaptation) but a reasonable analog I guess. One should be able to complicate the public relations problem for creationists by having a demonstration simulation randomly select between some functional targets, their parameters and starting points. They will always claim frontloading, but their imagined 'solution' will be one more step removed from the simplistic claim.

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

"godsdidit"

I'm sorry for my blasphemous expression.

I meant "dogpoopedit" of course!

By Torbjörn Larsson, OM (not verified) on 28 Mar 2008 #permalink

if genetic programming has any validity in studying anything to do with biology in more than an allegorical way, then the program created by the GP has to be analogous to the genome.

Wouldn't GPs describe whole "organisms", not the genome?

Anyway, a cell is duplicated by splitting, it isn't enough to duplicate the genome. In at least some animals there is more information added by maternal hormones, PZ has had posts on this.

I assume that this is what people mean when they observe that "the program", the functional trait, doesn't reside in the genome but is distributed over the cell and beyond, into the environment. (Perhaps this is what Dawkins means with "the extended phenotype"? I need to read up on that.)

A better analogy for the genome is perhaps that it is a collection of recipes, but not a full program description of how to cook and serve the food.

By Torbjörn Larsson, OM (not verified) on 29 Mar 2008 #permalink