Stalking the Perfect Tree

By loom on October 26, 2003.

When Charles Darwin was thrashing out his theory of evolution, he would doodle sometimes in his notebooks. To explain how new species came into existence, he wrote down letters on a page and then connected them with branches. In the process, he created a simple tree. Across the top of the page, he wrote, "I think."

That single tree has given rise to the thousands of trees that are published in scientific journals these days. A particular tree may show that humans are more closely related to chimpanzees than gorillas. It might show how the SARS virus in humans descends from viruses in other animals.

When you look at the picture of a tree in a scientific paper, it is easy to take it as an illustration of an unadorned fact. That is not, however, how science works. A tree represent a hypothesis that offers the best explanation of the data at hand. It shows the most likely pattern by which new species might have branched off one another, taking on new traits along the way, and giving rise to the range of species a scientist is studying.

These hypotheses are not simple to come by. Large-scale studies of phylogeny only became possible when computers began turning up the desks of biologists. You need that computing power because even in a simple comparison of a dozen species, there are so many alternative trees to test. Say you have 3 species, A, B, and C. A and B might be more closely related, or maybe A and C, or B and C. Three choices. But as you add more species, the possibilities explode to millions and more. Sifting through those possibilities takes both gigaflops and smart statistics.

From life's long reign, we have relatively few pieces of information to figure out the shape of its tree. The first evolutionary biologists to draw trees could only compare features that they could see through a microscope or on a fossilized skeleton. These days, most trees are based on genes. Once scientists could sequence genes, they tapped into a far richer lode of information than previous generations could reach. What's more, genes offer a much crisper picture of the evolutionary process than, say, a horn or a petal. After all, mutations to genes lead to inherited changes in how the body develops. Whereas the change to the body may be hard to tease out, the mutation may be as simple as snipping out a few nucelotides in a gene sequence.

But gene trees are not unadorned facts, either. Some genes have evolved relatively quickly, so that if you compare them in different species that took millions and millions of years to diverge, it may offer a distorted picture of how they are related. On the other hand, a gene that evolves too slowly may not be able to distinguish the fine details of a recent explosion (like the cichlids of Africa that I wrote about recently). In bacteria and other single-celled organisms, the picture gets even more fuzzy when you consider the fact that they can trade genes with one another, rather than just inheriting them from ancestors. In some regions, the tree of life is more like a mangrove, with branches grafting together rather than splitting apart.

One convenient thing about building evolutionary trees is that you can get an idea of how much confidence you can have in it. One way is to pick out a random subset of your data to base a new tree on. In some cases, the switch may produce a tree with a different shape. Perhaps just one section of it changes. Or perhaps the tree barely changes at all. By repeatedly testing the evidence in different combinations, it's possible to estimate how likely each branch point is authentic.

Gene trees have shed a lot of light on the history of life. Just to pick one case among many, several different studies have strongly supported the notion that hippos are the closest relatives to whales on land. But these studies are like telescopes for looking back in evolutionary time, and they are only as precise as their design allows. Studying one gene in all animals may give you a different picture of animal evolution than studying a different one. It's not as if one gene will point to snapdragons as the closest relative of fish, or link mushrooms and monkeys. But it can get hard to determine whether comb jellies are more closely related to jellyfish or to crustaceans, vertebrates, and other more complex animals. This is may sound esoteric, but it's not really. If comb jellies are closer to us, scientists could find some important clues in them about our own evolution. If they're out on a more distant branch, they aren't so important to our own evolutionary story.

In recent years, some scientists have argued that the best way to bring the evolutionary telescope into tighter focus is to study a bunch of genes at once. Fortunately, in this age of genomics, we're swimming in genes. Scientists have just started running studies in which they compare dozens of genes in various species. The results have been promising. But until now, no one had looked systematically at how much help multiple genes could offer to unsolved mysteries in phylogeny.

All of this is a very long preamble to a fascinating study in Nature this week from Sean Carroll at the University of Wisconsin and some of his current and former students. They looked at seven species of yeast, all of whose genomes have been fully sequenced in recent years. They picked out 106 genes in all seven species, choosing them because they clearly show signs of being variations of each other, descended from a common ancestral gene that duplicated many times Then they used each gene to come up with a tree showing how the yeast are related. Many of the genes produced different trees. Not surprising. What was surprising was what happened when they analyzed all 106 genes together. Suddenly, a single tree emerged as the most likely. And no matter how they tested the tree, they found 100% confidence at every node. As the authors note, this certainty is unprecedented, and they argue that they have established the evolutionary history of these seven species.

It seems that the annoying disagreements from individual genes fade away when a computer can crunch down on a lot of them. Carroll and his co-authors realized that they may have been indulging in overkill by using 106 genes, and so they narrowed down their data set to see how few genes they needed to get the same sort of overpowering results. They could get down to just 20 genes and still produce the same tree.

Carroll et al haven't found the guaranteed method to figure out every evolutionary tree. Each group of species will have its own peculiarities to take into account. But their astonishing results offer a very sunny forecast for phylogenies in the next few years. The days of "I think" may be over.

More like this

Tree of Life, c. 2006

Scientists are probably centuries away from drawing the full tree of life. For one thing, they have only discovered a small fraction of the species on Earth--perhaps only ten percent. They are also grappling with the relationships between the species they have discovered. Systematists (scientists…

Why Weird Animals Matter, Continued: Untangling the Branches

In my last post I wrote about how scientists are learning about the origin of animals by studying their genomes. One of the surprising findings of the latest research is that a group of animals called comb jellies (ctenophores) belong to the oldest lineage of living animals. Comb jellies look a bit…

Phylogeny Friday - 15 September 2006

A few weeks ago I introduced the tree of life, albeit to some criticisms. The following week I zoomed in on one branch of that tree, the eukaryotes. I pointed out that animals were a mere twig in the eukaryotic tree, yet they have been the focus of a large amount of biological research. This…

Phylogeny Friday - 2 June 2006

I wrote about the possibility of gene trees and species trees giving conflicting information in a previous Phylogeny Friday. In that example, the discordance was due to balancing selection maintaining multiple alleles across species boundaries. But can incongruities between genetic data and species…

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

The Loom Ends. The Loom Lives!

July 1, 2008

Okay--after some technical difficulties I won't bore you by recounting, I have an announcement. For the third time in this blog's life, I'm packing it up and moving it to a new home. I would like offer my deepest thanks to Scienceblogs for hosting the Loom for two years. I got to know a great…

The News Can Wait

June 30, 2008

Really, it's not like I've discovered a new element or anything. See you tomorrow.

Almost Ready...

June 30, 2008

...to spill beans. Any minute now, honest.

Some News...

June 29, 2008

...at 5 today. [That's 5 pm EST--sorry for the confusion.] [Hint...I've turned off the comments till then.]

A Tapeworm Mystery: Which Way Is Up?

June 28, 2008

I'm sure you'd like to pretend that you have nothing in common with a tapeworm. A tapeworm starts off as an egg which then develops into a cyst. Inside the cyst is a ball-shaped creature with hooks that it can use to crawl around its host before growing into an adult. Many species are made up of…