Phylogeny Friday - 28 April 2006

Over at my old site, I lamented the apparent death of distance based tree building algorithms. Just as all of life on earth can be divided into three domains, phylogenetic methods can be split into three groups: distance based, maximum parsimony, and maximum likelihood. Distance and parsimony based approaches have been around for a while (and were used prior to the availability of molecular data). The combination of molecular data and more powerful computers allowed large molecular datasets to be analyzed using parsimony methods. Our great computing power has also allowed for the advent of maximum likelihood methods to be applied to solving phylogenies. Bayesian likelihood algorithms are the en vogue tree building methods and they can be tuned to the specific parameters observed in your data. But, as I asked in the post, what about distance based methods?

More below the fold...

The discussion above is far from comprehensive, and I don't spend a lot of time building trees so I'm not qualified to judge which method is best. That said, the appropriate method definitely depends on your data, and it's always good to confirm your phylogeny using multiple methods. Despite being published nearly twenty years ago, the neighbor joining method remains one of the most popular tree building algorithms. The article has been cited an amazing 9,820 times (according to Google Scholar). That may be an underestimate, as ISI lists it as having 13,353 citations.

i-958c6f5656e9ff42f04841dd6bd4a10b-nei_join.JPGThe token phylogeny is shown to the left. This is the first ever neighbor joining phylogeny constructed using real data. The evolutionary distance between these frog species (from the genus Rana) were measured using allozyme loci and biochemical interactions -- not exactly DNA sequences, but the original data were published in 1978. The numbers represent the evolutionary distance along each branch. DNA sequencing was still quite difficult in the 1980s, but technological advances made in the 1990s lead to a rapid increase of DNA sequences in public databases. The neighbor joining algorithm was used to construct many of the early phylogenies using molecular data (some of these may appear in Phylogeny Friday in the coming weeks).

Categories

More like this

If you've ever looked at an evolutionary tree, contemplated phylogeny, cladistics, or the like, you're probably aware that Joe Felsenstein is one of the leaders of the pack. And you will certainly enjoy, this interview that Blind Scientist has posted. I wouldn't advise reading the interview to…
Our first paper from the Beetle Tree of Life study has been published. Here's the citation: Wild, A. L. & Maddison, D. R. 2008. Evaluating nuclear protein-coding genes for phylogenetic utility in beetles. Molecular Phylogenetics and Evolution, doi: 10.1016/j.ympev.2008.05.023 My co-author David…
A phylogeny is a statement about the evolutionary history of organisms. Cladograms give branching order only, but phylograms include branch lengths as well. They inform us about diversification of lineages, patterns and rates of trait evolution, and the ages of taxa and timing of radiations. The…
While I'm away, I'll leave you with this introduction to likelihood theory (originally published Nov. 22, 2005). In the Washington Post last week, Charles Krauthammer boldly opposed the Tin Foil Helmet wing of the Republican Party by calling intelligent design a "fraud." The best part of his column…

Everybody has their preferences but since Neighbor-Joining, parsimony and UPGMA have different assumptions it's worth running all three on a dataset. It's technically not an obstacle so why not explore the dataset? I agree that keeping the simplest analysis is the best but the different views afforded by parsimony and UPGMA night turn up some interesting tidbits in the dataset.
Oh, and like flossing after ever meal, don't forget to bootstrap.

By Bruce Thompson (not verified) on 29 Apr 2006 #permalink