One of the most important developments in evolutionary biology in the past few decades has come without much fanfare outside of a small circle of population geneticists. The early models of population genetics were limited when it came to analyzing the nucleotide sequence polymorphism data that began to appear in the 1980s. New statistical techniques were developed to analyze this data, and they all fell under the umbrella of coalescent theory. If you want to understand the evolution of populations, you’re missing a lot if you do not understand the coalescent.
When I wrote about the best biology experiments/discoveries I mentioned in the comments that I should have included the coalescent in my list. One reason I have for putting such importance in the coalescent is my bias toward molecular population genetics. But I also could not imagine a huge project like the HapMap being undertaken without coalescent theory. As DNA sequences become more and more common in studies of natural populations (supplanting microsatellites which replaced allozymes), the importance of coalescent theory grows and grows.
Rasmus Nielsen has written a review of a new book on coalescent theory. In his review, Nielsen describes the importance of the coalescent for researchers interested in using molecular markers to study features of natural populations, such as structure and phylogeography. Nielsen claims that this is the first comprehensive treatment of the coalescent published since Richard Hudson’s review from 1990 (available here as a pdf), but John Wakeley’s book is also available. I have not read either, so I’ll refrain from judgment.
I have reproduced a couple of passages from Nielsen’s review — describing the importance of coalescent theory in population genetics — below the fold.
On the importance of coalescent theory:
Coalescent theory provides a bridge between population genetic models and molecular data. It describes how demography, recombination, and other factors affect the shape of gene trees and provides tools for making statistical inferences from molecular population genetic data. Coalescent theory is necessary in phylogenetics to understand why gene trees may differ from species trees, in conservation biology to understand the relationship between effective population size and census population size, and in molecular ecology to understand almost anything at all. Acquiring a basic knowledge of coalescent theory should be a great help to any evolutionary biologist, and it is a must for researchers and students in population genetics or molecular ecology.
On drawing conclusions from molecular data without applying the coalescent:
There are still too many papers published in the field of evolutionary biology in which an estimated gene tree (or gene network) is used to invent a detailed biological story with little appreciation of the complexity involved in making inferences on demography/population history from gene trees. For example, it is a common misconception that the superposition of an estimated gene tree on a geographical map provides information about the geographic ancestry of the individuals in the sample. Likewise, the use of geographical location as a cladistic character with an ancestral state that can be inferred using ancestral character reconstruction may easily lead to false inferences. The main problem is not only the uncertainty associated with the estimation of trees, but that the tree itself has a strong stochastic component. One of the important insights we have gained from coalescent theory is that the same population history may generate very different gene trees if repeated and that very different historic scenarios may sometimes generate gene trees that are surprisingly similar.
Nielsen believes that there is an under appreciation of the role stochastic processes can play in shaping sequence polymorphism. Without a null model based on the coalescent, there is no way to statistically test hypotheses that are based on DNA data, regarding things like population structure.