The philosophy of classification

The term "radical" is a very loose term. It basically means "something that differs wildly from the consensus" in ordinary usage. So I hope David Williams and Malte Ebach won't take offense if I say that they have a radical interpretation of the nature of classification. In a couple of recent posts - one on Adolf Naef, and one on Molecular Systematics - they have presented some views on classification that do, indeed, differ from the received consensus. So, I need to blather a bit...

The nature of classification is highly contested in biology, let alone the ancillary philosophical discussions that accompany it. The focus for this contest is, of course, the rise and revision of cladistics, or phylogenetic systematics, in the early 1970s and since. As David Hull once noted, the "sweet science" of taxonomy is highly charged with emotion and politics.

Now at the time cladistics arose, there was a developing split between those who thought that a cladistic phylogeny was ipso facto a historical tree, or could be made immediately into one, and those who thought that a cladistic phylogeny was merely a summary of shared characters, or synapomorphies, that could at best test a historical reconstruction. The latter view, presented by a group of dissenting taxonomists including Donn Rosen, Nelson Platnick and Gareth Nelson, was dubbed by its opponents "pattern cladism", and the other view, which Mark Ridley declared successful in a book entitled Evolution and Classification: The Reformation of Cladism, used to teach several generations of biologists, has recently been dubbed "process cladism". The distinction is supposed to be between classification that shows only patterns and that which uncovers the process of evolution. Despite a philosophical work by Elliot Sober, Reconstructing the Past: Parsimony, Evolution, and Inference, that effectively accepted (without mentioning) the theses of pattern cladism, this is the consensus today.

Or really, it would be the consensus if many systematists thought much about what they are doing with cladistics. The majority tend just to use the tools, like PAUP*, MacClade and other software implementations, as "black boxes" through which they feed their data sets in order to give the cladograms required by editors in all fields associated with biology, from medical research to actual taxonomic monographs. But here we are discussing those who do have a philosophy about classification.

In the period immediately preceding the cladistic revolution, and it was a revolution at the time, there were two other competing schools of thought. One, misleadingly known as "evolutionary systematics" was really just a historicised version of prior systematic practice, in which specialists grouped organisms together on the basis of perceived similarities. The other was ushered in by the arrival of computers, and was called "numerical systematics", and is now known as "phenetics" (from the Greek phaineros, meaning, "appearance"; which is the same root as for "phenomenal" and "phantasm"). The major difference between these two schools was that the pheneticist held that prior theory ought not to bias the classifications, because they held to an extreme empiricism, and wanted classification to be the basis for further theorising.

The evolutionary systematists held that similarity could give you the homologies, and that if something was too different from other members of its branch of the evolutionary tree, it should be grouped separately. So birds, which are less like crocodilians than reptiles are, would be classified as distinct from the group containing the other two. On cladistic principles, this makes the group "artificial", because what makes it a group is purely the use of the taxonomist's art, not nature. On phenetic principles, the tree is irrelevant, so long as the traits used as principal components show clustering. So the debate is a three cornered affair, with pattern cladists being a minority in the other corner.

This mess has entertained and distracted taxonomists for over 40 years, and indeed the roots of this go back much deeper than just the last few decades. Many of these issues have very deep roots indeed. I won't bore the reader by doing what I usually do and go back to the Greeks (although I could! Just ask me!); it's enough to go back to the 18th century. Michel Adanson held that classification should be based on many characters, while Linnaeus used only one. Buffon used what was effectively a kind of evolutionary systematics, although he was not an evolutionist in the strict sense.

In the nineteenth century, Ernst Haeckel started to publish tree diagrams - literally diagrams of trees - that represented his attempt to reconstruct the history of evolution, but he did it largely on the basis of his prior theories about the progress of the evolutionary process. This started a tradition in Germanic paleontology - beginning according to Malte and David with Naef,, and eventuating in Willi Hennig's work, which got cladistics going.

So why do they think molecular systematics is a form of phenetics? They say:

...if considered a method, we see that there is no notion of congruence at all as no other datasets are given consideration. Molecular systematics as a form of measuring similarity constitutes a system, not a method.

Ancestors and other mechanical explanations are not of any concern in the debate between artificial and natural classifications. One does not decide on homology in advance. It is either there or it is not. Homology, as we understand, is a relation. A similarity such as 11, or AA, is not a relation. Thus, all molecular systematic studies are phenetic as they ignore relationship, that is, homology.

I'm not sure I follow this. According to current usage, molecular systematics does rely on homologies: they have a number of special terms devoted to identifying them: paralogy, xenology and orthology. Of course, they often don't use homology properly. And to identify a homology in molecular biology you need to do some prior work; homology is an inference from sequence similarity (including eyeball alignment). In short, if I understand the argument, molecular systematics derives homology from similarity.

There's a problem with that... in traditional biology, homology is derived from comparative analysis of the development and structure of compared organisms. We know that the fibula is homologous in dogs, apes and birds because of their structure and developmental sequence, even though in dogs the fibula supports their weight, in apes it is used to support manipulation, and in birds it is used to support flight. But how do we know that in molecular systematics? Only sequence similarity is enough. So perhaps that is what Malte and David mean.

Homology is a matter of identity under all forms and functions, it is not similarity. Classification that derives from identity is natural. Classification that relies on similarity necessarily relies on the metric used, and hence on the observer rather than the things observed alone. And that metric depends on theory and convention, and occasionally the biases of the observer. So I think that might be why Ebach and Williams and the whole pattern cladist school tend to think of non-cladistic classifications as artificial. I expect they will come in here and tell me if I'm right or not.

Why is this radical? It has antecedents that go back three centuries, and ties in with philosophical debates of much older provenance. But classification has been left out of philosophy lately, except in the contexts of language and set theory. At one time it was a hot topic, but it has evolved in its own way to become debates over natural kinds and laws of science. It is time, I think, for philosophy to take a new look at classification.

More like this

Having lived through the cladistic revolution, I am still somewhat bemused. I suppose we make up trees and the derived classifications based on similarities which we hope to be homologies. Anytime I get too comfortable with this, I remind myself of something Steve Farris said, "A similarity is only a similarity, but a difference is really a difference." Doesn't keep me awake at night, however.

By Jim Thomerson (not verified) on 29 Nov 2007 #permalink

Only sequence similarity is enough.

I wouldn't necessarily say that sequence similarity is enough, but that sequence similarity is usually all that's available. As methods that allow factors such as secondary and higher structure of molecular data-sources to be considered become more available, I would expect that homology-testing in molecular biology will become more strenuous.

Creatonists would say that evolutionsts have an ad hoc philosophy, ever changing yet remaining in accordance with their preconceived notion of evolution in their flight from true belief!

By Morgan-LynnGri… (not verified) on 29 Nov 2007 #permalink

John, your summary is a good one distinguishing between different approaches of classifying. What you have missed, alas, is most of what has happened since about 1990. More and more people infer trees, but fewer and fewer of them bother to make a classification out of them. When you need to draw conclusions about the evolution of interesting characters, you nowadays do not go to a classification, but you get ahold of a phylogeny and interpret its implications for your character. Examples abound in molecular evolution and in the use of comparative methods in morphology and behavior.

In this interesting new era, systematists are unfortunately trying to put the labels "cladistic" or "phenetic" on particular numerical or statistical methods (such as parsimony, distance methods, likelihood, or Bayesian inference). "Cladistic" and "phenetic" are labels best reserved for philosophies of classification, not methods for reconstructing phylogenies. In my view, this labelling is a mess, a disaster, and confusing. Ebach and Williams are doing this, but in an unusual way that labels almost everything other than their favorite approaches "phenetic". I don't think they are helping us understand classification or understand the reconstruction of phylogenies.

One of the things about being outside the scientific process as an observer is a constant tendency to take at face value the labels scientists apply to themselves, so I thank you for your input here.

However, I must demur on two points: one is that cladograms are a form of classification in my view. Sure, that is not what scientists - particularly in the medical sciences - often think they are doing when they generate a cladogram. For them, they are just drawing a tree. But it is a tree formed by homologies and synapomorphies. It remains a classification even though it is not called by that name. A synapomorphy scheme has no other justification than that it schematicises homologies, and if that isn't a classification, then I'll eat Malte's hat (I don't wear one apart from one I don't want to eat).

In traditional philosophy, classification is regarded as the forming of classes according to some property or set of properties, but even older than that, including people like Kant, and the logicians, genealogy was also a kind of classification. There's no reason to think that a cladogram isn't a kind of classification. Whether the perpetrators know they are doing it or not.

I know you are of the It Doesn't Matter Very Much school (indeed, I believe you are the founder of that school) and I understand why you think that is best. But the use of molecular data, or of algorithms that sort data into schemes like cladograms, doesn't change the underlying epistemological task being done, which is to order data for the purposes of further inference and research. I think that it pays to define these different tasks, whether or not one wants to give them labels that arouse the ire of various participants.

I would disagree with that on one point: trees generated by distance methods, likelihood methods, or Bayesian methods do not result in a "synapomorphy scheme", just a tree (or a cloud of trees). Thus they do not identify homologies, except perhaps implicitly (and how would have to be specified). Ebach and Williams have some other method, and I don't know what it is. Their interesting historical discussion of Naef and other figures of many decades ago does not clear that up.