Oops, missed the first part of this talk due to the distractions of Lunch. Walked in as he was talking about tree vs. ladder thinking (people have a hard time conceptualizing trees) and history as a chronicle — barebones description of events — or a narrative — events linked by causal explanations.
It took a century for biologists to use systematics to make testable hypotheses about evolution. Darwin himself talked at length about all kinds of evidence for evolution, but strangely neglected fossils and dinosaurs altogether. Sereno blames this on rivalry with Richard Owen, who was the big dinosaur man of the day. One fossil Darwin was pleased with was Archaeopteryx, and Huxley in particular made the link between Archy and birds. Sereno brought in fossil of Confuciusornis — very cool.
We have begun to separate out the chronology from the narrative; chronology is a limiting factor in our hypotheses. We are interested in the trajectory of change over time, and Sereno confesses to baldly exploiting that to get a publication in nature of Raptorex, but he carefully omitted any causal discussion in the paper, trusting readers to infer a narrative from the story, because that's what we do.
Deplores the thinness of work in the philosophy of phylogeny.
History: Darwin crystallized many of the pieces of an existing chronology into an evolutionary narrative. The next big breakthrough was Hennig (1950) who atomized morphological transformations and branching patterns, defining specific terms to describe phenomena important for understanding trees. Quantitative cladistics (1969) put it on a solid empirical foundation. Character states were coded as mathematical variables.
Problem: everyone has a different matrix for the analysis of characters for each phylogeny examined. The matrix is a black box. We are searching for a methodology that will link everything together. A modern comparative cladistics would open up the black box for universal analysis. Need to figure out what the characters are, and need to be able to do comparative analysis. There is no global understanding of what a character or character state are. There is currently a movement to develop a universal character ontology.
He makes a strong case that we have a serious problem with different investigators studying the same phylogenies, but using different characters and even scoring them differently. We need to standardize to enable full comparisons of multiple data sets.










Comments
Posted by: taranaki
|
October 30, 2009 2:20 PM
PZ - it turns out that the xians have been waiting and have their response ready for this conference. From the Christian NewsWire -
'The Mysterious Islands'--a New Film Shot on the Galapagos Islands which Challenges Darwin--Premiers Opposite Darwin/Chicago 2009
This is literally a twice in a century conference - will it live up to its hype?
Posted by: Glen Davidson
|
October 30, 2009 2:30 PM
I'd think that using different characters at first wouldn't be such a bad thing, allowing for cross-correlations and information for later standardization of characters and scores. Different scoring sounds worse, to me, as people at least have to be able to compare anything quantitative.
DNA helps so much in living lines. That's one reason the whole "Cambrian Explosion" nonsense the DI is promoting seems so stupid, as DNA so clearly indicates evolution without any Cambrain break in the crown groups.
Glen D
http://tinyurl.com/mxaa3p
Posted by: John S. Wilkins
|
October 30, 2009 2:51 PM
Deplores the thinness of work in the philosophy of phylogeny
As do we all... if only the biologists wouldn't keep trying to do it.
Posted by: Antiochus Epiphanes | October 30, 2009 3:45 PM
Deplores the thinness of work in the philosophy of phylogeny???
Maybe a little thin from the model-based side, but good God! Pick up an issue of Cladistics and you will soon grow tired of reading about the philosophy of phylogeny. Find papers by Farris, Sober, Felsenstein, Faith, DeQuieroz, Kluge, etc.
I guess I don't have context, but it seems like a wild statement to me.
Posted by: Antiochus Epiphanes | October 30, 2009 3:50 PM
@John S. Wilkins: What do you mean?
Posted by: Jerry D. Harris | October 30, 2009 4:06 PM
No, this is exactly the opposite of what we need to do! If the same kind of analysis with different data sets (different matrices) provide the same results, then it strengthens the hypothesis that those results are the correct ones. If everyone just uses the same characters over and over again, without ever changing them -- or, worse, not changing them in light of new data demonstrating that they ought to be changed -- then it would force all analysis to be the same without there being any way of testing whether that sameness reflects reality in any way! I do agree that there should, perhaps, be some systematization of how characters are identified and measured, but the former is a matter of semantics and picking the best words to describe things; the latter is a perpetual problem that may need some work.
Posted by: David Marjanović, OM | October 30, 2009 9:27 PM
Sereno will give what seems to be the same lecture here in Paris on Tuesday. :-) :-) :-) (Or on Nov. 15th. The other will be about fossil finds from the Sahara. Forgot which is which.)
Yeah. Determining which characters should be ordered, which should be unordered, and which should get which stepmatrix is often a hairy issue. Determining which characters contain phylogenetic signal is not always trivial (though it usually is). Finding out which characters are correlated to each other and must therefore be treated as a single character is often very hard...
Total-evidence analyses are a good thing, though. (They're just a lot of work, as I can tell from experience.) Once all characters and taxa ever used on a problem (provided they're not correlated to each other etc. etc. etc....) are in the same matrix, all changes to the resulting topology must be due to added taxa or added characters; furthermore, this approach prevents accusations of cherry-picking (something Sereno himself has been accused of in the past, when his analyses had absurdly high consistency indices of 0.8 and higher, while everyone else's had and still have 0.2 to 0.3) – obviously, if your matrix contains only those characters that fit your hypothesis, your analysis will find your hypothesis, and with strong support at that.
Besides, bigger is better in phylogenetics.
Posted by: John Scanlon, FCD | October 31, 2009 12:25 AM
...or corrections of matrix entries found to be in error (observation and measurement errors, typos, brainfarts generally) or based on inadequate or atypical samples (this presumably all falls into the 'etc. etc...').
Absolutely agree on 'total evidence' (which seemed absurd terminology in 1989 but is getting less so in these days of whole-genome analyses etc.) and bigger-is-better. Excellent things are happening now with analyses combining the largest available morphological datasets (including fossils) and large numbers of gene sequences (including ancient DNA from recently extinct stuff) for large numbers of taxa. If you can get all that together and do a model-based (Bayesian) analysis, of course the results are an improvement on the 1970's-80's pick-and-choose approach (Gaffney's turtle stuff from that time is probably a more extreme example than Sereno).
But the bigger the matrix, the more checking...
Posted by: mythusmage
|
October 31, 2009 4:51 AM
Proposition: When doing a study of any organism all features must be taken into account in order to get an accurate picture of the specimen, especially those one might be tempted to label as irrelevant. For insofar as a detail adds to the picture, none can be considered inconsequential.
Now should one be tempted to observe that this makes such work hard, let me point out that if it were easy everybody would be doing it.
Posted by: John Morales | October 31, 2009 5:49 AM
Antiochus @5, John Wilkins is a philosopher and has an interest in philosophy of biology.
(His name links to his blog.)
Posted by: David Marjanović, OM | October 31, 2009 9:43 AM
Well, yes. My latest two papers (or rather one paper and the supplementary information to another) are about this very subject. :-)
Phylogenetics obeys the law of "garbage in, garbage out". Mistakes in the matrix lead to mistakes in the tree – mistakes of unpredictable kinds and magnitudes.
Model-based analyses have the great advantage over simple parsimony that they're less susceptible to long-branch attraction. However, a recent Syst. Biol. paper shows that all effects anyone ever feared missing data could have, which all don't in fact apply to parsimony, hit hard in model-based analyses. Furthermore, they suck at data matrices where too many different characters evolve at too many different speeds (heterotachy). The usual number of rate categories in a model-based analysis is four, which is just laughable; increasing that number (which can of course be done in the available software) leads to drastically increased computation times...
Haven't looked at his matrices, but even at that time, everyone else got different results from him, so it could be. The jaw mechanics stuff in the recent Joyce and Sterli papers hints at Gaffney having strongly relied on scenarios to code some of his characters.
Indeed (pers. obs.).
It's much more complicated than that.
We aren't even trying to get an accurate picture of the specimen. We're trying to get an accurate picture of its phylogenetic history.
Many characters are phylogenetically uninformative, so there's no need to take them into account. Many others are correlated to each other. Having x correlated characters in a phylogenetic data matrix amounts to having a single character with a weight of x – one character that gets counted x times. That can skew a tree beyond recognizability. Often, finding out which characters are correlated is easy, but there are many cases where sophisticated biomechanical and/or development genetics analyses are required to figure it out. There are some fairly big surprises there, and certainly most of those are yet to come.
But, yes, many characters that one would intuitively consider irrelevant (and that have in many cases been considered irrelevant for decades) carry phylogenetic signal, are not correlated to others, and should be taken into account.
Posted by: David Marjanović, OM | October 31, 2009 10:01 AM
...except... if you're doing a model-based analysis. Then you need them, too, because that's what the model is estimated from. The question becomes how many you need.
Concerning "brainfarts generally", one common case is when two (arbitrary but almost universally observed) conventions conflict: the one of calling the ancestral state 0 and the derived state 1, and the one of calling "absent" 0, "present" 1. This is something I need to watch closely in my own work, and I've found a couple of probable examples in published papers.
Posted by: John S. Wilkins
|
November 3, 2009 3:27 PM
@5: I mean that the reason why philosophers engage in philosophical interpretations, both good and bad, on phylogeny is because the biologists do it first. They insist on being philosophical, so we philosophers really have to come in an do it better (or worse)...
Besides, I was tweaking Paul.