Print publication date: 2006

# The logic of the data matrix in phylogenetic analysis

(p.56) (p.57) Chapter 4 The logic of the data matrix in phylogenetic analysis
Parsimony, Phylogeny, and Genomics
Oxford University Press
The process of phylogenetic analysis inherently consists of two phases. First a data matrix is assembled, and then a phylogenetic tree is inferred from that matrix. There is obviously some feedback between these two phases, yet they remain logically distinct parts of the overall process. One could easily argue that the first phase of phylogenetic analysis is the most important: the tree is basically just a re-representation of the data matrix with no value added. This is especially true from a parsimony viewpoint, the point of which is to maintain an isomorphism between a data matrix and a cladogram. Paradoxically, despite the logical preeminence of data matrix construction in phylogenetic analysis, by far the greatest effort in phylogenetic theory has been directed at the second phase of analysis, the question of how to turn a data matrix into a tree. This chapter deals with logical issues involving the elements of the data matrix in light of the nested and interrelated nature of terminal units (‘twigs’ of the tree) and characters. It is argued that if care is taken to construct an appropriate data matrix to address a particular question of relationships at a given level, then simple parsimony analysis is all that is needed to transform that matrix into a tree. Debates over more complicated models for tree-building may then be seen for what they are: attempts to compensate for marginal data.

