<aside> 📜
© 2026 Denis Jacob Machado. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
</aside>
The reconstruction of phylogenetic relationships from genomic data sits at the intersection of evolutionary biology, statistics, and computer science. Over the past three decades, the dominant analytical paradigm in phylogenetics has shifted from parsimony-based methods to probabilistic approaches — maximum likelihood (ML) and Bayesian inference (BI). This shift has been driven by the explosion of molecular sequence data, the development of increasingly sophisticated models of sequence evolution, and the availability of computational power that would have seemed fantastical to the systematists of the 1990s.
Yet, as Wheeler (2012, p. 269) reminds us, "There is no single answer to 'what is the best method?' in [the] same way there is no single answer to 'what is the best tree?' A criterion must be specified in both cases." The choice of a probabilistic framework is itself a hypothesis about how inference should proceed — and it is one that has been vigorously contested on theoretical, philosophical, and practical grounds.
"There is no single answer to 'what is the best method?' in [the] same way there is no single answer to 'what is the best tree?' A criterion must be specified in both cases."
This manuscript traces the arc of this debate, from the foundational methodological critique of Siddall & Kluge (1997) and its ontological deepening by Grant and Kluge (2004), through the theoretical framework laid out in Wheeler (2012), and into the contemporary landscape of phylogenomic inference. We examine the theoretical assumptions underlying probabilistic methods, assess their practical implementation at genomic scales, consider their computational demands, and evaluate the limitations that remain in interpreting their results. Our goal is to provide a conceptual guide for researchers navigating this complex methodological terrain.
In 1997, Siddall & Kluge published "Probabilism and Phylogenetic Inference" in Cladistics — a paper that remains one of the most philosophically forceful critiques of maximum likelihood in phylogenetics. Their central thesis was stark: "The maximum likelihood approach to phylogenetics rests on frequency probability theory. This stands in stark contrast to the logical probability of corroboration-based cladistic parsimony" (Siddall & Kluge, 1997, p. 313).
The argument operated on several levels simultaneously. First, Siddall & Kluge insisted on a sharp distinction between frequency probability (the long-run behavior of repeatable events) and logical probability (the degree to which evidence supports a hypothesis). Phylogenetic events — speciation, character transformations along specific lineages — are historical singulars. They happened once, in a particular time and place. Treating them as draws from a parameterized probability distribution, as ML requires, conflates the unrepeatable particulars of history with the repeatable generalities of statistical estimation. As Siddall & Kluge argued, "History is particular and cannot be described in terms of universal statements about abstract generalities, the task of the historical sciences being one of explanation, not prediction" (Siddall & Kluge, 1997, p. 313).
"History is particular and cannot be described in terms of universal statements about abstract generalities, the task of the historical sciences being one of explanation, not prediction"
Second, Siddall & Kluge raised the problem of model circularity. The models of sequence evolution used in ML (and later in Bayesian analysis) are themselves derived from phylogenetic inference. We infer trees, then use those trees to estimate model parameters, then use those parameters to infer new trees. Siddall & Kluge (1997) argued that this constitutes a logical fallacy: justifying a method with a theory derived from the results of that method.
Third, Siddall & Kluge attacked the independence assumptions inherent in ML. The likelihood of a dataset is computed as the product of individual site likelihoods, which requires that sites evolve independently and identically (i.i.d.). Wheeler (2012, p. 216) acknowledges this directly: "An important assumption for the analysis of character data is that they are independent and identically distributed (i.i.d.). This allows the joint likelihood of several characters to be calculated as the product of their individual values. Certainly, for many character types, this is not reasonable." Siddall & Kluge argued that the requirements of frequency probability theory are "non-trivial and the maximum likelihood estimation of phylogeny can neither escape, nor satisfy the tenets of calculus independence (e.g. i.i.d.) inherent in the multiplicative relations of the method" (Siddall & Kluge, 1997, p. 313).
"An important assumption for the analysis of character data is that they are independent and identically distributed (i.i.d.). This allows the joint likelihood of several characters to be calculated as the product of their individual values. Certainly, for many character types, this is not reasonable."