The Numerical Program

The numerical idea (“mathematism”), which constitutes the core of the numerical research program in systematics, as was just noted, is a version of the episto-rational idea and program. It took shape in the 20th century and is usually associated with the positivist philosophy of science. However, its foundation has a deep historical and natural-philosophical background, dating back to early Antiquity and developing throughout subsequent history. At its beginning, there was the Pythagorean aphorism “everything is Number,” which affirms the subordination of Nature to the laws of harmony of numbers (on this basis, a numerological program was formed in systematics at the beginning of the 19th century, see Section 2.4.4). Then Galilei, in the 17th century, relying on the “book” metaphor of Aurelius Augustine, announced that “The Book of Nature is written in the language of mathematics.” Following this, I. Kant at the end of the 18th century, in his work with a very characteristic title, Metaphysical Foundations of Natural Science, expressed one of the key ideas of the phy sicalist conception of modern natural science: “in any special doctrine of nature there can be only as much proper science as there is mathematics therein” (cited after [Kant 2009: 235; italics in the original]).

In early systematics, the first attempt to describe organisms “in the language of mathematics” was made by J. Jung in the 17th century [Jung (1662) 1747]. The next to express one of the main points of the numerical program was probably

H. Strickland in the middle of the 19th century: based on the “taxonomic map” metaphor (see Section 2.4.1), he compared the similarity between groups of organisms to the distance between territories on a geographical map [Strickland 1841]. At the beginning of the 20th century, the development of quantitative methods of biometrics made a certain contribution to the initial development of the numerical program: they were used to solve elementary tasks of the pairwise comparisons of populations and to assess their similarities/differences numerically [Fisher 1925; Simpson and Roe 1939]. Simultaneously with this, the zoologist Evgeny Smirnov expressed the key idea of the program in question as follows: it is necessary “to establish those rules and laws that determine the mutual arrangement of the phenomena under study. Expression of these regularities in the form of mathematical formulas is the highest goal towards which the taxonomist strives” [Smirnov 1923: 359]. Smirnov called this systematics “exact” [Smirnov 1924, 1969]; somewhat later, its supporters gave it the epithet “numerical” and finally, quite bluntly, “mathematical” [Jardine and Sibson 1971; Dunn and Everitt 1982]. Assimilation of the positivist philosophy of science by systematics played a key role in this movement: it stimulated the development of a phenetic idea and, with it, an idea of possibility and even necessity for quantitative estimation of the similarity by multiplicity of characters between organisms and taxa [Cain and Harrison 1958; Cain 1959a; Sneath 1961; Sokal and Sneath 1963]. A purposeful development of the quantitative methods of elaborating classifications began in the late 1950s and early 1960s [Michener and Sokal 1957; Sokal and Sneath 1963; Williams and Dale 1965]. This marked an actual rise of numerical systematics, which is sometimes called taxometry or taxonometry [Rogers 1963; Williams and Dale 1965; Abbot et al. 1985; Jensen 2009].

Major Features

The ideological “core” of the numerical research program can be represented by two main positions. Firstly, the relationships (similarity, affinity, etc.) between organisms and their aggregates can and should be evaluated (measured) quantitatively. Secondly, the structure of relations measured somehow can and should be translated into a classification based on quantitative methods. All this taken together makes it possible to present the classification procedure in an algorithmic form, which makes it strictly analytical and “transparent” for verification, and reproducible. According to the idea going back to Kant (see above) and strengthened by positivist philosophy, this allowed the numerical program to claim to be the only one deserving the status of scientific in biological systematics.

It should be emphasized that, within the framework of the numerical program, the object of taxonomic research itself is not considered, but rather the research method in its specific “numerical” meaning. This means that numerical systematics, like the entire episto-rational systematics, does not have a subject area of its own: it belongs to the epistemic component of the cognitive situation. However, unlike self-sufficient “logical” systematics, it functions as a kind of “supplement” to ontic-based research programs (phylogenetic, biomorphic, phenetic, etc.). In this capacity, if it touches upon certain issues concerning the object itself, it is only in such a way as to “adapt” it to the needs of the numerical program to make it suitable for the application of quantitative methods. Due to this, numerical systematics, together with classification phenetics, promotes the creation of rather a specific Umwelt, in which there is no living nature at all, but only various kinds of abstractions—formalized representations of both organisms and their characters, and formalized assessments of the relations (affinity). Accordingly, a specific language for describing such Umwelt is developed: for example, there are no populations or species in it but the above-mentioned OTUs, to which various technical means of description and comparison are applied.

One of the most important abstractions of this kind, which became very widespread both in systematics and outside it (for example, in ecology of communities, biogeography), is the geometric interpretation of similarity relations (see next section for details). It is based on the phenetic idea idea of phenetic hyperspace with its axes being the characters and with OTUs distributed in it. “Numerists” successfully applied Euclid’s theorem of the right-angled triangle to this hyperspace and thus obtained a simple way to calculate the pairwise phenetic distances between these OTUs as analogs of real geographic distances. These numerically expressed distances in total compose the distance matrix, to which quantitative methods are applied to transform the initial phenetic hyperspace filled with OTUs into some other distribution models, one of which is the desired classification.

As in scholastic systematics several hundred years ago, the main emphasis of the numerical program is transferred to the classification method as such, so both its main problem and task become a justification of this method in a specific “numerical” manner. Following rigorous understanding of the “systematic philosophy” under consideration, it is argued that numerical methods should be deduced from certain well-formulated mathematical theories [Jardine and Sibson 1971; Dunn and Everitt 1982; Semple and Steel 2003]. Respectively, consistency of a method thus substantiated yields consistency of a classification elaborated with it. However, many of the numerical methods, borrowed from biometrics or developed by numerists, though intuitively understandable and therefore quite popular, turn out to be without a serious mathematical background and are sometimes criticized for this reason [Williams and Dale 1965].

Several methodologies are distinguished within numerical systematics, and they differ in some key assumptions. According to the content of the background knowledge, numerical phenetics and numerical phyletics are separated: no causes of taxonomic diversity are considered in the former, which makes it theory-neutral from an ontic perspective, while phylogeny is considered as such a cause in the latter, which makes its ontology burdened with metaphysics. At the level of analysis of characters, the proper numerical taxonomy [Sneath and Sokal 1963; Sokal and Sneath 1963], taximetry [Abbot et al. 1985], and taxonomic analysis [Smirnov 1969] can be distinguished: in the first, characters are generally introduced with equal weights, whereas the other two presume their differential weighting; Smirnov’s analysis includes elements of typology.

Algorithms and particular methods of numerical systematics are quite numerous, and they differ regarding certain principles of transition from raw data to the classifications. This variety is unavoidable, as it is largely due to the above-noted arbitrary nature and therefore potential multiplicity of the initial axiomatic systems used for substantiation of particular methods. Each of them may be good in itself within the framework of its respective axiomatics, but they are fundamentally irreducible to each other or to some general “supermethod” to the extent that they are based on different axiomatic systems. Therefore, generally speaking, there is no single numerical classification method that is equally applicable in all taxonomic research [Sneath 1995].

The areas of correct application of the numerical methods in systematics depend on two main factors: (a) to what extent the features of the organisms can be formalized by unit characters and (b) to what extent these characters are comparable (homologous) in different organisms. For obvious reasons, the effectiveness of the methods decreases with increasing complexity of organisms and the degree of differences between them. Therefore, the problems and technical difficulties caused by these factors are most relevant in the case of complex morphological macrostructures and are minimal when working with biochemical (molecular) characters. For the same reason, it is easier to compare species of the same genus than different orders of the same class. Both the universality and simplicity of molecular structures make numerical methods applicable almost equally effectively at all taxonomic levels up to the highest; due to this, the contemporary reconstruction of the global “Tree of Life” by means of numerical phyletics becomes possible [Cracraft and Donoghue 2004].

Many numerical methods allow study of the structure of both taxonomic and partonomic aspects of taxonomic reality; that is, to compare both taxa (organisms) and their characters on the same methodological basis. The areas of their application can be represented as a result of the decomposition of general phenetic hyperspace into the /- (organisms) and A- (characters) subspaces [Williams and Dale 1965]; they correspond to taxonomic and partonomic aspects of the taxonomic reality (see Section 4.2.1), respectively. Technical means for them are denoted as Q- and R-analyses [Sneath and Sokal 1973], respectively. 2-analysis involves a comparison of the OTUs by their characters and estimation of similarity relationships between them, /¿-analysis examines interrelations between characters as a degree of coincidence of their distributions over the set of OTUs.

With some reservations, it is permissible to refer to the tasks of numerical systematics not only elaboration of the classifications but also exploration of their structure characterized by certain quantitative parameters. These tasks belong to the field of comparative systematics. Among them, of great interest is analysis of the already-mentioned Zipf-Mandelbrot rank distribution, w'hich describes an inverse relationship between the number and size of taxa [Fairthorne 1969; Orlov 1976]. Comparing different classifications based on quantitative methods is another important task [Rohlf and Sokal 1981]. Besides, it is possible to study numerically the distribution of characters on the classification trees.

Apparently, the further development of the numerical program in systematics will be associated with a more active development of the sufficiently flexible probabilistic approaches (such as Bayesian), including those based on fuzzy logic [Amo et al. 1999; Scherer 2012]. This will make numerical methods more suitable for the research logic of biologists.

< Prev   CONTENTS   Source   Next >