Metric and Corrections for Bias
We calculated evolutionary distinctiveness using the topology based metric, the Ws index from Posadas et al. (2001), which is derived from the Taxonomic Distinctness index conceived by Vane-Wright et al. (1991). We chose this metric for three reasons. (1) It assigns higher values to species with fewer and more distant relatives than to species with more and closer relatives, allowing for a better identification of areas with more phylogenetically divergent species (Redding et al. 2008). (2) It is designed for combining phylogenetic information from different cladograms, independently of the kind of characters (morphological, molecular, etc.) or reconstruction method, since it is a topology based metric. This way, we were able to integrate data from phylogenies of taxa as different as plants, reptiles, molluscs and arthropods to study the evolutionary distinctiveness of different areas in New Caledonia.
(3) Each phylogeny contributes with the same amount of information, independently of its total species' number, as the Ws values for the species in any given phylogeny sum to one.
The traditional procedure is to sum Ws of all species present in each area and rank areas according to this sum (Posadas et al. 2001; Lehman 2006; McGoogan et al. 2007; López-Osorio and Miranda Esquivel 2010). However, this practice often leads to strong correlations with species richness (see López-Osorio and Miranda Esquivel 2010), having the possibility of masking important evolutionary divergence in sites with less species, or less phylogenies. Secondly, as Ws is bound between 0 and 1 for a given phylogeny, it is sensitive to the number of sampled species in each phylogeny. Although this will in part be driven by species richness, it is also simply affected by the scope of the study selected by the investigator (e.g. family level or genus level). Thus the wider the phylogenetic breadth of a study (the more species included), the lower the overall maximum value for any one species. Thirdly, in the absence of exhaustive location-based sampling, the data available on the evolutionary diversity of a given site will simply reflect the taxa that happen to have been sampled for individual research projects. If this bias is not corrected for, it will be hard to see the phylogenetic content, as the number of phylogenies and the number of species in each site might drive the result.
In order to address these shortcomings, we designed a method to highlight sites containing the most divergent taxa from each of the phylogenies. We firstly calculated Ws for each species in each phylogeny, and placed the species in order from the highest to the lowest Ws value. We then awarded “points” to the most divergent species in each phylogeny and compared the resulting scores among sites. As we were interested in the 'front-runners' from each phylogeny – we firstly took the top three species, i.e. the most 'basal' species from each phylogeny, assigning them a score of 1 (for most basal) 0.67 (second place) and 0.33 (for third place). However, we latterly truncated this to scores for 1st and 2nd place (1 and 0.67) to emphasise the most divergent species. In the case of ties for the most divergent species, the total score of 1.67 was divided by the number of species that tied. Where there is a unique first place score, but ties for second place, the 'second prize' of 0.67 was 'shared' amongst the species which tied. The scores were then summed for all phylogenies at each site.
This method ensures that each phylogeny contributes a directly equal total score, and we are simply assessing in each case where the most divergent species are. The downside of using first and second ranked species, is that it discards information from all of the other species in each data set. To accommodate this, we also continue to report the (more conventional) sum of Ws values, also standardised by the number of phylogenies present at a given site.
Our data set is constrained by the number of phylogenies that were available. To assess whether our findings are sensitive to the composition of the sample of phylogenies we have, we designed two tests. The first was through assessing the changes associated with the exclusion of a single phylogeny (single drops, a.k.a. Jackknifing). This is to see if the findings are being driven by a single influential phylogeny. Secondly, we undertook a resampling (or rarefaction) procedure, by defining subsets of 1, 2, 3… 15 phylogenies in a site and then calculating the mean and standard deviation of site's scores with all possible combination of phylogenies with species occurring in it. This was to establish whether the results are stable with respect to the number of phylogenies we have available.