The objective criterion for the evaluation of classifications
The next question is, how to choose the optimal classification from this richness? In previous research, the classifications were evaluated against an a priori manually created ‘gold standard’ (Schulte im Walde 2006) or a posteriori intuitively (Stefanowitsch and Gries 2010). In this study, we propose an entirely objective and data-driven criterion. The evaluation of the models is carried out according to the power of the classes in predicting the use of doen or laten in the above-mentioned sample of 6,863 observations. Obviously, the predictive power of a classification will be greater if the lexemes that belong to one and the same class will also tend to be used in contexts with only one of the two auxiliaries. In other words, the predictive power is actually an operationalisation of the success of discrimination between the lexemes that tend to be used with doen and the ones that are attracted to laten (cf. distinctive collexemes in the Distinctive Collexeme Analysis developed by Gries and Stefanowitsch 2004).
In this case study, we used several well-known statistical measurements of the predictive power: C, Somer’s Dxy, Nagelkerke’s R2, Gamma and AIC (Hosmer & Lemeshow 2000; Baayen 2008). All statistical analyses were performed in R (R Development Core Team 2010). Since most of the parameters displayed very similar behaviour, in the discussion of the results we will limit ourselves to the concordance index C, which is believed to be one of the most objective estimators (Hosmer and Lemeshow 2000: 160-164). This statistic usually falls in the range between 0.5 (random prediction) and 1 (perfect prediction). If C < 0.7, this suggests no discrimination; if 0.7 < C < 0.8, the prediction is acceptable; if 0.8 < C < 0.9, the model has excellent discrimination; and if C > 0.9, the prediction is outstanding. A good model combines a low number of classes with high predictive power. The results of our experiments are presented in the next section.