While Section 5 has covered a lot of ground, this should not detract from, but reinforce, the realization that the cline of co-occurrence complexity, entropy, and spikes in multidimensional space all point to the same conclusion with regard to corpus data in cognitive/usage-based linguistics that raw one-dimensional frequencies/percentages are too crude a tool to go the long way we still have to go towards understanding the cognitive and statistical properties of language acquisition, processing, use, and change. No one has summarized it better than Ellis & Ferreira-Junior (2009: 194):

Raw frequency of occurrence is less important than the contingency between cue and interpretation. Distinctiveness [in multidimensional space, STG] or reliability of form-function mapping is a driving force of all associative learning, [...] Contingency, and its associated aspects of predictive value, information gain, and statistical association, have been at the core of learning theory ever since.

Once we add to this perspective truly multidimensional approaches and new developments in distributional learning that can be applied to such information-ally rich contexts (cf. Baayen's 2011 paper on a naïve discriminative learning approach inspired by very the same approach by Rescorla and Wagner that Ellis and Ferreira-Junior's work discusses), then we stand a chance of developing better theories for our data dumbing down our methods and/or ignoring various kinds of converging evidence, on the other hand, will not help.


