The perceived lacks of semantics and discriminatory power
There are a number of empirical studies which support CA and undermine Bybee's arguments, which she appears not to have engaged with, especially Gries, Hampe & Schönefeld (2005), although it appears in her list of references. As mentioned above, Gries, Hampe & Schönefeld (2005) studied the as-predicative by means of a CA. They then ran a factorial sentence-completion experiment in which subjects were presented with sentence fragments ending in one of a set of verbs. These verbs were from eight groups that resulted from all combinations of three independent binary variables: COLLSTR (high vs. low), FREQCX (high vs. low), and VOICE (the voice of the sentence fragment: active vs. passive); a second re-analysis of the data also included FAITH (p(construction|verb)) as a covariate. ANOVAs of both analyses revealed highly significant effects of COLLSTR (also with the highest effect size) and insignificant and very weak effects of FREQCX. A follow-up study, Gries, Hampe & Schonefeld (2010, first presented 2004 and available online since 2006) revisited the as-predicative with a self-paced reading time study. Subjects' reading times on words after as were measured to determine whether the (dis)preference of a verb for the as-predicative would speed up/slow down reading processes when an as-predicative is encountered or not. While the result for COLLSTR very narrowly missed standard levels of significance (p = 0.0672, effect size = 0.014), this result would have been significant in a justifiable one-tailed test, and FREQCX yielded insignificant/weak results (p = 0.293, effect size = 0.005).
Bybee also ignores other studies that, while not primarily devoted to similar comparisons, still speak to the issue:
- Gries & Wulff (2005, 2009) find strong correlations between collostruction strengths and experimentally-obtained sentence completions from advanced L2 learners of English;
- Ellis & Ferreira-Junior (2009) find that frequency of learner uptake is predicted by frequency of occurrence, but more so by pFYE and AP;
- both Gries (2005) and Szmrecsanyi (2006) find strong correlations between verbs' collostruction strengths and priming effects observed in different corpora and for different constructions.
In sum, Bybee systematically chooses to not mention results of even a single study with experimental and/or corpus-based data running counter to her claims, but even a cursory glance at the literature shows that the picture is the opposite of the one she painted or, at least, much more complicated.
Bybee's final point of critique regarding low-frequency collexemes is only too easy to counter. No one ever said low-frequency collexemes should be ignored or cannot be revealing. A CA is based on the very fact that all collexemes are included
- the fact that most studies have focused on the top collexemes that are functionally most revealing does not mean weakly-attracted or repelled collexemes should not be studied, and the software that most CAs have used offers estimates of collostruction strengths for unattested words.
The absence of cognitive mechanisms underlying CA
Similarly straightforward to refute is the implication that CA does not come with a cognitive account of the data. First, given the strong (experimental and otherwise) support of collostruction strength in many studies that all adopt a cognitive-linguistic/usage-based framework, it is surprising there should be a special need for a cognitive underpinning in addition to what all these studies are based on anyway.
Second, the earliest studies make it very clear what their cognitive underpinning is. In Section 2.3 above, I already provided several quotes (from the studies Bybee refers to) to illustrate the CA position: Ultimately, collostruction strengths are based on (i) the conditional probabilities p(word|construction) and p(construction|word), which are related to notions of cue validity, cue reliability (cf. Goldberg 2006: Ch. 5-6 and Stefanowitsch to appear), associative learning measures such as AP, and prototype formation, and (ii) the frequencies that give rise to the probabilities, which are correlated with entrenchment. Put yet another way: "it is assumed [...] that the statistical associations found in the data are reflected in psychological associations in the mind of the language user" (Stefanowitsch 2006: 258).
-  A one-tailed test would have been justifiable because the expectation was that high collostruction strengths to the as-predicative would not just result in different reading times, but faster ones.