Log in / Register
Home arrow Language & Literature arrow Theory and Data in Cognitive Linguistics

Clarifications, repudiations, and responses

This section addresses Bybee's points of critique and other issues. I will show that Bybee's understanding, representation, and discussion of CA does not do the method justice, but the discussion will also bring together a few crucial notions, perspectives, and findings that are relevant to cognitive/usage-based linguists, irrespective of whether they work with CA or not.

Perspective 1: CA and its goals

There are three main arguments against this part of Bybee's critique. The first is very plain: As cited above, Stefanowitsch & Gries (2003: 217) explicitly state that any AM can be used, one based on a significance test (pFYE, chi-square, t, ...), one based on some other comparison of observed and expected frequencies (MI, MI2, ...), an effect size (Cramer's V/cp, log odds, ...), or some other measure (MinSem, AP, ...). For example, Gries (2011, available online since 2006) uses the odds ratio to compare data from differently large corpus parts. Any criticism of CA on these grounds misses its target.

A second, more general counterargument is that the whole point of AMs is to separate the wheat (frequent co-occurrence probably reflecting linguistically relevant functional patterns) from the chaff (co-occurrence at chance level revealing little to nothing functionally interesting). Consider an example on the level of lexical co-occurrence: Whoever insisted on using raw frequencies in contexts alone would have to emphasize that most nouns co-occur with the very frequently and that whatever makes the occur in corpora is precisely the factor that makes it frequent around nouns. I do not find this particularly illuminating. As a more pertinent example, Bybee's logic would force us to say that the as-predicative, exemplified in (4) and discussed by Gries, Hampe & Schonefeld (2005), is most importantly characterized not by regard (the verb with the highest collostruction strength), but by see and describe, which occur more often in the as-predicative than regard (and maybe by know, which occurs nearly as often in the as-predicative as regard). Given the semantics of the as-predicative and the constructional promiscuity and semantic flexibility of especially see and know, this is an unintuitive result; cf. also below.

(4) a. V NPDirect Object as complement constituent

b. I never saw myself as a costume designer

c. Politicians regard themselves as being closer to actors

It is worth pointing out that the argument against 'testing against the null hypothesis of chance co-occurrence' is somewhat moot anyway. No researcher I know believes words occur in corpora randomly just as no researcher analyzing experimental data believes subjects' responses are random of course they don't and aren't: if they did, what would be the point of any statistical analysis, with AMs or frequencies? With all due recognition of the criticisms of the null hypothesis significance testing paradigm, this framework has been, and will be for the foreseeable future, the predominant way of studying quantitative data this does not mean the null hypothesis of chance distribution is always a serious contender. Plus, even if null hypothesis testing were abandoned, this would still not constitute an argument against AMs because there are AMs not based on null hypothesis frequencies and the most promising of these, AP, is in fact extremely strongly correlated with pFYE. Lastly, regardless of which AM is used to downgrade words that are frequent everywhere, all of them recognize it is useful to consider not just the raw observed frequency of word w in context c but also the wider range of w's uses. That is, users of AMs do not argue that the observed frequency of w in c is unimportant they argue that it is important, as is w's behavior elsewhere. It is surprising that this position could even be criticized from a(n) usage-/exemplar-based perspective, something to which I will return below.

The final counterargument is even more straightforward: Recall that CA involves a normalization of frequencies against corpus size (for CA) or constructional frequencies (for DCA). But sometimes one has to compare 2+ constructions, as in Gries & Wulff (2009), who study to/ing-complementation (e.g., he began to smoke vs. he began smoking). They find that consider occurs 15 times in both constructions. Does that mean that consider is equally important to both? Of course not: the to-construction is six times as frequent as the ing-construction, which makes it important that consider 'managed to squeeze itself' into the far less frequent ing-construction as often as into the far more frequent to-construction. An account based on frequencies alone could miss that obvious fact CA or other approaches perspectivizing the observed frequencies of w in c against those of w and/or c do not.

Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
Business & Finance
Computer Science
Language & Literature
Political science