DOP as a usage-based, constructionist model
DOP shares most of its core principles with the usage-based constructionist approach. In this section, we discuss on what theoretical positions DOP and the usage-based constructionist approach converge.
An important effect of this modeling procedure is that the most likely parses will be the ones that have derivations consisting of a few larger subtrees. Why is this? The probability of a derivation will generally be higher if it consists of fewer subtrees because there are fewer subtree probabilities to multiply. Hence, the model has a bias towards interpreting utterances by using as few and hence as large fragments as possible. In that sense, the model tries to maximize analogy with the previously processed utterances and by doing so, the model adheres to the usage-based principles that grammatical productivity comes about through experience and a domain-general ability to make schemas (Tomasello 2003; Gentner 1983).
The reliance on experience is another aspect on which usage-based constructionist approaches and DOP converge. First of all, the hypothesis space of possible grammatical constructions emerges through the experience with language, as well as the conception that we have to understand language, at least to some extent, hierarchically (see Frank, Bod, and Christiansen 2012 for arguments why hierarchical processing is not a procedure applied all the time). Hierarchicality, then, does not have to be the a priori template for a learner to understand language. The learner may start with a number of possible data structures, some of which are hierarchical and some are not, and find out, in response to processing the data that a hierarchical template to store, process and produce language may be an optimal cognitive strategy (Perfors, Tenen- baum, and Wonnacott 2010). For this paper, we assume that this property of language has been discovered.
Secondly, experience means that routinization and Gestalt-like effects take place (Bybee 2006). It is well known that frequency affects language use, at the very least by governing choice among acceptable alternatives (Schuchardt 1885; Mehler and Carey 1968; Jurafsky 2003). DOP incorporates this insight by allowing for larger fragments to be stored and used as Gestalts in linguistic processing. Moreover, the trade-off between computation (composing two fragments by the substitution operator) and storage (using one larger fragments) is driven by frequency as well: the more likely the parts are relative to the whole, the more likely a computed analysis (as opposed to a retrieved one) is. But most importantly: it does not have to be either/or. Because all derivations, whether they are directly retrieved as one chunk, or composed of minimal bits, are used in calculating the probability of the analysis, DOP avoids the rule-list fallacy (Langacker 1989: chapter 1): language users maintain both, sometimes perhaps redundantly.
Furthermore, DOP starts from the same maximalist conception of language as constructionist approaches do. This conception entails a couple of things. First, the basic building blocks are heterogeneous in size. This means that they can be small, like words or depth-one rules, or larger. And it means that they can be abstract, having no lexical material, or highly concrete. An important insight following from this principle and the previous one is that rules and exemplars are not ontologically different entities, but are created out of the same matter, viz. processed experience. Every subtree in DOP then, is a schema from the processed experience that can be recombined with parts of other experiences to understand something novel. These ideas resonate core properties of a constructivist, usage-based understanding of grammatical knowledge (Croft and Cruse 2004: chapters 10 and 11).
Finally, the inventory of the basic building blocks may be redundant, as hinted at earlier. DOP gives the artificial learner fragments which it can, in principle, build up out of other subtrees it has. The idiom What time is it? can of course be built up out of its components, but there is reason to believe that language users keep a representation of the whole in mind as well (Bybee 2006).
Although this is not a position shared by all constructionist linguists (Construction Grammar (Fillmore and Kay 1996) tries to minimize redundancy for instance), the usage-based theorists seem to embrace this idea. Accepting redundancy as a core property of the linguistic system follows rather naturally from the rejection that linguistic structure has to be either stored as a rule or as a list (i.e., the rule-list fallacy, cf. Langacker 1989).
In fact, the DOP framework has been used to address issues in language acquisition that relate to the issues of heterogeneity and redundancy. Given a hypothesis space of all possible subtrees, we can find out what set of subtrees was most likely used in deriving an utterance. Without going into the details, Borensztajn, Zuidema, and Bod (2008) did so for a syntactically annotated corpus of young children’s utterances. What they showed was that, in line with the usage-based perspective, the most likely subtrees behind the children’s utterances become more abstract with age. More examples of applying the DOP principle to language acquisition can be seen in the next sections.