General discussion and conclusions
In this paper, we have demonstrated how Semantic Vector Spaces, an approach in computational linguistics, can be transferred to Construction Grammar and used to model constructional semantics. Our approach offers the following epistemological advantages in comparison with introspective manual semantic analysis (in the extreme case, mere eye-balling of the contexts):
- It allows one to find the optimal semantic classification of constructional slot fillers, based on their usage;
- - It takes into account the frequencies of semantic classes in near-synonymous constructions, which has a cognitive interpretation in terms of the cue validity of collexeme classes;
- - It provides objective evidence about the optimal level of semantic granularity of semantic descriptions;
- - It allows one to detect semantic regularities, which might otherwise go unnoticed;
- - It allows one to explore larger amounts of data than it would be possible manually, and, in principle, can provide distribution-based classifications for all collexemes that occur in the corpus. As a result, the conclusions are robust from both the quantitative and qualitative perspectives.
In addition, the co-occurrence matrices that were used for this case study can be easily “recycled” for other construction-oriented case studies in Dutch.
Apart from the demonstration of a new method for studying constructional semantics, the results of the experiments have theoretical consequences. For instance, in section 5.4 we found that the effects of the semantic classes of the different slots are non-additive. In other words, the predictive power of the three slots combined together is smaller than the sum predictive power of the slots taken separately. Why should this be the case? A possible explanation might be as follows. If a word is compatible with the meaning of the construction, as assumed in Construction Grammar, then there should be coherence among all slots. This statement is called the Principle of Semantic Coherence by Stefano- witsch and Gries (2005: 11). The coherence can be based on different types of knowledge, for instance, frame-semantic relationships, as in example (10). The information stored in one slot therefore interacts with the information from the other slots. This non-additivity can serve as evidence that the semantics of constructions is not reducible to the semantics of the constructional components.
Another interesting finding is that the most parsimonious classification of the Causers has fewer classes than that of the Effected Predicates. In general, noun classifications tend to be organised taxonomically as trees with long branches, for instance, ‘entity’ - ‘concrete object’ - ‘living being’ - ‘animal’ - ‘mammal’ - ‘carnivore’ - ‘canine’ - ‘dog’ - ‘bulldog’, whereas verbs constitute far less hierarchically structured ‘bushes’ (the hyper- and hyponymy chains in the WordNet provide a nice illustration). This difference can be explained by the complex semantic structure of verbs, which normally involves a configuration of participants, spatiotemporal characteristics of the event, image schemata, social frames and scenarios, and other information. To organise these heterogeneous abstract semantic structures in a consistent tree-like hierarchy with a high level of generalisation is problematic. It seems plausible that verbs are organised in a large number of small local clusters. We need very detailed contextual information (in our approach, many possible subcategorisation frames) to capture these small classes.
Perhaps the most intriguing finding concerns the type of distributional context. Both in the case of the Causers and the Effected Predicates, the maximally syntactic (constructional) models perform the best (cf. Gries and Stefanowitsch 2010). One may wonder why this should be the case. We would like to propose the following explanation. The abstract syntactic context features highlight the syntactically (constructionally) relevant semantic properties of lexemes (e.g. animacy or inanimacy). These features may also be relevant for the prediction of other constructions beside the causatives with doen and laten. According to this hypothesis, one can expect the more syntactically enriched models to perform better than more lexically specific models in all sets of near-synonymous syntactic constructions. Future research will show if this hypothesis is correct.
The approach also has to face a few challenges in the future. First of all, a more realistic approach would require word sense disambiguation. For instance, it would be necessary to treat denken in the sense ‘think (of)’ separately from denken as ‘think (that)’ and even ‘think (about)’. Another important problem is pronominal reference resolution, which would allow us to use all occurrences of the constructions in the testing set and get a more complete picture.
The results also suggest that a medium level of granularity is optimal for classification of the Effected Predicates. However, as we pointed out in section 2, the speaker’s knowledge about the constructions may have several interacting levels of generalisation. So far, we have pruned the clustering tree only at one level. In future experiments, we are planning to test different levels of granularity simultaneously, which may allow us to integrate both higher-level generalisations and specific exemplars of the constructions.
From a broader interdisciplinary perspective, this study has demonstrated that the neostructuralist distributional models perform quite well in finding relevant semantic similarities between constructional slot fillers. We expect that fully construction-based bottom-up models, free from any aprioristic assumptions about the syntactic categories and relations, will further improve the performance and make the approach even more radically data-driven. One of the possibilities, for instance, would be to use unsupervised stochastic grammar induction from raw data on the basis of existing Data Oriented Processing models (e.g. Beekhuizen and Bod, this volume). A practical implementation of this approach will bridge the gap between the computational models of semantics and usage-based Construction Grammar.