Benchmarking n-grams, Topic Models and Recurrent Neural Networks by Cloze Completions, EEGs and Eye Movements
Previous neurocognitive approaches to word predictability from sentence context in electroencephalographic (EEG) and eye movement (EM) data relied on cloze completion probability (CCP) data effortfully collected from up to 100 human participants. Here, we test whether three well-established language models can predict these data. Together with baseline predictors of word frequency and position in sentence, we found that the syntactic and short- range semantic processes of n-gram language models and recurrent neural networks (RNN) perform about equally well when directly accounting CCP, EEG and EM data. In contrast, a low amount of variance explained by a topic model suggests that there is no strong impact on the CCP and the N400 component of EEG data, at least in our Potsdam Sentence Corpus dataset. For the single-fixation durations of the EM data, however, topic models accounted for more variance, suggesting that long-range semantics may play a greater role in this earlier neurocognitive process. Though the language models were not significantly inferior to CCP in accounting for these EEG and EM data, CCP always provided a descriptive increase in explained variance for the three corpora we used. However, n-gram and RNN models can account for about half of the variance of the CCP-based predictability estimates, and the largest part of the variance that CCPs explain in EEG and EM data. Thus, our approaches may help to generalize neurocognitive models to all possible novel word combinations, and we propose to use the same benchmarks for language models as for models of visual word recognition.
Chapter written by Markus J. Hofmann, Chris Biemann and Steffen Remus.