A Modelling Approach
Engbert and Kruegel (2010) described a successful Bayesian estimator model of saccade control that provided a parsimonious account of the previously mentioned saccadic range error in word targeting during the reading of English text. This is the phenomenon whereby, during progressive saccades, the eye tends to overshoot when launched too near a word target and undershoot when launched from further away (McConkie et al. 1988). Until now, the over- and undershooting has been attributed to noise in the neurons that controlled the muscles responsible for moving the eye. However Engbert and Kruegel (2010) demonstrated that the actual landing sites were remarkably well described by the product of the prior (or default) landing site distribution and a likelihood distribution centred on the optimal position for identifying the word (i.e., the word centre). Bayesian estimator models are optimal when there is some degree of uncertainty about the target location. If this appears to be the case for spaced writing systems such as English or German, it is certainly true of unsegmented writing systems such as Chinese and Thai.
The approach described in this paper involves the construction of a simple computational model of saccadic control for both Thai and Chinese that effectively estimates the Bayesian prior for each writing system. Adopting Engbert and Kruegel’s (2010) analytic framework, the prior is equivalent to the default behaviour of the saccadic control system in the absence of reliable target information. In effect, it is the landing site distribution produced from the execution of saccades whose length and direction are randomly drawn from the empirical distribution of saccades for a given text. Taking the prior model as a baseline we can determine the degree to which it accounts for the empirical landing site data in both writing systems. A reasonable assumption is that for saccades launched relatively distant from the target, the prior model will give a good fit to the data, while those launched from near to the target may or may not deviate from the model depending on the availability and utilisation of targeting information. One hypothetical source of word-targeting information are statistical clues to word beginnings provided by characters that occur frequently in a word-initial position. As a first approximation, their role could be tested by determining the degree to which the empirical distributions deviate from the computationally estimated prior, specifically in the case of saccades launched from close to the target word.
As a preliminary exploration of these issues it was decided to adopt a corpus-based approach and analyse two extant corpora of Thai and Chinese reading data.