Modelling Details
The Chinese and Thai computational models were designed to saccade over the same texts as used in the two experiments. In the case of the models, however, only the word lengths were of concern; features such as word frequency, character frequency and stroke complexity were all ignored. Saccades were generated on the basis of the empirical saccadic metrics calculated for each text. The metrics comprised the mean and standard deviation of progressive saccades, the mean and standard deviation of regressive saccades, and the probability of a regressive versus progressive saccade. Each model was run 20 times with different initial conditions, effectively generating 20 simulated subjects for each of the language materials. Table 1 gives details of the modelling parameters.
Table 1. Empirical parameters used in the simulation models of word targeting in Thai and Chinese
Parameter |
Thai |
Chinese |
Avg. progressive saccade in characters |
5.40 |
2.46 |
Progressive saccade SD |
2.53 |
1.30 |
Avg. regressive saccade in characters |
3.50 |
3.54 |
Regressive saccade SD |
3.18 |
3.91 |
Regression probability |
0.13 |
0.28 |
Figure 5 shows the landing site distribution for near and far launches for the Thai simulation. The results are surprisingly similar to the empirical data, in particular for the far launches. In the case of the near launches, the empirical distributions are more peaked than their model counterparts, suggesting that readers are utilising some information to target the word centres more effectively. Nonetheless, the default strategy as represented by the model does surprisingly well at getting the eye to a near-optimal location on the word as evidenced by the peaked near-launch landing site distributions.
Figure 6 illustrates the simulation results for the Chinese sentences. Again, we see that the far launches are very similar to the empirical data and the near launches are less peaked than their empirical counterparts.

Figure 5. The results of 20 independent repetitions of a simulation of saccadic eye movements using parameters derived from the Thai empirical data. The simulation generated random saccades over the Thai text as a function of the parameters in the given in Table 1. These data represent single fixations following rightward saccades.

Figure 6. The results of 20 independent repetitions of a simulation of saccadic eye movements using parameters derived from the Chinese empirical data. The simulation generated random saccades over the Chinese text as a function of the parameters in the given in Table 1. These data represent single fixations following rightward saccades.
So what information might the reader be using to make more accurate saccades? As discussed earlier in the case of Thai, one candidate might be the word-initial character frequency. Kasisopa et al. (2013) argued that readers might use character statistics as a clue to the location of word beginnings. Similarly, in the case of Chinese, perhaps the frequency of occurrence of a character as the first character of, say, a two-character word might also provide similar clues for the targeting mechanism. To test this hypothesis, we partitioned the near launches into two groups, one consisting of words with word-length contingent high-frequency characters at the beginning and the other composed of words with low-frequency initial characters. The hypothesis was that the presence of characters predictive of a word boundary should result in more accurate targeting of the word centre thus giving rise to a more peaked landing site distribution. The corollary is that the flatter distributions associated with less predictive characters should resemble those from the computational model.

Figure 7. Partition of landing positions from the Thai data for near launches as a function of the word-boundary predictiveness of the initial character. For clarity, we just show the two highest frequency word lengths.
Figure 7 is a comparison of distributions from the Thai empirical data (near launches only) between words with an initial character that is either poorly or highly predictive of a word boundary. Only data from the most frequent word lengths are shown here for the sake of clarity. While the peaks of the distributions are at the same location, the spread varies, with the more predictive characters giving rise to sharper distributions, as hypothesised. A similar, if less pronounced effect can be seen for words of length 2 in the Chinese data (cf. Figure 8). In both the Thai and Chinese data, the effect of predictability on landing site proved statistically significant in a linear mixed- effects analysis (t = 3.25 for Thai; t = 2.96 for Chinese).

Figure 8. Partition of landing positions from the Chinese data for near launches for one- and two-character words as a function of the word-boundary predictiveness of the initial character.