Abstract
Given there are no interword spaces marking word boundaries in Chinese text, it remains unclear how information about word length influences eye movement control during the reading of Chinese text. In this research, we set up strict controls for word frequency and other word properties, to study this knowledge gap. In Experiment 1A and Experiment 1B, a between-subjects design was used. Forty-eight pairs of one- and two-character words were selected as target words in Experiment 1A, while the same amount of two- and three-character words were selected in Experiment 1B. Conversely, a within-subjects design was used in Experiment 2. Sixty sets of one-, two- and three-character words were selected as target words. The results showed that long words were skipped less often and fixated on more often than short words. Total time was shorter for shorter than for longer words but first fixation durations were longer for one- than for two-character words. Most importantly, we did not find reliable evidence to support the view that word length could modulate initial landing position and incoming saccade length in the length-matched region analyses. These findings suggest that word length influences eye movement control during reading Chinese in a way that is slightly different from that in the process of reading English.
One of the most important tasks in studying reading is to know what determines when and where to move the eyes (Rayner, 1998, 2009). In the reading of English text, the ‘when’ is primarily influenced by word frequency, predictability and length, while the ‘where’ is mainly determined by visual factors such as word length and interword spacing, as assumed by the E-Z Reader model (Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle, Rayner, & Pollatsek, 2003). In Chinese reading, it has been found that word frequency (Ma, Li, & Rayner, 2015; Yan, Tian, Bai, & Rayner, 2006) and predictability (Rayner, Li, Juhasz, & Yan, 2005) influence when to move the eyes in a way that is similar to that in English reading. However, since there are no spaces marking word boundaries in Chinese text, word length information is not easily acquired in parafoveal vision during reading (Li, Liu, & Rayner, 2011). It is still not well-understood whether, and if so how, word length information influences when and where to move the eyes in reading Chinese texts. The present study was designed to investigate these gaps in the knowledge field.
In natural Chinese texts, there are no spaces to mark word boundaries, but the length of Chinese words usually vary from one to four characters. Most words (about 72%) are two characters in length, while about 6%, 12%, and 10% are one-, three-, and four-character words, respectively (Wei, Li, & Pollatsek, 2013). Note that the perceptual span (i.e., the area of effective vision during a fixation in reading) in reading Chinese includes one character to the left of current fixation and two to three characters to its right (Chen & Tang, 1998; Inhoff & Liu, 1998). Therefore, in most cases, in reading Chinese, there should be multiple words located in the perceptual span. How word length influences eye movement control in the reading of Chinese remains an open question. Given that there are no visually salient features, such as space, to indicate word length, the use of word length may be different from that in reading spaced English texts.
In English reading, word length modulates both fixation times and saccade target selection. It has been reported that word length has little effect on first fixation duration, but as word length increases, the probability and first pass reading times of fixating on a word increases (Joseph, Liversedge, Blythe, White, & Rayner, 2009; Plummer & Rayner, 2012; Rayner, Slattery, Drieghe, & Liversedge, 2011; White, Rayner, & Liversedge, 2005). Moreover, word length influences the initial landing position on a word when it is not skipped (Plummer & Rayner, 2012). Readers prefer to make a saccade to the left of the centre of a word in English reading, a position that is called the preferred viewing location (PVL) (Rayner, 1979). The PVL is close to the optimal viewing position (OVP) wherein readers can recognise a word most efficiently in an isolated word recognition task (O’Regan & Jacobs, 1992; O’Regan, Lévy-Schoen, Pynte, & Brugaillère, 1984). The oculomotor system may use word length information to target the default position (i.e., PVL/OVP), but the accuracy is influenced by a random saccade error and saccadic range effect (i.e., small distances are likely to be overestimated, while large distances are likely to be underestimated) (McConkie, Kerr, Reddix, & Zola, 1988).
In reading Chinese, a corpus analysis revealed that word length influenced gaze duration and the probability of skipping over words (Li, Bicknell, Liu, Wei, & Rayner, 2014). Long words are fixated on for a longer duration and skipped less often than short words (Li et al., 2014; Li et al., 2011; Li & Shen, 2013). Note, however, that word frequency has a high correlation with word length, in Chinese texts, and that the word frequencies of long words are usually lower than that of short words. In the two previous studies on word length effect during the reading of Chinese, word frequency was not well controlled (Li et al., 2011; Li & Shen, 2013). For instance, in Li and Shen’s (2013) study, word frequencies of the target words in the two-character word condition (M = 64.23 occurrences per million words, standard deviation [SD] = 123.43) was significantly greater than that in the four-character word condition (M = 1.57 occurrences per million words, SD = 1.36). Therefore, it is unclear whether the increase of fixation times for long words is caused by lower word frequency for longer than shorter words.
Furthermore, it is controversial whether word length can influence saccade target selection while reading Chinese. To answer this question, researchers studied whether there was a PVL at the centre of a word when reading Chinese (Li et al., 2011; Yan, Kliegl, Richter, Nuthmann, & Shu, 2010). In Li et al.’s (2011) study, the authors presented either a two- or four-character word in the same sentence frame, and observed participants’ eye movements during reading. If Chinese readers preferred to saccade to the centre of a word, as English readers do, there should be a corresponding PVL at the centre of a word, resulting in differences in the initial landing position of target words between the two- and four-character word conditions. However, it has been consistently reported that there is no PVL when reading Chinese or, at least, the PVL curve does not peak at the centre of a word as it does when reading English (Li et al., 2011; Ma, Li, & Pollatsek, 2015; Zang, Liang, Bai, Yan, & Liversedge, 2013). When including all forward saccades to the matched four-character target region (i.e., the regions of interest [ROIs] include the four-character target word in the four-character word condition, and include the two-character target word and the following two characters in the two-character word condition), Li et al. (2011) found that both initial landing position and incoming saccade length were not influenced by word length information.
The null effect of word length on initial landing position might be caused by the existence of extra variables such as word frequency. In this study, we set strict controls for word frequency and other extra variables to study the role of word length during the reading of Chinese. Considering the limit of perceptual span in reading Chinese, we only used one-, two- and three-character words as target words. This study involves a between-subjects design in Experiment 1A and 1B and a within-subjects design in Experiment 2 (see Figure 1). Eye movements were recorded as participants read the sentences. If word length information modulates fixation time as it does while reading English, long words should be fixated with longer first pass reading time and should also be skipped less often than short words. In addition, if word length indeed influences saccade target selection as that in English reading, we would consistently observe differences in initial landing position and incoming saccade length between the long- and short-word conditions when including all forward saccades to the length-matched target region.

Materials used in the present study. The target words are in bold for the purpose of illustration (characters were not in bold in the experiment).
Experiment 1A
Method
Participants
Thirty native Chinese speakers from Shaanxi Normal University were paid to participate in the experiment. All of them had normal or corrected-to-normal vision and were not made aware of the purpose of the experiment.
Materials and design
We selected 48 pairs of one- and two-character words as target words, and each pair was embedded in the same sentence frame (see Figure 1). Word frequency was calculated using occurrences per million words as a unit based on a published lexicon database (Chinese Linguistic Data Consortium, 2003). Word frequency showed no significant difference between the one- and two-character word conditions (see Table 1). Character frequency and complexity of the first character for each pair of target words were matched to avoid their influence on eye movement control (Ma & Li, 2015; Ma et al., 2015). The average predictability of target words in the sentence was then assessed by 10 volunteers and no difference was found between the two conditions. The naturalness of all sentences was assessed by 12 volunteers on a 7-point scale (1 = very unnatural, 7 = very natural). The results did not show significant differences between the two conditions.
Properties of the Stimuli for one-, two-, and three-character words conditions in Experiment 1A and 1B.
Word frequency is measured by occurrences per million words. Standard errors are given in parentheses.
Apparatus
The stimuli were presented on a 24-inch LCD monitor (ASUS VG248QE) with a resolution of 1920 x 1080 pixels and a refresh rate of 144 Hz. Each sentence was displayed in Song 24-point font in black (RGB: 0, 0, 0) on a grey background (RGB: 128, 128, 128). Participants’ eyes were positioned approximately 62 cm away from the computer monitor. At this viewing distance, each character subtended a visual angle of about 0.8°. We used an Eyelink 1000 plus (SR Research Ltd, Ontario, Canada) eye tracker with a sample rate of 1000 Hz to track participants’ eye movements. A chin rest was used to minimise head movements. Participants read the sentences binocularly, but only the right eye was monitored.
Procedure
When participants arrived, they were given instructions for the experiment and a brief description of the apparatus. The eye tracker was calibrated at the beginning of the experiment and recalibrated as necessary. Each participant read six sentences to practice, followed by the formal experiment in which they read 48 experimental sentences and 24 filler sentences in a random order. Participants were asked to read silently and then to answer comprehension questions following one third of the total number of sentences. Each sentence appeared only after participants successfully fixated on a character-sized box at the location of the first character of each sentence. After reading a sentence or answering a comprehension question, the participants were asked to press a response button to start the next trial.
Analysis
Accuracy on the answers to comprehension questions was high (93%), indicating that all the participants understood the sentences well. About 3% of the trials were removed from further data analysis because participants blinked more than three times when reading the whole sentence or blinked once while fixating at the target word region.
We report, first, on the following nine eye movement measures for the target word region: first fixation duration (the duration of the first fixation on the target region), gaze duration (the sum of all first-pass fixations on the target region before moving to another region), total time (the sum of all fixations on the target region), number of fixations (the number of fixations the target word received during first-pass reading, not counting instances of skipping), probability of skipping (the probability that the target word was skipped on first-pass reading), launch site (the distance between the last fixation on the pretarget region and the left side of the target region), initial landing position (the distance between the first fixation on the target region and the left side of the target region), incoming saccade length (the distance between initial landing position and the launch site) and outgoing saccade length (the distance between the last fixation on the target region and the first fixation on the region to the right). Second, we report the distribution of landing position adopting the same method used in Li et al.’s (2011) study to examine whether Chinese readers tend to prefer to fixate on the centre of a word. For this method, the length matched ROIs include the two-character target word in the two-character word condition, and include the one-character target word and the following one character in the one-character word condition. The launch site, initial landing position and incoming saccade length for length matched ROIs are reported in order to compare our data with Li et al.’s (2011) findings. Similarly, one character was coded as one unit for saccade measures.
We analysed the data using a linear mixed-effects model (LMM) for continuous variables and a generalised mixed-effects model for binary variables (Baayen, Davidson, & Bates, 2008; Jaeger, 2008). In order to avoid being anti-conservative (Barr, Levy, Scheepers, & Tily, 2013) and also balance type I error and power in LMM (Matuschek, Kliegl, Vasishth, Baayen, & Bates, 2017), we not only included crossed random intercepts for participants and items, but also included random slopes for participants in our models unless they failed to converge. Notice that stroke number influences saccade target selection and fixation durations (Ma & Li, 2015), we included both word length and stroke number (i.e., the sum of stroke numbers for each word) as fixated factors in all LMMs. The Lme4 package (version 1.1-12, Bates, Machler, Bolker, & Walker, 2015) was used in the R environment (R Core Team, 2016). Fixation durations were log-transformed to meet LMM assumptions (Kliegl, Masson, & Richter, 2010). It is important to note here that analyses of log-transformed durations yielded results similar to those obtained from the untransformed analyses. The p-values were estimated using the lmerTest package (version 2.0-33, Kuznetsova, Brockhoff, & Christensen, 2016).
Results and discussion
Target word region
Experiment 1A showed that increased stroke complexity led to longer first fixation durations, gaze durations and total times, and also larger number of fixations and smaller probability of skipping, ts > 1.970, ps < .051, but stroke number did not significantly affect launch site, initial landing position, incoming saccade length and outgoing saccade length, ts < 1.444, ps > .151. The present study mainly focuses on the effects of word length in Chinese reading and detailed eye movement measures and statistical analyses for word length effects are showed in Table 2. First fixation durations were longer in the one- than in the two-character word conditions. A similar finding has been observed previously in the reading of Chinese (Cui, Drieghe, Bai, Yan, & Liversedge, 2014). Cui et al. (2014) explained this finding on the basis of a trade-off, insofar as long words typically received more fixations than short words and, as a result, the first fixation duration on a short word was more often a single fixation compared to the first fixation duration on a long word. As in English, single fixation durations tend to be longer than the first of two fixations (Rayner, Sereno, & Raney, 1996). However, our data suggests that the number of single fixations may be not the reason for longer fixation durations on shorter words. For one-character target words, first fixation durations did not show significant differences between single fixation duration (M = 268 ms, standard error [SE] = 8) and the first of two fixations (M = 276 ms, SE = 11), t < 1. For two-character target words, first fixation durations were significantly shorter in single fixation duration (M = 234 ms, SE = 6) than the first of two fixations (M = 257 ms, SE = 10), t = –2.302, p = .029 (see Yan et al., 2010 for a similar finding). Therefore, longer first fixation duration for short words, observed in the current study, is not likely to be caused by more single fixations for shorter rather than longer words (see the section of general discussion for potential mechanisms underlying this phenomenon).
Eye movement measures and data analyses for the target word region in Experiment 1A.
First fixation duration, gaze duration, and total time were measured in milliseconds. Standard errors are given in parentheses.
Further data analysis showed word length effects. Although gaze durations and total times did not show significant differences between the one- and two-character word conditions (see Table 2), long words were fixated on with more fixations and skipped less often than short words. The launch site did not show significant differences between the two conditions, but long words had further initial landing positions and longer incoming saccade lengths than that for short words. Consistent with the finding in the previous study (Li et al., 2011), outgoing saccade length was significantly longer for longer than shorter words. Note that the target word regions were not matched in length between the one- and two-character word conditions. Thus, these kinds of saccade measures may not directly indicate that word length can modulate saccade target selection. Consistent with the logic used by Li et al. (2011), initial landing position included both the first fixations on the first and second character regions in the two-character word condition, but it only included the first fixations on the first single character region in the one-character word condition. Therefore, much longer saccades were included for longer than shorter target words, resulting in further initial landing position and longer incoming saccade length. In addition, readers need to saccade longer to leave a longer ROI, resulting in longer outgoing saccade length for longer than shorter words.
Length matched region
To compare this study with previous findings in Li et al. (2011), we analysed the distribution of landing position in the length matched region adopting the same method used in Li et al. (2011). The ROIs include the two-character target word in the two-character word condition, and include the one-character target word and the following one character in the one-character word condition. Figure 2 shows the proportion of initial fixations that landed on different landing zones in a length matched two-character region. Consistent with Li et al.’s (2011) finding, the initial fixations were more likely to fall on the first character than on the second character and they dropped from left to right. However, this phenomenon does not mean that Chinese readers prefer to fixate at the beginning of a word or that a PVL curve exists in reading Chinese. As Li et al. (2011) explained, all of the forward fixations were counted when calculating the number of fixations for the first character but only partial fixations (fixations resulting from long saccades) were counted for the other characters. For an extreme example, if the ROI included 10 characters or more, the proportion of initial fixations landing on the 10th character should be close to zero, because there is no such saccade close to 10 characters long in natural Chinese reading.

Proportion of initial fixations at different landing zones in one- and two-character words conditions in Experiment 1A. Each zone is one character in size.
Our data showed that increased stroke complexity led to shorter incoming saccade length, b = -0.015, SE = 0.007, t = -2.177, p = .031, but stroke number did not significantly influence initial landing position in Experiment 1A, t < 1. Furthermore, the main eye movement measures and data analyses for saccade measures on the length matched region are shown in Table 3. Initial landing position did not show significant differences between the one- and two-character word conditions, but launch site was slightly shorter in the one- than in the two-character word conditions. Further analysis showed that incoming saccade length was shorter in the one- than two-character word conditions. These findings might be modulated by the slightly larger initial character frequency in the two-character word condition or the marginally significant difference in the launch site between the two conditions. When we included initial character frequency and launch site as control variables in the LMMs, initial landing position still did not show significant differences between the one- and two-character word conditions, b = 0.039, SE = 0.044, t = 0.908, p = .367, but incoming saccade length was only marginally shorter in the one- than in the two-character word conditions, b = 0.075, SE = 0.041, t = 1.808, p = .071. We will discuss these effects further in general discussion.
Eye movement measures and data analyses for the length matched region in Experiment 1A.
Experiment 1B
Method
In Experiment 1B, all of the participants, apparatus, procedures and analysis methods were identical to those described in Experiment 1A. Forty-eight pairs of two- and three-character words were selected as target words. Their predictability in the sentence was assessed by 10 volunteers and no difference was found between the two conditions (see Table 1). Word frequency and other variables (initial character frequency, initial character complexity, word predictability and sentence naturalness) were also matched. One participant was excluded from data analysis because his accuracy on the comprehension questions was very low (75%). After excluding this participant, accuracy on the comprehension questions was high (93%). About 96% of data was retained for further data analyses after excluding data based on the criterion used in Experiment 1A.
Results and discussion
Target word region
Experiment 1B also showed that increased stroke complexity led to longer first fixation durations and gaze durations, ts > 1.935, ps < .054, but stroke number did not significantly influence other eye movement measures for target word region, ts < 1.220, ps > .223. Eye movement measures and statistical analyses for word length effects are shown in Table 4. We found that all eye movement measures in Experiment 1B showed a similar pattern to Experiment 1A. First fixation durations were longer in the two- than in the three-character word conditions. Consistent with Experiment 1A, our data showed that first fixation durations did not show significant differences for two-character target words between single fixation duration (M = 265 ms, SE = 12) and the first of two fixations (M = 270 ms, SE = 11), t < 1, as well as for three-character target words between single fixation duration (M = 246 ms, SE = 9) and the first of two fixations (M = 253 ms, SE = 8), t < 1.
Eye movement measures and data analyses for the target word region in Experiment 1B.
First fixation duration, gaze duration, and Total time were measured in milliseconds. Standard errors are given in parentheses.
Gaze durations did not show significant differences between the two- and three-character word conditions, but total times were significantly shorter in the two- than three-character word conditions. Long words were fixated with longer gaze durations and total times than short words. Long words were also fixated on with more fixations and skipped less often than short words. All the saccade measures showed a similar pattern to Experiment 1A. Launch site did not show significant differences between the two- and three-character word conditions but long words had further initial landing positions and longer incoming saccade length than that for short words. Furthermore, consistent with Experiment 1A and the findings in Li et al.’s (2011) study, outgoing saccade length was significantly longer for longer than shorter words.
Length matched region
Figure 3 shows the proportion of initial fixations that located on different landing zones in length matched three-character region. The ROIs include the three-character target word in the three-character word condition, and include the two-character target word and the following one character in the two-character word condition. Consistent with the findings in Experiment 1A, the initial fixations were more likely to fall on the first character than on other characters and they dropped from left to right.

Proportion of initial fixations at different landing zones in two- and three-character words conditions in Experiment 1B. Each zone is one character in size.
In Experiment 1B, stroke number did not significantly affect saccade measures for length matched region, ts < 1.021, ps > .308. Detailed eye movement measures and data analyses for word length effects on the length matched region are shown in Table 5. Launch site did not show significant differences between the two- and three-character word conditions. However, the initial landing position was significantly further into the target region in the three- than in the two-character word conditions, while incoming saccade length did not show significant differences between the two- and three-character word conditions. These findings were not modulated by initial character complexity. When we involved initial character complexity as a control variable in the LMMs, initial landing position was still further into the target region in the three- than two-character word conditions, b = 0.180, SE = 0.065, t = 2.791, p = .006, while incoming saccade length did not show significant differences between the two conditions, b = 0.094, SE = 0.076, t = 1.225, p = 0.224.
Eye movement measures and data analyses for the length matched region in Experiment 1B.
Experiment 2
In Experiments 1A and 1B, although character properties were controlled, some of them (i.e., initial character frequency in Experiment 1A and initial character complexity in Experiment 1B) still showed marginally significant differences between the short and long target words. In Experiment 2, we performed a within-subjects design and had better control of both word and character properties to study how word length information influences when and where to move the eyes while reading Chinese.
Method
Participants
Thirty native Chinese speakers from Shaanxi Normal University were paid to participate in Experiment 2. None of them participated in Experiment 1A or Experiment 1B.
Materials and design
Sixty sets of one-, two- and three-character words were selected as target words and each set was embedded in the same sentence frame. Similar to Experiments 1A and 1B, we matched word frequency and character properties of the first character for each set of three words (see Table 6). The average predictability of target words in the sentence was zero, which meant no target word was reported by 12 volunteers. The naturalness of all sentences was assessed by another 12 volunteers and the results did not show significant differences among the three conditions.
Properties of the Stimuli for one-, two- and three-character words conditions in Experiment 2.
Apparatus, procedure and analysis
We used the same apparatus, procedure and analysis methods as that described in Experiment 1A. To compare the results in Experiment 2 with Experiment 1A and 1B, we performed a priori contrasts with the two-character word condition as the base line for the one- and three-character word conditions in all LMMs. Accuracy on the answers to the comprehension questions was high (95%), indicating that the participants understood the sentences well. About 2% of the trials were removed from further data analysis according to the standards used in Experiment 1A and 1B.
Results and discussion
Target word region
Experiment 2 showed that increased stroke complexity led to marginally longer first fixation durations, gaze durations and total times, and also marginally larger number of fixations and smaller probability of skipping, ts > 1.694, ps < .090. In addition, increased stroke complexity led to shorter incoming saccade length and outgoing saccade length, ts > 2.190, ps < .029. These data suggest that stroke number plays an important role in determining eye movement control in Chinese reading. Furthermore, detailed eye movement measures and data analyses for word length effects are shown in Table 7. Consistent with Experiment 1A, first fixation durations were longer in the one- than in the two-character word conditions. Again, these findings were not likely caused by more single fixations for short words in the reading of Chinese, because single fixation durations did not tend to be longer than the first of two fixations. First fixation durations showed no significant differences for one-character target words between single fixation duration (M = 266 ms, SE = 9) and the first of two fixations (M = 269 ms, SE = 15), t < 1, and for two-character target words between single fixation duration (M = 248 ms, SE = 11) and the first of two fixations (M = 257 ms, SE = 10), t < 1, as well as for three-character target words between single fixation duration (M = 243 ms, SE = 9) and the first of two fixations (M = 243 ms, SE = 13), t < 1. However, slightly different from Experiment 1B, first fixation durations were only marginally longer in the two- than in the three-character word conditions (see possible mechanisms in the discussion section). It is not because the initial character frequency in the two- and three-character word conditions in Experiment 2 (M = 46 occurrence per million) was much lower than that used in Experiment 1B (M = 1978 occurrence per million). When we combined the data of Experiment 1B and 2 in a LMM, we found that the interaction between word length and initial character frequency (character frequency was log-transformed to meet LMM assumptions) did not show a significant difference, b = 0.024, SE = 0.020, t = 1.188, p = 0.235.
Eye movement measures and data analyses for the target word region in Experiment 2.
First fixation duration, gaze duration, and total time were measured in milliseconds. Standard errors are given in parentheses.
For most of the other measures, the data showed similar patterns to those in Experiment 1A and 1B. Gaze durations did not show significant differences between short and long words, but total times were shorter for shorter words than for longer words. Long words were also fixated on with more fixations and skipped less often than short words. Launch site did not show significant differences among the one-, two- and three-character word conditions. Long words also had a further initial landing position, a longer incoming saccade length and a longer outgoing saccade length than that for short words.
Length matched region
Similar to Experiments 1A and 1B, we analysed the distribution of the landing position in the length matched region. Figure 4 shows the proportion of initial fixations that located on different landing zones in a length matched three-character region. The ROIs include the three-character target word in the three-character word condition, and include the two-character target word and one character immediately thereafter in the two-character word condition, while including the one-character target word and two characters immediately thereafter in the one-character word condition. Consistent with the findings in Experiments 1A and 1B, the initial fixations were more likely to fall on the first character than on the other characters and they dropped from left to right.

Proportion of initial fixations at different landing zones in one-, two- and three-character words conditions in Experiment 2. Each zone is one character in size.
Further analyses showed that increased stroke complexity led to shorter incoming saccade length and outgoing saccade length for length matched region, ts > 2.359, ps < .018. Eye movement measures and data analyses for word length effects on the length matched region are shown in Table 8 and the pattern was similar but not same to that in Experiments 1A and 1B. Launch site did not show significant differences among the one-, two- and three-character word conditions. Initial landing position was further into the target region in the two- than one-character word conditions, but did not show significant differences between the two- and three-character word conditions. Incoming saccade length was shorter in the one- than two-character word conditions, but did not show significant differences between the two- and three-character word conditions. Since the results for length-matched region analyses in Experiment 1A and 1B were not well replicated in Experiment 2 with better stimulus control, we cannot conclude that word length generally influences initial landing position and incoming saccade length during Chinese reading (see Bayes factor analyses for Experiments 1A, 1B and 2 in the discussion section).
Eye movement measures and data analyses for the length matched region in Experiment 2.
General discussion
This study was designed to investigate how word length influences eye movement control during the reading of Chinese text. We found that long words were fixated with longer first pass reading times than short words and that long words were fixated on with more fixations and skipped less often than short words, but first fixation durations were longer for shorter than for longer words. Incoming and outgoing saccade lengths were longer for longer than shorter target words, but word length information did not reliably influence the initial landing position and incoming saccade length on the length matched region. These findings have theoretical significance to improve our understanding of eye movement control in the reading of Chinese.
Our data show that long words were fixated with longer first pass reading times than short words, which is consistent with the findings in reading English (Joseph et al., 2009; Plummer & Rayner, 2012; Rayner et al., 2011). These results are easy to understand because longer words always contain more fixations than shorter words. However, our data also revealed that word length influenced first fixation duration while reading Chinese, in a way that is different in reading English. Inconsistent with reading English (Joseph et al., 2009; Plummer & Rayner, 2012; Rayner et al., 2011) as well as one previous study in reading Chinese (Li et al., 2011), we found that first fixation durations were longer for shorter than for longer words in Experiments 1A and 1B, and this phenomenon was replicated in Experiment 2. In Li et al.’s (2011) study, word frequency was higher in the two- than in the four-character word conditions, which may lead to the null difference for the first fixation duration between the two conditions. As we previously mentioned, this phenomenon was not caused by more single fixations for shorter than longer words, as single fixation durations tended to be longer than the first of two fixations in English reading (Rayner et al., 1996), because first fixation durations were not significantly shorter in single fixation duration than the first of two fixations in the current study.
The phenomenon regarding longer first fixation duration for shorter than longer words has been replicated in Experiment 2 with better stimulus control, thus it is not caused by relatively poor control of initial character frequency in Experiment 1A. Considering the specialties in reading Chinese (Ma, Li, & Rayner, 2014), one potential cause is as follow. The distribution of initial landing position showed that Chinese readers could not target the optimal viewing location for long words (i.e., the centre of a word). Unlike in English reading, there are no spaces between Chinese words marking word boundaries. Chinese readers could not use parafoveal word length information to guide initial landing position towards the centre of a word. The probability of initial landing position drops from the left to the right of a long word, thus most of initial fixations do not land on the OVP in recognising Chinese words (Liu & Li, 2013). Therefore, Chinese readers may quickly move the eyes to other viewing position of a long word, resulting in shorter first fixation durations for long words.
In terms of saccade measures, although we found that word length influenced the skipping probability, initial landing position, incoming saccade length and outgoing saccade length on target words, these findings might not directly indicate the modulation of word length on saccade target selection during reading Chinese. The different skipping probability for words of different length does not indicate that word length affects saccade target selection either. Since the average saccade length is about 2.5 characters during Chinese reading (Li et al., 2014), a one-character word is of course more likely to be skipped than a two-character word. In other words, even if saccade lengths are constant, short words are more likely to be skipped than long words. Moreover, consistent with the logic used by Li et al. (2011), much longer saccades were included for longer than for shorter target words, thus the initial landing position should be further into the target region for longer than for shorter words, and incoming saccade length should also be longer for longer than for shorter words. Finally, as we discussed in Experiment 1A, readers need to saccade longer to leave a longer ROI, resulting in longer outgoing saccade length for longer than shorter words. Therefore, we argue that the findings described in the paragraph should not be taken as evidences that saccadic targeting is affected by word length.
For the length matched region, the study provided no strong evidence to support the view that word length information could modulate saccade target selection. First, consistent with previous studies in the reading of Chinese (Li et al., 2011; Ma et al., 2015), there was no defaulted landing position on the centre of a word. The distribution of the initial landing position was dropped from the left to the right, instead of peaking at the centre of word. This kind of saccade target selection is different from that in the reading of English. In English reading, initial landing position is primarily determined by visual factors such as interword spacing, as assumed in popular reading models (Engbert, Nuthmann, Richter, & Kliegl, 2005; Reichle et al., 1998; Reilly & Radach, 2006). On the contrary, there are no salient features marking word length or word boundary information in unspaced Chinese text. Therefore, Chinese readers cannot use word length information in parafovea to guide initial landing position towards the centre of a word.
Second, we did not obtain consistent results on initial landing position and incoming saccade length among Experiment 1A, 1B and 2. Only that incoming saccade length showed significant differences between the one- and two-character word conditions in Experiment 1A, which was replicated in Experiment 2. To quantify the relative evidence for alternative hypotheses over null hypotheses, we used the BayesFactor package (version 0.9.12-2; Morey, Rouder, & Jamil, 2015, available in R) to calculate the Bayes Factor (Rouder, Speckman, Sun, Morey, & Iverson, 2009). In Experiment 1A, we found that the ratio of evidences to support the alternative hypotheses against the null hypotheses were 0.13, and 0.29 for initial landing position and incoming saccade length, respectively. In Experiment 1B, the corresponding Bayes factors were 9.94 and 0.15. In Experiment 2, the corresponding Bayes factors were 3.71 and 0.08 when comparing the difference between the one- and two-character word conditions, while the Bayes factors were 0.18 and 0.11 when comparing the two and three-character word conditions. As the results revealed, most of Bayes factors were lower than one. Therefore, these results do not consistently support the view that Chinese readers can use parafoveal word length to guide initial landing position and incoming saccade length.
What determines saccade target selection during the reading of Chinese text? Previous studies support the view that saccade target selection in reading Chinese is more likely to be determined by parafoveal linguistic processing (Li, Liu, & Rayner, 2015; Liu, Reichle, & Li, 2015, 2016). Liu et al. (2015) found that, only when parafoveal processing was not prevented, foveal high-frequency word would trigger longer outgoing saccade length. Liu et al. (2016) further reported that parafoveal word frequency could modulate saccade target selection in a different way to that in English reading (Rayner et al., 1996). Chinese readers had longer incoming saccade length and landed closer to the centre of a word for higher- than lower-frequency target words. However, in the current study, there was no strong evidence to support the view that word length influenced processing difficulty in parafovea, since word length did not reliably modulate saccade target selection.
The present study is helpful to understand and improve modelling work on reading Chinese text. Our data show that word length functions differently in reading Chinese when compared to reading English. In the E-Z Reader model during English reading (Reichle et al., 1998), fixation times were primarily determined by word frequency and predictability, while word length influences later measures of fixation times by modulating the probability of re-fixations on a word. In the Chinese version of E-Z Reader model (Rayner, Li, & Pollatsek, 2007), a similar hypotheses was made, which made it hard to simulate longer first fixation durations for shorter than for longer words. In addition, Chinese readers could not select the centre of a word as a default landing location as that in English reading. Where to move one’s eyes seems to be determined more by parafoveal linguistic information, while word length in reading Chinese texts could not provide such information. These corresponding findings should be integrated and tested in any model of eye movement control in reading Chinese.
To summarise, this study replicated word length effects in the reading of Chinese text, in which long words were skipped less often, fixated on with more fixations and longer total time than for shorter words. Even with better stimulus control, we did not find strong evidence to support the view that word length influences initial landing position and incoming saccade length for the length-matched region. In addition, we found evidence that first fixation durations were longer for shorter than for longer words in reading Chinese texts. These findings suggest that word length modulates eye movement control when reading Chinese in a slightly different way from reading English. Further research should test the interaction between word length and other word properties and integrate the corresponding findings into modelling eye movement control during the reading of Chinese text.
Footnotes
Acknowledgements
We thank Xiangling Zhuang for helpful discussion regarding this work, and Eyal M. Reingold, Denis Drieghe and two anonymous reviewers for their helpful comments on an earlier version of this article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This study was supported by grants from the National Natural Science Foundation of China (31600877, 31571125) and the Fundamental Research Funds for the Central Universities (GK201603121, GK201803099).
