Abstract
In written Korean, spaces appear between phrasal units (“eojeols”). In Experiment 1, participants read sentences in which space information had been manipulated. Results indicated that removing spaces or replacing them with a symbol hindered reading, but this effect was not as disruptive as previously found in English. Experiment 2 presented sentences varying in the proportion of eojeols that ended with postpositional particles as well as the presence/absence of spaces. Results showed that space removal interfered with reading, but its effects were weaker when the sentence contained more postpositional particles. This suggests that postpositional particles provide an extra cue to word segmentation in Korean texts. These findings are discussed in relation to the unique characteristics of the Korean writing system and to the models of eye-movement control during reading in different languages.
Introduction
Many models of eye movements in reading postulate that for alphabetic languages, such as English, interword spaces facilitate word segmentation and identification, which in turn drives readers’ control of eye movements. For instance, the E-Z Reader model (Reichle et al., 1998, 2003) assumes that visual cues, such as interword spaces and word boundaries, guide identification of the orthographic form of the word and determine where to move the eyes next.
Many empirical studies have tested the effects of spaces on reading by comparing readers’ eye movements while they read normally spaced texts versus texts in which space information has been manipulated. These studies generally suggest that spaces have a facilitatory effect on reading Roman-script languages that use interword spacing in normal texts. In English, removal of spaces, or their replacement with other symbols or characters, negatively affects eye-movement measurements that are indicative of reading difficulty (Malt & Seamon, 1978; Rayner et al., 1998; Rayner & Pollatsek, 1996; Spragins et al., 1976; cf. Epelboim et al., 1994, for different results).
Rayner et al. (1998) compared the reading of texts with or without spaces and found that removal of spaces negatively affected global and local eye movements, reducing reading rates and forward saccade sizes while increasing fixation durations, proportion of regressions, first fixation durations, and gaze durations (the sum of the durations of all first-pass fixations on a word until the eyes have left the word in either direction). To investigate possible causes of the disruptive effects of space removal, their follow-up experiment included three new spacing conditions in addition to Normally spaced and Unspaced text: (1) Filled space, where spaces were replaced with the letter x, (2) Flanker, where spaces were preserved but each word was additionally flanked by x, and (3) Wide space, where words were separated by three spaces. The results showed that reading times and fixation durations were longer in the Filled space and No space conditions than in the Normal spacing and Wide space conditions, with the Flanker condition intermediate. Spacing also influenced eye landing positions: Eyes tended to land a bit left of the centre of the word when text was spaced (Normal spacing, Wide space, and Flanker), but at the beginning of the word when spaces were removed or replaced with x (No space and Filled space). Based on these findings, Rayner et al. argue that reading without spaces is difficult because space removal hinders both word identification and saccadic planning. Similar facilitative effects of spaces on reading have also been reported in German (Inhoff et al., 2000) and Spanish (Perea & Acha, 2009).
However, previous models of eye-movement control in reading are not easily applicable across languages, as not all languages adopt interword spaces in their writing systems. Chinese has a logographic writing system, in which one unit of writing roughly stands for a word or a morpheme, and sentences do not contain interword spacing in normal Chinese texts. Bai et al. (2008) examined whether adding word boundaries in Chinese texts, either through the use of spaces or by highlighting every other word in grey, facilitates reading. They reported that, regardless of the word demarcation methods, the addition of word boundary information to Chinese texts did not cause significant differences in global eye-movement measures (fixation durations, saccade length, number of forward/regressive saccades, and total reading times) or in local eye-movement measures (first fixation duration, single fixation duration, and gaze duration). These results suggest that reading Chinese sentences with interword spacing is as easy as reading normal unspaced text. Despite the null effect of adding spaces in reading Chinese, a study on word learning in Chinese suggests that new vocabulary is read faster and learned better when presented in word-spaced format by young and adult Chinese readers as well as Chinese second-language learners (Blythe et al., 2012; Shen et al., 2012).
Even though Chinese texts do not have interword spaces, Chinese readers somehow recognise words and determine where to move their eyes based on the properties of the fixated word (Li et al., 2009, 2011). To account for eye-movement control in reading unspaced Chinese texts, Li and Pollatsek (2020) proposed a unique version of an eye-movement control model for Chinese reading (Chinese reading model; CRM). The CRM assumes that word segmentation and identification occurs simultaneously in Chinese reading. All words that are consistent with the characters in the perceptual span (i.e., a limited region surrounding the fixated position from which readers obtain useful information) are activated. When one of the activated words wins the competition, it is both segmented and identified as a word. The segmented word then has information on where it starts and ends, thus guiding the next eye movement.
Furthermore, Chinese characters themselves appear to provide word segmentation cues even in the absence of spaces. In an eye-tracking experiment, Yen et al. (2012) had participants read texts containing two-character target words, in which the probability of the second character being used as word ending (compared to word beginning) was either high or low. They found longer gaze durations on words that ended with low-probability than high-probability characters, supporting the view that Chinese readers use statistical cues of positional character frequency for word segmentation. Liang et al. (2015, 2017) also reported that incongruent positional character frequencies (i.e., when the initial character in a word has a high word-ending frequency or vice versa) increase reading times during pseudoword learning by adult and young Chinese readers. These findings together suggest that positional frequency of Chinese characters guide word segmentation and identification, although its effect is independent from that of word spacing.
The writing system of Japanese is a combination of logographic (Kanji) and syllabic (Hiragana and Katakana) writing components and also does not use interword spacing. Given that typical Japanese texts consist of a mixture of these characters, a switch between the character types could signal word boundaries. Sainio et al. (2007) hypothesised that introducing space would facilitate reading pure Hiragana texts but be less effective in reading mixed Kanji-Hiragana texts. Results of their eye-movement study verified this hypothesis; they found faster reading speed, shorter gaze duration, and shorter total fixation time for spaced Hiragana texts than for unspaced Hiragana texts, but there was no indication of a spacing effect in mixed Kanji-Hiragana texts.
Thai has an alphabetic writing system, but uses spaces to delimit sentences, not words. Kohsom and Gobet (1997) used a reading-aloud task to examine whether the addition of interword spaces to Thai texts facilitates reading. Their results showed that the participants were faster in reading and made less errors when they read word-spaced texts than when they read normal unspaced texts. Based on these findings, the authors postulate that spaces may play a language-universal role in reading. However, later studies on spaced Thai texts showed different results. In an eye-movement study, Winskel et al. (2009) found that interword spaces facilitated reading at the word level, yielding shorter gaze durations and total fixation durations, but not at the sentence level, increasing reading times and number of fixations. Moreover, target-word skipping rates were higher in unspaced texts than in spaced texts, which, according to Winskel et al., suggests that the contrasting results at the word and sentence levels possibly occur because words segmented by spaces are visually distinct targets that attract fixations (Inhoff & Radach, 2002).
In an earlier eye-movement study of Thai readers reading normal, unspaced Thai texts, Reilly et al. (2005) proposed that there is a small set of characters that tend to mark word boundaries in Thai and that the presence of such characters in words may guide readers’ eye movements to the preferred viewing location (the word centre). Based on the findings of Reilly et al. (2005) and Winskel et al. (2009), Kasisopa et al. (2013) hypothesised that position-specific character frequencies in Thai would help spatial eye-movement control (i.e., where to move the eyes) and that interword spacing would facilitate word identification (i.e., when to move the eyes). In an eye-tracking study, they found that characters that frequently occur at word boundaries directed eye movements to the centre of the word, supporting their hypothesis that position-specific frequency of characters affects where the eyes move. However, they also found that the presence of spaces did not affect reading-time measures, indicating no effect on when the eyes move.
Although many eye-tracking studies have investigated word recognition and sentence reading in Korean (e.g., Kwon et al., 2010; H. Lee & Choi, 2019; Y. Lee et al., 2007, 2009; Seong et al., 2020 among many), no previous work, to our knowledge, has paid attention to the role of spacing in reading Korean texts. Korean and English scripts are similar in that both are alphabetic, with letters representing phonemes (consonants and vowels), and in using space to signal boundaries between sentence-internal constituents. However, the writing system of Korean differs from that of English in two critical ways. One distinctive property of the Korean writing system is that alphabets combine into a character block to represent a syllable. For instance, three Korean alphabetic letters “ㅅ,” “ㅗ,” and “ㄴ,” which represent phonemes /s/, /o/, and /n/, respectively, can together constitute a character block “손,” which then represents a syllable /son/. In addition, Korean has a unit of spacing called an eojeol, a phrasal unit that contains and is larger than a word. An eojeol usually consists of one or more stem morphemes and a series of functional morphemes, such as postpositional case markers, conjugations, and sentence endings (Woo, 2017). For example, in (1), a word 손 /son/ “hand” combines with an accusative case marker 을 /ul/ to form an eojeol.
(1) 나는 Na-nun I- “I washed (my) hands.”
Considering these unique characteristics of the Korean writing system, it is uncertain whether spacing plays a crucial role in Korean sentence reading as it does in English texts. One might expect that since Korean uses an alphabetic writing system as in English, the facilitatory effect of spacing must also be present in Korean and that removing spaces would disrupt reading. On the contrary, it is also possible that because the unit of spacing in Korean is larger than the word, the role of spacing as a visual cue to segmentation may not be as important as it is in English.
Experiment 1
The purpose of Experiment 1 was to investigate the effects of space removal and substitution on sentence reading in Korean. To this end, participants’ sentence-level eye movements during reading normal and manipulated texts were examined.
Method
Materials
Sixty declarative sentences were constructed. The sentences contained 9.8 words (eojeols) and 30 syllables (or characters) on average. One of the stimuli is given in (2) as an example.
(2) 도서관 내에서는 정숙하는 Tosekwan nay-eyse-nun cengswukha-nun Library inside-loc-top stay.quiet-comp 것이 대표적인 공중도덕이다. kes-i tayphyocekin kongcwungtotek-ita. to.be-nom common etiquette-decl “It is a common etiquette to stay quiet in the library.”
Five spacing versions were created for each stimulus sentence, as shown in Table 1. The Normal (N) text version contained spaces between eojeols, as in normal Korean texts. In the Space Deleted (SD) condition, all spaces were removed. In the Symbol Inserted (SI) condition, spaces were replaced with a “%” symbol. In the Same Character Inserted (SCI) condition, one Korean character “운,” standing for a syllable /un/, was used to substitute spaces. In the Random Character Inserted (RCI) condition, a random Korean character was used to substitute each space in the stimuli.
Spacing conditions and example stimuli for Experiment 1.
Symbols and characters added to the texts are highlighted here.
A total of 300 target stimuli (60 sentences × 5 spacing conditions) were divided into 5 lists of 60 sentences using a Latin Square design. Each list contained no more than one spacing version of each stimulus sentence. Participants were randomly divided into five groups, and each group was given one of the five lists of 60 sentences. As a result, each participant read 12 sentences per presentation condition (N, SD, SI, SCI and RCI). In addition to target sentences, the experiment also included 10 practice sentences.
Participants
Forty native speakers of Korean with normal or corrected-to-normal vision participated in the study. They were all attending a university in South Korea at the time of participation (Age M = 21.5, SD = 1.7). The participants received 10,000 Won (~$8.00 USD) after finishing the session.
Procedure
An experiment session started with initial calibration and validation on a 3×3 grid on the screen. At the beginning of each trial, a fixation point (+) was displayed near the left edge at the vertical centre of the monitor. Once the participant fixated on the point, a sentence appeared with its left boundary replacing the fixation point. The participants were instructed to read the sentence silently and to press a button when finished reading. After reading each stimulus sentence, the participants indicated whether the following sentence (e.g., 도서관 안에서는 조용해야 한다. “We should stay quiet in the library.”) was true or false as a comprehension check. 2
Participants’ eye movements during reading were recorded using the Eyelink 1000 Plus model (SR Research, Ontario, Canada) from their dominant eye at a sampling rate of 1,000 Hz. A headrest and a chinrest were used to minimise head movement. All sentences were presented in a 25-point black Malgun Gothic font on a white background, displayed on a monitor with the resolution of 1920 × 1080 pixels. The distance between the participant and the display monitor was 66 cm, and the distance between the participant and the camera was 51 cm. The number of characters within 1° visual angle was 1.6.
Analysis
Six trials with extremely long total sentence reading times were excluded from the analysis (0.25% of all trials; four trials by four different participants and two trials by one participant). Four sentence-level eye-movement measures were generated from the data: total sentence reading time, total number of fixations, mean fixation duration, and mean forward saccade amplitude per sentence. Total sentence reading time refers to the time it took the participants to read each sentence, including not only fixation durations but also saccadic durations. Total number of fixations indicates how many fixations were made during the reading of each sentence. Mean fixation duration is the average duration of all fixations made during the reading of a sentence. Forward saccade amplitude refers to the average amplitude of all forward saccades made during reading a sentence, and a greater amplitude indicates a longer forward saccade. These measures are representative of difficulty of sentence reading; as the difficulty of a sentence increases, so do sentence-reading times, number of fixations, and mean fixation durations, while forward saccades get shorter (Rayner, 1998, 2009). Interpretation of forward saccade amplitude data is complicated by our manipulation of spacing (i.e., spaces or inserted symbols/characters result in larger distances between word centres as compared to unspaced texts). Therefore, to better understand the effects of spaces on saccadic movements during reading, initial landing positions on 3-syllable or 4-syllable words were also analysed for normally spaced versus unspaced conditions.
Differences in these measurements across spacing conditions were analysed using mixed-effects regression models using the lmer() function from the lme4 package (Bates et al., 2015) in R, version 4.0.3 (R Core Team, 2020). Models were fit for each dependent variable with spacing condition as a fixed effect. The five levels of spacing conditions were treatment-coded with the Normal condition as the reference level. For the random effect structure, a full model was fit with by-participant, by-sentence, and by-group random intercepts as well as by-participant, by-sentence, and by-group random slopes for spacing condition. As the full model failed to converge, the random effect that captured the smallest variance was removed until the model fit reached convergence (Barr et al., 2013). The final models for all dependent variables included by-participant and by-sentence random intercepts and no random slopes. Statistical significance was computed using the lmerTest package (Kuznetsova et al., 2017).
Results
Mean accuracy of participants’ responses to comprehension questions was 0.90 (SD = 0.29), and there was no difference in accuracy across conditions (F < 1, ns). This indicates that the participants were paying attention to reading and understanding the sentences during the experiment.
Figure 1 shows the total sentence reading time, total number of fixations per sentence, mean fixation durations, and forward saccade amplitude by spacing conditions. Table 2 summarises the results of mixed-effects regression models for each measure.

Mean total sentence reading time (ms), total number of fixations, mean fixation duration (ms), and forward saccade amplitude (°) by spacing conditions from Experiment 1. Error bars represent the standard error.
Estimated effects of spacing conditions on total sentence reading time (ms), total number of fixations, mean fixation duration (ms), and forward saccade amplitude (°) from Experiment 1.
SE: standard error; SD: Space Deleted; SI: Symbol Inserted; SCI: Same Character Inserted; RCI: Random Character Inserted.
Total sentence reading times in the SD and the SI conditions were not different from the reading time in the N condition, while those in the SCI and the RCI conditions were significantly longer than the N condition. In other words, reading was not slowed down by removal of spaces or by their replacement with the symbol %. However, replacement of spaces with a character, either by the same character or by random characters, significantly increased reading times.
Total number of fixations in the SD and the SI conditions was not different from the total fixations in the N condition, while those in the SCI and the RCI conditions were significantly greater than the N condition. This result indicates that removal of spaces or replacement of spaces with the symbol % did not affect how many fixations the participants made per sentence. On the contrary, participants made more fixations in reading texts in which spaces were replaced with characters, either the same or random.
Differences in mean fixation durations between the N condition and each of the other four conditions were significant. Mean fixation durations in the SD condition, the SCI condition, and the RCI condition were significantly longer compared with the N condition, while mean fixation duration in the SI condition was shorter than the N condition. This result indicates that when spaces were deleted or replaced either with the same or random characters, it took longer for the participants to recognise words. In contrast, when spaces were replaced with the symbol %, word recognition was even faster than when normal spaces were available.
Differences in forward saccade amplitude between the N condition and each of the other four conditions were significant, indicating that any spacing manipulation influenced forward saccade amplitude. The participants made significantly shorter forward saccades when space was either deleted or replaced with symbols or characters than in normal text reading.
Figure 2 shows the proportion each syllable was landed on initially within 3-syllable and 4-syllable words as a function of spacing condition. The proportions of initial landing positions on the first or second syllable were higher in the SD condition than in the N condition, but this pattern reversed for the third and fourth syllables. The interaction of spacing condition and landing position was short of significant for 3-syllable words, F(2,228) = 1.999, p = .138, though a paired t-test showed that the proportion of initial landing on the last syllable was significantly higher in the N condition than in the SD condition, t(39) = 2.81, p = .008. In 4-syllable words, the interaction effect of spacing and landing position on initial landing proportions was significant, F(3,304) = 4.680, p = .003, with a paired t-test again showing that the proportions of initial landing on the last syllable were significantly higher in the N condition than in the SD condition, t(39) = 2.645, p = .012. These results indicate that compared with normally spaced sentences, the initial landing position in unspaced sentences was more towards the left boundary of the word.

Proportions of initial landing on each syllable in 3-syllable and 4-syllable words by spacing condition. Error bars represent the standard error.
Discussion
Overall, the results demonstrate that change in spacing yields influence on all of the sentence-level eye-movement measures examined. When spaces were replaced with either the same or random characters, there was an increase in total sentence reading time, mean fixation duration, and the total number of fixations, and a decrease in forward saccade amplitude. Slower reading, more and longer fixations, and smaller saccades all indicate that the participants had more difficulty when reading texts where spaces were replaced with a character than when reading normally spaced text. In other words, reading was significantly hindered by inserted Korean characters.
When spaces were deleted, mean fixation duration was significantly longer compared to reading normal text, and forward saccade amplitude smaller, providing a hint of a facilitative effect of spaces on reading Korean text. The decrease in forward saccade amplitude when spaces were deleted is associated with the fact that initial landing position in unspaced texts was closer to the left word boundary than in spaced texts, consistent with the findings of Rayner et al. (1998) in reading spaced versus unspaced English texts. However, there were no reliable differences in total sentence reading times and the number of fixations between reading normal texts and reading unspaced texts. This result contrasts with earlier studies showing that removing spaces from English texts results in longer reading times and more fixations (Malt & Seamon, 1978; Rayner et al., 1998; Rayner & Pollatsek, 1996; Spragins et al., 1976).
When spaces were replaced with the symbol %, total sentence reading time and total number of fixations were not different compared to reading normal texts. However, forward saccade amplitude was significantly smaller, and mean fixation duration was significantly shorter when reading texts with % substituting spaces than when reading normally spaced texts. These results indicate that the inserted symbol % played a role as a word delimiter and helped word recognition to some degree, although not as well as normal spaces do. The facilitative effect of inserted symbols contrasts with the disadvantageous effect of inserted characters, which is possibly because the symbol % is visually more distinctive from characters in words, but inserted characters are not distinguishable from other characters that are actually part of the sentence, likely causing confusion about lexical boundaries through creation of nonwords.
Experiment 2
As discussed in the “Introduction” section, one of the crucial differences between the writing systems of English and Korean is the unit of spacing. In English, every word constitutes one unit of spacing, so that spacing provides critical information for word recognition. In contrast, a spacing unit in Korean, or eojeol, often consists of a stem word plus optional functional morphemes such as a case marker and a sentence ending morpheme. As these postpositional morphemes make up a small set of syllables (e.g., -i/-ka for nominative case, -ul/-lul for accusative case, -un/-nun for topics), eojeol boundaries in Korean are not only marked by spaces but also, to some extent, by the presence of these morphemes. It is thus reasonable to predict that the characters representing postpositional morphemes could serve as an eojeol boundary cue and facilitate word recognition in Korean text reading, even in the absence of spaces.
The Korean language is comparable to Thai and Chinese in the sense that certain syllables appear near word boundaries with high frequency. In Thai, 10 characters account for over 50% of all word-initial characters, and 5 characters account for a similar percentage of word-final characters (Kasisopa et al., 2013). Earlier empirical findings showed that the presence of the characters that frequently occur near the beginning or end of a word directs readers’ eyes towards the word centre, playing a similar facilitative role in reading that interword spaces play in Roman script languages (Kasisopa et al., 2013; Reilly et al., 2005). Similarly, there is a small set of Chinese characters that more frequently occur at word beginning than word ending or vice versa (Yen et al., 2012). These positional frequencies of characters were found to influence word segmentation, and characters occurring in the position where they do not frequently occur disrupts word segmentation both in natural reading and in word learning contexts (Liang et al., 2015, 2017; Yen et al., 2012). If postpositional morphemes in Korean eojeols play a similar role as spaces and signal eojeol boundaries, it is expected that sentences containing more postpositional morphemes would be easier to read without space than sentences containing fewer postpositional morphemes.
Experiment 2 aimed to examine the effects of postpositional morphemes on reading unspaced text in Korean. Participants were shown sentences varying in the use of space and the number of postpositional morphemes, and their eye movements during reading were analysed.
Method
Materials
Sixty declarative sentences consisting of eight eojeols were created. In one half of the sentences, two eojeols had a postpositional morpheme as their final syllable, as in (3). In the other half of the sentences, five eojeols contained a postpositional morpheme, as in (4). Postpositional morphemes used in this study, listed in Table 3, were mostly particles, which in the Korean language are grouped into three types: (1) case particles, which mark the syntactic or discourse relation of a noun with co-occurring words or with the sentence; (2) special particles, or delimiters, which delimit the meaning of the co-occurring words; and (3) conjunctive particles, which connect two or more clauses (Choi-Jonin, 2008; Sohn, 1999). We used particles from all three categories. In addition to particles, three instances of an adverbial derivational suffix -key were also included. However, considering the dominant number of particles, the postpositional morphemes used in the experiment are hereafter referred to simply as postpositional particles (PPs).
(3) 한국 종합 무술 대회의 Hankuk conghap muswul tayhoy-uy Korean comprehensive martial.art competition-poss 참가 기준을 강화해야 한다. chamka kicwun-ul kanghwahayya han-ta. participation criteria- “The participation criteria of the Korean comprehensive martial arts competition must be augmented.” (4) 논문의 각주와 인용구 사용은 Nonmun-uy kakcwu-wa inyongkwu sayong-un Paper- 출처 참고에 도움을 준다. chwulche chamko-ey towum-ul cwun-ta. source reference-ben help- “Use of footnotes and citations in a paper helps referencing sources.”
The average syllable count per sentence was 23.5 for both 2PP and 5PP conditions, and a t-test found a non-significant difference between the two conditions (t = 0.07, p = .95). Two versions were created for each stimulus sentence, varying in the presence or absence of space, as shown in Table 4. A total of 120 target stimuli (30 sentences × 2 PP conditions [2 PP/5 PP] × 2 spacing conditions [Spaced/Unspaced]) were divided into 2 lists of 60 sentences using a Latin Square design. Each list contained only one of the spacing versions of each stimulus sentence. Participants were randomly divided into two groups, and each group was given one of the two lists of 60 sentences.
List of postpositional morphemes used in Experiment 2 according to Choi-Jonin’s (2008) and Sohn’s (1999) categorisation.
Sample sentences showing PP and spacing conditions for Experiment 2.
PP: postpositional particle.
Postpositional markers are highlighted here.
Participants
Sixty native speakers of Korean with normal or corrected-to-normal vision participated in the study. They were all attending a university at the time of participation (Age M = 22, SD = 2.4). The participants received 10,000 Won after finishing the session.
Procedure
The procedure for Experiment 2 was identical to Experiment 1.
Analysis
Data from Experiment 2 were analysed in the same way as in Experiment 1, except for the effect structure included in the mixed-effects regression models. Spacing (Spaced/Unspaced) and the number of PPs (2 PP/5 PP) as well as their interactions were included in the models as fixed effects. Two levels of both fixed factors were deviation-coded with the Spaced and 5PP conditions as −1 and the Unspaced and 2PP conditions as 1. After reducing the random effect structure until the models converged, the final models included by-participant and by-sentence random intercepts.
Results
Mean accuracy of participants’ responses to comprehension questions was 0.95 (SD = 0.22), and there was no difference in accuracy across conditions (F = 1.47, not significant). This indicates that the participants were paying attention to reading and understanding the sentences during the experiment.
Figure 3 shows the mean values of total sentence reading time, total number of fixations, mean fixation duration, and forward saccade amplitude per sentence by spacing conditions. Table 5 summarises the results of mixed-effects regression models for each reading measure.

Mean total sentence reading time (ms), total number of fixations, mean fixation duration (ms), and forward saccade amplitude (°) by PP and spacing conditions from Experiment 2. Error bars represent the standard error.
Estimated effects of spacing and PP conditions on total sentence reading time (ms), total number of fixations, mean fixation duration (ms), and forward saccade amplitude (°) from Experiment 2.
SE: standard error; PP: postpositional particle.
Total sentence reading time showed a significant main effect of spacing, with the Unspaced condition longer than the Spaced condition. This difference was greater for 2 PP sentences than for 5 PP sentences, and the interaction between spacing and the number of PP was significant. In other words, reading unspaced texts took longer than reading spaced texts, but the disruptive effect of space removal was mitigated when the sentence contained more PPs. The main effect of PP on total sentence reading time was not significant.
Total number of fixations showed a similar result as total sentence reading time, in that there were more fixations in the Unspaced than in the Spaced condition, indicating that the readers made more fixations when space information was not available compared to reading text with normal spacing. This effect of spacing on the number of fixations was stronger in 2 PP sentences than in 5 PP sentences, although this difference did not translate into a statistically significant interaction. The main effect of PP was not significant.
Mean fixation duration also showed a significant effect of spacing, with the Unspaced condition significantly longer than the Spaced condition, showing that it took longer for the participants to recognise words when reading unspaced texts than when reading normal texts. This difference across spacing conditions was greater for 2 PP sentences than for 5 PP sentences, and the interaction between spacing and PP was significant. The main effect of PP was not significant.
Forward saccade amplitude was smaller in the Unspaced condition than in the Spaced condition, and the interaction between spacing and PP was not significant. The main effect of PP was significant, with 5 PP sentences having longer saccades than 2 PP sentences. These results indicate that the readers made longer forward saccades when spaces were available than when spaces were removed, regardless of the number of PP in sentences. Also, sentences with five PPs were read with longer forward saccades than sentences with two PPs.
Discussion
The results of Experiment 2 indicated that the number of PPs in a sentence influences the magnitude of spacing effects on reading measures. Unspaced texts yielded longer reading times and longer fixation durations than spaced texts, and this difference was greater for sentences containing two PPs than sentences with five PPs. Reading unspaced texts involved more fixations than reading spaced texts, and this difference was weaker when the sentence contained more PPs, although this interaction was short of significance. An increase in the number of PPs also led to greater forward saccade amplitudes, though it did not interact with spacing. To summarise, PPs seem to facilitate reading and mitigate the hindering effect of space removal, supporting the hypothesis that characters representing PPs act as a word segmentation cue in unspaced Korean text. This result is in line with earlier findings that in Thai, which normally does not use interword spaces, frequencies of characters occurring at word boundaries serve as a word demarcation cue (Kasisopa et al., 2013).
However, the overall effect of the number of PPs in Korean texts on reading times, number of fixations, and fixation durations was not significant. That is, spaces are the primary cue for word segmentation in Korean, and the role of PPs as word delimiters seems to emerge only when there is no space information available. In contrast, Kasisopa et al. (2013) found that in Thai, reading spaced text resulted in shorter gaze durations than reading unspaced texts, but this difference was not significant when character frequency was also taken into account. This suggests that in Thai texts, character frequency at word boundaries is the more important cue for word segmentation than spaces. In Korean and Thai, the use of different information as a dominant cue for word segmentation likely stems from differences in conventions of writing the two languages: Korean texts contain spaces while Thai texts do not.
General discussion
Experiment 1 examined the importance of spaces in Korean sentence reading by scrutinising the effects of removing or modifying space information in Korean texts on readers’ eye movements. Replacing spaces with the same or random characters yielded disadvantages in all reading measures including total sentence reading times, total number of fixations, mean fixation durations, and forward saccade amplitude. Removing spaces or replacing them with a % symbol yielded longer fixations and shorter forward saccades, but total sentence reading times and number of fixations did not differ compared to normal spaced texts. Experiment 2 investigated the role of PPs occurring at the end of eojeols in word segmentation by comparing sentences containing 2 or 5 PPs written with or without spaces. Spaced texts were advantaged over unspaced texts in all four reading measures, but the effects of spacing on total sentence reading time and fixation durations were weaker for sentences containing 5 PP than for those containing 2 PP. These results together suggest that reading spaced texts has an advantage over reading unspaced texts in fixation durations and forward saccade amplitude. The facilitative effects of spaces in reading Korean sentences are comparable to other languages that use alphabetic writing systems like English, in which spacing effects have been widely found. Despite the difference in the spacing units in the two languages (i.e., words in English and eojeols in Korean), spaces do play a significant role of demarcating sentence-internal constituents, and removing spaces impedes reading in both languages.
In Experiment 1, total sentence reading times and number of fixations were not affected by removal of space. This null effect of space removal contrasts with earlier studies on English reading, which in general report a decrease in reading rate by more than 40% when space information was manipulated (either deleted or filled with letters/digits) compared to normal text reading (Pollatsek & Rayner, 1982; Rayner et al., 1998; Spragins et al., 1976). Although a weaker effect of spaces compared to English was also found in languages that do not use word spacing by default, such as Chinese (Bai et al., 2008), Japanese (Sainio et al., 2007), and Thai (Kasisopa et al., 2013; Winskel et al., 2009), the Korean results in the current study are striking, considering that normal Korean texts use spaces to mark eojeol boundaries. One potentially relevant difference between the English and Korean writing systems is that English alphabets roughly represent a phoneme, while Korean characters represent a syllable. Therefore, a lack of space information in English texts would interfere with letter grouping both at the syllable and at the word levels. On the contrary, space deletion in Korean texts would hinder character grouping only at the word level and not at the syllable level, thus having a weaker influence on reading in general.
However, compared to the results of Experiment 1, Experiment 2 showed a reliably more significant effect of spacing on sentence reading. All reading measures analysed indicated an advantage of spaced texts over unspaced texts. The contrasting results of Experiments 1 and 2 are likely due to the difference in the conditional designs of the two experiments. In Experiment 1, the participants read normal spaced and unspaced texts along with texts in which spaces were replaced with symbols and characters. The disruptive effects of the space manipulation were more prominent when the same or random characters were inserted in place of spaces than when spaces were simply deleted. The greater disruptive effect of inserted letters than that of deleted spaces was also found in a similar experiment in English by Epelboim et al. (1997). They found that reading sentences in which spaces were replaced with random Latin letters yielded increased reading time, even longer than reading unspaced sentences. Epelboim et al. (1997) argue that in unspaced texts, “familiar, highly practiced words and words predictable through meaning conveyed by context may have ‘jumped out’ at the reader, despite the absence of spaces that delimit words in ordinary text” (p. 2913), and this grouping is more likely to be incorrect when spaces were filled with random letters. Therefore, in our Experiment 1, the extreme difficulty of reading character-inserted sentences might have made reading unspaced texts relatively less disadvantaged than if the unspaced condition had been the only unconventional text format. In Experiment 2, the only condition in which space information was manipulated was the unspaced condition, and thus, the participants may have found the difficulty of reading unspaced texts more salient than reading normal texts, displaying a clearer difference in reading times in the spaced versus unspaced conditions.
Another distinctive property of the Korean writing system is the unit of spacing, eojeol. As an eojeol often ends with a PP, the presence of a PP may provide a cue for eojeol boundaries even in the absence of spaces. Indeed, the results of Experiment 2 showed that the hindering effect of space removal was stronger in sentences containing two PPs than in sentences containing five PPs, suggesting that the spacing effect is modulated by the presence of PPs in Korean texts. This finding is in line with earlier studies on the effects of position-specific character frequencies on word segmentation in Thai and Chinese. Both Thai and Chinese do not use word spacing in their writing systems, but their readers nevertheless are able to segment and recognise words during reading. Studies showed that characters in either of the languages vary in their probabilities of appearing at word initial or final position, and readers use this statistical information on positional frequencies of characters to group characters into words and control their eye movements (Kasisopa et al., 2013; Liang et al., 2015, 2017; Reilly et al., 2005; Yen et al., 2012). Thus, the weaker role of spaces in Korean texts than in English texts appears to be at least partially due to the cue for word segmentation provided by PPs.
While PPs in unspaced Korean texts mitigate the negative impact of space removal, characters inserted in place of spaces, either the same character or random characters, failed to show a similar modulating effect in Experiment 1. Rather, replacing spaces with the same or random characters (the SCI and RCI conditions) interfered with reading, and this disruptive effect was even stronger than the effect of deleting spaces. As PPs in Korean are represented by a small set of syllable characters, it seems to be the occurrence of this set of characters representing PP, rather than any character in place of spaces, that facilitates reading. Thus, frequencies of characters at the end of eojeols may be an important factor contributing to readers’ word segmentation, just as shown in Thai and Chinese. In a similar vein, Y. Lee et al. (2009) found that words whose last syllable is ambiguous as to whether it is part of the stem word or a case marker (e.g., cengchi
As mentioned in the “Introduction” section, the E-Z reader model (Reichle et al., 1998) posits that interword spaces serve as a critical cue to guide eye movements during reading and explains well the disruptive effect of space removal in English reading (e.g., Rayner et al., 1998). The finding reported here can also be broadly explained by this model. When the interword spaces were removed from Korean sentences, reading was generally impaired. However, the negative effects of the space removal in Korean were not as severe as those in English. In Experiment 2 of Rayner et al. (1998), deleting spaces resulted in a decrease in reading rate by 55% (279 words per minute in spaced texts vs 118 words per minute in unspaced texts) and an increase in fixation duration by 36% (250 ms in spaced texts vs 340 ms in unspaced texts). In contrast, Experiment 1 of the present study, of which the properties of the stimuli were similar to those in Rayner et al., yielded an increase in total sentence reading times by 5% (3,022 ms in spaced texts vs 3,198 ms in unspaced texts) and in fixation duration by 6% (210 ms in spaced texts vs 223 ms in unspaced texts). In Experiment 2, total sentence reading times increased by 18% (2,149 ms in spaced texts vs 2,541 ms in unspaced texts) and fixation duration by 8% (211 ms in spaced texts vs 228 ms in unspaced texts). This indicates an alternative cue can be exploited in an unusual situation in which the interword space is removed from texts. As mentioned earlier, we argue that the PP of eojeols is used as a segmentation cue in the unspaced Korean reading. Interword spaces cannot be used as a segmentation cue in languages like Chinese, in which interword spaces are not used in normal texts. According to the CRM (Li & Pollatsek, 2020), a unified eye-movement control model for Chinese reading, word segmentation and identification occurs simultaneously in Chinese reading, which means that linguistic features of each character can be used as segmentation cues as well as for word recognition.
Taken together, the results of the present study add to the models of eye-movement control in reading alphabetic languages such as English and shed light on the possibility of extending such models to account for reading in non-Roman script languages such as Korean. Many of the recent models agree on the crucial role of interword spaces in readers’ lexical processing and in their eye movements. While spaces serve as a salient visual cue to word boundaries in Korean as well, our findings demonstrate (1) that the disruptive effects of space removal are not as severe compared to English, (2) that spaces are not the only cue to word (or eojeol) segmentation, and (3) that characters representing PPs also guide readers’ eye-movement control to some extent. A model of eye-movement control in Korean sentence reading should therefore be able to incorporate both word spacing and PP characters as contributors to eye-movement control during reading.
Footnotes
Acknowledgements
We are grateful to the LCBL members for creating stimuli and collecting data.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported in part by a grant from the National Research Foundation of Korea (NRF-2020S1A3A2A02103899).
