Abstract
Understanding reading is a central issue for psychology, with major societal implications. Over the past five decades, a simple letter-detection task has been used as a window on the psycholinguistic processes involved in reading. When readers are asked to read a text for comprehension while marking with a pencil all instances of a target letter, they miss some of the letters in a systematic way known as the missing-letter effect. In the current article, we review evidence from studies that have emphasized neuroimaging, eye movement, rapid serial visual presentation, and auditory passages. As we review, the missing-letter effect captures a wide variety of cognitive processes, including lexical activation, attention, and extraction of phrase structure. To account for the large set of findings generated by studies of the missing-letter effect, we advanced an attentional-disengagement model that is rooted in how attention is allocated to and disengaged from lexical items during reading, which we have recently shown applies equally to listening.
We are all familiar with sensory illusions in which our subjective experience does not match the physical reality we are trying to capture. For instance, consider Figure 1a. When asked to report which line is longer, ignoring the brackets, observers see the left line as being longer even though both lines are of equal length. Cognitive illusions are analogous to sensory illusions, while calling upon higher-order processes. An example can be found in Figure 1b. When adult readers are asked to read this sentence and circle all the fs (try this before reading further)—
—between 85% and 90% of them miss the three fs at the ends of the word of (Read, 1983). This phenomenon, known as the missing-letter effect (MLE), is usually defined as a higher omission rate for target letters embedded in function than in content words and in frequent than in less frequent words.

Two illusions, one perceptual and one cognitive. (See text for explanation.)
The Missing-Letter Effect: Simple to Observe Using Paper and Pencil
The earliest (Corcoran, 1966), simplest to administer, and most frequently used method for eliciting and studying the MLE is the paper-and-pencil method exemplified in Figure 2. The participant is given the dual task of reading a series of passages for comprehension while searching for (and circling) all detected instances of a target letter. A few comprehension questions are administered after each passage to reinforce the importance of the reading task. This powerful method stands out in modern cognitive psychology as both low-tech (no special equipment is required) and particularly easy to administer and score. The development of the stimuli—passages well designed to hone in on the effect of interest without untoward confounds—is the main challenge.

Illustration of the paper-and-pencil method for eliciting and studying the missing-letter effect (a) and the marked text as viewed during scoring (b). The correction grid illustrated in (b) is made of a piece of paper with apertures at the locations of words containing the target letter—both critical (e.g., fox, fur, for, fin, fog) and filler words (if, from)—and is overlaid on the text. It is worth noting that in addition to the example provided in the figure, the missing-letter effect has been observed with a large variety of words, target letters, target-letter positions within the word, and languages including Arabic, Chinese, Dutch, English, French, German, Greek, and Hebrew.
The prototypical finding is a higher omission rate 1 for the target letter (f in Fig. 2) when it is embedded in function than in content words (see Saint-Aubin & Poirier, 1997, for an illustration in French) and in frequent than in less frequent words (see Roy-Charland & Saint-Aubin, 2006), a finding also observed in proofreading (see Saint-Aubin, Losier, Roy, & Lawrence, 2015). Although the two factors usually covary (because function words are typically more frequent than content words), it is important to note that the influences of word function and word frequency have been carefully isolated in a number of studies by contrasting function and content words of the same frequency (e.g., at vs. it, with the target being the letter t) or words from the same grammatical class with different frequencies (e.g., cost vs. cyst, with the target being the letter t). As we will show, findings like these are to be expected because the MLE arises from the joint influence of attention, language, and reading processes (Roy-Charland, Saint-Aubin, Klein, & Lawrence, 2007)
One advantage of the paper-and-pencil method for exploring the MLE is that it is simple enough to administer in the classroom and to neuropsychological patients. Reflecting its validity and utility, the magnitude of the MLE is positively related to reading skill level in elementary school children (Saint-Aubin & Klein, 2008; Saint-Aubin, Klein, & Landry, 2005). Interestingly, in one study comparing agrammatic Broca’s and Wernicke’s aphasic patients, the Wernicke’s patients generated a normal-sized MLE, whereas the agrammatic Broca’s patients showed no difference in omissions between function and content words (Rosenberg, Zurif, Brownell, Garrett, & Bradley, 1985).
Searching Adds to Reading Without Disturbing It: Neuroimaging and Eye Monitoring
Reading, from an evolutionary perspective, has only recently been added to our behavioral repertoire. Consequently, it does not enjoy a dedicated genetically programmed brain system and is instead achieved through a network of interconnected visual and psycholinguistic modules (Klein & McMullen, 1999). Newman, Kenny, Saint-Aubin, and Klein (2013) conducted an fMRI study to explore the reading network and any additional regions recruited by the letter-search task performed while reading. Functional connectivity (illustrated in Fig. 3) was determined for reading, searching, and reading while searching. Figure 3 nicely shows that all the brain areas involved in reading are also involved in the reading-plus-searching task in which the MLE has been observed. This finding echoes earlier work using eye monitoring, which has shown that all of the eye-movement benchmark effects in reading are observed when readers are also looking for a target letter (Greenberg, Inhoff, & Weger, 2006; Roy-Charland et al., 2007); for example, function words are skipped more frequently than content words, and when fixated, their fixations are shorter. Similarly, frequent words and more predictable words are more likely to be skipped than less frequent words and less predictable words. Finally, fixation duration decreases across multiple readings, whereas the probability of skipping words increases.

Functional-connectivity maps showing pairwise comparisons between reading, reading-while-searching, and searching conditions, including areas of significant overlap in functionally connected networks (conjunctions), as well as areas that showed significantly greater connectivity in one condition than another. Maps have been redrawn from data reported in Newman, Kenny, Saint-Aubin, and Klein (2013).
A Purely Bottom-Up (Skipping) Hypothesis Is Rejected: Rapid Serial Visual Presentation and Eye Monitoring
In the foundational article outlining the paper-and-pencil MLE test, Corcoran (1966) speculated that target letters were more likely to be missed in function than in content words, at least in part because letters would be harder to detect in skipped than in fixated words. Healy, Oliver, and McNamara (1987) tested this hypothesis by using a rapid-serial-visual-presentation (RSVP) procedure to present the text. With this procedure, words are displayed one at a time at the center of the screen for a fixed duration. Consequently, each word will be fixated during its entire presentation (Saint-Aubin, Kenny, & Roy-Charland, 2010). When Healy et al. (1987) first used this methodology, they did not find an MLE (see the gray filled squares in Fig. 4a). Because every word is fixated in this procedure, Hadley and Healy (1991) later used this finding to support a purely bottom-up account that emphasized skipping. When a fuller set of exposure durations has been used, however, including durations approximating those characteristic of normal reading, a large MLE has been consistently observed (see the remaining data points in Fig. 4a). As would be expected, the overall rate of omissions decreases as exposure duration increases, from about 50% to about 1% (see Fig. 4b) over the range of frame rates tested in the literature. Importantly, so long as misses are occurring at a substantial rate, despite the equivalent exposure durations, target letters are missed more often in function words than in content words.

Average omission rates (a) and the magnitude of the missing-letter effect (omissions from frequent function words minus omissions from less frequent content words; b) as a function of frame duration from studies using rapid serial visual presentation. The French words son, pour/cour, and des in the key refer to the data from different passages used in Experiment 1 of Saint-Aubin, Klein, and Roy-Charland (2003).
Converging evidence that this difference, with equivalent “bottom-up” information controlled by the RSVP procedure, is caused by the same cognitive processes that are operating during normal reading (while searching) comes from two sources. First, for each target-containing word in a passage, we compared the omission rates for participants who read while searching in the RSVP paradigm with those obtained from a different group of participants using the paper-and-pencil version of the task (as illustrated in Fig. 2). The item-based correlation between both tasks was as good as could be expected given the reliability of each task (corrected r = .95; Saint-Aubin & Klein, 2004). Second, and more directly, we (Saint-Aubin & Klein, 2001, Experiment 5) monitored participants’ eye movements while they read a passage on a computer monitor and signaled the detection of target letters by pressing a response key (see Fig. 5). Not surprisingly, we found that function words (the and for) were skipped more frequently (31%) than content words (e.g., tie and fog; 13.5%) and that the omission rate was higher for skipped (61%) than for fixated (41%) words. Most importantly, in these experiments and in two subsequent ones, we observed a large MLE with both the fixated and skipped words (see Table 1). That there was an MLE for both skipped and fixated words converged with the results from the RSVP experiments in definitively rejecting the proposal that the MLE is caused by skipping.

An illustration of how Saint-Aubin and Klein (2001) used eye-monitoring data to determine whether words were skipped or fixated and whether targets were detected or missed. Open circles represent fixations, arrows represent eye movements, and solid squares represent the position of the eyes when a button press signaling a target-letter detection was made. Reprinted from “Influence of Parafoveal Processing on the Missing-Letter Effect,” by J. Saint-Aubin and R. M. Klein, 2001, Journal of Experimental Psychology: Human Perception and Performance, 27, p. 330. Copyright 2001 by the American Psychological Association. Reprinted with permission.
Omission Rates for Target Letters Embedded in Fixated and Skipped Words From Three Studies Using the Methods Illustrated in Figure 5
Note: The function words were either the or for in Experiment 5 of Saint-Aubin and Klein (2001), and the corresponding target letters were t and f. In Roy-Charland, Saint-Aubin, Klein, and Lawrence (2007) and Roy-Charland, Saint-Aubin, Lawrence, and Klein (2009), Francophone participants read passages in French in which the function word was des and the target letter was d. In all the studies illustrated here, the content words were 3 letters long and the target letter was in the first position (e.g., as in ten, fin, and dit).
Attentional Disengagement: Integration of Bottom-Up and Top-Down Accounts
After Corcoran’s (1966) initial work, the field was dominated by the bottom-up camp of Healy and colleagues, whose series of studies, which began in 1976, were summarized in 1994 (Healy, 1994). Readers of these studies will see a primarily bottom-up account of the MLE, which evolved from one emphasizing unitization of more frequent and more familiar letter strings and groups of words to one emphasizing the relative processing time of the target letters and the words containing them (e.g., see Moravcsik & Healy, 1995). According to this view, readers miss more target letters in function than in content words because function words are more frequent. Healy and colleagues assumed that completion of processing at the word level leads to interruption of processing at the letter level and that, because of their higher frequency, function words are more likely to be identified as a whole before their constituent letters are fully processed than are less frequent content words.
This view was more or less unchallenged until 1991, when Koriat and Greenberg published a series of three articles in which they advanced the structural account: a purely top-down hypothesis (Greenberg & Koriat, 1991; Koriat & Greenberg, 1991; Koriat, Greenberg, & Goldshmid, 1991). According to the structural account, target letters would be more likely to be omitted in function than in content words because reading is driven by a mental representation of the upcoming text structure. This structure would be used to support meaning integration of content words and as a result would rapidly conceal function words. In short, within the structural account, word function and not word frequency is the critical factor mediating the MLE.
After a decade of disagreement, it became clear that neither a purely bottom-up view nor a purely top-down view could provide a satisfactory account of the MLE because both word frequency and grammatical structure contribute to it. In 2001, after providing data showing contributions from both kinds of processes, we suggested that both accounts might be integrated (Saint-Aubin & Klein, 2001). In 2004, the opposing sides teamed up and proposed such an integrative account, called the guidance-organization (GO) model (Greenberg, Healy, Koriat, & Kreiner, 2004). The interruption assumption, upon which this account and its predecessor (the unitization and processing-time account of Healy and colleagues) depend, implies that letter-detection response times must be faster in the conditions generating the most omissions. When we directly tested this prediction (Roy-Charland et al., 2007; Roy-Charland, Saint-Aubin, Lawrence, & Klein, 2009; Saint-Aubin, Klein, & Roy-Charland, 2003), we found the opposite pattern: Detection response speed was slower in the conditions generating the most omissions.
To accommodate this pattern of response times, we developed the attentional-disengagement (AD) model. One strength of this model is that, similar to the E-Z Reader model of Reichle, Rayner, and Pollatsek (2003), it is rooted in how attentional processes are allocated to and disengaged from lexical items during reading (Roy-Charland et al., 2007). Another strength of the AD model is its explicitness about how the search task is performed and how it is affected by these attentional states. The AD model assumes that attention is driven by reading and comprehension processes, leaving the detection process in the passenger seat. Incorporating ideas from both the bottom-up and top-down accounts of the MLE in reading and comprehension, attention would be disengaged faster from function than from content words because the former are easier to predict, carry less semantic information, and are usually more frequent, allowing faster lexical access. Because lexical access is faster for more frequent words, miss rates are higher for more frequent words within each word class (i.e., for both function and content words). In addition, the typical distinction between overt and covert attention can be applied, with skips during reading representing an overt shift of attention and with attentional disengagement prior to skipping (or during RSVP, when there are no eye movements) representing a covert shift of attention.
Whereas the GO model also incorporates these factors, it incorrectly assumes that interruption of processing at the letter level causes truncation of the distribution of target-detection responses, thus increasing misses and generating faster mean responses for hits. In contrast, in the AD model, faster disengagement results in weaker target signals, which entail a higher rate of omissions and slower responses when weaker targets are detected. The AD model predicts that (a) omissions and eye movements should be influenced by the same factors; (b) an MLE should be possible for targets other than letters (e.g., whole words); (c) like omission rates, target-detection reaction time should decrease as engagement time increases; and (d) an effect similar to an MLE should be observed when listening to passages rather than reading them.
Thus, instead of the MLE being a by-product of eye movements, both the MLE and eye movements are driven by similar psycholinguistic and attentional processes during reading. As such, the MLE can be used as an easy-to-administer and low-cost technique for exploring these processes. This is well illustrated by data we obtained on what words were expected for each word slot (this is referred to as cloze probability) in the passages we administered in our later MLE studies. In the eye-movement-monitoring literature, it is well known that predictable items are more likely to be skipped than unpredictable items. For instance, in the sentence “Since the wedding day was today, the baker rushed the wedding cake/pies to the reception,” the predictable word cake was more likely to be skipped than the unpredictable but still sensible word pies (Balota, Pollatsek, & Rayner, 1985). Using cloze probability norms developed with an innovative Web-based procedure (see Saint-Aubin et al., 2005, Experiment 3), we predicted omission and skipping rates (Roy-Charland et al., 2007). Results revealed that the probability of expecting a function word, be it the actual word or not, predicted omission and skipping rates over and above the actual role of the word. What is more, the contribution of the actual role of the word completely vanished when the probability of expecting a function word was factored in.
The discovery of powerful, common top-down processes driving eye movements and the MLE—such as the on-line development of a tentative structure of a sentence, which leads to expectations about the role of the upcoming words—suggests that any target in the area occupied by an anticipated function word should be frequently missed (e.g., see Koriat & Greenberg, 1991). We tested this hypothesis by asking readers to detect a word instead of a letter. For instance, the polysemous French word son can assume either a content role, meaning “sound,” or a function role, serving as a possessive adjective meaning “his.” Results revealed that readers missed more instances of son when it assumed a function role, a result that was reproduced with other words (Saint-Aubin et al., 2003).
An Analogous Missing-Phoneme Effect
In addition to being involved in eye-movement control, the cognitive processes highlighted by the MLE are also involved in oral comprehension. In effect, similar psycholinguistic processes operate during oral comprehension and reading, no doubt because reading, aside from the necessity for print-to-sound and print-to-meaning conversions, is built upon oral comprehension skills. In a recent study, we asked one group of participants to read texts for comprehension while searching for a target letter and another group to listen to narrations of the same texts while listening for the corresponding target letter’s phoneme (Saint-Aubin, Klein, Babineau, Christie, & Gow, 2016). In the listening task, we observed a large “missing-phoneme effect”: Participants missed more phonemes embedded in function than in content words, even after we controlled for acoustic factors that covaried with word class. Furthermore, as predicted by a common attentional-disengagement process, the item-based correlation between the letter- and the phoneme-detection tasks was high (see Fig. 6).

Scatter plot illustrating omission rates for target letters during the reading of a passage (x-axis) and for corresponding target phonemes during listening to the same passage (y-axis). Each dot represents the two omission rates for the same target-containing word contributed by different groups of participants who read and listened, respectively. Reprinted from “The Missing-Phoneme Effect in Aural Prose Comprehension,” by J. Saint-Aubin, R. M. Klein, M. Babineau, J. Christie, and D. W. Gow, 2016, Psychological Science, 27, p. 1023. Copyright 2016 by the Association for Psychological Science.
Conclusion
In sum, for five decades, through the development of successive models, the MLE has been used as a window illuminating a wide range of cognitive factors involved in reading. These include visual factors such as eye-movement patterns and letter position within a word; lexical and syntactic factors such as word frequency, word function, and expectations; and reading-specific factors such as text familiarity, reading skill, and reading development. The influence of all these factors reflects the involvement of both top-down and bottom-up processes. In a nutshell, the MLE arises at the confluence of attentional and psycholinguistic processes. The wealth of the underlying processes it can be used to study, combined with the ease of use of the letter-detection task, makes the MLE a valuable tool for the study of cognitive processes in reading.
Footnotes
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Funding
Each contributor has been supported by grants from the Natural Sciences and Engineering Research Council of Canada Discovery Grants Program.
