Abstract
We present an eye-tracking experiment examining moment-to-moment processes underlying the comprehension of emoticons. Younger (18–30) and older (65+) participants had their eye movements recorded while reading scenarios containing comments that were ambiguous between literal or sarcastic interpretations (e.g., But you’re so quick though). Comments were accompanied by wink emoticons or full stops. Results showed that participants read earlier parts of the wink scenarios faster than those with full stops, but then spent more time reading the text surrounding the emoticon. Thus, readers moved more quickly to the end of the text when there was a device that may aid interpretation but then spent more time processing the conflict between the superficially positive nature of the comment and the tone implied by the emoticon. Interestingly, the wink increased the likelihood of a sarcastic interpretation in younger adults only, suggesting that perceiver-related factors play an important role in emoticon interpretation.
Advancements in technology have seen an increase in the use of computer-mediated communication, where people interact with one another online, for example, through email and social networking services (Kiesler et al., 1984). As the majority of these interactions are text-based, people use various textual devices such as punctuation (e.g., …, !!!), emoticons (e.g., :-), :-(, or ;-)), or emojis, to enhance the meaning of their messages and to express their thoughts and emotions (Derks et al., 2008; Thompson & Filik, 2016; Weissman & Tanner, 2018). Because previous research has suggested that the winking face emoticon is one of the most common devices to accompany ironic/sarcastic sentences (Thompson & Filik, 2016), in this paper we will focus on the wink emoticon. Specifically, we use eye-tracking during reading to investigate how this device is processed in real-time with comments that are ambiguous between a literal and a sarcastic interpretation, as well as examining its influence on interpretation, and the relationship between reading behaviour and interpretation. We also investigate whether certain perceiver-related factors, specifically, age, personal tendency to use sarcasm, and use of the internet, social media, and emoticons, have an influence on interpretation.
The comprehension of irony and sarcasm
Irony can be defined as a form of non-literal language that involves someone expressing one thing when they actually mean the opposite (Grice, 1975). Ironic language is used frequently in online communication, with research indicating that 7% of emails sent to friends (Whalen et al., 2009) and 73% of blog entries (Whalen et al., 2012) contain some form of verbal irony (with the latter study reporting an average of around two ironic utterances per entry). A common form of irony is ironic criticism or sarcasm, where people express a positive statement to convey a negative meaning, to target individuals and chastise them about their behaviour, for example, saying, You’re such an amazing cook!, to an individual who has burnt their dinner (Dews et al., 1995; Kreuz & Glucksberg, 1989). By talking in this way, the speaker is able to highlight a failed expectation, that is, the fact that events have unfolded in unexpected or undesirable ways (Pexman, 2008). Thus, the use of irony may serve some communicative function that would not be achieved by speaking directly, such as eliciting a particular emotional response in the recipient of the comment (e.g., Colston, 2007; Dews et al., 1995, see Pickering et al., 2018, for a recent overview). From this, it is clear that irony is a frequently used and fascinating form of language, with many subtle and complex functions.
A number of theoretical accounts have been put forward to explain how readers and listeners process and understand ironic comments. The extent to which these theories allow for contextual factors to influence initial processing varies. Specifically, some accounts state that the literal (or salient, which is often literal) meaning (or interpretation) is initially accessed (or constructed) regardless of contextual cues (e.g., Giora, 1997, 2003; Grice, 1975). Following these accounts, extra steps are involved in detecting a mismatch with context and reanalysing the utterance as being sarcastic. However, more interactive accounts suggest that given sufficiently strong context, the sarcastic interpretation can be arrived at without initial recourse to the literal meaning (e.g., Gibbs, 1994, 2002); and that many factors (or “constraints”) can influence this process (e.g., Pexman, 2008). Whereas these accounts have not been explicitly developed to make predictions regarding the influence of textual devices such as emoticons, any influence on online processing would be more readily accommodated by interactive accounts which allow for a wide range of factors (or “constraints”) to influence processing (e.g., Pexman, 2008), than by modular accounts which do not allow contextual factors to influence initial processing (e.g., Giora, 1997, 2003; Grice, 1975).
The role of emoticons in sarcasm comprehension
The role of emoticons in sarcasm comprehension has been investigated by researchers for over a decade. The findings suggest that the wink emoticon is used to signify sarcasm by introducing ambiguity; the wink highlights the discrepancy between the text-based context and the visual cue, which is the emoticon itself (Oleszkiewicz et al., 2017; Rezabek & Cochenour, 1998). Some researchers also suggest that the wink emoticon signifies sarcasm by highlighting the discrepancy between the smiling face, which suggests positivity, and the winking eye, which suggests there is an additional hidden meaning behind the message (Derks et al., 2008).
One of the earliest experiments to investigate the relationship between emoticons and sarcasm comprehension was conducted by Walther and D’Addario (2001). They investigated how participants interpreted ambiguous positive and ambiguous negative messages when they were accompanied by a smiley face, sad face, winking face, or no emoticon. Participants read the ambiguous messages and following each message they completed a questionnaire which asked questions about the writer’s attitude, the writer’s affect, how easy the message was to understand, the sincerity of the writer, the ambiguity of the message, and the emotions portrayed by the writer. The results showed that positive messages accompanied by the winking face were rated as the most sarcastic. However, the positive messages accompanied by the winking face were not significantly more sarcastic than the positive messages accompanied by the smiley face, sad face, or no emoticon; suggesting that emoticons in general make ambiguous messages seem sarcastic, and not the winking face itself.
Contrary to this, other researchers have suggested that it is the wink emoticon itself that makes messages appear more sarcastic. For instance, Derks et al. (2008) found that participants rated emails containing the wink emoticon as significantly more sarcastic than emails that contained no emoticon. In support of this, Filik et al. (2016, Experiment 2) found that ambiguous comments accompanied by the wink emoticon were rated as more sarcastic than comments accompanied by an ellipsis (…) or a full stop. In terms of language production, Thompson and Filik (2016) asked participants to edit the final sentence of written conversations (Experiment 1) or write the final sentence of written conversations (Experiment 2), to make them sound literal or sarcastic. They found that emoticons were used significantly more often when participants were making the final sentence sound sarcastic compared with literal, and that the wink emoticon and tongue-face emoticon (:-p) were used significantly more often than other emoticons to indicate sarcasm. Taken together, these findings suggest that the wink emoticon is a useful tool for enabling people to convey that they are being sarcastic.
Individual differences in sarcasm comprehension
It is clear that sarcasm is a relatively complex form of communication (Channon et al., 2005), which involves the successful integration of a number of pragmatic and contextual cues to interpretation. Consequently, some researchers suggest that people who use sarcasm regularly are more sensitive to sarcastic utterances and better able to interpret the intended meaning. For instance, Ivanko et al. (2004, Experiment 1) found that participants’ use of ironic sentences in a production task, and their interpretation of ironic criticisms and ironic compliments, was predicted by their overall use of sarcasm (as measured by the Sarcasm Self-Report Scale; SSS). However, eye-tracking research by Kaakinen et al. (2014, Experiment 2) did not find a relationship between participants’ processing of irony and their use of sarcasm (also measured by the SSS); although the authors did suggest the differences in findings could be due to them using a different task to Ivanko et al. (2004), who used a word-by-word paradigm, or due to the Finnish translation of the English SSS scale. We will further investigate this issue in the current study by assessing participants’ own personal use of sarcasm (using the SSS, to be consistent with previous research).
Figurative language processing and ageing
Another individual difference which may have important consequences for the processing and interpretation of irony (and figurative language more generally) is that of age. For example, research conducted by Newsome and Glucksberg (2002) investigated the processing of metaphors in younger and older adults. They found that the older adults comprehended metaphors slower than younger adults. However, the older adults were still able to understand the metaphors, suggesting there are no age-related deficits in metaphor comprehension. In addition, Skalicky and Crossley (2019) investigated the processing of satire in younger and older adults. Participants read satirical and non-satirical headlines from The New York Times, and the results showed a difference in the reading times of younger and older adults, with increased age corresponding to longer reading time for the satirical, compared with the non-satirical, headlines.
Furthermore, Uekermann et al. (2008) investigated the processing of proverbs in younger and older adults and found impaired proverb comprehension in the older adults. The older adults also had impaired executive functions compared with the younger adults, as they demonstrated inhibition impairments and reduced working memory. The researchers suggested the results demonstrated age-related deficits in proverb comprehension, and that these deficits were related to the older adults’ reduced executive skills.
In terms of sarcasm, specifically, Phillips et al. (2015) investigated the interpretation of sarcasm in younger and older adults. Participants watched videos in which sincere or sarcastic exchanges were made between two people, and following each video participants were asked questions about what one of the people in the video was thinking, feeling, and doing, and what meaning (e.g., literal or sarcastic) they were trying to communicate to the other person. Participants also read stories which contained sarcastic utterances or control sentences and following each story participants were asked a question about which meaning (e.g., literal or sarcastic) the main character was trying to communicate. The results for the video task showed no age differences for the sincere exchanges; however, there were age-related deficits in the interpretation of the sarcastic exchanges. For the verbal task, there were no age differences for the control sentences; however, there were age-related deficits in the interpretation of the sarcastic utterances. Specifically, the results demonstrated that older adults were more likely to interpret statements literally, rather than sarcastically.
Effects of ageing on eye movements in reading
As we are using eye-tracking during reading to investigate the issues outlined above, it is important to consider the effects of ageing on eye movements during reading more generally. Research has shown that older adults read more slowly, make longer and more frequent eye fixations, more regressive (backward) eye movements, and longer saccades (as they skip words more frequently), compared with younger adults (e.g., Kemper et al., 2004; Kemper & Liu, 2007; Kemper & McDowd, 2006; Kliegl et al., 2004; McGowan et al., 2014, 2015; Paterson et al., 2013; Rayner et al., 2006).
These differences in reading behaviour between older and younger adults could be due to changes in the visual system. For instance, Kerber et al. (2006) found a significant correlation between age and a decline in oculomotor measures. In addition, research has found age-related declines in contrast sensitivity (for a review, see Owsley, 2011; Owsley et al., 1983; Schefrin et al., 1999).
Aims and hypotheses
Most previous research examining the relationship between emoticons and sarcasm comprehension has relied heavily on participants completing rating tasks, with no studies to date (to the authors’ knowledge) investigating the moment-to-moment processes that occur during normal reading using eye-tracking methodology. Therefore, our aim is to examine reading behaviour in both younger (18–30) and older (65+) adults when they encounter comments that are ambiguous between a literal or sarcastic interpretation and are either followed by a wink emoticon or a full stop (see Table 1 for an example). Participants’ ultimate interpretations of the comments, as well as the relationship between reading behaviour and interpretation, will also be assessed. In addition, we will investigate whether certain perceiver-related factors, specifically, personal tendency to use sarcasm (as assessed by the SSS), and use of the internet, social media, and emoticons, also have an influence on interpretation.
Example experimental item and filler item.
Forward slashes and bold text represent analysis regions. Neither of these was visible to participants.
Based on earlier research (e.g., Derks et al., 2008; Filik et al., 2016; Thompson & Filik, 2016), we expect the ambiguous comments accompanied by a wink emoticon to be interpreted as more sarcastic than the ambiguous comments accompanied by a full stop, at least for the younger adult participants. In terms of individual differences, we expect the younger adults to interpret more of the ambiguous comments sarcastically compared with the older adults, as previous research has suggested that older adults are more likely to interpret comments literally rather than sarcastically (Phillips et al., 2015). We anticipate a positive correlation between participants’ self-reported use of sarcasm and their inclination to interpret the ambiguous comments sarcastically (Ivanko et al., 2004). In addition, we expect that younger adults will report using emoticons more than older adults, as previous research has shown that younger adults use emoticons more frequently than older adults (López-Santamaría et al., 2019; Oleszkiewicz et al., 2017; Prada et al., 2018; Spina, 2019). We also expect the younger adults to report using the internet/social media more than the older adults, due to age being negatively correlated with the use of technology and social networking services (Prada et al., 2018).
As this is the first experiment (to our knowledge) to examine eye movement behaviour during reading of ambiguous comments accompanied by emoticons, our hypotheses in relation to reading times are necessarily more speculative. One possibility is that ambiguous comments accompanied by a wink emoticon will have longer reading times compared with ambiguous comments accompanied by a full stop. This is due to the wink emoticon creating a discrepancy between the text-based context (i.e., a superficially positive comment) and the visual cue of the emoticon, which may indicate a non-serious tone. Thus, participants may require additional time to interpret this visual cue to possible sarcasm, as well as to suppress the literal interpretation (Oleszkiewicz et al., 2017; Rezabek & Cochenour, 1998). It should also be noted that we do expect general differences in reading behaviour between the younger and older adults, such as overall longer reading times for older adults (e.g., Choi et al., 2017; Kemper et al., 2004; McGowan et al., 2014, 2015; Rayner et al., 2006).
In terms of the relationship between reading behaviour and interpretation, we would predict a positive correlation between reading times and the likelihood of reaching a sarcastic interpretation. This would follow both from theoretical accounts which state that comprehending sarcasm involves extra (time-consuming) steps over literal interpretation when the context does not provide strong cues to a sarcastic interpretation (e.g., Gibbs, 1994, 2002; Giora, 1997; Grice, 1975; Pexman, 2008), and from previous empirical work using eye-tracking which has demonstrated longer reading times for sarcastic than literal comments (e.g., Filik et al., 2014; Filik & Moxey, 2010; Kaakinen et al., 2014; Olkoniemi et al., 2016; Țurcan & Filik, 2016, 2017). Importantly, these previous eye-tracking studies have typically examined comments for which the context strongly supports either a literal or sarcastic interpretation. A further novel aspect of the current research is that the target comments are more ambiguous (which may also often be the case in real life).
Method
Participants
There were 56 participants in total. Twenty-eight younger adults, aged between 20 and 27 years (M = 23.46, standard deviation (SD) = 2.19, 16 females, 12 males), and 28 older adults, aged between 65 and 76 years (M = 68.61, SD = 2.73, 17 females, 11 males) received an inconvenience allowance (a monetary award) for taking part in the experiment. The younger adults were recruited through posters placed around the University of Nottingham campus, and the older adults were contacted from the School of Psychology, University of Nottingham volunteer database. All participants were native English-speakers, had normal or corrected-to-normal vision (11 younger adults and 24 older adults wore glasses), and were not diagnosed with any reading impairments such as dyslexia. The older adults attended an educational setting for a mean of 15.11 years (SD = 3.74), and for the younger adults this was 17.93 years (SD = 2.00), t(54) = 3.52, p < .001, Cohen’s d = 0.94. The older adults reported reading for a mean of 11.71 hr per week (SD = 8.07), and for the younger adults this was 19.36 hr per week (SD = 11.40), t(54) = 2.90, p < .001, Cohen’s d = 0.77.
To ensure that our older adult participants did not suffer from visual or cognitive impairments beyond what might be expected as a result of normal ageing, a number of standard visual and cognitive tests were administered. Specifically, participants’ visual acuity was assessed using an Early Treatment Diabetic Retinopathy Study (ETDRS) chart (Ferris & Bailey, 1996), and participants’ contrast sensitivity was assessed using a Pelli–Robson chart (Pelli et al., 1988), at a viewing distance of 40 cm. Older adults had lower visual acuity (89% of older adults at 0.3 logMAR or better; Snellen = 20/40 or better) compared with the younger adults (100% of younger adults at 0.1 logMAR or better; Snellen = 20/25 or better), and lower contrast sensitivity (85.7% of older adults at 0.4 logMAR or better; Snellen = 20/50 or better) compared with the younger adults (100% of younger adults at 0.3 logMAR or better; Snellen = 20/40 or better), which is typical for the age groups tested (Owsley, 2011).
Participants’ reading speed was assessed through a Radner reading chart (Radner & Diendorfer, 2014). Younger adults could read more words per minute (max reading speed: M = 248.86 wpm, SD = 29.76 wpm; mean reading speed: M = 199.59 wpm, SD = 23.54 wpm) than the older adults (max reading speed: M = 218.08 wpm, SD = 28.91 wpm; mean reading speed: M = 173.64 wpm, SD = 21.94 wpm), tmax(54) = 3.93, p < .001, Cohen’s d = 1.05; tmean(54) = 4.27, p < .001, Cohen’s d = 1.14, which is also typical for the age groups tested (Akutsu et al., 1991; Liu et al., 2017).
Finally, participants’ cognitive abilities were assessed using the Mini-Mental State Examination (MMSE; Folstein et al., 1975). Two points (hospital, floor) from the orientation section were removed; therefore, the maximum score was 28. Importantly, all participants scored within the range of having no cognitive impairments (Molder = 26.11, SD = 1.55; Myounger = 27.11, SD = 0.83). The younger adults scored higher on the MMSE than the older adults, t(54) = 3.01, p < .001, Cohen’s d = 0.80, but this finding is consistent with the literature demonstrating age-associated cognitive decline (Deary et al., 2009; Harada et al., 2013; Murman, 2015).
Materials and design
Twenty-eight experimental items taken from Filik et al.’s (2016, Experiment 2) study were modified for the current experiment (see Table 1 for an example). Target sentences were all superficially positive (e.g., But you’re so quick though) and were designed to be ambiguous between a literal and a sarcastic interpretation. When such a comment is interpreted literally, it remains positive (i.e., an instance of praise). However, when a superficially positive comment is interpreted sarcastically, it becomes negative (i.e., criticism, meaning that the person is not quick).
The 28 experimental items were interspersed with 35 “filler” items. The filler items followed the same structure as the experimental items, but only 15 of them were superficially positive (e.g., Your singing was amazing). The remaining 20 were superficially negative (e.g., You’re such a terrible cook) to add variety. When a negative comment is interpreted literally, it remains negative (i.e., criticism), but when a negative comment is interpreted sarcastically, it becomes positive (i.e., praise). The filler items also contained a mixture of emoticons and punctuation devices, such as ellipses (…), exclamation marks (!, !!!), sad face:-(, shocked face:-o, and tongue face:-p, to prevent the participants from noticing that the experiment was investigating the wink emoticon.
Each experimental item and filler item illustrated a different conversation that occurred online between different people, referred to as Person A and Person B in each case; thus, each item had a distinct topic of conversation. The items all followed the same format, with Person A making the first remark, Person B responding, Person A replying with an ambiguous comment aimed at Person B (which could be interpreted sarcastically or literally), and Person B delivering the final remark. Each of the 28 experimental items was followed by a question that was designed to examine whether the participants had interpreted the ambiguous comment (e.g., But you’re so quick though) literally or sarcastically (e.g., Does Person A think Person B is a fast runner?). Eleven of the 35 filler items were followed by similarly designed interpretation questions, and the remaining 24 were followed by comprehension questions that were not related to the interpretation of the target comment, but instead tested recall for factual content of other parts of the text (see Table 1 for an example).
There were two versions of each ambiguous comment; one ended with a wink emoticon, and the other with a full stop, and participants were younger or older adults. Thus, the experiment consisted of a 2 device (wink emoticon vs. full stop) × 2 age group (younger adults vs. older adults) design, with the device factor being both within-subjects and within-items, and the age group factor being between-subjects and within-items.
The experimental items were distributed across two separate stimulus presentation files, such that each item appeared only once in each of the presentation files, with each file containing a different device condition (wink emoticon vs. full stop). Thus, each participant read 14 items followed by a wink emoticon, and 14 followed by a full stop. Interspersed with the 28 experimental items were 35 filler items, resulting in 63 trials in each file. The items in each stimulus file were presented in a different randomised order for each participant.
Measures
Following the eye-tracking experiment, participants’ tendency to use sarcasm was assessed using the Sarcasm Self-Report Scale (SSS; Ivanko et al., 2004), which contained 16 items. Eight of the items assessed participants’ general use of sarcasm, for example, Likelihood that you would use sarcasm with someone you just met, and the remaining eight items assessed participants’ use of sarcasm in specific situations, for example, How likely are you to make sarcastic statements in these situations? You have to be at work in 15 min and your friend just accidentally locked your keys in the car. Participants responded to each of the items on a seven-point scale (1 = not at all likely; 7 = extremely likely), and responses were summed for overall use of sarcasm score (minimum score of 16 and maximum score of 112). The SSS was found to have good internal consistency, with Cronbach’s α = .83.
Participants’ tendencies to use the internet, social media, and emoticons were then examined, using questions developed by the current authors (see the online Supplementary Material for the full questionnaire). Participants were first asked How many hours do you spend on the internet per day? followed by How many hours do you spend using social media sites (e.g., Facebook, Twitter, Instagram) per day?. To examine participants’ general use of emoticons, participants were asked How often do you use emoticons/emojis when messaging/emailing others? where participants responded on a four-point scale (1 = never; 4 = often). Participants were then asked 11 questions examining their use of specific emoticons, such as the wink emoticon, where participants also responded on a four-point scale (1 = never; 4 = often).
Procedure
Participants first completed the ETDRS chart and the Pelli-Robson chart, where they had to read out loud the letters on each row until they could no longer identify the letters. This was followed by the Radner reading chart, where participants had to read each sentence out loud as quickly and as accurately as possible.
The eye-tracking experiment was then conducted using an SR Research EyeLink 1000 eye-tracker, which sampled the participants’ right eye position every millisecond, although viewing was binocular. Stimuli were presented on a 17-inch monitor, positioned 58 cm from the participants’ eyes. Three characters subtended approximately 1° of visual angle. The procedure was first explained to the participants, where they were instructed to read the items (silently) at their normal reading pace. They were then asked to sit in front of the computer screen and position their head on a chin-and-forehead rest to help minimise their head movements, after which a calibration procedure was completed.
Prior to each trial, a fixation circle appeared in the centre of the screen, and this was followed by a fixation square positioned in the upper left quadrant, which the participants were required to fixate upon in order for the stimulus computer to display the item. If the participant’s point of fixation was not in-line with the fixation square, the researcher recalibrated the eye-tracker. Once the participants had read each item, they fixated upon a post-it note positioned on the lower right-hand side of the display monitor and pressed the right-hand button on a hand-held controller to continue the experiment. A question designed to examine whether the participants had interpreted the ambiguous comment literally or sarcastically was displayed following all 28 experimental items and 11 of the 35 filler items. A general comprehension question assessing participants’ recall of information from the text (rather than their interpretation) was displayed following the remaining 24 filler items. For these 24 text-recall questions, older adults had an average correct response rate of 89.0% (SD = 0.50) and the younger adults had an average correct response rate of 89.9% (SD = 0.50). The overall average correct response rate was 89.4% (SD = 0.31), indicating that the participants were engaged in the task. There was no difference in the average correct response rates between the older and younger adults, t(1,342) = .532, p = .96, Cohen’s d = 0.03.
Participants then completed the MMSE followed by the SSS, and the internet, social media, and emoticons usage questionnaire.
Data analysis
Readers can respond to difficulty with a piece of text in a variety of ways. First, they might pause and make more or longer fixations on the word or phrase that is causing the difficulty. Alternatively (or in addition), they might look back at previous portions of text, to try and relocate key information (see e.g., Frazier & Rayner, 1982), or simply to “buy time” (Mitchell et al., 2008). They may also continue reading whilst trying to overcome the difficulty, resulting in longer reading times on subsequent words or “spillover effects” (see Vasishth et al., 2013, for discussion of different types of eye movement behaviour in reading). To capture these different behaviours, and to examine participants’ moment-to-moment reading behaviour in a relatively fine-grained way, the text is typically broken down into a series of analysis regions, for which a number of different measures of reading behaviour are calculated.
In this study, the 28 experimental items were separated into four analysis regions, as illustrated in Table 1. The pre-critical region contained the one or two words which preceded the ambiguous part of the target comment (e.g., you’re so in the example given in Table 1). The critical region was the one or two words which depicted the ambiguous part of the sentence, which could be interpreted literally or sarcastically (e.g., quick). The post-critical region was the following text up to the wink emoticon or full stop (e.g., though). The final region contained the wink emoticon or full stop itself, followed by the remainder of the scenario (e.g.,:-) Person B: I need to get more people to sponsor me.).
For each analysis region, four measures of reading behaviour are reported. First fixation duration is the duration (in milliseconds) of the initial fixation that a reader makes within a region of text. First-pass reading time (also known as gaze duration if the analysis region is a single word) is the sum of all the fixations a reader makes within the region until their point of fixation leaves the region, either to the left or the right. These two reading measures capture early processing difficulties and can indicate whether readers experienced difficulty immediately on encountering that portion of text. Regression path (or go past) reading time is the sum of all the fixations made within a region, as well as in the previous regions if re-reading occurs, until the reader’s point of fixation goes past the region and onto the following text. This measure can indicate whether readers experienced difficulties and re-read earlier portions of the text to overcome these difficulties. Finally, total reading time is the sum of all fixations made within a region and depicts overall processing difficulty.
During the pre-processing of the data, an automatic procedure combined fixations under 80 ms in duration with previous fixations within one character and deleted fixations under 40 ms in duration if they were not within three characters of another fixation. Trials were then eliminated if two or more consecutive regions had zero first-pass reading times, as this suggests there had been track-loss or the participants had failed to read the sentence. This procedure accounted for 5.29% of the data. Data were also removed when reading times were zero for a particular region, and means were calculated from the remaining data. For the pre-critical region, this accounted for 25.99% of first fixation, first-pass, and regression path, and 14.61% of total time data; for the critical region, 12.26% of first fixation and first-pass, 12.32% of regression path, and 8.01% of total time data; for the post-critical region, 14.48% of first fixation and first-pass, 14.14% of regression path, and 10.98% of total time data; and for the final region, <1% of data was lost across all measures (data losses are within the normal range for this kind of study, see, for example, Rayner, 2009).
Results and discussion
Eye-tracking data
Data from the pre-critical, critical, post-critical, and final regions were analysed using linear mixed effects (LME) models through the lme4 package (Version 1.1-21; Bates et al., 2015) in R (Version 3.6.1; R Core Team, 2019). Outliers were removed from the data (older adults and younger adults treated separately) when they were three SDs away from the mean (see Table 2 for the number of trials removed and Table 3 for the means and standard errors for the four measures of reading behaviour). The reading time data were skewed and were therefore logarithmically transformed prior to analysis.
Summary of trials removed (three standard deviations away from the mean).
Descriptive statistics.
SE: standard error.
The next step was to establish the random effects structure for each analysis; therefore, the maximal model was first fitted to the data. The maximal model included intercepts and slopes for all the fixed effects across participants and items, including interactions and correlations (see Barr et al., 2013). We included device (wink emoticon vs. full stop) and age group (younger adults vs. older adults) as fixed factors in the models. For device and age group, the fixed effects were coded using sum coding: full stop = –0.5, wink emoticon = 0.5, older adults = –0.5, and younger adults = 0.5.
If a model did not converge, we first added the optimizer “bobyqa.” If the model was still non-converging, we trimmed it down by removing perfect or near-perfect correlations and by progressively removing one random component at a time—the component which explained the least amount of variance in the previous non-converging model. Once the random effects structure had been established, we performed a series of likelihood ratio tests comparing the fit of the model with progressively simpler fixed-effects structures to reach the best model for our data. We report the regression coefficients (b), t-values (t), p-values (p), and 95% confidence intervals (CIs), where the lmerTest package (Version 3.1-0; Kuznetsova et al., 2017) was used to compute the p-values (see Table 4 for the fixed-effects parameters).
Results of the linear mixed models and the fixed-effects parameters.
CI: confidence interval.
p ⩽ .001; **p ⩽ .01; *p ⩽ .05; +p ⩽ .10.
Main effects of age group
There were main effects of age group across all regions of text and all measures of reading behaviour, showing that older adults had longer reading times than younger adults. This is in line with previous research (e.g., Kemper et al., 2004; Kemper & Liu, 2007; Kemper & McDowd, 2006; Kliegl et al., 2004; Liu et al., 2017; McGowan et al., 2014, 2015; Rayner et al., 2006) and suggests that older adults had to make longer and more frequent fixations when reading to compensate for age-associated cognitive decline.
Main effects of device
In the pre-critical region (e.g., the words you’re so, in the example given in Table 1), there were shorter first fixation durations in the wink emoticon condition than in the full stop condition. In the critical region (e.g., the word quick, in the example given in Table 1), there were shorter first fixation durations, first-pass reading times, and regression path reading times in the wink emoticon condition than in the full stop condition. One possible explanation for these results is that participants were already able to perceive the presence of an emoticon, which then attracted their attention. They may then have moved forward more quickly in the text in this condition, to reach something which may (or may not) be a helpful cue to interpretation. This suggestion is bolstered by the fact that readers are able to perceive information that is within five degrees of visual angle of the point of fixation (see Schotter et al., 2012, for a review). The emoticon was under five degrees of visual angle from the pre-critical region in approximately half of the experimental materials (15 out of 28) and was under five degrees of visual angle from the critical region in all materials. This would explain why the effect was relatively weak in the pre-critical region (i.e., emerged as a 6 ms effect in first fixation duration only) and then became stronger in the critical region (i.e., was present in first fixation, first-pass, and regression path reading times).
In the later regions of text, the opposite pattern of effects was found. Specifically, in the post-critical region (e.g., the word though, in the example given in Table 1), there were longer first-pass and total reading times in the wink emoticon condition than in the full stop condition. In addition, in the final region (e.g.,:-) Person B: I need to get more people to sponsor me., in the example given in Table 1), there were longer first fixation durations in the wink emoticon condition than in the full stop condition. Because the wink emoticon or full stop appeared right at the beginning of the final analysis region, it is likely that the longer initial fixation in the wink emoticon condition is simply due to the greater visual complexity of the wink emoticon compared with the full stop.
In terms of “later” reading time measures, there were longer regression path reading times in the wink emoticon condition than in the full stop condition. This suggests that readers had gone back to re-read earlier portions of the text more in the wink emoticon condition. In support of this interpretation, there were shorter first-pass reading times in the wink emoticon condition, suggesting that readers had immediately gone back to re-read earlier portions of the text, cutting short the first-pass reading times. In addition, there were longer total readings times in the wink emoticon condition than in the full stop condition, suggesting that this condition leads to greater processing difficulty overall.
These longer reading times for the wink emoticon condition in later measures of reading behaviour would suggest that greater processing effort is required in the wink emoticon condition. This is in line with previous research suggesting that the wink emoticon highlights a discrepancy between the text-based context and the visual cue, which is the emoticon itself (Oleszkiewicz et al., 2017; Rezabek & Cochenour, 1998). Specifically, participants may have taken longer as they were trying to integrate the superficial positive meaning of the target comment with the non-serious tone that is implied by the use of the wink emoticon.
Additional analyses isolating the device from the final sentence revealed that participants fixated this region on 41.5% of trials in the wink emoticon condition, compared with 1.7% of trials in the full stop condition, confirming that the emoticon does indeed attract readers’ attention. However, given the low number of fixations on this smaller region, particularly in the full stop (control) condition, including the final sentence in with the device allows for sufficient data points for meaningful analyses. A goal for future research would be to identify a control condition which would allow for sufficient data points such that reading behaviour on the device alone could be examined in more detail.
Interactions between age group and device
In the final region, there was an interaction between age group and device in first-pass reading times (see Figure 1). Decomposing the interaction showed that the younger adults spent longer reading the full stop condition than the wink emoticon condition (b = 0.11, t = 5.53, p < .001, 2.5% CI = 0.07, 97.5% CI = 0.15). There was no difference in the older adults’ reading times between the wink emoticon and full stop conditions (b = 0.03, t = 1.75, p = .08, 2.5% CI = 0.00, 97.5% CI = 0.07). Following our interpretation above, it seems to be the case that younger adults may have a greater tendency than older adults to look back when there is a wink emoticon, thus shortening first-pass reading times in this region.

Interaction between age group and device for first-pass reading times in the final region.
In addition, there was an interaction between age group and device in total reading times (see Figure 2). However, decomposing this interaction simply showed main effects of age, with older adults having longer reading times in both the full stop condition (b = 0.13, t = 3.12, p < .001, 2.5% CI = 0.05, 97.5% CI = 0.21) and the wink emoticon condition compared with the younger adults (b = 0.08, t = 2.00, p < .01, 2.5% CI = 0.00, 97.5% CI = 0.17), and device, with both older (b = –0.03, t = 2.30, p < .01, 2.5% CI = –0.06, 97.5% CI = 0.00) and younger (b = –0.08, t = 5.79, p < .001, 2.5% CI = –0.10, 97.5% CI = –0.05) adults having longer reading times in the wink emoticon condition than the full stop condition.

Interaction between age group and device for total reading times in the final region.
Interpretation of ambiguous comments
Following each of the 28 experimental items, participants were presented with questions to examine how they had interpreted the ambiguous target comments. Results showed that the younger adults interpreted the ambiguous comments sarcastically 21.9% of the time when they were accompanied by a wink emoticon, and 11.2% of the time when they were accompanied by a full stop. In contrast, the older adults interpreted the ambiguous comments sarcastically 7.7% of the time in both conditions.
The interpretation data were analysed using generalised linear mixed effects models (GLMM) through the lme4 package (Version 1.1-21; Bates et al., 2015) in R (Version 3.6.1; R Core Team, 2019). We included device (wink emoticon vs. full stop) and age group (younger adults vs. older adults) as fixed factors in the model. For device and age group, the fixed effects were coded using sum coding: full stop = –0.5, wink emoticon = 0.5, older adults = –0.5, and younger adults = 0.5. We report the regression coefficients (b), z-values (z), p-values (p), and 95% CI, where the lmerTest package (Version 3.1-0; Kuznetsova et al., 2017) was used to compute the p-values (see Table 5).
Results of the generalised linear mixed models and the fixed-effects parameters.
CI: confidence interval; YA: younger adult; OA: older adult.
p ⩽ .001; **p ⩽ .01; *p ⩽ .05; +p ⩽ .10.
There was a main effect of age group, showing that younger adults interpreted more of the ambiguous comments sarcastically compared with the older adults. In addition, there was a main effect of device, where ambiguous comments accompanied by wink emoticons were interpreted as more sarcastic than ambiguous comments accompanied by full stops. Furthermore, there was an interaction between age group and device.
The interaction results demonstrated the younger adults interpreted the ambiguous comments accompanied by wink emoticons as more sarcastic than the older adults. However, there was no difference between the younger adults’ and older adults’ interpretation of the ambiguous comments accompanied by full stops. In addition, the younger adults interpreted the ambiguous comments accompanied by wink emoticons as more sarcastic than the ambiguous comments accompanied by full stops. However, there was no difference in the older adults’ interpretation of the ambiguous comments accompanied by wink emoticons, and those accompanied by full stops.
These results are in line with previous research showing that younger adult participants tend to interpret ambiguous comments more sarcastically when accompanied by wink emoticons compared with full stops (Derks et al., 2008; Filik et al., 2016). These results also support previous research by Phillips et al. (2015), which suggests that older adults have a reduced ability to suppress the ambiguous comments literal meaning and process the sarcastic meaning, although, interestingly, there was no difference in how the groups interpreted the ambiguous comments accompanied by full stops. Notably, the presence of an emoticon did not influence the older adults’ reluctance to adopt a sarcastic interpretation, which may be due to older adults engaging less with the internet and social media (e.g., Prada et al., 2018) and thus having less experience of emoticons in their own interactions (e.g., Oleszkiewicz et al., 2017; see also results below).
Relationship between reading times and interpretation
Because one aim of the study was to investigate the relationship between reading behaviour and ultimate interpretation, two-tailed Pearson’s correlations were conducted (see Table 6). That is, correlations were conducted between the likelihood of interpreting the comment sarcastically and reading times for the entire ambiguous comment (i.e., pre-critical, critical, and post-critical regions combined), and between the likelihood of interpreting comments sarcastically and reading times for the final region of text containing the emoticon.
Two-tailed Pearson’s correlations: sarcastic interpretation.
CI: confidence interval.
p ⩽ .001; **p ⩽ .01; *p ⩽ .05; +p ⩽ .10. Significant correlations (p ⩽ .05) are highlighted in bold.
The results of these correlations show a clear and consistent pattern, with longer reading times in both regions of text being associated with a greater likelihood of sarcastic interpretations for both younger and older adults. Specifically, for the younger adults, a positive correlation was found between first-pass reading times and total reading times across the entire ambiguous comment and the likelihood of interpreting the wink emoticon scenarios sarcastically. In addition, a positive correlation was found between first-pass reading times across the entire ambiguous comment and the likelihood of interpreting the full stop scenarios sarcastically. Furthermore, for the final region, there was a positive correlation between first fixation durations and the likelihood of interpreting the full stop scenarios sarcastically.
For the older adults, a positive correlation was found between first fixation durations and first-pass reading times across the entire ambiguous comment and the likelihood of interpreting the full stop scenarios sarcastically. Furthermore, there was a positive correlation between first fixation durations on the final region and the likelihood of interpreting the wink emoticon scenarios sarcastically.
Overall, the current findings support previous eye-tracking research showing that sarcastic interpretations result in longer reading times than literal ones (e.g., Filik et al., 2014; Filik & Moxey, 2010; Kaakinen et al., 2014; Olkoniemi et al., 2016; Țurcan & Filik, 2016, 2017), but extend these findings to the processing of ambiguous comments. They also point to differences in the time course of processing between younger and older adults—a finding which merits further investigation.
Relationship between sentence interpretation and Sarcasm Self-Report Scale (SSS)
Participants’ Sarcasm Self-Report Scale data were summarised prior to data analysis, through totalling their scores across the 16 questions. An independent samples t-test indicated that the younger adults reported a greater tendency to use sarcasm compared with the older adults (Myounger = 66.86, SE = 2.5; Molder = 43.36, SE = 3.5), t(54) = 5.44, p < .001, Cohen’s d = 1.45.
To investigate whether participants’ tendency to use sarcasm was associated with their interpretation of the ambiguous comments, two-tailed Pearson’s correlations were conducted. A positive correlation was found between SSS scores and likelihood of interpreting the wink emoticon scenarios sarcastically (r = .262, n = 56, p = .05, 2.5% CI = .000, 97.5% CI = .491), but no association was found between SSS scores and the likelihood of interpreting the full stop scenarios sarcastically (r = .079, n = 56, p = .56, 2.5% CI = –.188, 97.5% CI = .335). This is in line with previous research (Ivanko et al., 2004) and supports the idea that people who use sarcasm regularly are more sensitive to sarcastic utterances. Interestingly, this correlation was found for the wink emoticon scenarios only. It may be the case that when comments are ambiguous, people who have a greater tendency to use sarcasm themselves are more sensitive to cues to a sarcastic interpretation (such as emoticons).
Relationship between reading times and Sarcasm Self-Report Scale
Because another aim of the experiment was to investigate the relationship between reading behaviour and participants’ use of sarcasm, two-tailed Pearson’s correlations were conducted (see Table 7). Specifically, correlations were conducted between reading times for the entire ambiguous comment (i.e., pre-critical, critical, and post-critical regions combined) and SSS scores, and between reading times for the final region containing the emoticon and SSS scores.
Two-tailed Pearson’s correlations: SSS.
SSS: Sarcasm Self-Report Scale; CI: confidence interval.
p ⩽ .001; **p ⩽ .01; *p ⩽ .05; +p ⩽ .10. Significant correlations (p ⩽ .05) are highlighted in bold.
Results showed a clear and consistent pattern, indicating that participants’ greater tendency to use sarcasm was associated with shorter reading times. Specifically, for the ambiguous comment, negative correlations were found between first fixation durations, first-pass, regression path, and total reading times for both the wink emoticon scenarios and the full stop scenarios and participants’ tendency to use sarcasm.
For the final region, which contained either the full stop or wink emoticon, negative correlations were found between first-pass, regression path, and total reading times for the full-stop scenarios and participants’ tendency to use sarcasm. In addition, negative correlations were found between regression path reading times for the wink emoticon scenarios and participants’ tendency to use sarcasm. These findings would suggest that when participants had a greater tendency to use sarcasm in their own communications, they experienced less difficulty in processing comments which are potentially sarcastic.
Internet, social media, and emoticon use
Independent samples t-tests indicated that the younger adults reported more hours of internet usage per day than the older adults (Myounger = 5.66 hr, SE = 0.6 hr vs. Molder = 1.60 hr, SE = 0.2 hr), t(54) = 6.65, p < .001, Cohen’s d = 1.78, greater use of social media per day (Myounger = 2.27 hr, SE = 0.4 hr vs. Molder = .70 hr, SE = 0.2 hr), t(54) = 4.00, p < .001, Cohen’s d = 1.07, greater use of emoticons (Myounger = 3.32, SE = 0.2 vs. Molder = 2.25, SE = 0.2), t(54) = 4.28, p < .001, Cohen’s d = 1.14, and greater use of the wink emoticon (Myounger = 2.75, SE = 0.2 vs. Molder = 1.46, SE = 0.2), t(54) = 4.89, p < .001, Cohen’s d = 1.31. This result is similar to previous studies that reported negative correlations between emoticon usage and age (Oleszkiewicz et al., 2017; see also Skovholt et al., 2014) and between the use of technology/social networking services and age (Prada et al., 2018).
To investigate whether participants’ use of emoticons in general was related to their interpretation of the ambiguous comments, two-tailed Pearson’s correlations were conducted. A positive correlation was found between the use of emoticons and interpreting the wink emoticon scenarios sarcastically (r = .328, n = 56, p = .01, 2.5% CI = .071, 97.5% CI = .544), but no association was found between the use of emoticons and interpreting the full stop scenarios sarcastically (r = .102, n = 56, p = .45, 2.5% CI = –.165, 97.5% CI = .356).
In addition, to investigate whether participants’ use of the wink emoticon specifically was related to their interpretation of the ambiguous comments, two-tailed Pearson’s correlations were conducted. In line with the findings for emoticon use in general, a positive correlation was found between the use of the wink emoticon and interpreting the wink emoticon scenarios sarcastically (r = .400, n = 56, p = .002, 2.5% CI = .153, 97.5% CI = .600), but no significant correlation was found between the use of the wink emoticon and interpreting the full stop scenarios sarcastically (r = .107, n = 56, p = .43, 2.5% CI = –.161, 97.5% CI = .360). Together, these results suggest that perceiver-related factors, such as participants’ tendency to use emoticons, are associated with how they interpret comments accompanied by emoticons.
Summary and conclusion
The current study extends existing research on emoticons and sarcasm comprehension in a number of novel ways, specifically, by investigating the moment-to-moment processes during normal reading using eye-tracking methodology, examining the relationship between reading behaviour and how a comment is ultimately interpreted, and in examining processing and interpretation in older as well as younger adults.
Our results provide evidence that when processing ambiguous comments, readers move more quickly to the end of the sentence when there is a device which may potentially aid comprehension. However, they will then spend more time in the region of the text containing the device (and in looking back to previous regions) than when there is a full stop, likely due to the discrepancy between the superficially positive literal meaning of the text (i.e., a literal compliment), and the sentiment implied by the emoticon (i.e., some degree of teasing). Indeed, ultimately, readers (at least younger adults) interpreted comments accompanied by the wink emoticon as being more sarcastic than comments accompanied by a full stop. Overall, results showed that the younger adults seemed to have greater sarcastic tendencies than the older adults in a number of respects, both reporting greater use of sarcasm, and showing a greater tendency to interpret ambiguous comments as being sarcastic (at least when accompanied by an emoticon). This finding would fit with previous research showing that older adults have a greater tendency to adopt literal interpretations (Phillips et al., 2015).
Finding a positive association between reading times and the likelihood of interpreting a comment as being sarcastic supports an extends previous eye-tracking research showing that sarcastic interpretations result in longer reading times than literal ones (e.g., Filik et al., 2014; Filik & Moxey, 2010; Kaakinen et al., 2014; Olkoniemi et al., 2016; Țurcan & Filik, 2016, 2017). Specifically, this previous work has examined reading times for comments which are ultimately disambiguated by the context. In contrast, the current work shows that participants also show longer reading times when choosing to interpret an ambiguous comment sarcastically—this is important, because in real life, it may be the case that comments are somewhat ambiguous.
We also found that a number of perceiver-related factors, such as personal tendency to use sarcasm, or to use emoticons, can influence comprehension. In relation to theories of sarcasm comprehension, these results would fit well with constraint-satisfaction accounts (e.g., Pexman, 2008), which allow for a wide range of factors to influence processing and interpretation. However, existing constraint-satisfaction models are currently simply descriptive accounts that theoretically allow for a number of factors to come into play—importantly—they are not yet functional computational models that can generate testable predictions. Degen and Tanenhaus (2019) note that the development of constraint-satisfaction models to explain aspects of pragmatic processing is a cutting-edge endeavour that is currently in its infancy. They outline a number of steps that will be necessary in the development of a successful model. First, the relevant constraints (i.e., factors affecting processing and interpretation) need to be identified and quantified. We view the current research as an important contribution towards further refining which constraints are important for sarcasm comprehension, specifically, personal tendency to use sarcasm, the presence of (and readers’ own personal use of) textual devices such as emoticons, and age.
Although we focused on the wink emoticon on the basis of previous research indicating that it is the device most commonly used to indicate sarcasm (Thompson & Filik, 2016), it would be of interest for future research to investigate the online processing and interpretation of a wider range of devices, including ellipsis (Hancock, 2004), exclamation marks (Whalen et al., 2009), the tongue emoticon (Thompson & Filik, 2016), and emojis (Weissman & Tanner, 2018). Finally, it is important for future research to further examine the key constraints in sarcasm comprehension, to continue important progress towards the development of more precise models of comprehension.
Supplemental Material
QJE-STD-19-077.R3-Supplementary_Material – Supplemental material for The role of emoticons in sarcasm comprehension in younger and older adults: Evidence from an eye-tracking experiment
Supplemental material, QJE-STD-19-077.R3-Supplementary_Material for The role of emoticons in sarcasm comprehension in younger and older adults: Evidence from an eye-tracking experiment by Hannah Elizabeth Howman and Ruth Filik in Quarterly Journal of Experimental Psychology
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the British Academy (Grant Number: PM140296) awarded to R.F.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
