Abstract
Objective:
The goal of the present study is twofold: (1) demonstrate the importance of measuring and understanding the relationship between task engagement and vigilance performance, and (2) celebrate the work of Joel S. Warm and expand upon his previous research in two semantic vigilance paradigms.
Background:
The importance of measuring task engagement in cognitive and sensory vigilance tasks has been well documented. But to date, our understanding of the effects of task engagement on semantic vigilance performance is limited.
Method:
Seventy-three participants completed either a standard semantic vigilance task or a lure semantic vigilance task. Participants also completed subjective measures of workload and stress.
Results:
The results indicated that changes in task engagement are associated with correct detection performance. Changes in task engagement may be related to individual differences in the distress associated with performing semantic vigilance tasks.
Conclusion:
In line with the work of Warm and his colleagues (Dember, Warm, Bowers, & Lanzetta, 1984), participants who reported increased task engagement after the vigil outperformed their peers who noted decreased task engagement upon conclusion of the task. Participants reporting increases in engagement with the semantic vigilance tasks also reported significantly greater distress pretask, but not posttask. Instead, increases in postvigil distress were driven by the task to which participants were assigned, not task engagement.
Application:
The present study has several implications for applied settings that involve long duration semantic processing or semantic target identification. Such real-world tasks include aviation, cyber threat detection and analysis, driving, and reading.
Vigilance is the ability to sustain attention to critical information for prolonged periods of time (Hancock & Warm, 1989; Jerison, 1970; Matthews & Davies, 1998; Warm, 1977). The study of vigilance is an area in which Joel S. Warm has made an enormous, wide-reaching impact. In the present experiment, we honor Warm’s contributions to this area of research by linking individual differences in task engagement to indices of vigilance performance, stress, and workload while extending the current literature by using a semantic vigilance paradigm. In addition to studying the phenomenon of vigilance, it is crucial to understand the decline in performance over time associated with vigilance tasks, which is often referred to as the vigilance decrement (Davies & Parasuraman, 1982; Warm, 1977). Toward this end, researchers have sought to “defeat” the decrement for nearly 70 years (Hancock, 2013, 2017; Mackworth, 1948).
One growing area of research in vigilance, which has previously received limited attention in the broader literature, is semantic vigilance. Semantic vigilance tasks require participants to attend to and process characters, symbols, text-speak words, words, or nonwords over extended periods of time (Claypoole, Neigel, Fraulini, Hancock, & Szalma, 2018; Epling, Russell, & Helton, 2016; Fraulini, Hancock, Neigel, Claypoole, & Szalma, 2017; Head, Russell, Dorahy, Neumann, & Helton, 2012; Neigel, Claypoole, Hancock, Fraulini, & Szalma, 2018; Thomson, Besner, & Smilek, 2016; Yap & Seow, 2014). Semantic vigilance tasks require operators to respond to targets that are semantic or lexical in nature and withhold response to neutral stimuli, which are not semantically representative or related to target signals (Pattamadilok et al., 2017). Thus, semantic vigilance tasks are unique in that they do not fall neatly into the cognitive-sensory vigilance distinction (See, Howe, Warm, & Dember 1995) and could arguably involve both cognitive and sensory processes given the nature and design of the vigilance task.
Interestingly, there is a limited body of work that discusses the stress and workload associated with semantic vigilance tasks. Previous research has indicated that traditional vigilance tasks are high in workload and stressful for operators (Finomore, Shaw, Warm, Matthews, & Boles, 2013; Grier et al., 2003; Grier, 2015; Hancock & Warm, 1989; Matthews & Davies, 1998; Matthews, Warm, & Smith, 2017; Smit, Eling, & Coenen, 2004; Warm & Dember, 1998; Warm, Dember, & Hancock, 1996; Warm, Parasuraman, & Matthews, 2008). Thus, in order to make sound recommendations as human factors practitioners, a more holistic understanding of semantic vigilance tasks is necessary.
Task Engagement and Vigilance Performance
One method for reducing the vigilance decrement is to select operators based on individual differences associated with improved vigilance performance. Much of this work has focused on cognitive differences, such as working memory capacity (Caggiano & Parasuraman, 2004; Helton & Russell, 2011, 2013; Matthews, Warm, Shaw, & Finomore, 2014); personality differences, such as extraversion or introversion (Costa & McCrae, 1992; Shaw et al., 2010); and differences in motivation (Cerasoli, Nicklin, & Ford, 2014; Dember, Galinsky, & Warm, 1992; Dember, Warm, Bowers, & Lanzetta, 1984; Matthews, Davies, & Lees, 1990). Task engagement as an individual difference in vigilance performance has generally been overlooked (Hancock, 2017). Yet task engagement is critical in sustaining attention and cognitive resources over time (Matthews, 2016; Matthews, Warm, Reinerman-Jones, Langheim, & Saxby, 2010; Matthews, Warm, Reinerman-Jones, Langheim, Washburn, & Tripp, 2010; Matthews et al., 2017). Furthermore, task engagement can assist operators “in coping with the task demands” (Matthews, 2016, p. 801; Szalma, 2009, 2014), thereby reducing the vigilance decrement and operator stress (Matthews, Szalma, Panganiban, Neubauer, & Warm, 2013).
Given this, the purpose of the present study is to understand how individual differences in task engagement (i.e., state measures of pre- and post-task engagement) may be associated with the performance of semantic vigilance tasks. We hypothesize that increases in task engagement over the course of the vigil will be associated with improved overall vigilance performance across task types (one easy and one hard). This notion is supported by cognitive resource theory, which suggests that better resource allocation occurs when individuals are engaged with the task (Helton & Russell 2011, 2013; Helton & Warm 2008; Matthews, Warm, Reinerman-Jones, Langheim, & Saxby, 2010; Matthews, Warm, Reinerman-Jones, Langheim, Washburn et al., 2010; Matthews et al., 2017).
The Present Study
In the present study, task difficulty was manipulated across two task types wherein different sets of false alarms were used to make performance comparisons. In the standard semantic vigilance task, nonanimal or neutral false alarms are presented to participants (Fraulini et al., 2017; Thomson et al., 2016). Neutral false alarms included responses to nonanimals such as “coffee,” “drum,” or “flake.” In the lure semantic vigilance task, both neutral and “lure” stimuli are presented. Lure false alarms included responses to non-four-legged animals such as “cobra,” “octopus,” or “worm.”
The distinction between the lure task and semantic task is important in terms of determining task difficulty. Participants tend to perform more poorly and rate the lure semantic task as more frustrating because the targets, four-legged animals, are much more similar to the lure false alarms, non-four-legged animals (Claypoole et al., 2018; Neigel et al., 2018). Furthermore, the lure task requires additional decision-making resources in order to correctly detect targets, thus increasing the difficulty of this task. We manipulate task type to observe whether increases in state-based task engagement improve performance across task difficulty.
Finally, we postulate that increases in task engagement over the vigil may be associated with the stress and workload experienced in semantic vigilance tasks. In previous research, state-based task engagement scores tend to be better predictors of vigilance performance than distress or worry (Matthews, 2016; Matthews et al., 2017). This evidence helps support our rationale for examining state-based differences in task engagement in a vigilance paradigm.
Method
Participants
Seventy-three participants (45 women; 28 men) were recruited from the research participation system (SONA) at the University of Central Florida. The average age of participants was 18.77 years (median = 18.00 years, SD = 2.28 years, range = 18–30). All participants reported normal or corrected-to-normal vision. Participants indicated that they did not consume caffeine 24 hr prior to participating in this study.
Measures
Dundee Stress State Questionnaire
The Dundee Stress State Questionnaire (DSSQ; Matthews, 2016; Matthews et al., 2002) was utilized to measure perceived stress associated with the semantic vigilance task. The short-form DSSQ (20 items; Matthews, 2016) measures pre- and posttask engagement, distress, and worry. Higher scores on the DSSQ indicate more of that dimension. For example, higher scores for task engagement imply greater perceived engagement with the task.
In the present study, state changes across the task engagement dimension of the DSSQ were utilized to distinguish between participants who (1) indicated increases in task engagement over time and (2) indicated decreased task engagement over time.
NASA-Task Load Index
The NASA-Task Load Index (NASA-TLX; Hart, 2006; Hart & Staveland, 1988) is used to measure perceived global workload. The NASA-TLX was administered at the end of a task and consists of several subscales including mental demand, physical demand, temporal demand, performance, frustration, and effort. Lower scores on the NASA-TLX indicate less workload associated with the task.
Task Types and Stimuli
The semantic vigilance task is unique because of its inclusion of “lures,” which are similar to neutral stimuli in that participants should withhold responses to these nontargets. Lures are also categorically or semantically similar to targets but do not meet the target criteria. For example, in the present lure semantic vigilance task, targets include four-legged creatures, such as “cougar,” “squirrel,” or “llama.” Lures, on the other hand, include words that describe creatures that do not have four legs, such as “walrus,” “turkey,” or “flamingo.” Neutral stimuli are not semantically related to lures or targets and include nonanimal objects such as “cube,” “binder,” or “socket.” The inclusion of lures increases the difficulty associated with the semantic vigilance task (Claypoole et al., 2018; Neigel et al., 2018), whereas standard semantic vigilance tasks are easier in that they require only binary distinctions from participants (Fraulini et al., 2017). In the present study, the standard semantic vigilance tasks consisted of 10 targets and 90 neutral stimuli (referred to as “distracters” in Thomson et al., 2016), or 100 events per period. The lure task consisted of 10 targets, 10 lure stimuli, and 80 neutral stimuli, for a total of 100 events per period (for five total periods of watch; each period was approximately 2.4 min in length).
All stimuli were presented in white text in 24-point Times New Roman font on a black background using a Dell Optiplex computer and 18-inch Dell computer monitor set to its factory settings (i.e., participants could not adjust the brightness, contrast, etc.). Targets, lures, and neutral stimuli were presented for 200 ms and an interstimulus interval (ISI) cross (“+”) was presented for 1,100 ms, which resulted in a response window of 1,300 ms. The cross was also presented in white, in the center of the screen, on a black background. Both versions of the semantic vigilance task were approximately 12 min in length, and the duration of the entire experiment was approximately 30 min. Participants were randomly assigned to the lure semantic vigilance task (n = 37) or the standard semantic vigilance task (n = 36). Participants were not aware of the task to which they were assigned.
Procedure
Upon arrival at the research laboratory, participants were greeted by a research assistant and then reviewed an electronic informed consent. Participants were then seated approximately 58 cm away from the desktop computer. Participants were encouraged to ask questions at any time, except during the vigil. Next, participants completed the pretask DSSQ. The research assistant then started the respective semantic vigilance task on the computer and asked participants to review the task instructions. Both sets of task instructions were identical and instructed participants to press the spacebar only when a target was detected. The instructions specified that participants should respond as quickly as possible to targets and withhold response to all nontargets. If participants did not have questions about the task instructions, participants were asked to press the spacebar to start the semantic vigilance task and the research assistant left the room. Upon conclusion of the vigil, the research assistant returned to the room to prepare the posttask surveys. Participants then completed the posttask DSSQ and NASA-TLX. Finally, participants completed a brief demographics inventory and were given more information about the study.
This research complied with the American Psychological Association code of ethics and was approved by the institutional review board at the University of Central Florida. Informed consent was obtained from each observer.
Data Cleaning and Outlier Removal
Seventy-nine participants were recruited from the SONA study pool in total. However, two participants were removed for being outliers in terms of individual difference measures (i.e., survey malingering). Three participants were removed based on their number of correct detections in the first period on watch (i.e., fewer than seven correct detections in period one; this cutoff was used for data cleaning in Neigel et al., 2018, and Szalma & Teo, 2012). One participant was removed based on excessive distracter and lure false alarms (i.e., 57 false alarms committed in total). Thus, the final sample used in the following analyses included 73 participants in total.
Results
Task engagement change scores were utilized as measures of state individual differences (e.g., Matthews et al., 2017). The sample was divided between participants who indicated decreases in task engagement over time (N = 26 in the lure task; N = 25 in the standard task), which is associated with the traditional vigilance decrement (Hancock, 2017), and participants who indicated increases in task engagement over time (n = 11 in the lure task; n = 11 in the standard task), which is aligned with the cognitive increment (Hancock, 2017). The cognitive increment is an increase in vigilance performance that is observed over time (Hancock, 2013, 2017).
The mean for change in the increasing task engagement group was an increase of 2.73 points (SD = 0.64). The mean for change in the decreasing task engagement group was a decline of 4.205 points (SD = 0.475). The increasing task engagement group showed almost a three-point increase in engagement scores between pretask and posttask, whereas the decreasing task engagement group showed almost a four-point decrease in task engagement scores between pretask and posttask.
Correct Detection Performance
A two (task type) × two (task engagement type) × five (period on watch) mixed-measures analysis of variance (ANOVA) yielded a nonsignificant main effect of task engagement type on correct detection performance, F(1, 69) = 3.79, p = .056, ηp2 = .052, although the pattern of means indicated that correct detections were higher in the increasing task engagement conditions. Additionally, there was a significant main effect of period on watch on correct detection performance, F(4, 276) = 4.27, p = .004, ηp2 = .058, ε = .843. Together, these main effects indicate that across tasks, performance in the vigilance task declined over time. There were no additional significant main effects or interactions to report for this analysis. Correct detection performance is reported in Figure 1.

Percentage of correct detections over time as a function of task type and task engagement type. Error bars represent the standard error of the mean.
Neutral False Alarm Performance
Neutral false alarms occur when participants respond to a stimulus that is neither a target nor a lure stimulus (e.g., “book,” “slipper”). A two (task type) × two (task engagement type) × five (period on watch) mixed-measures ANOVA indicated a significant main effect of period on watch on neutral false alarm performance, F(4, 276) = 4.97, p = .001, ηp2 = .067, ε = .810. The number of false alarms to neutral stimuli tended to decrease over time across all tasks. There were no additional significant main effects or interactions to report for this analysis. Neutral false alarm performance is reported in Figure 2.

Number of false alarms to neutral stimuli committed over time as a function of task type and task engagement type. Error bars represent the standard error of the mean.
Lure False Alarm Performance
Lure false alarms occur when participants respond to a non-four-legged animal (e.g., “flamingo,” “walrus”). Analyses were conducted using only the data from participants in the lure semantic task, as this group is the only group exposed to lure stimuli. A two (task engagement type) × five (period on watch) mixed-measures ANOVA indicated a statistically significant main effect of period on watch on lure false alarm performance, F(4, 276) = 28.06, p < .001, ηp2 = .289, ε = .804. The number of false alarms committed to lure stimuli significantly declined over time across tasks. There were no additional significant main effects or interactions to report for this analysis. Lure false alarm performance is reported in Figure 3.

Number of false alarms committed to lure stimuli over time as a function of task type and task engagement type. Error bars represent the standard error of the mean.
Response Time to Targets
A two (task type) × two (task engagement type) × five (period on watch) mixed-measures ANOVA indicated a significant main effect of task type on average response time (in ms) to targets, F(1, 69) = 34.44, p < .001, ηp2 = .333. There was also a significant main effect of period on watch on average response time, F(4, 276) = 33.47, p < .001, ηp2 = .327. Participants in the standard semantic vigilance task responded to targets more rapidly than participants in the lure semantic vigilance task. Response time to targets tended to increase across all tasks over time. There were no additional significant main effects or interactions to report for this analysis. Average response time performance is reported in Figure 4.

Average response time to target stimuli (in ms) over time as a function of task type and task engagement type. Error bars represent the standard error of the mean.
Task Engagement
A two (task type) × two (task engagement type) factorial ANOVA was performed on pre- and posttask engagement scores. A significant main effect of task engagement type indicated that participants who reported increases in task engagement (M = 20.41, SD = 4.29) over the vigil also reported higher posttask engagement scores (M = 16.04, SD = 5.08), F(1, 69) = 12.13, p = .001, ηp2 = .149. There were no additional significant main effects or interactions to report for these analyses. Task engagement is depicted in Figure 5.

Pre- and posttask engagement scores as a function of task type and task engagement type. Error bars represent the standard error of the mean.
Distress
A two (task type) × two (task engagement type) ANOVA was performed on pre- and posttask distress scores. A significant main effect of task engagement type demonstrated that participants who indicated increases in task engagement over the course of the vigil initially reported higher scores of distress pretask, F(1, 69) = 5.50, p = .022, ηp2 = .074. However, there were no significant differences between distress scores posttask in terms of the task engagement types.
There was a main effect of task type on posttask distress, F(1, 69) = 6.65, p = .012, ηp2 = .088. This result indicated that posttask distress is a function of task type, not state-based task engagement. Participants in the lure semantic vigilance task (M = 10.15, SE = .93) reported significantly more distress posttask compared with participants in the standard semantic vigilance task (M = 6.75, SE = .93). There were no additional significant main effects or interactions to report for these analyses. Distress is depicted in Figure 6.

Pre- and posttask distress scores as a function of task type and task engagement type. Error bars represent the standard error of the mean.
Worry
A two (task type) × two (task engagement type) ANOVA was performed on pre- and posttask worry scores. These analyses did not yield any significant results. Worry is depicted in Figure 7.

Pre- and posttask worry scores as a function of task type and task engagement type. Error bars represent the standard error of the mean.
Workload
A two (task type) × two (task engagement type) ANOVA was performed on global workload scores and for each of the NASA-TLX subscale scores. These analyses did not yield any significant differences between tasks. Global workload is depicted in Figure 8.

Global workload scores as a function of task type and task engagement type. Error bars represent the standard error of the mean.
Discussion
The present study aimed to expand upon the early work of Joel S. Warm, who made significant contributions to research at the intersections of task engagement, stress, workload, and vigilance. To reiterate, it was hypothesized that increases in task engagement over the course of the vigil would mirror increases in performance in two semantic vigilance tasks (Fraulini et al., 2017). Increases in task engagement from pre- to posttask were not significantly related to improvements in false alarm performance or response time to targets. However, the pattern of means in the data suggested that participants who reported an increase in task engagement made more correct detections than participants who declined in engagement. For these reasons, additional research is necessary to understand how state-based task engagement, or other individual difference states such as motivation, may be associated with decision-making criteria and performance strategies in semantic vigilance tasks.
Another goal of the present study was to elaborate upon the stress and workload associated with semantic vigilance tasks, which remain relatively understudied in the literature. In this experiment, increases in task engagement over the course of the vigil were related to reports of higher pretask distress. Individuals increasing in task engagement were more distressed at the onset of the task, but not upon completion of the task. Instead, changes in distress posttask were driven by the task type to which participants were assigned; individuals in the lure semantic vigilance task reported significantly greater distress upon conclusion of the task than individuals in the standard semantic vigilance task. This is likely driven by the stages of decision making required to complete the lure task effectively (Fraulini et al., 2017). For example, participants need to ensure the target is both a creature and a creature with four legs. In the standard semantic vigilance task, participants needed only to determine that a creature was presented amongst nontargets. Thus, task engagement may be related to pretask states of distress, but task demands ultimately affect the posttask levels of these stress states.
Study Limitations
One major limitation to the present study is that participants increasing in task engagement over time reported statistically higher pretask distress scores at the beginning of the semantic vigilance task. It is possible that increases in task engagement over the course of the vigil may have been linked to high distress, however, this group of participants did not report high distress scores posttask. Because it was necessary to assign participants to engagement groups based on their task engagement scores (increasing vs. decreasing), differences in scale usage between groups cannot be ruled out as a contributor to our observed differences between groups. Although this study does have its limitations, there are several implications for theory and practice.
Implications for Theory and Practice
Theoretically, the results of the present experiment indicate that increased task engagement is associated with better, overall correct detection performance in two types of semantic vigilance tasks. This result was not statistically significant, but the patterns of the means were in the anticipated direction for the performance observed over the course of the task. This further supports the notion that task engagement is highlighted by energetic arousal toward the task and the desire to succeed at performing the task (Matthews, 2016; Matthews et al., 2002; Matthews, Warm, Reinerman-Jones, Langheim, & Saxby, 2010; Matthews, Warm, Reinerman-Jones, Langheim, Washburn et al., 2010). Under the assumptions of cognitive resource theory, task engagement can be more predictive of vigilance performance than measures of worry or distress (Helton, Matthews, & Warm, 2009; Matthews et al., 2017), which the present study extended to semantic vigilance paradigms. However, future research should focus on the role of task engagement in longer duration semantic vigilance tasks and different semantic stimuli (nonwords, text words, etc.).
Practically, this research is important for applied settings that require a great deal of semantic processing for extended periods of time. These settings may include, but are not limited to, aviation, driving, reading, or cyber vigilance. There is also some potential utility in selecting operators who demonstrate greater task engagement. To use one applied example, Naval aviators must interpret new symbols (e.g., flight path mode), which are currently being integrated to fighter cockpits and heads-up displays under fatigue states and after long-duration missions (Neigel & Priest, 2018). This new cockpit symbology is crucial in the pilot’s landing on aircraft carriers, which may occur under several sea and weather conditions, and is arguably a stressful task to perform (Neigel & Priest, 2018).
By understanding how state-based task engagement is related to semantic processing, we as human factors practitioners can make improved design recommendations. For example, the design of the task facilitates declines in performance over time (Hancock, 2013, 2017) and engagement factors are often overlooked in many applied vigilance settings (Szalma, 2009, 2014). Taken together, this evidence suggests that tasks may be designed in a way that maximizes performance while accounting for changes in task engagement. The relationship between task engagement and vigilance in semantic paradigms should not be neglected, and future research will need to focus on examining task engagement from a position of causal inference.
Key Points
The role of state-based task engagement is neglected in many vigilance settings. The present study suggests that changes in state-based task engagement may be associated with target detection performance.
Task engagement was not related to decreases in the stress and workload associated with semantic vigilance performance.
Future research on semantic vigilance should consider the effects of changes in state- and trait-based task engagement on performance, especially given the implications for cognitive resource theory.
Footnotes
Alexis R. Neigel earned her PhD in applied experimental and human factors psychology from the University of Central Florida in 2017. Her current research involves understanding human attention, decision-making, and operator workload.
Daryn A. Dever was an undergraduate research assistant in the Performance Research Laboratory at the University of Central Florida during this research. She is currently a first year doctoral student in the College of Education and Human Performance at the University of Central Florida.
Victoria L. Claypoole is a postdoctoral research fellow at the Air Force Research Laboratory at Wright-Patterson Air Force Base. She received her PhD in human factors and cognitive psychology in 2018 from the University of Central Florida.
James L. Szalma is an associate professor of psychology at the University of Central Florida and received his PhD in applied experimental and human factors psychology in 1999 from the University of Cincinnati. His primary research interests include vigilance performance, stress response, and workload.
