Abstract
The vigilance decrement is a decline in signal detection rate that occurs over time on a sustained-attention task. The effect has typically been ascribed to conservative shifts of response bias and losses of perceptual sensitivity. Recent work, though, has suggested that sensitivity losses in vigilance tasks are spurious, and other findings have implied that attentional lapses contribute to vigilance failures. To test these possibilities, we used Bayesian hierarchical modeling to compare psychometric curves for the first and last blocks of a visual vigilance task. Participants were a convenience sample of 99 young adults. Data showed evidence for all three postulated mechanisms of vigilance loss: a conservative shift of response bias, a decrease in perceptual sensitivity, and a tendency toward more frequent attentional lapses. Results confirm that sensitivity losses are possible in a sustained-attention task but indicate that mental lapses can also contribute to the vigilance decrement.
Monitoring and surveillance tasks require observers to detect critical signals that can occur infrequently and unpredictably. These tasks are challenging (Warm et al., 2008), and even if detection rates begin at acceptable levels, they tend to decline over the course of a half hour or less (Mackworth, 1948; Nuechterlein et al., 1983). This decline, the vigilance decrement, is one of the oldest and best-established phenomena in the human performance literature (Hancock, 2013). It afflicts performance in work (e.g., Pigeau et al., 1995) and academic settings (Young et al., 2009) and can be exacerbated in clinical disorders of attention (Huang-Pollock et al., 2012).
Analyzing the effect within the framework of signal detection theory (Green & Swets, 1966; Macmillan & Creelman, 2005), researchers have generally attributed the vigilance decrement to a combination of two mechanisms. The first is a conservative shift of response bias (Broadbent & Gregory, 1965) that occurs as observers adapt to the low signal rate. The second is a decline in sensitivity, the ability to discriminate signal from noise (Macmillan & Creelman, 2005). Although criterion shifts appear nearly ubiquitous in vigilance tasks (Broadbent & Gregory, 1965; Wickens et al., 2016), sensitivity losses are most likely when the stimulus event rate is high or when the task imposes heavy demands on perception or working memory (Nuechterlein et al., 1983; Parasuraman, 1979; See et al., 1995).
To explain this selective pattern of effects, theorists have proposed that a sensitivity decrement occurs only when a vigilance task is resource demanding and resources dedicated to the task decrease over time. Different explanations of the resource decline have been offered. One prominent account, the resource-depletion model, proposes that task demands gradually exhaust information-processing capacity (Warm et al., 1996, 2008). A newer hypothesis, the resource-control model, suggests that resources drift to task-unrelated thoughts as the result of flagging executive control (Thomson et al., 2015). In either case, the result is a drop in information-processing quality. Both the resource-depletion and the resource-control models fit well with the finding that vigilance tasks are characterized by high levels of subjective mental workload and stress (Grier et al., 2003; Warm et al., 2008).
But a recent review (Thomson et al., 2016) has suggested that apparent sensitivity decrements in sustained attention are spurious, an artifact of extremely low false-alarm rates. Vigilance experiments typically employ yes–no detection tasks, in which trials are classed into two levels, signal present or absent, and responses are binary. Changes of sensitivity and bias are distinguished by their relative effects on hit and false-alarm rates. Most simply, a loss of sensitivity tends to decrease hit rates and increase false-alarm rates, whereas a conservative change of response bias tends to reduce both variables. Hit and false-alarm rates thus provide the two degrees of freedom necessary to isolate sensitivity from bias in a yes–no task (Macmillan & Creelman, 2005). When false-alarm rates are near zero, further decreases produced by a conservative bias shift can be statistically undetectable, making changes of sensitivity and bias indistinguishable.
Reviewing the vigilance literature, Thomson et al. (2016) note that the problem of near-zero false-alarm rates has been common; few studies have reported substantial numbers of false alarms, and many have inferred sensitivity losses from patterns of hit rate alone. To circumvent the problem of low false-alarm rates, researchers sometimes transform data to A′, a supposedly nonparametric measure of sensitivity, or to measures of predictive power (e.g., Szalma et al., 2006). Unfortunately, none of those measures isolates sensitivity from bias (Getty et al., 1995; Pastore et al., 2003). “The evidence for a sensitivity decrement,” Thomson and colleagues (2016, p. 74) concluded, “is extremely weak” (for a response, see Fraulini et al., 2017). This leaves the possibility that the vigilance decrement might be entirely the result of response-bias shifts. Alternatively, it allows that the decrement might reflect attentional lapses, occasional periods of mindlessness, or full disengagement from the vigilance task (Jerison et al., 1965; Robertson et al., 1997). In the absence of ceiling or floor effects, lapses would reduce hit and false-alarm rates equally, but with false alarms near zero, again, the effect of lapses would be evident only in hit rates. Although data show clearly that the vigilance decrement is not strictly the result of mindlessness (Grier et al., 2003; Helton & Warm, 2008), the hypothesis that lapses contribute to vigilance losses over time remains plausible.
Statement of Relevance
The ability to monitor for important events over long periods of time is critical in many work and academic settings. A common finding is that the detection rate for unpredictable signals decreases rapidly following the start of a task, but the mechanisms that explain this loss of vigilance remain the subject of debate. By applying a novel method of data analysis, we identified three separate forms of vigilance decline: a trend toward more conservative responding, a loss of perceptual quality, and a tendency toward mental lapses. The ability to distinguish these different forms of vigilance loss may enable better methods of identifying and mitigating vigilance losses outside the lab.
Much existing data thus seem compatible with three potential mechanisms of vigilance decrement, none exclusive of the others: sensitivity losses, bias shifts, and attentional lapses. Although binary signal detection tasks are not well-suited for distinguishing these mechanisms, an ideal way of testing them might be through analysis of the psychometric curve (Kingdom & Prins, 2016). The psychometric curve for a detection task displays positive-response rate as a function of signal intensity. As shown in Figure 1, it is an S-shaped function that can be characterized by three parameters. Shift, a measure of response bias, determines the horizontal position of the curve. Scale, a measure of sensitivity, determines the slope of the curve (a larger value of scale corresponds to lower sensitivity and a shallower curve). Lapse rate, the proportion of trials on which the observer selects a response independently of stimulus intensity, determines the curve’s asymptotes (the hypothetical data shown in the rightmost panel of Fig. 1 are based on the assumption that an attentional lapse always leads to a negative response). An empirical psychometric function spans a range of performance from near floor to near ceiling, in multiple steps. It therefore includes some points at which false-alarm rates are nonnegligible. Further, because a psychometric curve is based on three or more empirical data points, it provides the degrees of freedom needed to distinguish multiple mechanisms of vigilance loss.

Sample psychometric curves showing positive-response rate as a function of signal strength. Values on the x-axis increase from left to right. Within each panel, solid and dashed lines indicate differing values of a single parameter, either shift (left), scale (middle), or lapse rate (right). Dashed lines represent functions with the higher parameter value. Hypothetical data in the right panel are based on the assumption that a lapse always leads to a negative response.
A practical constraint that complicates the analysis of psychometric curves for vigilance tasks is the amount of data needed. Conventional analysis of the psychometric curve, using maximum-likelihood methods, requires hundreds of trials per observer (Kingdom & Prins, 2016). A standard vigilance task, however, elicits just a small number of responses over a period of viewing. Analysis using hierarchical Bayesian modeling (Kruschke & Vanpaemel, 2015; Lee, 2018; Lee & Wagenmakers, 2013) circumvents this difficulty. Hierarchical models assume that parameter values for individual observers are sampled from group-level distributions. The parameters of the group-level distribution can be estimated from observer-level data, even if data from the individual observers are sparse or incomplete (Lee, 2018), and without the risk of distortion that comes from simply aggregating individuals’ raw data or averaging their parameter estimates (Morey et al., 2008). This implies that group-level curves can be recovered from individuals’ data in a vigilance task.
Here, we used Bayesian hierarchical modeling to test for changes in response bias, sensitivity, and lapse rate in a vigilance task. Figure 2 presents a timeline of events across a representative series of trials. Participants were asked to view a pair of small probes on each trial and to respond any time the distance between the probes exceeded a criterion value. The stimulus and task design were intended to maximize the possibility of a sensitivity decrement in multiple ways. First, probes appeared without an on-screen standard of comparison, requiring the participants to hold the criterion gap size in working memory. Second, trials occurred at a forced pace of 40 per minute, imposing high time stress. Finally, probes were embedded in visual noise, imposing high sensory-perceptual load. All of these characteristics—memory load, time stress, and perceptual load—have been associated with sensitivity losses in earlier work (Nuechterlein et al., 1983; Parasuraman, 1979). Posttask ratings of cognitive workload were collected to characterize task difficulty.

Sequence of events over a representative series of trials. On each trial, participants viewed a pair of small circular probes embedded in visual noise. They were asked to respond any time the distance between the probes exceeded a criterion value (2 cm). Items are not drawn to scale.
Method
Participants
Participants (N = 99; 67 female and 32 male, by free-response self-report; age: M = 19.44 years, SD = 1.97) were a convenience sample recruited from the departmental subject pools at two large public universities in the United States. Sample size was determined by the number of participants who enrolled between the time the experiment was posted and the end of the academic term. Participants were screened for normal or corrected-to-normal visual acuity using a conventional eye chart.
This research complied with the tenets of the Declaration of Helsinki and was approved by the institutional review board at each testing site. Informed consent was obtained from all participants.
Stimuli and procedure
Experimental software was written in PsychoPy (Version 3.2.4; Peirce et al., 2019). The stimulus on each trial was a pair of probes embedded in dynamic punctate visual noise. Stimuli were drawn in black against a white background. Probes were unfilled circles, drawn in a three-pixel stroke and with a radius of 1 mm. The two probes on each trial were arranged horizontally, separated by a distance that varied as described below. The noise field consisted of 10,000 elements, 2 × 2 pixels each, within a circular field that had a radius of 4 cm and was centered within the display. Stimuli were viewed at a distance of approximately 57 cm, although participants were free to move their heads. Data were collected at two test sites. Stimuli at one site were presented on a 24-in. LED monitor with a resolution of 1,024 × 768 pixels and a vertical refresh rate of 75 Hz. Stimuli at the other site were presented on a 24-in. LED monitor with a resolution of 1,920 × 1,080 pixels and a vertical refresh rate of 60 Hz.
Participants were asked to press the space bar on each trial if the gap between probes exceeded a criterion value of 2 cm and to withhold response otherwise. This choice of stimuli and task was intended to ensure that an increase in the criterion characteristic, the distance between probes, did not change the space-averaged luminance or contrast of the display in a way that might capture attention (Theeuwes et al., 2010). For exposition, we define a gap greater than the criterion distance as a signal.
Across trials, the distance between probes varied between 0.75 cm and 3.75 cm, in steps of 0.5 cm. The gap xt for a given trial t was chosen probabilistically in a two-step procedure. First, the trial was randomly designated to be either a noise or a signal event, where the probability of signal was .15. Second, the gap size was selected randomly and with uniform probability from among the range of values corresponding to the designated trial type, that is,
Each trial consisted of a 500-ms stimulus display followed by a blank interval of 1,000 ms. The subsequent trial began immediately thereafter, producing an event rate of 40 trials per minute. A response was attributed to trial t if it occurred before onset of the stimulus display for trial t + 1. Participants received no posttrial feedback to indicate whether their judgments were correct or incorrect.
After reading on-screen instructions, each participant completed a practice run of 3 min, followed by an experimental run of 20 min. Instructions included a sample stimulus display in which probes were separated by 2 cm. The practice run was identical to the experimental run except that (a) the signal rate was 0.50; (b) for the first 60 s, the stimulus display on each trial remained visible for the full trial duration of 1,500 ms; and (c) response errors were followed by a 1-s feedback message reading either, “Oops! It was not a target,” or “Oops! You missed a target,” as appropriate.
Immediately following the experimental run, participants were asked to complete a version of the Subjective Workload Assessment Technique (SWAT) known as ASWAT, in which ratings were made on continuous scales (Luximon & Goonetilleke, 2001). This rating scale for subjective mental workload comprises three items, corresponding to three aspects of a task: mental effort, time load, and psychological stress. Items were presented on screen one at a time, and participants used a mouse to report their ratings on a visual analog scale. Responses were coded on a scale from 0 to 100.
Data analysis
In the first step of analysis, binary responses from the full 20-min vigilance task were converted to a signal detection theory measure of sensitivity, d′, and participants with d′ less than or equal to 0.25 were screened from further analysis. The purpose of this screening was to remove participants with sensitivity near chance (d′ = 0), which might imply a misunderstanding of or failure to comply with task instructions. To correct for ceiling- or floor-level hit and false-alarm rates, we calculated d′ scores using a log-linear correction (Hautus, 1995). No participants were excluded on the basis of this preliminary screening. Across participants, mean d′ for the full 20-min task was 2.74 (range = 1.65−4.15, SD = 0.48); mean hit rate was .90 (range = .59–1.00, SD = .10); mean false-alarm rate was .14 (range = .00–.51, SD = .11); and mean proportion of correct responses, calculated as the average of the hit and false-alarm rates (Macmillan & Creelman, 2005), was .88 (range = .74–.98, SD = .05).
Subsequent statistical analysis employed hierarchical Bayesian parameter estimation using a Markov chain Monte Carlo (MCMC) sampling procedure (Kruschke, 2015; Lee & Wagenmakers, 2013). Gap-discrimination data were fitted with logistic psychometric curves (Lee, 2018) characterized by shift, scale, and lapse rate. To simplify model fitting and hypothesis testing, we compared performance between only the first 4-min block and last 4-min block of the 20-min experimental run. The choice of 4 min for block duration was intended to prevent rapid vigilance losses from diluting comparisons between blocks; past work has found that vigilance losses can begin within 5 to 10 min of task onset (Nuechterlein et al., 1983). The model placed unit-normal priors on the standardized mean differences in shift, scale, and lapse rate between first and last blocks (Lee & Wagenmakers, 2013) and assigned vague priors to all remaining parameters. To maintain consistency and facilitate comparisons across parameters, we placed priors on probit-transformed lapse rates rather than on raw lapse rates. This allowed the model to use the same priors for the effect of block on lapse rates as for the effect of block on shift and scale. Full details of the model are presented in the Supplemental Material available online.
We used the Savage-Dickey ratio (Wagenmakers et al., 2010) to estimate Bayes factors for or against effects of block on each of the three logistic parameters. As a check of model fit, we calculated 95% posterior predictive credible intervals on the basis of a random sample of 1,000 steps from the MCMC chains.
Responses for the ASWAT subscales were analyzed separately within a model that placed a normal likelihood function on observed ratings and placed uniform priors, U(0, 100), on the group means and standard deviations of the ratings.
Estimation was performed in the R programming environment (Version 4.0.3; R Core Team, 2020) using JAGS (Version 4.3.0; Plummer, 2019). The estimation procedures ran four MCMC chains for 10,000 warmup trials and then 50,000 estimation trials each.
Results
All parameter estimates showed

Mean response rate for the first and last 4-min blocks of the vigilance task (left) and mean change in response rate between the first and last blocks (right), both shown as a function of gap length between the probes. Response rate was calculated as the proportion of trials on which the participant reported that gap size exceeded the criterion value (2 cm). Symbols in both plots represent empirical mean values; error bars represent 95% posterior predictive credible intervals. Gray triangles mark the criterion value separating noise from signal events.
Figure 4 presents posterior density functions of standardized mean differences in shift, scale, and probit of the lapse rate between the first and last blocks of trials. Values reflect estimated group-level differences between blocks, divided by the estimated standard deviation of the differences. All three parameters increased between the first and last blocks (all Bayes factors favoring the alternative over the null hypothesis [BF10s] > 450), indicating a conservative drift of response bias, a loss of sensitivity, and an increasing tendency toward lapses.

Density plots showing the standardized mean parameter difference between the first and last blocks of the vigilance task, separately for shift (top), scale (middle), and probit-transformed lapse rate (bottom).
Interestingly, standardized mean differences between blocks were similar for scale and the probit-transformed lapse rate and were only slightly smaller for shift. However, such standardized effect sizes do not reveal how strongly each of the three mechanisms contributed to the observed changes in raw response rates (Pek & Flora, 2018). To allow a more meaningful comparison of the effects of shift, scale, and lapse-rate changes, we recalculated posterior predictive intervals for the differences in response rate between blocks, incorporating only one of the three mechanisms at a time. The graph on the left of Figure 5 presents the difference estimates produced by a selective effect of block on shift, holding scale and lapse rate constant. The graph in the middle of the figure presents the estimates produced by a selective effect of block on scale, and the graph on the right presents the estimates produced by a selective effect of block on lapse rate. Changes in shift led to large decreases in response rate between blocks. These decreases were largest near the boundary value of gap size separating noise from signal events and were negligible at more extreme values of gap size. Changes in scale produced more modest effects, causing a small increase in the response rate for events just below the signal boundary and a small decrease in response rate for events just above the signal boundary. Changes in lapse rate produced a decrease in response rate that was most pronounced for events above the signal boundary. Conservative shifts of response bias therefore accounted for most of the observed differences in response rate between blocks.

Selective effects of shift (left), scale (middle), and lapse-rate (right) changes on the posterior predictive differences in response rate between blocks. Results are shown as a function of gap size. Filled symbols represent mean posterior predictive response-rate differences, and error bars represent 95% credible intervals. Open symbols are empirical data replotted from Figure 3 and represent the combined effects of shift, scale, and lapse-rate changes. Gray triangles mark the criterion value separating noise from signal events.
We excluded ASWAT data for one participant who failed to complete all three subscales. Estimated mean ASWAT ratings for the remaining participants were 57.7 (95% Bayesian credible interval [BCI] = [52.5, 63.0]) for time load, 72.3 (95% BCI = [67.9, 76.7]) for mental effort, and 50.4 (95% BCI = [45.3, 55.5]) for psychological stress. Values were near or above the scale midpoints, consistent with earlier findings that vigilance tasks impose significant workload.
Discussion
Although the vigilance decrement is well-known, disagreements about its mechanisms persist. The effect has often been ascribed to changes in response bias and losses of perceptual sensitivity (Broadbent & Gregory, 1965; Parasuraman, 1979). However, other work has suggested that apparent sensitivity changes in vigilance tasks have been artifactual (Thomson et al., 2016) and that vigilance declines may result either from bias changes alone (Broadbent & Gregory, 1965) or from bias changes in conjunction with attentional lapses (Robertson et al., 1997).
Our findings indicate that, in fact, bias changes, sensitivity losses, and attentional lapses can all contribute to the vigilance decrement. Psychometric curves produced decisive evidence for changes in shift, scale, and lapse rate over blocks of a vigilance task, and all changes were in the direction of reducing the signal detection rate. These results refute the suggestion that sensitivity losses in vigilance tasks are spurious, confirming that the ability to distinguish signal from noise can decline even over a fairly short period of time. In addition, they confirm that attentional lapses—response failures that are independent of stimulus intensity—are a source of vigilance decrement.
The observed increase in lapse frequency across blocks is incompatible with a model that assumes resource depletion without bouts of mindlessness (Warm et al., 2008). Conversely, increases in the scale of the psychometric function over the vigilance task disconfirm a model that presumes mindlessness without sensitivity loss for stimulus-driven responses (Robertson et al., 1997). The results might indicate separate and independent processes of resource depletion and sporadic mental lapses. Alternatively, they could reflect a process like that of Thompson et al.’s (2015) resource-control model, in which the executive system diverts processing resources from the vigilance task, producing sensitivity losses, and occasionally disengages completely from stimulus-driven processing, producing lapses. Research to dissociate lapses from sensitivity losses, within or between observers, might distinguish these models from each other.
Finally, although they give evidence that all three contribute to vigilance failures, our data do not imply that shifting criteria, waning sensitivity, and lapsing attention are equally consequential. Criterion shifts alone accounted for most of the observed change in response rates across blocks. This result suggests that even when sensitivity losses and attentional lapses are possible, interventions targeting response bias may be the most effective method of fighting the vigilance decrement. Additional research will be needed, though, to generalize that conclusion and, more broadly, to extend the current results beyond the particular task and population tested here.
Supplemental Material
sj-pdf-1-pss-10.1177_09567976211007559 – Supplemental material for Psychometric Curves Reveal Three Mechanisms of Vigilance Decrement
Supplemental material, sj-pdf-1-pss-10.1177_09567976211007559 for Psychometric Curves Reveal Three Mechanisms of Vigilance Decrement by Jason S. McCarley and Yusuke Yamani in Psychological Science
Supplemental Material
sj-docx-2-pss-10.1177_09567976211007559 – Supplemental material for Psychometric Curves Reveal Three Mechanisms of Vigilance Decrement
Supplemental material, sj-docx-2-pss-10.1177_09567976211007559 for Psychometric Curves Reveal Three Mechanisms of Vigilance Decrement by Jason S. McCarley and Yusuke Yamani in Psychological Science
Footnotes
Transparency
Action Editor: Krishnankutty Sathian
Editor: Patricia J. Bauer
Author Contributions
J. S. McCarley and Y. Yamani conceptualized the study. J. S. McCarley developed the methodology, programmed the software used in the study, and collected and analyzed the data. Both of the authors ran the experiment. J. S. McCarley created the figures and wrote the first draft of the manuscript. Both the authors reviewed and edited the manuscript and approved the final version for submission.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
