Abstract
Objective:
We aimed to discover how varying the length of task breaks would affect the time-on-task effect in subsequent testing periods.
Background:
An important means of preventing errors and accidents caused by mental fatigue and time on task is to intersperse rest intervals within long work periods. Most studies of rest pauses to date have examined their effects in real-world tasks and settings, and their subtler effects on behavior, as measurable by laboratory paradigms, are not well understood.
Method:
We studied a group of 71 participants as they completed a 1-hr auditory oddball task with two rest opportunities. Rest intervals were 1, 5, or 10 min long.
Results:
Improvements in reaction time were significantly positively associated with length of the rest break. However, longer breaks were also associated with steeper decrements in performance in the subsequent task block. Across individuals, the amount of immediate improvement correlated with the extent of later decline.
Conclusion:
Our results support a resource/effort-allocation model of fatigue, whereby longer breaks bias participants toward greater effort expenditure on resumption of the task when cognitive resources may not have been fully replenished.
Application:
These findings may have implications for the refinement of work-rest schedules in industries where time-on-task degradation in performance is an important concern.
Introduction
A basic and ubiquitous countermeasure for on-the-job fatigue is the timely administration of rest breaks during extended periods of work. The restorative effect of breaks in industrial settings has been extensively studied. These experiments have suggested that, on the whole, rest breaks are effective in the short term for increasing productivity (Colquhoun, 2007; Dababneh, Swanson, & Shell, 2001), reducing the risk of on-the-job errors and accidents (Phipps-Nelson, Redman, & Rajaratnam, 2011; Tucker, Folkard, & Macdonald, 2003), reducing boredom and fatigue, and improving subjective well-being (Dababneh et al., 2001; Drory, 1985; for a review, see Tucker, 2003).
Importantly, rest breaks have been shown to mitigate objective performance declines attributable to time on task (TOT; Helton & Russell, 2012; Ross, Russell, & Helton, 2014). Signs of TOT decline include increased errors and reduced speed of output (Helton & Russell, 2011; Mackworth, 1964; Warm, Parasuraman, & Matthews, 2008), which can subsequently affect safety at work (Baker, Olson, & Morisseau, 1994). The impact of TOT on the ability to sustain attention has an especially large and direct effect on workplace performance (Warm et al., 2008; Williamson et al., 2011). As TOT effects can be observed in many commonly used psychological tasks, including tests of sustained attention, the phenomenon is amenable to study in controlled environments. As a result, the cognitive and neurophysiological basis of TOT is becoming increasingly well described (Langner & Eickhoff, 2013; Parasuraman & Jiang, 2012).
Understanding the precise effects of rest breaks on the trajectory of the TOT effect on attention is important for the calibration of work-rest schedules that optimize worker productivity while minimizing the detrimental effects of fatigue (Bhatia & Murrell, 1969; Boucsein & Thum, 1997), as effective performance in many work settings relies heavily on the maintenance on a stable level of attention. However, relatively few studies have investigated these effects using laboratory-based cognitive paradigms. Such studies may reveal finer-grained behavioral effects that are masked by the complexity of experiments that mimic real-world work performance. Critically, it is not yet clearly understood how factors such as break length affect the amount of recovery the rest period provides. One further issue that has not been addressed using laboratory paradigms is whether the immediate improvement afforded by a break comes at any later cost. This issue is hinted at by the fact that break opportunities do not always result in improved performance when composite postbreak measures are used as a basis of comparison (Drory, 1985).
In the present study, we tested how altering the length of rest periods would affect recovery from the TOT effect on an auditory oddball task, a test that requires sustained attention for effective performance. On the basis of predictions of resource theory (Kahneman, 1973; Parasuraman, 1979; Sanders, 1997; Warm et al., 2008), we hypothesized that longer break lengths would result in better immediate postbreak performance without any cost to later performance being incurred. In other words, we predicted that longer breaks would yield a mean improvement in performance with no detrimental effect on subsequent declines in performance due to TOT. Finding an opposite effect (that is, if longer breaks resulted in less improvement or worse performance) would argue against a purely resource-based account and suggest that other factors, such as underload, may also be driving performance efficiency over TOT. Finally, changes affecting the rate of decline in subsequent blocks would support alternative models of compensatory control (Hockey, 1997) or opportunity cost (Kurzban, Duckworth, Kable, & Myers, 2013).
Materials and Methods
Eighty students and staff members from the National University of Singapore were recruited as participants for this study. Participants had no history of attentional disorders. We prescreened participants to ensure they had normal hearing using the Quick Hearing Check (Koike, Hurst, & Wetmore, 1994), a 15-item self-report scale on which greater scores represent poorer hearing. Participants were excluded at the prescreening stage if their score on this scale was 9 or greater (i.e., if they were below the 90th percentile in hearing ability). Volunteers were asked to refrain from using caffeine or alcohol for 6 hr prior to coming into the lab. Study sessions took place in the afternoon between 1:30 and 5:30 p.m. to control for possible circadian confounds. All testing took place in the experimental rooms of the SINAPSE laboratories in the National University of Singapore. Participants who completed the experiment were compensated S$10 for approximately 1 hr of their time. The protocol for this study was approved by the institutional review board of the National University of Singapore.
Study Procedure
Participants in this experiment performed an auditory oddball task, the details of which have been previously reported (Lim, Quevenco, & Kwok, 2013). We reused this paradigm with slight modifications. Briefly, participants were instructed to monitor a stream of tones for a target tone that occurred 25% of the time. Nontargets were 996 Hz tones of ~65 dB presented for 1 s, and targets had a 50% drop in amplitude in the final 50 ms of the stimulus duration. The interstimulus interval was kept constant at 2 s, yielding a total trial time of 3 s. In summary, the event rate was 20 events per minute, and the signal rate was 25%. Targets were presented pseudorandomly such that every 15-s interval contained at least one target, and targets never appeared consecutively. Participants were told to respond to target tones via a mouse click with their dominant hand and to ignore nontargets. Prior to the actual experiment, they were given a 1-min practice run, in which they were required to achieve an accuracy rate of 80%. Ten targets were presented in this practice run. Participants who could not achieve this level of performance after three practice runs were excluded from the study; all excluded individuals reported that they could not discriminate between the target and neutral tones. Nine participants were screened out for this reason, leaving a final sample of N = 71 (38 male; mean age = 23, SD = 2.8).
After successfully achieving criterion on the practice runs, but prior to the experiment, participants were assigned at random into one of six conditions to determine the type and order of breaks they would receive; break periods were short (1 min; S) medium (5 min; M) or long (10 min; L), and break orders were all permutations of these (S-M, S-L, M-L, M-S, L-S, L-M). All conditions contained 12 participants except L-M, which had 11. We elected to use this distribution of conditions to keep the task length and number of comparisons manageable while reaping the benefits of having within-subject comparisons in a mixed design.
Participants then performed three 15-min blocks of the auditory oddball task, separated by the two break periods of their assigned length (Figure 1). Each task block contained 300 events, of which 75 were targets. Participants were instructed before the task to emphasize both speed and accuracy in responding. A screen also appeared briefly at the start of each break to instruct participants to relax while maintaining their gaze on a fixation cross presented in the middle of the display. Before and after the paradigm, participants completed the Short Stress State Questionnaire (SSSQ; Helton, 2004, Helton & Naswall, 2015), a brief survey that measures task engagement, distress, and worry, variables that are typically affected by fatigue.

Schematic of the task paradigm. Participants performed three 15-min task blocks separated by two rest periods, lasting 1, 5, or 10 min. Each participant received two of these three rest lengths, in one of two orders (six total conditions). The oddball task involved monitoring a series of tones for an uncommon (25%) tone that decreased in amplitude in its last 50 ms.
Data Analysis
To measure the effect of TOT, we tested for differences in median reaction time (RT) and accuracy (hit rate and false alarms) in each of the task blocks using one-way repeated-measures ANOVA. Hit rate was calculated as the proportion of correct detections, and false alarms were defined as responses to nontargets. As the number of false alarms was small, we report this variable in absolute numbers. To test the main effect of taking a break on task performance, we first created a plot of RT against target-trial number in each task block, with no data entered if the participant did not make a response for that particular event. Nonresponses to targets did not affect the trial numbers of subsequent targets. We then fitted a linear slope to this plot and performed paired t tests between the predicted RT at the end of the prebreak block and the y-intercept at the beginning of the postbreak block. This former value is the y-value of the fitted slope when x = 75 (i.e., the predicted RT of the final target in the block). To test the effects of the different break lengths, we used this same linear fit and extracted (a) the change (difference) in slope from the prebreak to the postbreak block and (b) the percentage change between the predicted RT at the end of the prebreak block and the y-intercept at the beginning of the postbreak block (illustrated in Figure 2). These two variables were subjected to linear mixed-model ANOVAs (West, Welch, & Galecki, 2015) using a diagonal variance-covariance matrix with break length (S, M, L) as a within-subjects factor and break number as a between-subjects factor. Break number refers to whether the break was the first or the second break received during the task. Linear mixed-model analysis was used as each participant underwent only two of the three possible load conditions, thus yielding incomplete data for a full repeated-measures analysis. Post hoc comparisons for these tests were adjusted for multiple comparisons using Holm-Sidak corrections (Holm, 1979). Box corrections were performed for all one-way repeated-measures ANOVAs. Statistical analysis was carried out using MATLAB R2011B and SPSS for Windows, Version 21.0.

Illustration of the calculation of dependent measures. Fit lines were estimated for each task block by plotting reaction time (RT) against target trial number. The effect of each break length was then tested by measuring (a) the percentage change in slope from the prebreak to postbreak block and (b) the change in the predicted RT from pre- to postbreak. For example, to measure the effect of Break 1, we would calculate the percentage decrease from Y(1,2) to Y(2,1).
Results
Overall Performance
We assessed the effect of TOT overall by subjecting hit rate, false alarms, and median RT in the three task blocks to one-way repeated-measures ANOVAs. Means and standard deviations of these variables are reported in Table 1. As expected, we observed significant declines in hit rate, F(1.69, 118.35) = 19.95, p < 10-7, η2partial = .22, and increases in median RTs, F(1.90, 133.19) = 6.23, p = .003, η2partial = .08, over the course of the experiment. Corrected tests revealed significant differences for all pairwise comparisons for hit rate (first vs. second, 95% CI [1.51, 6.78], p = .001; first vs. third, 95% CI [3.83, 10.40], p = .000004; second vs. third, 95% CI [0.68, 5.26], p = .007) and in two of the pairwise comparisons for median reaction time (first vs. second, 95% CI [–42.8, –1.9], p = .03; first vs. third, 95% CI [–48.8, –5.4], p = .01; second vs. third, 95% CI [–22.5, 13.1], p = .89). False alarms did not change significantly over the course of the run, F(1.27, 88.58) = 2.24, p = .11.
Means of Behavioral Variables
Note. Standard deviations shown in parentheses. Hit rate was measured as the proportion of correct detections. False alarms are reported as absolute numbers.
We next performed paired t tests of the individually fitted slope estimates for all pairwise combinations of blocks. In line with previous findings (Giambra & Quilter, 1987), we observed that the slope of predicted reaction times was significantly steeper in the first than in the second and third task blocks: first versus second, t(70) = 2.46, p = .02, Cohen’s d = 0.42; first versus third, t(70) = 3.49, p = .001, Cohen’s d = 0.60, indicating that TOT effects were more pronounced in the early portion of the work period.
For all subsequent analysis, we focus on speed as our primary measure, as it is affected by sleep deprivation and fatigue, it is a robust predictor of real-world lapses and accidents (Dorrian, Rogers, & Dinges, 2005), and for consistency with our previous experiment (Lim et al., 2013).
Main Effect of Break Periods
We tested the main effect of breaks on performance (regardless of length) by comparing the predicted RT at the end of each task block with the y-intercept of the slope after the subsequent break using a paired t test. This effect was significant for both the first, t(70) = 5.15, p = .0003, Cohen’s d = 0.35, and second break period, t(70) = 2.92, p =.005, Cohen’s d = 0.25, indicating that taking a break had a restorative effect on performance in the short term (Break 1, reduction from 500.9 ms to 448.3 ms; Break 2, reduction from 497.9 ms to 458.6 ms).
Effect of Break Length
Mixed-model analysis revealed a significant effect of break length on the change in predicted RT, F(2, 56.17) = 5.68, p = .006, R2 = .34, but no effect of break number, F(1, 56.95) = 1.18, p = .28, and no Break Number × Length interaction, F(2, 64.87) = 1.04, p = .51. Corrected post hoc tests showed that there was a significant difference in this variable between the 1- and 10-min breaks (95% CI [.001, .177], p = .046), and between the 5- and 10-min breaks (95% CI [.012, .169], p = .02), but not between the 1- and 5-min breaks. Analysis of the change in slope between blocks across breaks of differing lengths revealed a significant effect of break length, F(2, 96.052) = 3.30, p = .04, R2 = .26; a significant effect of break number, F(1, 133.58) = 16.29, p = .00009, R2 = .69; and no Break Number × Length interaction, F(2, 96.052) = 0.38, p = .68. Corrected post hoc tests showed differences between the 1- and 10-min break only (95% CI [–2.62, –0.039], p = .041). Our findings indicate that longer breaks tended to result in relatively steeper postbreak slopes. Changes in accuracy (hit rate) did not differ due to break length. These results are summarized in Table 2.
Results of Mixed-Model ANOVAs
Note. Means of each variable are reported with standard deviations in parentheses; all means are normalized.
Refers to the percentage decrease from the predicted reaction time at the end of the prebreak period from the y-intercept at the beginning of the postbreak period.
p < .05.
Based on these observations, we hypothesized that change in predicted RT might be inversely correlated with the postbreak slope; that is, the greater the immediate rebound effect owing to a break, the steeper the decline would be in the subsequent task block. To test this hypothesis, we calculated separate correlations between these variables over the first and second breaks (to remove dependencies due to the effect of subject). Both these correlations were significant (Figure 3; Break 1, r = −.58, p < 10-7; Break 2, r = −.51, p < 10-5). As the slope in each task period was also correlated with the intercept at the beginning of each task block (and not just the change from the previous block), we also performed a partial correlation controlling for this confounding variable. Our result was still significant over both the first (r = −.47, p = .00003) and the second (r = −.39, p = .0009) rest periods.

The immediate restorative effect of a break was significantly correlated with the rate of time-on-task decline (slope) in the block following that break. This relationship held for both the (a) first (r = −.58, p < 10-7) and (b) second (r = −.51, p < 10-5) rest periods.
Finally, for each break length, we used our fit parameters (slope and intercept at the beginning of the task block) to extrapolate the average number of minutes postbreak at which a participant declined to the point of prebreak performance. We allowed this value to be negative in the case where participants started the task period with worse performance than when they entered the break. These averages (and standard deviations) were 5.71 (20.34), 7.23 (14.47), and 5.22 (14.54) minutes, respectively, for the 1-, 5- and 10-min breaks. One-way ANOVA revealed no significant difference between these three crossover points, F(2, 139) = .195, ns.
Subjective Data
We analyzed subjective data from the SSSQ by first calculating aggregate scores for its three subfactors (task engagement, distress, and worry) and performing paired t tests on the pretest and posttest ratings. Performing the auditory oddball task resulted in significant increases in distress, t(70) = −3.16, p = .002, Cohen’s d = 0.95, and decreases in engagement, t(70) = 8.55, p < 10-12, Cohen’s d = 0.50, but no significant change in worry, t(70) = −.11, p = .91, Cohen’s d = 0.01. We computed a pre-post change score for each of these variables to test whether this was affected by the total length of break time (6 [i.e., 1 + 5], 11 [1 + 10], and 15 [5 + 10] min) during the work period. There was no significant difference in the change of any of the three scales between these conditions: engagement, F(2, 68) = 1.67; distress, F(2, 68) = 0.13; and worry, F(2, 68) = 0.05; all ns.
Discussion
The Length of a Task Break Affects Both Overall Recovery and Effort Allocation Strategies
We demonstrate in this study that manipulating the length of break opportunities leads to significantly different magnitudes of rebound in RT, with the longest break leading to the greatest immediate improvement. However, contrary to our hypothesis, we also observed that there was an inverse relationship between this immediate improvement in RTs and the rate of subsequent decline in performance. This relationship held up even after individual differences in initial RTs were controlled for. Furthermore, the amount of time that participants took to decline to their prebreak performance levels was not significantly different following breaks of different lengths. This suggests that altering break length could relate more strongly to the way in which participants allocate their mental resources in the postbreak period, rather than leading to different amounts of recovery from fatigue per se.
Break length has generally been associated with performance improvements in paradigms that mimic real-world tasks, although we also note that the studies discussed in this section did not all use response times or RTs as their primary measure. For instance, Henning, Sauter, Salvendy, and Krieg (1989) reported that the chosen length of a self-regulated “microbreak” was predicted by fatigue and boredom but also predicted lower error rates in subsequent performance in a group of experienced data-entry operators. However, the authors also concluded that these breaks were not fully utilized by the operators for complete recovery. Boucsein and Thum (1997) found that longer breaks are more effective in fatigue reduction toward the end rather than the beginning of a work shift, when shorter breaks were sufficient to promote recovery. Other studies have also reported no effect of break length. For example, Lisper and Eriksson (1980) found no difference between a 15- and a 60-min rest break during an all-day driving task, although consuming food during the break did have a positive effect.
Laboratory experiments have also provided evidence that break length may have a significant effect on recovery. An early study by McCormack (1958) found better performance (mean response time) following a longer break at the tail end of a vigilance task. Bergum and Lehr (1963) also demonstrated that two 10-min rest periods in the midst of a 90-min vigil had a significant beneficial effect on the percentage of correct detections. Helton and Russell (2012; in response to a paper by Ariga and Lleras, 2011) showed that a brief task interruption on its own without a rest opportunity was insufficient to mitigate the vigilance decrement. More recently, Ross et al. (2014) found further evidence that a 1-min rest break, but not a brief goal switch, was effective in reversing the decrement caused by TOT but only when administered early in the task. In sum, it appears that task interruptions are beneficial but only when they are of sufficient quality and duration and possibly only when given at an appropriate time.
Our own findings extend the literature by providing a more nuanced description of the effect of different break lengths on both immediate performance and later decrements. In line with the studies previously described, RTs improved the most after the longest break and the least after the shortest. However, the reverse pattern was true when we compared the postbreak slopes; performance decline was steepest following the longest break. By about 5 to 7 min into each task block, all participants were performing at the same level that they were before their previous break, regardless of the length of that break. These data are compatible with an observation made in a recent study by our group: that providing a short midtask break does not result in significantly different performance by the end of the task when compared to a no-break control group (Lim et al., 2013).
It is useful to consider these data within the framework of the current debate in the literature on fatigue between resource theorists (Warm et al., 2008), and those who argue that sensations of fatigue (and accompanying performance declines) owe more to underload (Manly, Robertson, Galloway, & Hawkins, 1999; Smallwood & Schooler, 2006). On aggregate, and in agreement with Ross et al. (2014), we found that providing a break, regardless of length, led to a significant improvement in performance. Furthermore, we observed that longer breaks facilitate greater immediate recovery. Both of these two findings are in line with the general predictions of resource theory.
However, a strictly resource-based explanation does not seem sufficient to explain the tendency for the TOT slope to be steeper following a longer task break. Per this theory, we would not have expected to find evidence against the null hypothesis in our current experiment, or if we did, we would have expected the slope of decline to be steeper following a shorter break. The inverse relationship seen in this data set suggests a shift in allocation strategies and not merely the replenishment of a depleted resource. In other words, when given a longer break, participants either consciously or implicitly biased themselves toward faster responding at the beginning of the subsequent work period. However, this level of performance was relatively less stable and more vulnerable to degradation in the long term, indicating that the amount of actual recovery may not have differed among the break conditions.
We thus speculate that, in addition to allowing for the replenishment of some depleted resources, the different break lengths in this study led participants to pursue different effort-allocation strategies on resumption of the task. This mechanism may be governed by an executive control loop similar to the one proposed in Hockey’s (1997) compensatory control model. In this model, routine, lower-level activity is monitored by higher-control mechanisms that act when the first system is no longer able to self-regulate, for example, under conditions of high stress or excessive demand. Higher control was likely to be required in our task as TOT increased. However, we posit that participants returning from a longer break tended to overestimate the amount of recovery they had received, thus leading them to disengage these processes, resume work at an unsustainable level of performance, and experience a steeper level of decline.
Human factors experts have long been interested in how to optimize work-rest schedules (Bhatia & Murrell, 1969; Janaro & Bechtold, 1985; Tucker et al., 2003). However, the complexity of real-world tasks can mask more subtle effects in performance, such as those seen in the current experiment. Our data indicate that there may be two parameters to consider when deciding how long a break to administer: the amount of actual recovery the break will provide and how the break may affect a participant’s postbreak effort allocation. Specifically, a break that is too long may be counterproductive in tasks where consistent performance above a certain threshold is needed (e.g., if there is a discrete level of performance below which accidents are much more likely to occur). Mathematical modeling using cognitive architectures may be an effective way to use this information in the design of work-rest schedules. Alternatively, it may be useful to provide feedback to workers to prevent overexpenditure of effort in the intervals directly following a return to work.
Total Break Length Does Not Affect Subjective Changes Associated With Fatigue
As a whole, performing the auditory oddball task resulted in increases in subjective distress, and declines in engagement, as measured by the SSSQ, an abbreviated version of the Dundee Stress State Questionnaire (DSSQ; Matthews et al., 2002). These data are in line with other studies reporting decreases in task engagement, increases in distress, and decreases in worry following performance on a vigilance task (Matthews, Szalma, Panganiban, Neubauer, & Warm, 2013). These changes strongly suggest increases in fatigue but may also occur due to factors such as stress and boredom.
Unlike previous studies on factory work (Dababneh et al., 2001) and driving performance (Drory, 1985), longer total break length in this experiment was not associated with differential decreases in distress or task engagement. Task differences may account for these differential findings. Moreover, both the task and rest periods in this experiment were relatively short compared with many other paradigms investigating fatigue. It is thus possible that lengthier experiments may be needed to uncover these differences.
Limitations
There are a few limitations in interpreting the current study. First, the task that we administered required 225 motor responses over the course of 45 min, which may have induced motor fatigue in addition to cognitive fatigue. Hence, declines in performance may not be entirely attributable to changes in psychological processing.
Second, accuracy was relatively low in this study (in the range of 70% to 80%), which may raise concerns about the validity of analyzing RT. We believe that the RT measure was still meaningful and interpretable as RTs were comparable to our previous published experiment (Lim et al., 2013; Experiment 1, M = 492.36, SD = 93.35; Experiment 2, M = 464.61, SD = 131.62, t[101] = 1.076, p = .29), in which accuracy rates were much higher (about 90% to 91%).
Finally, we did not allow participants to take an unconstrained break (e.g., leaving the experiment room, surfing the Internet) in order to maintain control over their activity. This may partially account for the discrepancy with earlier studies that have shown no performance decrements when participants are allowed to rest outside of the experimental setting (Bergum & Lehr, 1963).
Conclusion
In summary, this experiment contributes to the literature on rest and mental fatigue by highlighting a subtle trade-off that is unmasked by variations in break length. These results are in line with a resource model of fatigue that is governed by higher-control mechanisms. Future work in this area could explore whether this trade-off is inevitable or if other interventions can be used to stabilize performance after longer periods of rest. Finally, our refined model may have implications for improving the schedule of real-world work-rest cycles in jobs where performance is particularly prone to TOT degradation.
Key Points
Seventy-one participants performed a 1-hr oddball task with two rest periods of varying lengths.
Longer rests were associated with greater immediate improvement in reaction time (RT), and greater immediate improvement correlated with steeper subsequent RT slope.
We posit that the length of rest period modulates effort-allocation strategies in addition to providing resource recovery.
This finding has implications for refining work-rest schedules in industries where time-on-task degradation is an important concern.
Footnotes
Acknowledgements
This study was funded by NEUROEN Grant R3940000059232. We thank Frances-Catherine Quevenco and Wong Kian Foong for their help in data collection.
Julian Lim is an assistant professor in the Neurosciences and Behavioral Disorders Program at the Duke-NUS Graduate Medical School. He obtained a PhD in psychology from the University of Pennsylvania in 2010 and was a research scientist in the Singapore Institute for Neurotechnology from 2010 to 2014.
Kenneth Kwok is a principal research scientist in the Singapore Institute for Neurotechnology and program director of the Combat Protection and Performance Program at the Defence Science Organization National Laboratories in Singapore. He obtained a PhD in psychology from Carnegie Mellon University in 2003.
