Abstract
People often attribute poor performance to having bad days. Given that cognitive aging leads to lower average levels of performance and more moment-to-moment variability, one might expect that older adults should show greater day-to-day variability and be more likely to experience bad days than younger adults. However, both researchers and ordinary people typically sample only one performance per day for a given activity. Hence, the empirical basis for concluding that cognitive performance does substantially vary from day to day is inadequate. On the basis of data from 101 younger and 103 older adults who completed nine cognitive tasks in 100 daily sessions, we show that the contributions of systematic day-to-day variability to overall observed variability are reliable but small. Thus, the impression of good versus bad days is largely due to performance fluctuations at faster timescales. Despite having lower average levels of performance, older adults showed more consistent levels of performance across days.
Keywords
People generally agree that they experience good days and bad days in dealing with the cognitive demands of everyday life. But the empirical basis on which people conclude that cognitive performance varies from day to day is actually slim. Important cognitive tasks, such as exams, may repeat over days but are often posed only once on any given day. The same applies to more mundane endeavors. For example, trying to retrieve where one parked the car the day before is a task that many people experience daily. Occasional failures on this task can easily be attributed to having a bad day. This conclusion may actually be wrong because it is based on a sample of just one observation on a single day. Instead of having experienced a bad day in the sense that memory performance was systematically lower than usual throughout the whole day, a person may have suffered from a “bad moment” at the time when he or she tried to retrieve the location of the car and then erroneously generalized the attributes of that moment to the entire day.
Existing data shows that moment-to-moment fluctuations in cognitive performance are sizable (Rabbitt, Osman, Moore, & Stollery, 2001). Before drawing any conclusions about the contribution of true day-to-day fluctuations to cognitive performance, one needs to demonstrate that such fluctuations are present after performance fluctuations at faster timescales have been taken into account. Provided that true day-to-day fluctuations can be demonstrated, it would be interesting to know how large they are and whether they differ between early and late adulthood. Normal cognitive aging is associated with declines in many cognitive abilities (Li et al., 2004; Nilsson et al., 2004; Schaie, 2005). Hence, the majority of older adults perform at lower average levels than younger adults do. Experiencing a bad day may hurt the everyday cognitive functioning of an older adult more than it does that of a younger adult. Thus, it is worthwhile to know whether normal aging is associated with an increase in the day-to-day variability of cognitive performance.
Empirical research on these important issues is astonishingly scarce. The generalizability of existing studies on daily fluctuations in cognitive performance is limited by the use of small numbers of participants, occasions, or tasks (Allaire & Marsiske, 2005; Hertzog, Dixon, & Hultsch, 1992; Li, Aggen, Nesselroade, & Baltes, 2001; Nesselroade & Salthouse, 2004; Rabbitt et al., 2001; Sliwinski, Smyth, Hofer, & Stawski, 2006). In the short term, cognitive performance fluctuates considerably in younger adults, and more so in older adults (Anstey, Dear, Christensen, & Jorm, 2005; Deary & Der, 2005; Fozard, Vercruyssen, Reynolds, Hancock, & Quilter, 1994; Li et al., 2004; MacDonald, Hultsch, & Dixon, 2003). Most of these findings pertain to fluctuations in reaction times from trial to trial within a block of trials or from block to block within a testing session.
The mechanisms generating intraindividual variability in cognitive performance at higher temporal resolution may include neuromodulatory influences (Li, Lindenberger, & Sikström, 2001; MacDonald, Nyberg, & Bäckman, 2006), blood-oxygen-level-dependent signal variability (with inverse relations to decision-time variability; Garrett, Samanez-Larkin, MacDonald, Lindenberger, & McIntosh, 2013), inherent noise in decision processes (Schmiedek, Oberauer, Wilhelm, Süß, & Wittmann, 2007), lapses of attention (Weissman, Roberts, Visscher, & Woldorff, 2006), or disturbance by task-irrelevant cognitive processes competing for cognitive resources (Brose, Schmiedek, Lövdén, & Lindenberger, 2012; Riediger, Wrzus, Schmiedek, Wagner, & Lindenberger, 2011). It is unknown whether these mechanisms also contribute to variability in cognitive performance from day to day.
Because a person’s performance level on a particular day usually is measured by his or her average performance on a certain number of trials or blocks of trials, observed day-to-day variability is a combination of a systematic day-to-day variance component (i.e., variance of mean performance across days because of variation in the statistically expected level of performance for that day) and lower-level variance components (i.e., variance of mean performance across days because of random draws of trials or blocks from a distribution of trial-to-trial or block-to-block variability). Lower-level variance components are reduced, but not eliminated, by means of aggregation (cf. Rabbitt et al., 2001). The relation of observed and true day-to-day variance is given by
with nblocks being the total number of blocks per day. This implies that the same amount of observed day-to-day variability (i.e., variations in daily performance averaged across blocks) can be due to combinations of, for example, a small day-to-day component and a correspondingly larger block-to-block component or a large day-to-day component and a correspondingly smaller block-to-block component. In the latter case, “good” or “bad” days would be characterized by performance that is systematically high or low on all blocks of a particular day.
To separate these variance components, one needs to assess performance repeatedly across many daily occasions and across at least two blocks within each daily occasion. The estimated block-to-block variance component contains trial-to-trial variability in performance as well as systematic fluctuations across blocks. Given that trial-to-trial variability appears to increase with advancing age in adulthood, one may therefore expect that estimates of block-to-block variability should increase as well. However, to the degree that block-to-block variability is influenced by factors that differ from those influencing trial-to-trial variability, things may look different. For example, if older adults’ performance is characterized by stabler levels of task-related motivation (Brose, Schmiedek, Lövdén, Molenaar, & Lindenberger, 2010), total block-to-block variability may actually be smaller for older adults despite their higher trial-to-trial variability. Analogous reasoning applies to the relation between block-to-block variability and day-to-day variability. In sum, there is a need to disentangle hierarchically nested timescales of variability in cognitive performance because antecedents may differ by temporal resolution.
Cognitive performance is composed of a number of separable cognitive abilities (Carroll, 1993). It is desirable to measure performance on each daily occasion with tasks representing different abilities and content domains to explore day-to-day variability in a comprehensive manner. These requirements were fulfilled by the COGITO Study (Schmiedek, Lövdén, & Lindenberger, 2010), in which 101 younger (aged 20–31) and 103 older (aged 65–80) participants worked on a battery of computerized cognitive tasks measuring the abilities of working memory, episodic memory, and perceptual speed on an average of 100 daily occasions. Here, we report results from nine tasks: one verbal, one numerical, and one figural-spatial for each ability. On each day, participants performed at least two blocks of each task. This design enabled us to separate day-to-day from within-day (i.e., block-to-block) components of intraindividual variability.
Method
Participants, procedure, and tasks
During the daily assessment phase of the COGITO Study, 101 younger adults (51.5% women, 48.5% men; age range: 20–31 years, M = 25.6, SD = 2.7) and 103 older adults (49.5% women, 50.5% men; age range: 65–80 years, M = 71.3, SD = 4.1) completed an average of 101 practice sessions. Both the younger sample and the older sample were quite representative regarding general cognitive functioning, as indicated by comparisons of performance on a digit-symbol task with data from a population-based study and a meta-analysis (Schmiedek, Lövdén, et al., 2010), and showed positive selectivity of comparable effect size regarding self-rated health (Wolff et al., 2012). The attrition rate for participants who had entered the longitudinal practice phase of the COGITO study was low (i.e., 15 out of 219 participants; for details on rates of and reasons for drop outs in the different study phases, see Schmiedek, Bauer, Lövdén, Brose, & Lindenberger, 2010).
Participants practiced individually in lab rooms containing up to six computer testing stations. Before and after this longitudinal phase, participants completed pretests and posttests during 10 sessions that consisted of 2 to 2.5 hr of comprehensive cognitive test batteries and self-report questionnaires. On average, the time that elapsed between the pretest and the posttest was 197 days for the younger group and 188 days for the older group.
Participants were paid between 1,450 and 1,950 euros, depending on the number and temporal density of completed sessions. The ethical review board of the Max Planck Institute for Human Development approved the study.
In each practice session, participants practiced 12 tasks drawn from a facet structure cross-classifying cognitive abilities (perceptual speed, episodic memory, and working memory) and content material (verbal, numerical, and figural-spatial), with two to eight blocks of trials for each task (for information on all practiced tasks, see Schmiedek, Lövdén, et al., 2010). For the episodic and working memory tasks, presentation time (PT) was adjusted for individual participants on the basis of their pretest performance. This procedure led to comparable performance levels for the working memory tasks, whereas performance on the episodic memory tasks was still somewhat lower for the older adults (see Fig. S1 in the Supplemental Material available online).
Perceptual speed: comparison tasks
In the numerical, verbal, and figural-spatial versions of the comparison task, two strings of five numbers each, two strings of five digits each, or two colored three-dimensional objects consisting of several connected parts (fribbles), respectively, appeared on the left and right sides of the screen, and participants had to decide as quickly as possible whether both stimuli were exactly the same or different. If they were different, the strings differed only by one number or letter and the objects differed only by one part. Number strings were randomly assembled using digits 1 to 9. Letters were lowercase and randomly assembled from all consonants in the alphabet, which ensured that they could not actually form real words. Each session included two blocks of 40 items, with equal numbers of pairs of same and different stimuli. Images of fribbles used in this task were courtesy of Michael J. Tarr of Brown University (http://www.tarrlab.org/).
All three comparison tasks were scored by dividing the number of correct responses by the total response time (in seconds) and multiplying this quotient by 60 (i.e., creating a score of correct responses per minute). To reduce the influence of outliers, scores above 100 were set to missing (0.5% of the observed data).
Episodic memory tasks
The verbal, numerical, and figural-spatial versions of the episodic memory task focused on word lists, number-noun pairs, and object position, respectively.
Verbal episodic memory: word lists
Lists of 36 nouns were presented sequentially with PTs of 1,000, 2,000, or 4,000 ms. The interstimulus interval (ISI) was 1,000 ms. Word lists were assembled in such a way that words’ frequency, length, emotional valence, and imageability were balanced across lists. After a list was presented, participants had to recall words in the correct order by entering the first three letters of each word using the keyboard. Two blocks were included in each daily session. The performance measure was the percentage of correctly recalled words multiplied by a score ranging from 0 to 1, which represented the correctness of the order (based on a linearly rescaled tau rank correlation). The resulting scores were logit transformed before being entered in the analyses.
Numerical episodic memory: number-noun pairs
Lists of 12 paired two-digit numbers and plural nouns were presented sequentially with PTs of 1,000, 2,000, or 4,000 ms. The ISI was 1,000 ms. After a list was presented, participants had to enter all numbers on the basis of noun prompts presented in random order. Two blocks were included in each daily session. The performance measure used in the analyses was the logit-transformed percentage of correctly recalled numbers.
Figural-spatial episodic memory: object position
Se-quences of 12 color photographs of real-world objects were displayed at different locations in a 6 × 6 grid with PTs of 1,000, 2,000, or 4,000 ms. The ISI was 1,000 ms. After a sequence was presented, the photographs were presented again at the bottom of the screen, and participants had to move the objects in the correct order to their correct locations by clicking on objects and locations with the computer mouse. Two blocks were included in each daily session. The performance measure was the percentage of items placed in the correct locations multiplied by a score ranging from 0 to 1, which represented the correctness of the order (based on a linearly rescaled tau rank correlation). The resulting scores were logit transformed before being entered in the analyses.
Working memory tasks
The verbal, numerical, and spatial versions of the working memory task were an alpha-span task, a memory-updating task, and 3-back dot-positioning task, respectively.
Verbal working memory: alpha span
Ten uppercase consonants were presented sequentially, with a number located below each letter. For each letter, participants had to decide as quickly as possible whether the number corresponded to the alphabetic position of the letter within the set of letters presented up to this step. Five of the 10 items (targets) had correct position numbers. If position numbers were incorrect (nontargets), they differed from the correct position by 1. PTs were 750, 1,500, or 3,000 ms. The ISI was 500 ms. In each daily session, eight blocks were included. The performance measure used in the analyses was based on the percentage of correct responses. Scores were averaged across odd and even blocks and logit transformed.
Numerical working memory: memory updating
Participants had to memorize and update four one-digit numbers. In each of four horizontally placed cells, one of four single digits (from 0 to 9) was presented simultaneously for 4,000 ms. After an ISI of 500 ms, a sequence of eight updating operations were presented in a second row of four cells below the first one. The updating operations were subtractions and additions from −8 to +8. The updating operations had to be applied to the digits memorized from the corresponding cells above, and the new results then also had to be memorized. Each updating operation was applied to a cell different from the one a step earlier in the sequence such that no two updating operations had to be applied to one cell in a sequence. PTs were 500, 1,250, or 2,750 ms. The ISI was 250 ms. At the end of each trial, the four end results had to be entered. In each daily session, eight blocks were included. The measure of performance used in the analyses was based on the percentages of correct responses. Scores were averaged across odd and even blocks and logit transformed.
Spatial working memory: 3-back task
A sequence of 39 black dots appeared at varying locations in a 4 × 4 grid. Participants had to determine for each dot whether or not it was in the same position as the dot three steps earlier in the sequence. Dots appeared at random locations with the constraints that (a) 12 items were targets, (b) dots did not appear in the same location on consecutive steps, and (c) exactly 3 items each were 2-, 4-, 5-, or 6-back lures—that is, items that appeared in the same position as the items two, four, five, or six steps earlier. The presentation rate for the dots was individually adjusted by varying ISIs (500, 1,500, or 2,500 ms). PT was fixed at 500 ms. In each daily session, four blocks were included. The measure of performance used in the analyses was based on the percentages of correct responses on Trials 4 through 39. Scores were averaged across odd and even blocks and logit transformed.
Data analysis
To estimate day-to-day and block-to-block variance components and to test for age-group differences therein, we separately fitted multilevel models to each task. These models allowed us to flexibly test for age-group differences in the variance components while controlling for individual differences in longer-term trends as well as for variations in the difficulty of the different stimuli used in each daily session. The resulting variance decomposition showed separately for each task how the variability that remains after accounting for trends and task-difficulty variations can be partitioned into a day-to-day component and a block-to-block component. The day-to-day component captures systematic variations of performance across days, indicating the degree to which observed (i.e., total) day-to-day variability is due to performance being systematically higher or lower across blocks on different days (see Multilevel Modeling of Variance Components in the Supplemental Material for a detailed description of the multilevel model).
Results
Results are summarized in Figure 1 and Table 1, which show the estimated variance components of block-to-block and day-to-day variability. The main finding of this study was twofold. First, the average contribution of true day-to-day variability to observed day-to-day variability was highly reliable but comparatively small for most of the tasks. Even after we controlled for variations in task difficulty, a large share of observed day-to-day variability was accounted for by variability at the block-to-block level, especially for the working memory and episodic memory tasks. This means that seemingly good and bad days are to a considerable degree attributable to performance fluctuations at much faster timescales. It follows that single blocks of trials on cognitive tasks are poor indicators of good and bad days.

Younger and older participants’ estimated day-to-day and block-to-block variance components for the (a) working memory, (b) episodic memory, and (c) perceptual-speed tasks. The total size of the bars corresponds to the variance of observed day-to-day variability (i.e., the variance of average performance across days). This variance is decomposed into a variance component of systematic day-to-day fluctuations and the contribution of block-to-block variability to observed day-to-day variability.
Variance Components and Age-Group Differences
Note: Data are variance components. Standard errors are shown in parentheses. Chi-square tests (critical value = 3.84 for p < .05) are based on likelihood ratios comparing unconstrained models (i.e., parameters freely estimated) with constrained models (i.e., parameters constrained to be equal across age groups).
Second, observed day-to-day variability was smaller for older than for younger adults for all nine tasks, as older adults’ performance was significantly less variable from block to block and from day to day (see Table 1). Day-to-day variance components were reliably different from zero, however, for both age groups and all tasks. Control analyses showed that the age-group differences in variability could not be accounted for by age-group differences in average performance level, time of day of the testing sessions, or the PT conditions to which participants were assigned (see Tables S1–S3 in the Supplemental Material).
Discussion
The results from this study show that day-to-day variability in cognitive performance reliably exists across a wide range of tasks representing broad cognitive abilities. However, day-to-day fluctuations contribute less to observed day-to-day variability in performance than do fluctuations at faster timescales. The small samples of performance that people observe in their lives may lead them to overestimate the effect of good and bad days because people generally lack the computational requirements, and often the required evidence, to estimate the portion of observed day-to-day variability that is systematic across days. With just one observation per day, sources operating at the level of days cannot be separated from sources operating at timescales with higher temporal resolution. But even if several observations per day are available, it seems unlikely that humans routinely correct observed day-to-day fluctuations by the amount of variability that is operating at higher temporal resolution. For many people, failing three times on a particular day to remember the name of a colleague met at a conference will create the subjective impression of suffering from some general memory deficit on that day, even though short-term fluctuations in memory retrieval success may render this sequence of failures likely to occur by chance alone. More often than not, what seems like a bad day may actually be a series of bad moments.
The finding that older participants did show less fluctuation in performance at both the day-to-day and the block-to-block level can be explained by a number of factors, such as the setting of lower goals regarding performance levels (Shing, Schmiedek, Lövdén, & Lindenberger, 2012), stabler levels of motivation (Brose et al., 2010), lifestyles and circumstances that are characterized by fewer stressful events (Brose, Scheibe, & Schmiedek, 2013) and a lesser need to engage in cognitively demanding self-regulatory efforts (Brose, Schmiedek, Lövdén, & Lindenberger, 2011), and less variability in cognitive-strategy exploration and strategy use (Shing et al., 2012). Higher stability at the block and day levels therefore does not contradict earlier findings of increased reaction-time variability at the trial-to-trial level. The reduced fidelity of older adults’ information-processing systems, which manifests itself in larger reaction-time fluctuations from trial to trial, may not affect variability from day to day, at least not in the context of normal aging. That is, adult age differences in variability differ in direction by timescale. This finding highlights the need for theoretical models of learning and development that articulate variability at different timescales and levels of analysis (Garrett et al., 2013; Nesselroade, 1991; for an example in the domain of human motor performance, see Newell, Mayer-Kress, Hong, & Liu, 2009).
Results from our investigation of day-to-day variability with a large and heterogeneous set of cognitive tasks also indicated that the amount of such variability might differ across tasks or task domains. For example, age differences in the day-to-day variability were large for the perceptual-speed tasks. In contrast to the working memory and episodic memory tasks, in which the pacing was determined by the individualized PTs, in the three perceptual speed tasks, participants were allowed to set their own pace. This demand characteristic may have exacerbated younger adults’ performance variability on a bad day. Future work needs to arrive at a better understanding of the mechanisms that contribute to variability at different timescales.
One may wonder whether selection on task-relevant aspects of personality profiles, such as conscientiousness, may have contributed to the stability advantage of the older adults. We have some indication that this is unlikely to be the case. After finishing the 100 daily sessions, 85 of the older adults also participated in the German Socio-Economic Panel Study (SOEP; Wagner, Frick, & Schupp, 2007). Comparing the Big Five personality profiles of this subsample to those of the representative SOEP sample, we discovered that the older adults participating in our study had somewhat lower values on items related to conscientiousness (see Fig. S2 in the Supplemental Material).
In summary, despite having lower average levels of performance, older adults maintained stabler day-to-day levels of performance than did younger adults. In many vocational, voluntary, and leisure settings, older adults’ higher degree of consistency from day to day may be an advantageous attribute that positively contributes to their productivity.
Footnotes
Acknowledgements
The authors thank the following people for their important roles in conducting the COGITO Study: Colin Bauer, Annette Brose, Birgit Heim, Katja Müller-Helle, Annette Rentz-Lühning, Werner Scholtysik, Julia Wolff, and a team of highly committed student research assistants.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Funding
This work was funded by the Max Planck Society, in part by Grant M.FE.A.BILD0005 from the innovation fund of the Max Planck Society, and by a Sofja Kovalevskaja Award (to M. Lövdén) from the Alexander von Humboldt Foundation, donated by the German Federal Ministry of Education and Research.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
