Abstract
Background:
Developing effective interventions to attenuate age-related cognitive decline and prevent or delay the onset of dementia are major public health goals. Computerized cognitive training (CCT) has been marketed increasingly to older adults, but its efficacy remains unclear. Working memory (WM), a key determinant of higher order cognitive abilities, is susceptible to age-related decline and a relevant target for CCT in elders.
Objective:
To evaluate the efficacy of CCT focused on WM compared to an active control condition in healthy older adults.
Methods:
Eighty-two cognitively normal adults from two sites (USA and Sweden) were randomly assigned to Cogmed Adaptive or Non-Adaptive (active control) CCT groups. Training was performed in participants’ homes, five days per week over five weeks. Changes in the performance of the Cogmed trained tasks, and in five neuropsychological tests (Trail Making Test Part A and Part B, Digit Symbol, Controlled Oral Word Association Test and Semantic Fluency) were used as outcome measures.
Results:
The groups were comparable at baseline. The Adaptive group showed robust gains in the trained tasks, and there was a time-by-group interaction for the Digit Symbol test, with significant improvement only after Adaptive training. In addition, the magnitude of the intervention effect was similar at both sites.
Conclusion:
Home-based CCT Adaptive WM training appears more effective than Non-Adaptive training in older adults from different cultural backgrounds. We present evidence of improvement in trained tasks and on a demanding untrained task dependent upon WM and processing speed. The benefits over the active control group suggest that the Adaptive CCT gains were linked to providing a continuously challenging level of WM difficulty.
INTRODUCTION
Age is the most important risk factor for developing dementia, a major source of dependence and disability [1–3]. With people living to increasingly advanced ages, the prevalence of dementia is expanding [4]. The risk of developing dementia is one of the most feared aspects of growing old [5], with more than half the people over 65 years old having concerns about their cognitive abilities [6–8]. Models developed by Brookmeyer and colleagues [9] suggest that even small delays in dementia onset could significantly reduce the global burden of dementia. For instance, postponing the onset of Alzheimer’s disease (AD) by one year could result in a decrease of >9 million in the worldwide prevalence of AD [9]. Given the profound personal, social, and economic costs of dementia, there is a need for evidence-based interventions capable of delaying, attenuating or preventing cognitive decline and dementia.
Healthy cognitive functioning is a critical component of well-being, autonomy, and successful aging [10, 11]. Converging lines of evidence indicate that engagement in cognitively stimulating activities throughout the lifespan may reduce the risk of cognitive deterioration and dementia [12–15]. This research supports the cognitive reserve hypothesis, which suggests that certain kinds of life experience, including education, occupational attainment, and cognitively stimulating leisure activities, contribute to the ability to cope with or compensate for age-related cerebral decline, allowing adults to maintain cognitive task performance and daily activities [16]. Interventions that engage individuals in cognitively stimulating activities may help build cognitive reserve enabling them to tolerate greater age- and disease-related brain changes.
Brain plasticity has been shown to persist into old age and may be stimulated by cognitive activities [17, 18]. Some investigators have argued that formal cognitive training is a potential tool for improving mental functioning [19–26]. In fact, recent reports suggest that some forms of cognitive intervention (e.g., speed of processing training) in healthy elders may have long-term effects on everyday functioning [27, 28], and even reduce the risk of developing dementia 10 years after the completion of a training program [29].
In the last decade, there has been a growing interest in investigating the effect of working memory (WM) training on healthy older adults. WM reflects the ability to maintain and manipulate information [30]. It is a central mechanism supporting higher-order cognitive abilities, such as fluid intelligence, language comprehension, problem solving and reasoning [31–35]. WM operations are dependent on a widespread network, including fronto-striatal, parietal, and temporal brain regions [36–40]. There is evidence that WM declines with age [41–45], which can undermine performance of instrumental activities of daily living [46].
The literature on the efficacy of WM training in older adults has been controversial. On the one hand, studies suggest that WM training can improve information processing in older adults [26, 48] allowing them to sustain cognitive functioning and remain active and engaged [47]. For example, there is evidence of cognitive improvements on trained tasks, and near-transfer to tasks not explicitly trained (i.e., that test operations within the same cognitive domain) [38, 48–54]. On the other hand, far-transfer effects (i.e., that reflect operations in other cognitive domains) have been challenging to demonstrate, and results have not been uniform across studies [48, 55–59]. Moreover, there is limited evidence of transfer to meaningful everyday life activities [28, 60]. Inconsistencies in the literature may be due to large differences in type, intensity, and duration of the training programs, as well as variation in the methodology, outcome measures used, and the characteristics of the control groups (e.g., active versus passive versus no-contact) [52].
In recent years, there has been an expanding interest specifically in investigating computerized cognitive training (CCT). CCT involves structured practice on standardized and cognitively challenging tasks and has advantages over traditional methods, including visually appealing interfaces, efficient and scalable delivery, and the potential to constantly adapt training content and difficulty to individual performance [50, 59]. In this context, the CCT industry has grown enormously, with sales approaching $1.3 billion in 2013, and projections of $3.38 billion by 2020 [61, 62]. These programs have been increasingly marketed to older adults, often with broad claims about cognitive and functional improvement allegedly based on strong evidence that frequently does not exist [51, 63]. In 2014, two groups of scientists published open letters on the efficacy of brain-training interventions, with conflicting perspectives. The first letter claimed that brain training games do not provide a scientifically grounded way to improve cognitive functioning or reverse cognitive decline, and highlighted the need for more research by investigators with no conflicting financial interests who will conduct rigorously designed studies that include a control group treated exactly the same as the trained group, except for the specific training [64] (available on http://longevity3.stanford.edu/blog/2014/10/15/the-consensus-on-the-brain-training-industry-from-the-scientific-community-2/). The second letter, published months later [65] (available on www.cognitivetrainingdata.org/the-controversy-does-brain-training-work/response-letter/), argued that a substantial and growing body of evidence shows that certain cognitive-training regimens can significantly improve cognitive function, and generalize to everyday activities.
In their review article from the same year, Lampit and colleagues [59] suggest that, in general, CCT for older adults is ineffective under the following conditions: unsupervised, at-home training, frequency of more than three times per week, or sessions of less than 30 minutes. However, what constitutes the core components of an effective intervention remains an unsettled issue since other studies have reported training-gains for both healthy older adults [50, 66] and MCI patients [67] after home-based, unsupervised CCT with frequency of five days a week. Further investigation is necessary to yield more definitive conclusions about the efficacy of brain-training interventions in older adults [50].
Interestingly, little is known about whether differences in cultural background modulate training effects, which is an important issue for determining if such interventions can be successful in diverse populations. Culture has been shown to influence cognitive skills and information processing. For example, studies using standard neuropsychological tests often suggest robust differences in performance between adults in the US and other nations [68–70]. Furthermore, there is evidence that cognitive skills such as perception, attention and memory can vary across cultures [71, 72]. Gutchess and collaborators emphasize that cultural differences can be associated with the engagement of distinct cognitive processes and strategies, or the processing of different aspects of the information [73]. On the other hand, theoretically, the cognitive operations underlying WM are not specific to particular groups of individuals, but apply more generally [31] and previous work suggests no cross-cultural differences in WM in teenagers [74]. In the current study, we explored the potential impact of culture and background on CCT in older adults by recruiting two different samples of subjects, one from an urban setting in the United States, and the other from a rural setting in Sweden. We anticipated that although the two samples may differ cognitively at baseline, the magnitude of any training effects would not be modulated by site.
Moreover, the influence of age on the training response in older adults continues to be unclear. Results of studies have varied regarding whether the beneficial effect of cognitive training is greater for young-old adults than old-old adults [75–78]. Although a recent meta-analysis on cognitive interventions for older adults reported no significant differences in effect sizes based on age of study participants [79], it is uncertain whether the oldest-old population (∼80 years or older) demonstrates a different training response than young-old adults (∼60 to 80 years). For instance, a study found that oldest-old adults without dementia (age range 75–101 years) were unable to enhance their memory performance after training based on the method of loci [80], a mnemonic technique that had yielded substantial learning benefits in young-old adults [81]. Nevertheless, there is evidence that the oldest-old exhibit cognitive plasticity [82] and can improve WM after cognitive training [19], suggesting that advancing age does not significantly alter the ability of older adults to benefit from cognitive interventions, so long as they have sufficient cognitive capacity to carry out the training program [79]. The impact of cognitive training on the oldest-old adults is an important area of research, requiring additional investigation, especially because this group is the fastest growing segment of the population in developed countries [83]. To help address this issue, our sample comprised a wide age range of older subjects (65–89 years) that included young-old and old-old individuals, whose average age was 10 years older than participants in a similar, previous study on WM training in older adults [50]. We expected that old-old adults would also benefit from training, although we were uncertain about whether the impact would be smaller than that observed in their younger counterparts.
Sensitive to the methodological issues associated with the investigation of cognitive training in older adults, the current study sought to address some of the concerns that have been raised. We conducted a randomized controlled trial (RCT) that used an active control group, recruited participants from two different countries with diverse cultural backgrounds, included a broader age range of older adults, and was funded by an impartial source. The investigation used the commercially available Cogmed® QM platform (Pearson Education, Inc.) to compare an Adaptive WM training condition, wherein task difficulty was continuously modulated based on performance, with an active control condition (Non-Adaptive), wherein the task difficulty remained the same [25]. By continually updating the degree of demand based on real time performance, adaptive training provides an ongoing intellectual challenge throughout the intervention period and ensures that training is neither boring nor overtaxing [57, 84]. A Non-Adaptive group seems to be an excellent active control, since this approach seeks to match all aspects of the actual intervention, except for the increasingly challenging training component [51]. Previous literature indicates that Adaptive CCT can lead to greater cognitive improvement than Non-Adaptive, accompanied by changes in brain function, as indexed by fMRI and event-related potentials [25, 85].
Outcome measures for the current study, as described in the Methods section below, reflected performance on trained Cogmed tasks, and on five neuropsychological tests that evaluated both near- and far-transfer effects. Our primary outcomes reflect the near-transfer effects, and the secondary outcomes reflect the far-transfer effects. We chose outcome measures that relied, in part, on WM, but differed in terms of the content of the material being held in WM and the extent to which WM was the central cognitive process. We hypothesized that near-transfer effects related to Cogmed training would include tasks that require holding, updating and manipulating information that has been visually presented to the participant. Digit Symbol [86] and Trail Making Test, Part B (TMT-B) [87] were chosen as strong candidates for this kind of near-transfer effects.
Other investigators have reported improvements in Digit Symbol after cognitive training [27]. Although Digit Symbol is classically considered a test of processing speed, it requires various activities, such as scanning, matching, switching and writing that are reflective of several cognitive functions like perception, encoding and retrieval processes, transformation of information stored in active memory, and decision-making [88]. Many of these processes are related to WM, and evidence suggests that speed of processing has been tightly linked to WM capacity [89]. When performing the Digit Symbol task, the better a person can hold relevant information in mind, the faster (and better) performance will be. Regarding TMT-B, Sánchez-Cubillo and colleagues [90] have argued its execution is contingent upon WM and task-switching ability. Consistent with this thesis, other research has shown that TMT-B performance is associated with performance on WM tasks [91–93].
Trail Making Test, Part A (TMT-A) is a test of visual attention that requires visual search and psychomotor speed, but places low demands on WM and executive control [90, 94]. There is evidence that adaptive CCT in older adults is associated with improvement in tasks that require attention and maintenance aspects of WM [50, 95]. TMT-A was included to help determine whether Adaptive Cogmed training would have a greater impact than Non-Adaptive Cogmed training on more basic aspects of attention. Although we hypothesize that WM training can lead to gains on untrained tests of attention (consistent with near-transfer effects), given the limited WM load associated with TMT-A, there was no compelling reason to predict that Adaptive WM training would have a greater impact on TMT-A performance than the control condition.
In terms of the far-transfer measures, although phonemic and semantic word list generation requires WM functions, these tasks rely most heavily on lexical and semantic processing [96]. Based on the literature, we hypothesized that we would find robust near-transfer effects, especially in the outcomes that demand more WM resources, but had low expectations for observing far-transfer effects. As noted, we also anticipated that the magnitude of the effects would be similar across sites from the two different countries.
METHODS
Study design, randomization, and sample size calculation
The Successful Aging and Enrichment (SAGE) study was a randomized controlled, two-site, single-blind trial, using a four-group design, including three treatment groups and a control group. The treatment groups consisted of adaptive cognitive training, physical exercise, and mindfulness meditation. The active control group participated in a non-adaptive cognitive training program. In the current report we focus only on the results of the two cognitive training conditions (Adaptive versus Non-Adaptive control). Recruitment occurred between January 2014 and October 2015. At each site, participants were randomly assigned to one of the four intervention groups. The data coordinating center of each site randomized participants to one of the intervention groups using a computer block randomization system based on a Latin square design. We used the Latin square design to counterbalance our four groups, since it is an efficient approach for small RCTs involving more than two treatment conditions [97]. Of note, the randomization was done before the baseline assessment. Neuropsychological assessment was conducted at baseline and following the intervention. Research staff were not blinded to treatment assignment of subjects. However, subjects were blinded to the cognitive training condition (adaptive versus non-adaptive) in which they participated and were unaware of any hypotheses associated with the two cognitive training conditions.
Sample size was calculated using G-Power software (G-Power, V.3.1). We determined that a sample size of at least 30 subjects for each experimental group would be sufficient to detect a medium-sized effect of CCT on working memory ability in healthy older adults with 0.90 power at an alpha level of 0.05. Our estimation was based on the medium-sized effect (d = 0.50) observed using the same CCT intervention on a WM outcome (Digit Span Backward) found by Vermeij and colleagues [67] in a pilot study involving 23 healthy older adults.
Participants and assessment
The initial sample consisted of 82 older adults (age range 65 to 89, mean age 73.1±6.1) living independently with preserved functional and cognitive status, who were recruited through community announcements in the Boston metropolitan area, USA (population of ∼4,800,000) [98] and Växjö municipality, a rural region in Sweden (population of ∼90,000) [99]. The study was approved by the appropriate ethical review boards at the participating sites (Boston, Partners Human Research Committee, protocol 2013P002266; and Linnaeus University, Regional Ethical Committee, Linköping, protocol dnr 2013/154-31). All participants completed written informed consent. During a pre-training visit, the participants completed a detailed screening evaluation that included a structured interview to obtain a medical, neurological, and psychiatric history; a formal neurological evaluation, screening for vision and hearing, and a neuropsychological assessment.
To be included in the study, participants had to be 65 years or older, English- or Swedish-speaking, have an estimated intelligence quotient (IQ)≥ 90 based on the American or Sweden National Adult Reading Test (AMNART, NART-SWE) [100, 101]; score ≥ 26 on the Mini-Mental State Examination (MMSE) [102]; and perform within one standard deviation (SD) of mean for published age-based norms on Logical Memory delayed recall test, from Wechsler Memory Scale – Third Edition (WMS-III) [103], and naming to confrontation, as assessed by the Boston Naming Test (BNT) [104]. Subjects were excluded if they had a history of central nervous system disease or major ongoing psychiatric disorders based on DSM-IV criteria [105], exhibited clinically significant depressive symptoms (scored ≥ 15 on the Geriatric Depression Scale - GDS) [106, 107], had focal abnormalities consistent with a brain lesion as determined by a neurological examination, or a history of clinically significant medical diseases. Clinical history and baseline performance on neuropsychological tests within one SD of the mean for age allowed us to exclude subjects with mild cognitive impairment [108] or early dementia [109].
Procedure
Participants in both cognitive training groups received identical instructions about the training program according to a standardized protocol. After the baseline assessment, a research assistant (RA) visited participants in their homes and provided a Hewlett-Packard, 15.4” laptop computer and an orientation to the Cogmed software program. Since the training was delivered online, our staff verified technical requirements such as access to an internet browser. Also, to ensure that the instructions were clear and the participant had sufficient computer skills, the RA assisted the participants in accessing the training system for the first time and supervised the practice of the demo version of the training. No training strategies were offered by study personnel. The first training session occurred approximately 7–10 days after the baseline cognitive assessment.
During the intervention period, the participant’s progress was monitored every week by an RA, who contacted the participants by telephone to provide support, as suggested by the software training guidelines (available at: https://cogmed.com/wp-content/uploads/2010/07/Coaching-manual-US-1.0.9.pdf). The conversation focused on three general areas: 1) the subject’s participation during the prior week (e.g., any missed or abbreviated CCT sessions); 2) any technical difficulties; and 3) any questions or issues in need of clarification. Participants were reminded that they could contact the staff with any additional questions. Finally, words of encouragement were provided. The post-training visit was scheduled as close as possible to the end of each participant’s training period (mean days 4.7±5.9). Of note, there was no difference between intervention groups in the number of days between the last training session and the date of cognitive testing (p > 0.5).
Interventions (WM training program)
A commercially available WM CCT program (Cogmed® QM, Pearson Education, Inc.) was utilized in the current study. Time commitments were equivalent for both training conditions, which consisted of 25 individual training sessions of approximately 40 minutes. Participants were instructed to train five days per week for five weeks. Training sessions in both conditions were based on twelve different verbal and visuospatial tasks, which included remembering a sequence of numbers, letters, shapes, or spatial locations for immediate recall. Some exercises involved active manipulation of information, such as entering numbers in reverse to the order that they had appeared or tracking the location and order of pertinent moving circles that are highlighted (see the Supplementary Material for more details about the Cogmed tasks). Participants worked on eight of the possible twelve tasks on each day of training; the tasks that each participant had to complete on a given day were pre-determined by the online training program and were consistent across subjects. The specific tasks varied across the 25 days such that each of the 12 tasks was practiced approximately the same number of times. Allocation of tasks was exactly the same for both conditions. Participants were instructed to perform all tasks within one block of time with minimal breaks between tasks.
Under Adaptive CCT, task difficulty was revised on a trial-by-trial basis with the goal of establishing 60% accuracy, thereby creating a consistently challenging level of subjective difficulty for each individual. Task difficulty was modulated by increasing or decreasing the WM load for each trial, e.g., the number of letters required to be kept in mind. Under the active control training condition, task difficulty remained at a constant, relatively low-load across all training days, which involved two items. The Cogmed training system automatically logs and saves data from each training session, including times and performance levels.
Outcomes measures
Changes in performance on the Cogmed training tasks were measured using standard data provided by Cogmed software. A Cogmed “Training Index” score was computed for each training day based on averaging the difficulty level of items held in WM with 60% accuracy for one spatial WM task and one of two verbal WM tasks (whichever performance was higher). The spatial WM task involved remembering/reproducing the sequence of circles that were lit up on a two-dimensional grid (called the “Grid” task), and the verbal WM tasks involved remembering the order of digits read aloud, and at test, entering the numbers in reverse order on a number grid that remained visible the whole time (“Numbers” task) or that only became visible after all the numbers were provided (“Hidden” task).
Several well-established neuropsychological tests were used to assess the effect of the computerized WM training. All participants were tested both pre- and post-intervention. The outcome measures included: 1) Trail Making Test, Part A (TMT-A) [87], which measures visual attention and speed of processing; 2) Trail Making Test, Part B (TMT-B) [87], which measures WM, set shifting, processing speed, and planning/sequencing; 3) Digit-Symbol from Weschler Adult Intelligence Scale – Fourth Edition (WAIS-IV) [86], which evaluates processing speed, and the WM operations of maintaining information online, monitoring, and manipulation; 4) Controlled Oral Word Association Test (COWAT), also labelled Phonemic Fluency [110]; and 5) Semantic Fluency [111], both of which assess initiation, self-generation, and monitoring. The COWAT included the letters FAS, and the semantic fluency included the categories of vegetables, fruits and animals. Both fluency tasks involve executive control, lexical knowledge and retrieval [96]. Although the tasks place demands on strategic search from semantic memory, phonemic fluency requires a constrained search from a broad set of lexical exemplars, whereas semantic fluency may be accomplished with a relatively more constrained search of exemplars from an superordinate category and relies on semantic associations within the category [112].
As noted in the Introduction, performance on the five administered neuropsychological tests varied in the degree of reliance on WM operations, as well as the content of the material being held in WM. Our primary outcomes are the tests that involved WM, and were used to assess near-transfer effects. One test, TMT-A, was a relatively easy attentional task that placed limited demands on WM. Two tests, Digit Symbol and TMT-B were excellent candidates for determining near-transfer effects. They depended heavily on WM involving visual material and shared reliance on many of the cognitive processes practiced in the WM training tasks. Finally, our secondary outcomes (COWAT and semantic fluency), were tests whose strong dependence on lexical and semantic processing provided the opportunity to assess far-transfer effects. We elected not to use a long list of outcome measures because of concerns that if positive results were observed on a limited number of specified tests, as predicted, the findings could be challenged as reflecting chance alone due to multiple comparisons.
Statistical analysis
An intention-to-treat (ITT) analysis was carried out in which all randomized participants were included in the statistical analysis [113]. No imputations for the missing data were made. Demographic and neuropsychological baseline characteristics were assessed using raw scores and differences across treatment conditions and across sites were assessed using Pearson’s chi-square test (χ2) for dichotomous variables, and independent sample t-test for continuous variables. We used Linear Mixed Models (LMM) to model the association between predictors and each of the outcome measures. We compared the fit statistics of three models. The basic model included the fixed main effects of the intervention condition, (adaptive versus non-adaptive), assessment time (pre versus post), and site (USA versus Sweden). The second model included the terms described above, as well as all the possible two-way interactions. Lastly, the third model included all the above and the three-way interaction between intervention condition, time and site. The models included participants as random intercepts to adjust for within-participant correlations of repeated measures. The parameters were estimated using the Restricted Maxium Likehood Method, and unstructured covariance was specified to model the covariance structure of both the residuals and the random factors. To reduce false discovery rate (FDR), p-values for the interactions were adjusted employing the Benjamini-Hochberg (BH) procedure [114]. The BH procedure is defined as P≤(i/m)Q, where P represents the individual P-value, i = the individual p-value’s rank, m = total number of tests, and Q = the FDR (which was set at 0.1).
Utilizing available Cogmed software, performance gains during the course of training were computed using the method described above to derive a Cogmed Training Index score. This analysis focused only the Adaptive condition, since the performance of the control group was fixed at a constant low level across the five weeks of intervention, and hence, no performance changes were observed for subjects under the control condition. Similar to a previous study [50], performance on the first two sessions was excluded from analyses due to lack of variability, as the starting point was identical for all individuals. For the remaining 23 sessions, mean daily performance was generated by Cogmed software. A repeated measures ANOVA was conducted with time (weeks 1–5) as the within-subject factor to investigate the pattern of performance during training. Lastly, we investigated the correlation between training gains on Cogmed tasks and on the neuropsychological tests. This correlation was based on calculating the percentage change [((Time2-Time1)/Time1)*100] in the Cogmed Training Index score (based on the performance in the last session vs. the third session), and percentage change on the neuropsychological tests.
Regarding the adherence analysis, we included all subjects for whom we had pre- and post-intervention data, regardless of their compliance level. However, we also re-analyzed the data excluding subjects who completed <75% of the training sessions. In addition, effect sizes [115] (Cohen’s d) were calculated for outcome variables that demonstrated a significant time-by-intervention group interaction. All analyses were performed using IBM SPSS version 23, and results were considered significant at p < 0.05.
RESULTS
Participant enrollment is shown in Fig. 1 according to the CONSORT diagram [76]. A total of 95 persons were screened for eligibility. One was excluded due to depression. Twelve elected not to participate (e.g., due to having insufficient time to commit to the training program). The remaining 82 adults were randomized to one of the two intervention programs. Five subjects withdrew from the study (four from the Adaptive group, and one from the control group) due to family illness, complaints of wrist pain, and unexpected time constraints or need to travel. Differences between the individuals who dropped out (n = 5) and the rest of the sample (n = 77) were evaluated using independent sample t-test for continuous variables and Pearson’s chi-square test for dichotomous variables. There were no group differences in baseline cognitive performance or on any demographic variable (i.e., age, education, gender). However, there was a difference between sites (p = 0.03), as four of the five subjects that dropped out were from the Swedish sample and one from the US sample.

CONSORT flow chart for selection of study participants.
Demographics and baseline neuropsychological performance by intervention group
Demographics and baseline neuropsychological characteristics from the included participants are shown in Table 1. There were no differences between intervention groups in terms of age, years of formal education, sex, global cognition (MMSE), or any of the neuropsychological test scores.
Demographic variables and neuropsychological raw scores in each intervention group
AMNART, American version of the National Adult Reading Test; BNT, Boston Naming Test; COWAT, Controlled Oral Word Association Test; GDS, Geriatric Depression Scale; IQ, Intelligence Quotient; M, mean; MMSE, Mini-Mental State Examination; SD, standard deviation; TMT, Trail Making Test; WAIS-IV, Wechsler Adult Intelligence Scale 4th-Edition; WMS-III, Wechsler Memory Scale 3rd-Edition.
Demographics and baseline neuropsychological performance by site
Table 2 compares the baseline demographic and neuropsychological characteristics of the participants from the two countries (USA and Sweden). Participants from the two sites differed in age [t(80) = 3.95, p < 0.001] and education [t(80) = 5.53, p < 0.001], with participants from US being older and having more years of education. The Swedish sample had higher scores on the GDS [t(72) = –15.59, p < 0.001]. Although the scores on all cognitive tests were within the normal range, the Swedish subjects showed a worse performance on the Digit Symbol (WAIS-IV) [t(80) = 3.89, p < 0.001], and a better performance on semantic fluency for animals [t(75) = –2.48, p = 0.01], which did not impact the overall, three-category semantic fluency performance (p = 0.38).
Demographic variables and neuropsychological raw scores in each site group
AMNART, American version of the National Adult Reading Test; BNT, Boston Naming Test; COWAT, Controlled Oral Word Association Test; GDS, Geriatric Depression Scale; IQ, Intelligence Quotient; M, mean; MMSE, Mini-Mental State Examination; SD, standard deviation; TMT, Trail Making Test; WAIS-IV, Wechsler Adult Intelligence Scale 4th-Edition; WMS-III, Wechsler Memory Scale 3rd-Edition. * Represents significant p-values, < 0.05.
Adherence rates
The Cogmed software tracked the number of sessions completed by each subject, and adherence rate was then calculated as number of sessions completed divided by 25 sessions (i.e., maximum number of sessions). When analyzing all subjects, adherence rates were not different between intervention groups [t(75) = –1.00, p = 0.32], with very high adherence in both groups (Adaptive group: 96.7% (range: 32–100%); Non-Adaptive group: 98.8% (range: 76–100%). Moreover, adherence rates were different between sites [t(75) = 2.18, p = 0.03], with 100% of adherence in the US sample, and 95.7% in the Swedish sample. Note that only one subject (from the Adaptive Group in the Swedish sample) had a low adherence rate (32%). All other subjects had adherence rates over 75%.
Performance gains during training (Cogmed Training Index Score)
Across the five weeks of training, participants in the adaptive condition significantly improved their WM performance on the tasks that were trained (F(4,128) = 43.15, p < 0.001), an effect that was not modulated by site (F(4,128) = 1.12, p = 0.17) (Fig. 2).

Performance changes on Cogmed tasks across five weeks in the Adaptive group (based on Training Index Scores). The units on the y-axis reflect an arbitrary number system developed by Cogmed. Graph A represents the performance of the entire sample, and graph B shows the performance by each site. Error bars represent standard errors.
Effects of time, intervention, and site
Table 3 shows mean and standard deviations of the raw scores for both intervention groups and sites, and Table 4 presents estimates and standard errors associated with fixed and random effects (Model 2). According to LMM Model 2, we observed an effect of assessment time for TMT-A, and Phonemic Fluency, with scores at post-intervention being better than at pre-intervention (TMT-A: [t(76) = 2.40, p = 0.01]; Phonemic Fluency (COWAT) [t(76) = –2.81, p = 0.006]. There was no main effect of intervention or site for any of the neuropsychological outcome measures.
Pre and Post means and standard deviation by intervention and sitea
aData represents raw scores; bThe more negative values represents the better performance; c3 categories. COWAT, Controlled Oral Word Association Test; TMT, Trail Making Test.
Intent-to-Treat Linear Mixed-Effects Model 2 Resultsa
aData based raw scores; SE, standard error; *p < 0.05; **p < 0.01; ***p < 0.001. COWAT, Controlled Oral Word Association Test; SF, Semantic Fluency; TMT, Trail Making Test.
Training effects
Our LMM analysis (Model 2) revealed a time-by-group interaction for Digit-Symbol (p = 0.02), with a medium effect size of treatment (d = 0.49) (for more details see Table 5). The interaction remained significant after controlling for FDR using Benjamini-Hochberg procedure (P≤(i/m)Q) [114] (Fig. 3), and was driven by a training-related improvement for the Adaptive intervention group only [t(36) = –4.61, p < 0.001], and not for the control group [t(39) = –1.12, p = 0.26]. Additional analyses revealed that the groups did not differ at baseline (p = 0.57), but after the five-week intervention the group that received Adaptive Cogmed training performed significantly better on Digit-Symbol than the control group (p = 0.03). In addition, the time-by-group interaction for Digit-Symbol remained significant (p = 0.01) after controlling for speed of processing (TMT-A) performance.

Cognitive performance in Digit Symbol test. *Represents p < 0.05, indicating a significant time-by-group interaction. Error bars represent standard errors.
Effect size (Cohen’s d), SE and 95% confidence interval associated with each outcome measure at post-training
Positive effect size favors adaptive training. COWAT, Controlled Oral Word Association test; TMT, Trail Making Test.
Model 3 revealed no time-by-group-by-site interaction in the cognitive outcomes, indicating that the magnitude of the training effect for Digit Symbol was similar for both sites (see Supplementary Material). It is relevant to highlight that after excluding the subject with low adherence rate, the pattern and significance of the results described in this section remained the same. Also of note, we repeated the analysis using ANOVA, which yielded the same statistical results as was found using LMM.
Correlation between training gains and transfer effect
There was a positive correlation between the percentage change in performance on the Cogmed tasks (i.e., training gain) and percentage change in performance on Digit Symbol (r= 0.37; p = 0.02), which remained significant after controlling for site (r= 0.40; p = 0.02) (Fig. 4).

Scatterplot representing the correlation between % of change on Cogmed Adaptive training (i.e., training gain) and % of change on Digit Symbol performance.
Exploratory analysis: Effect on age on training response
In an exploratory analysis (LLM) including all subjects, we examined whether the magnitude of the intervention effect was different for young-old versus old-old subjects. To do so, we performed a median split by age for each of the intervention groups. The mean age of the young-old subjects was 68.1±2.22 and of the old-old subjects was 78.0±4.57. The age of young-old and old-old subjects did not differ between intervention groups (young-old: [t(39) = –0.34, p = 0.72; old-old: [t(39) = –1.34, p = 0.18]. For Digit Symbol, a three-way interaction between intervention group, time, and age-group was not observed [F(1,73) = 0.37, p = 0.68], suggesting that the magnitude of the training effect was not different for young-old and old-old adults. Also of note, we did not find interaction with age-group for any of the other outcome measures.
DISCUSSION
The current study investigated the effects of five weeks of intensive computerized WM training on cognitively healthy older adults. The formal aspects of the intervention (home-based, computer interface, WM tasks, ∼40 min/day, five days/week over five weeks) were shared by the experimental (Adaptive training) and the control (Non-Adaptive training) conditions, but the level of task difficulty was only adjusted in the experimental condition. This study design allowed us to isolate the role of continuously challenging participants in a WM training program. Training gain on practiced tasks and transfer effects on specified neuropsychological tests were investigated.
Similar to previous work [25, 67], our study employed Cogmed, a computerized WM training paradigm. Execution of the Cogmed tasks requires maintaining stored information, shifting between encoding and retrieval demands, and exercising other aspects of attentional control. The beneficial effects of the Cogmed program have been reported in several clinical populations, including patients with attention-deficit/hyperactivity disorder [116–119]; brain injury [120, 121]; epilepsy [122], and older adults with mild cognitive impairment [67, 123]. However, there is sparse literature regarding the cognitive effects of this paradigm on healthy older adults [67, 124].
The most salient finding of our study of older adults is the intervention-associated improvement in performance on the Digit Symbol task, one of the primary outcomes. There are several reasons why we do not believe that this result is simply a reflection of chance. Importantly, we predicted this outcome based on an analysis of the likely near-transfer effects associated with participation in Adaptive Cogmed training. Transfer effects may occur if the trained and transfer tasks engage overlapping cognitive processing components and brain regions [67, 125]. Although all of the tasks that served as outcome measures relied, in part, on WM, they differed in terms of the content of the material being held in WM and the extent to which WM was the central cognitive process. Cogmed tasks involve training in holding on-line and updating information, operations that are critical to carrying out the Digit Symbol task [88]. The link between Cogmed training and changes in performance on the Digit Symbol task is strengthened by the positive correlation that was observed between performance gains on Cogmed tasks (as measured by the Training Index) and improvement on Digit Symbol. In addition, the beneficial impact of Cogmed training on Digit Symbol continued to be observed after using a procedure to reduce FDR. Critically, the time-by-group interaction observed on Digit Symbol remained significant after controlling for speed processing performance (TMT-A), suggesting that the gains on Digit-Symbol task likely reflect improvement in WM, and not in speed processing. Lastly, the effect size observed (d = 0.49) was medium, in line with a previous study [67], but in contrast to the small effect sizes reported in several computerized cognitive training studies [59].
Our prediction about training-related improvement in TMT-B was not substantiated. We hypothesize that the difference in training effects for Digit Symbol and TMT-B may be due the fact that maintenance and rapid retrieval of information held in WM, which are extensively practiced in the Cogmed program, are much more relevant to Digit Symbol than TMT-B. For instance, once the symbol–digit relationships are successfully held in WM, it is no longer necessary to rely on visual scanning of the key at the top of the administration sheet, to produce correct responses. Reliance on WM can reduce the time to generate an accurate response, resulting in an increased score on the Digit-Symbol test. Cogmed tasks place much less emphasis on shifting sets, which is one of the cardinal features of TMT-B. In fact, prior studies with Cogmed indicate that this program has a limited effect on interference control [50, 126], which contributes to set-shifting abilities. The differential impact of the intervention on Digit Symbol and TMT-B suggests that transfer effects associated with training may be very sensitive to the specific cognitive operations that are practiced during the training period.
In our study both Adaptive and Non-Adaptive Cogmed training were associated with a similar degree of improvement in TMT-A performance (i.e., main effect of time; no time by intervention interaction). This result may reflect increased familiarity with this test and the development of practice effects. However, the Cogmed training protocols (with or without ongoing adjustments to level of task demand) require attention, visual tracking and speeded responses, which could facilitate improvement of performance on relatively simple tasks like TMT-A. As predicted, there seems to be no advantage to the adaptive training intervention when subjects are asked to perform untrained tasks like TMT-A that involve very low-level WM demands.
Although phonemic and categorical word list generation requires WM [96], these tasks rely heavily on lexical and semantic processing. In contrast to Digit Symbol, word list generation does not involve manipulation of visually presented stimuli, but rather depends on access to one’s mental lexicon. Training-related improvement on word fluency tasks would be consistent with far-transfer effects, but no such effects were observed in this study. These findings are in line with literature that has reported inconsistent far-transfer effects after WM training [52, 56]. Previous studies of older adults who have participated in the Cogmed training program also have failed to observe far-transfer effects on measures of reasoning, interference control, and episodic memory [50, 67]. The transfer effects observed in the current study were limited to the processing of external visual stimuli (and not internal mental stimuli). However, it is noteworthy that the kinds of cognitive operations needed to effectively carry out the Digit Symbol task are vital to many complex, real world activities (e.g., driving), which require the maintenance and updating of visual stimuli. Consistent with this idea, Digit Symbol is the only non-memory task in the Preclinical Alzheimer Cognitive Composite (PACC) score, which is an index that has been shown to predict decline in independent, cognitively normal older adults at risk for AD [127–129].
Our intervention focused on the training of only one cognitive domain (WM), which may have limited transfer effects compared to multi-domain training routines. Theoretically, the latter approach engages many cognitive processes, which may yield broader transfer effects [52]. However, the meta-analysis by Mewborn and colleagues [79] challenges this idea, suggesting that multi-domain training interventions are not more successful than single-domain interventions at improving cognition immediately post-intervention. In fact, these authors found that WM training was more effective than all other single-domain interventions and than multi-domain interventions that did not include WM training.
Our study design provided an opportunity to examine whether the impact of computerized Adaptive WM training was influenced by the demographic make-up of participants. Subjects from Sweden and the United States differed in terms of cultural background, age, education, and baseline performance on neuropsychological tests. Växjö, Sweden is a rural community, whereas Boston, Massachusetts is a metropolitan one. On average, participants from the US site were older, had more years of formal education, and performed better on several baseline neuropsychological tests. Despite these noteworthy differences between the two sites, there was no time-by-intervention group-by-site interaction for performance on Digit Symbol, indicating that the magnitude of the training effect was similar across the two sites. These results are consistent with a meta-analysis suggesting that demographic variables such as age and education do not reliably modify the overall effect of cognitive interventions in older adults [79]. Additional research is needed to further clarify the potential influence of cultural factors on cognitive training effects. To the best of our knowledge, our study is the first to compare the impact of CCT on elders from different countries. Confirmation of our results indicating similar training-related effects for participants from diverse backgrounds would be important since CCT has the advantage of being readily accessible through the internet to individuals from different countries and cultural backgrounds. Importantly, our findings suggest good external validity of our computerized training protocol, which was applicable in different cultural contexts.
Results from the current study are pertinent to the question of whether old-old adults can benefit from CCT WM training. Brehmer and colleagues [50] studied normal young-old adults, 60 to 70 years old, and found that participants in the Adaptive training group outperformed those in the control group on several untrained cognitive tasks involving WM and sustained attention, such as Digit Span, Span Board and Paced Auditory Serial Addition Test (PASAT). The current investigation extends these findings to an older sample, 65–89, whose average age was ∼10 years higher (mean age 73.1) than participants in the Brehmer study (mean age 63.7). Our exploratory analysis using a median split of participants by age found that young-old and old-old adults benefit to a similar degree from the training, as measured by the Digit Symbol task. Our findings are in line with a previous report that old-old adults (∼80 years) can improve visual WM after WM training [19]. In addition, results of studies have varied regarding whether the beneficial effect of cognitive training is greater for young-old adults than old-old adults [75–78]. Relevant to this question is a recent meta-analysis on cognitive interventions in older adults that reported no significant difference in effect size based on age of study participants [79]. The authors suggest that advancing age does not significantly alter the ability of older adults to benefit from cognitive interventions, if they have large enough cognitive capacity to execute the training program.
Our results do not support the claims by Lampit and collaborators [59] that training more than three times per week lacks efficacy. Moreover, we would question their suggestion that unlike group-based training, home-based training is ineffective. Our findings and those from other recent studies [50, 131] demonstrate that CCT done in the homes of older participants can improve cognitive performance on untrained tasks. Further confirmation of these results is critical, given the accessibility and relatively low cost of in-home, computer-based cognitive exercises. Participation in either the experimental or control condition was demanding and involved five ∼40 minutes sessions per week over five weeks. Although the study did not include a direct measure of motivation, excellent adherence rates in both groups strongly suggest that participants were highly motivated and engaged. We believe that actively communicating with participants on a weekly basis in ways that provided both motivational and technical support contributed to our high adherence rates. Additional research is necessary to determine the extent to which an ongoing relationship with a member of the study or care team and the implementation of structured supervision are essential components to a successful home-based cognitive intervention program, as some investigators have suggested [132]. Consistent with our study, a previous Cogmed home-based CCT with older adults [50] reported a high adherence rate (94%), which is also similar to studies with children suffering from epilepsy [133] who had adherence rates of approximately 90%. Finally, we believe the individuals that agree to participate in an intense-dose training protocol like ours (i.e., five days per week over five weeks) were already highly motivated, which probably contributed to the high adherence rate observed.
In general, it is much more challenging to demonstrate training effects when using an active control group, as was done in the current study, than a comparison group involving either no contact or participation in a very different kind of intervention protocol (e.g., viewing educational videos) [51, 52]. Our active control condition included all aspects of the training intervention except for real-time adjustments to level of task difficulty. The observed benefits of Adaptive CCT over the active control suggest that the gains were not due to practice effects or non-specific intervention effects (e.g., involving motivation, test familiarity, or changes in performance anxiety) [57, 135], but were linked to providing a continuously challenging level of WM difficulty. Additionally, blinding participants to group assignment reduced the likelihood that subject expectations or placebo effects can adequately account for our findings.
Despite the originality and the careful experimental design of the present study, we acknowledge several limitations that remain unaddressed. First, although the study was approved by the Institutional Review Boards at the two sites, it was not registered (e.g., clinicaltrials.gov), which could negatively impact the quality rating of the trial. Second, although the magnitude of improvement did not differ across sites, suggesting that training effects were similar for participants from different cultural backgrounds, further research with greater statistical power is necessary to reach more definitive conclusions. We were not able to disentangle possible cultural differences between sites from demographic and cognitive differences (presented in Table 2), or the environmental context (rural versus urban). Therefore, the present conclusions should be considered with caution. Third, due to the moderate sample size, our study was not able to address mediating factors that predict intervention response, which is critical for identifying those participants who are most likely to benefit from WM CCT. Fourth, although controlling for false discovery rates reduces the likelihood that our findings are due to chance, the transfer effect observed in the current study relies on only one cognitive task, and does not reflect converging evidence from multiple tasks. Further investigation is needed to replicate our findings using other cognitive measures. Fifth, although we provide data on training gains on the Cogmed tasks, we did not include a criterion task (i.e., WM task similar to the ones in which participants were trained) to determine if performance at baseline or after training differed between experimental groups that might help account for the near-transfer effect observed. Sixth, in contrast to several other studies, we did not use episodic memory, reasoning or fluid intelligence as outcome measures [50, 67], leaving open the question whether the Cogmed WM training program would result in these kinds of far-transfer effects. Finally, it will be important for future studies to determine whether the cognitive benefits observed are maintained over time and to investigate possible transfer effects to real-world activities.
In conclusion, this multi-site, randomized controlled study demonstrated that healthy older adults can benefit from an intense five-week WM CCT. Compared to an active control condition, Adaptive training effectively improved performance on a task emphasizing WM and processing speed. The benefits over the active control group suggest that our CCT gains were not due to practice or non-specific intervention effects, but were associated with providing a continuously challenging level of WM difficulty. Importantly, the magnitude of improvement was similar in two subject samples with differing demographics and was not modulated by age, indicating that computerized WM training can be effective for older adults from divergent cultural backgrounds. It remains to be determined whether this type of intervention is more effective than other kinds of readily available cognitively stimulating activities. The results observed with this WM training program, which took place in the participant’s own home via the internet, suggest that the program is promising, potentially scalable, and worthy of additional study.
Footnotes
ACKNOWLEDGMENTS
The authors would like to thank Mayada Guzmán for her excellent administrative assistance. The authors also thank Cogmed® & Pearson® for allowing access to the training programs.
This study was funded by the Kamprad Family Foundation, Växjö, Sweden. In addition, the Laboratory of Healthy Cognitive Aging at Brigham at Women’s Hospital has been sustained by NIA GrantR01AG017935 and ongoing support from the Wimberly family, the Muss family, and the Mortimer/Grubman family.
