Abstract
Background:
There are detectable cognitive differences in cognitively unimpaired (CU) individuals with preclinical Alzheimer’s disease (AD).
Objective:
To determine whether cross-sectional performance on the Cogstate Brief Battery (CBB) and Auditory Verbal Learning Test (AVLT) could identify 1) CU participants with preclinical AD defined by neuroimaging biomarkers of amyloid and tau, and 2) incident mild cognitive impairment (MCI)/dementia.
Method:
CU participants age 50+ were eligible if they had 1) amyloid (A) and tau (T) imaging within two years of their baseline CBB or 2) at least one follow-up visit. AUROC analyses assessed the ability of measures to differentiate groups. We explored the frequency of cross-sectional subtle objective cognitive impairment (sOBJ) defined as performance ≤–1 SD on CBB Learning/Working Memory Composite (Lrn/WM) or AVLT delayed recall using age-corrected normative data.
Results:
A+T+ (n = 33, mean age 79.5) and A+T– (n = 61, mean age 77.8) participants were older than A–T– participants (n = 146, mean age 66.3), and comparable on sex and education. Lrn/WM did not differentiate A + T+or A+T– from A–T– participants. AVLT differentiated both A+T+ and A+T– from A–T– participants; 45% of A+T+ and 25% of A+T– participants met sOBJ criteria. The follow-up cohort included 150 CU individuals who converted to MCI/dementia and 450 age, sex, and education matched controls. Lrn/WM and AVLT differentiated between stable and converter CU participants.
Conclusion:
Among CU participants, AVLT helped differentiate A+T+ and A+T– from A–T– participants. The CBB did not differentiate biomarker subgroups, but showed potential for predicting incident MCI/dementia. Results inform future definitions of sOBJ.
Keywords
INTRODUCTION
There is a genuine need to develop an efficient way to screen the population for elevated brain amyloid and tau in order to 1) identify appropriate individuals with preclinical Alzheimer’s disease (AD) for anti-amyloid and anti-tau clinical trials, and 2) identify those with AD pathology and thus at highest risk of developing AD dementia once anti-amyloid or anti-tau therapies are identified. Simple cognitive assessment methods show promise in this regard. First, there are detectable cognitive differences in amyloid positive versus negative cognitively unimpaired (CU) individuals, which are most robust for memory measures [1, 2]. Effect sizes are small in amyloid positive CU individuals (d =–0.17), but are larger in individuals with both amyloid and additional disease burden in the form of tau or neurodegeneration (d = –0.47) [1]. Second, subtle objective cognitive impairment (sOBJ), also referred to as subtle cognitive decline, shows promise for contributing to detection of preclinical AD. sOBJ based on a single baseline assessment confers risk of converting from CU to mild cognitive impairment (MCI)/AD at follow-up [3]. For example, Edmonds and colleagues showed that 46% of individuals with sOBJ alone, and 84% of individuals with sOBJ, neurodegeneration, and amyloidosis converted to MCI/AD dementia [3]. Further, individuals with cross-sectional sOBJ show faster accumulation of amyloid per PET neuroimaging, faster entorhinal cortical thinning, and greater risk of conversion to MCI/AD dementia over follow-up relative to CU participants without sOBJ [4].
Various definitions of sOBJ have been suggested based on traditional neuropsychological tests [3 , 5–7], but there is no clear agreement on the best definition or measures to use. Cross-sectional definitions have included a <10th percentile cut-off score on composite measures of memory [7] or global cognition [6]. Other approaches have combined cognitive and functional measures, requiring two out of six scores <1 SD below published normative data in two separate cognitive domains or a subtle decline in functional abilities based on a Functional Assessment Questionnaire score of 6–8 [3]. Intra-individual definitions of objective longitudinal cognitive decline (ΔOBJ) may confer greater sensitivity relative to cross-sectional approaches and represent an important method for defining transitional cognitive decline for individuals with preclinical AD [8]. However, whether intra-individual decline is superior compared to a cross-sectional approach remains a hypothesis to be tested and no clear a priori cutoffs have been recommended for how to define longitudinal decline at this time. Despite this ambiguity and prior evidence of the utility of cross-sectional sOBJ [3, 4], there is no cross-sectional option for defining Stage 2 transitional cognitive decline based on cognitive measures as recently proposed by the National Institute on Aging-Alzheimer’s Association (NIA-AA) Working Group [8]. Therefore, additional information about the frequency of cross-sectional sOBJ in preclinical AD will help address whether cross-sectional definitions of sOBJ should be considered as another way to define transitional cognitive decline and will inform development of a generalizable, widely accepted definition of sOBJ.
Computerized cognitive measures, such as the Cogstate Brief Battery (CBB), may be a more efficient method for identifying sOBJ relative to traditional, standardized person-administered measures [9, 10]. Further, computerized measures can be incorporated into study designs independent of diagnosis, which helps avoid the frequent circularity between measures used to define sOBJ and those used for current and future diagnosis. If shown to be reliable and valid, computerized measures that provide the option for unsupervised administration may be better suited for large-scale cognitive screening and monitoring than traditional neuropsychological measures or other computerized measures that require in-person administration. The CBB is one such tool that can be administered with or without supervision and consists of four cognitive tasks that measure psychomotor function, attention, working memory, and visual memory [11]. The CBB is already being used for at-home enrichment of clinical trial enrollment through the Brain Health Registry and has been implemented in the next phase of the Alzheimer’s Disease Neuroimaging Initiative (ADNI-3) [12 –14]. Early Cogstate Ltd.-affiliated studies suggest that the CBB may show promise for detecting early preclinical cognitive change; individuals identified as memory decliners based on longitudinal performance on the One Card Learning subtest from the CBB had a higher frequency of amyloid positivity in CU individuals relative to individuals with stable performance [15, 16]. Cogstate has FDA approval to market the CBB as a medical device under the name Cognigramtrademark for use as a digital cognitive assessment tool that can be completed in clinic or at home through prescription use in individuals 6–99 years of age. Although the CBB was originally developed to detect change over time, Cognigram is marketed for use on a single occasion or to determine cognitive change over time. One study showed that a single supervised administration of the Learning/Working Memory (Lrn/WM) Composite from the CBB showed good diagnostic accuracy for AD dementia and MCI [17], but this has not yet been replicated. It remains unclear whether a single CBB administration can help predict elevated brain amyloid or conversion to MCI/dementia in CU individuals.
The primary aim of this study was to help inform evolving definitions of sOBJ by determining whether baseline performance on a computerized measure, the CBB, can help identify CU participants with 1) preclinical AD defined by neuroimaging biomarkers of amyloid and tau, and 2) incident MCI/dementia. We predicted that the CBB Lrn/WM Composite and a traditional neuropsychological measure, Auditory Verbal Learning Test (AVLT) delayed recall, would show comparable 1) diagnostic accuracy and 2) prognostic accuracy based on AUROC analyses. We also applied a conventional cut-off of ≤ –1 SD below age-adjusted normative scores to the CBB and AVLT to investigate the frequency of sOBJ and to facilitate clinical translation of study results. We predicted that CU individuals meeting criteria for a biological diagnosis of AD (CU A+T+), but not those with Alzheimer pathologic change alone (CU A+T–), would show a higher frequency of sOBJ relative to CU A–T– individuals. We also predicted that individuals with incident MCI/dementia defined as CU individuals who later convert to MCI or dementia (CU converters) would show a higher frequency of sOBJ relative to CU stable participants at baseline.
METHODS
The Mayo Clinic Study of Aging (MCSA) is a prospective population-based study of cognitive aging among Olmsted County, MN, residents following an age- and sex-stratified random sampling design [18]. Individuals randomly chosen for recruitment were invited to participate in the MCSA and those without a medical contraindication were invited to participate in imaging studies. PiB-PET began in 2008 and tau-PET imaging began in 2016. The MCSA began administering Cogstate measures in the clinic in 2012 to the newly enrolled 50–69-year-olds during clinic visits. Cogstate administration for those aged 70 and older began the following year.
MCSA study visits included a neurologic evaluation by a physician, an interview by a study coordinator, and neuropsychological testing by a psychometrist [18]. The physician examination included a medical history review, complete neurological examination, and administration of the Short Test of Mental Status (STMS) [19]. The study coordinator interview included demographic information and medical history, and questions about memory to both the participant and informant using the Clinical Dementia Rating scale [20]. Details about the neuropsychological battery have been previously reported [18].
For each participant, performance in a cognitive domain was compared with age-adjusted scores of cognitively unimpaired (CU) individuals using Mayo’s Older Americans Normative Studies [21]. Participants with scores of ≥1.0 SD below the age-specific mean in the general population were considered for possible cognitive impairment, taking into account education, prior occupation, visual or hearing deficits, and other information. A diagnostic determination of CU, MCI, or dementia is made by a consensus agreement between the study coordinator, examining physician, and neuropsychologist using published criteria [18, 22]. The diagnosis at each study visit is conducted blind to prior clinical information, diagnosis, or knowledge of biomarkers. Performance on the neuropsychological battery is considered by the neuropsychologist for diagnostic recommendation, but not by the physician or study coordinator. Performance on Cogstate was not considered for diagnosis.
The study protocols were approved by the Mayo Clinic and Olmsted Medical Center Institutional Review Boards. All participants provided written informed consent.
Inclusion criteria
Two partially overlapping study samples from the MCSA were derived to address study hypotheses: 1) participants with biomarker data, and 2) all available participants, grouped by future diagnosis (see Table 1, which includes information about sample overlap).
Study definitions
1Biomarker profile defined by neuroimaging biomarkers of amyloid and tau within two years of baseline Cogstate. 2CU at baseline Cogstate and remained CU at all available follow-up visits. N = 51 participants with biomarker data were also represented in this group (n = 29 CU A– T– , n = 14 CU A+T– , n = 8 CU A+T+). 3CU at baseline Cogstate and converted to MCI/dementia at any follow-up visit. N = 18 participants with biomarker data were also represented in this group (n = 5 CU A– T– , n = 5 CU A+T– , n = 8 CU A+T+). 4Cogstate is independent of consensus diagnosis; ≤– 1 SD cut-off is equivalent to an age-corrected z-score of ≤– 1 derived from normative data provided by Cogstate. 5AVLT is considered as part of consensus diagnosis; ≤– 1 SD cut-off is equivalent to a Mayo’s Older Americans Normative Studies age-corrected scaled score of ≤7. Note. CU = Cognitively Unimpaired; A = amyloid; T = tau; sOBJ = subtle objective cognitive impairment; CBB = Cogstate Brief Battery; Lrn/WM = Learning/Working Memory; AVLT = Auditory Verbal Learning Test 30-Minute Delayed Recall.
Participants with biomarker data
Individuals were eligible if they were CU, aged 50 or older at the time of their baseline CBB session, and had amyloid PET (using Pittsburgh Compound B [PiB-PET]) and tau PET (AV1451) scans within two years of their baseline Cogstate evaluation (n = 240). After application of the biomarker cut-offs described below to eligible participants, biomarker subgroups included CU individuals with a biological definition of AD (n = 33 A+T+), CU individuals with AD pathologic change (n = 61 A+T–), and CU individuals with normal AD biomarkers (n = 146 A–T–).
Participants grouped by future diagnosis
For this analysis, MCSA participants aged 50+ who were CU at baseline CBB and had at least one MCSA follow-up visit were eligible for inclusion (N = 2,328). A cohort of 150 individuals with a diagnosis of MCI/dementia at any follow-up visit were identified and labeled as CU converters; the majority of these participants converted to MCI (n = 141). The remaining participants (n = 2,178) represented a pool of participants who remained CU at all available follow-up visits and were eligible for matching (CU stable). Each converter was matched to three stable CU participants on age (±5 years), sex and education (< = 12 years, 13–14, 15–16, 17+).
Biomarker methods
PiB-PET and tau-PET are acquired with a PET/CT operating in three-dimensional mode. The details of the acquisition, processing, and cut-off derivation have been previously published [23 –25]. Individuals are considered amyloid positive if they have an SUVR > 1.48 in a previously defined meta-ROI [24] and tau positive if they have an SUVR > 1.25 in a previously define meta-ROI including temporal lobe structures [24]. To simplify the study design and maintain a focus on individuals on the AD continuum, we grouped participants according to amyloid and tau (AT) imaging status, instead of using the full AT(N) classification scheme [26]. Unspecified or unavailable biomarker data is denoted*; we did not consider N status and thus use N* to denote this. We limited the current study to individuals meeting the biological definition of AD (A+T+N*) or with AD pathologic change (A+T–N*) per the recently proposed research framework for a biological diagnosis of AD [8]. We refer to individuals who are A–T– as biomarker negative, but in doing so likely include some individuals with non-AD pathologic change (A–T– N+). Because of our focus on individuals within the AD continuum, we did not include A–T+ individuals.
Cognitive measures
Cogstate Brief Battery (CBB)
Only baseline Cogstate performance is included in the current study. Cogstate version 7 (v7) was used, which provides personal computer (PC), iPad, and web-based administration options. Most participants completed their first Cogstate session in clinic on a PC or iPad (see Stricker, et al. [27] for details about Cogstate procedures). Less than 1% completed their baseline Cogstate session at home through web-based administration on a PC, thus the small differences in performance at home versus in clinic that we previously reported on measures of accuracy are unlikely to impact results presented [28]. The ability to reliably complete and adhere to the requirements of each task was determined by completion checks as previously described [27]. All data values with a failed completion flag were removed from subsequent analyses.
The CBB includes the following tasks: 1) Detection (DET): A simple reaction time paradigm that measures psychomotor function. 2) Identification (IDN): A choice reaction time paradigm that measures visual attention. 3) One Card Learning (OCL): A continuous visual recognition learning task that assesses visual recognition memory and attention. 4) One Back (ONB): A task that assesses working memory and attention. Detection and Identification previously showed limited diagnostic accuracy for MCI and were not included in analyses [17]. Accuracy data (OCL and ONB) were transformed using arcsine transformation to normalize the variables.
The CBB Lrn/WM composite was the primary measure of interest. It is derived using the average of age corrected z-scores for One Card Learning Accuracy and One Back Accuracy. Normative data were provided by Cogstate [29] in the form of means and SDs by age groups for arcsine transformed OCL accuracy and ONB accuracy scores. Secondary exploratory analyses also used only One Card Learning accuracy as an alternative primary outcome variable since measures of episodic memory are typically more sensitive to early decline in preclinical AD relative to measures of working memory [1, 2], and to allow direct comparison to raw-score analyses. One Card Learning z– scores and arcsine transformed raw scores were used in this analysis.
Auditory verbal learning test
The AVLT is a 15-item word list memory test that is part of the neuropsychological battery used at each MCSA study visit. Mayo’s Older Americans Normative Studies (MOANS) age-corrected scaled scores [21] for 30 min delayed recall was the primary measure of interest; these scaled scores have a mean of 10 and SD of 3. Because many participants were already enrolled in the MCSA and followed longitudinally at the time of the baseline Cogstate session that is the focus of this study, a portion of participants had prior exposure to the AVLT (see Results). The normative data applied [21] does not take prior test exposure into account, which likely decreases the sensitivity of this measure as used in the current study, particularly in the subgroups with greater frequency of prior exposure.
Subtle Objective Cognitive Impairment (sOBJ)
sOBJ was defined using a conventional cut-off of performance of less than or equal to 1 standard deviation below the mean (≤–1 SD) based on age-corrected normative scores. This translates to an age-corrected z-score of≤–1 for Cogstate measures [29] and an age-corrected MOANS scaled score of ≤7 for AVLT 30 min delayed recall [21].
Statistical methods
Demographic and clinical differences between groups were assessed by linear model ANOVA test for means and Chi-squared test for frequencies. Effect size (Hedge’s g) was computed using a weighted and pooled standard deviation. AUROC analyses were conducted to assess the ability of measures to differentiate two groups. The Youden index method was used to identify cut points [30]. This method defines the optimal cut-point as the point maximizing the Youden function, which is the difference between the true positive rate (TPR) and false positive rate (FPR) over all possible cut-point values. We also directly tested biomarker group-wise discrimination as summarized by the AUROC for select comparisons [31]. Frequency of sOBJ using a conventional cut-off was compared across groups.
For participants with biomarker data, we also used a multivariate logistic regression model to predict group membership (A–T– versus A+T+, A–T– versus A+T–) and to compute odds ratios (OR) and 95% confidence intervals (CI). Multivariate logistic regression models were adjusted for age, sex, and years of education and included a single cognitive measure of interest as the predictor. This allowed us to compute a demographics-adjusted OR for the Lrn/WM composite and AVLT delayed recall. In order to better compare to Lrn/WM, AVLT delayed recall was rescaled to z-score units. For every 1.0 z-score less, the odds of being in the biomarker positive group increases by (OR-1)%.
Secondary analyses were performed with select raw scores to ensure results were not driven by use of age-corrected scores. Secondary analyses also included the STMS to facilitate direct comparison of the CBB results to a public domain, traditional person-administered screening measure.
RESULTS
CU participants with biomarker data
Demographics and group mean comparisons
A+T+ and A+T– participants were older than A–T– participants and were comparable on education and sex (see Table 2). Performance on Lrn/WM across A+T– and A–T– groups was comparable (p = 0.27, Hedge’s g = 0.17). Lrn/WM performance in the A+T+ group was marginally lower (p = 0.07) than in the A–T– group, with a small to medium effect size (Hedge’s g = 0.35). Both of the biomarker positive groups (i.e., A+T+ and A+T–) showed significantly lower AVLT delayed recall performance (ps < 0.01) relative to the A–T– group, with medium effect sizes for both the A+T+ (Hedge’s g = 0.60) and A+T– (Hedge’s g = 0.54) comparisons. Of note, 40.6% of participants with biomarker data had prior exposure to the AVLT overall, with increasing frequency of prior exposure in the biomarker positive groups (see Table 2). Although direct comparison of the A+T+ and A+T– groups was not part of our a priori planned hypotheses, we did include some comparisons of these groups for the interested reader (see Table 2). Mean comparisons of the A+T+ and A+T– groups showed comparable performances on Lrn/WM (Hedge’s g = 0.11) and AVLT delayed recall (Hedge’s g = 0.08; both ps > 0.05).
Demographic characteristics and mean performance on memory measures among cognitively unimpaired (CU) participants
p-values represent linear model ANOVAs for mean comparisons or Pearson’s Chi-squared test for frequency comparisons of amyloid positive groups with the biomarker negative group. A, amyloid; T, tau; CBB, Cogstate Brief Battery; AVLT, Auditory Verbal Learning Test 30-Minute Delayed Recall; MOANS, Mayo’s Older Americans Normative Studies; SS, scaled score;- sOBJ, objective subtle cognitive impairment. 1Performance is considered as part of consensus diagnosis.
sOBJ frequency
For Lrn/WM, the A+T+ group showed a significantly higher frequency of sOBJ relative to the A–T– group (see Table 2). Despite this, sensitivity was very low, with 18% of A+T+ participants meeting sOBJ criteria on Lrn/WM. Only 5% of CU A–T– participants met this cut-off, thus specificity of Lrn/WM was excellent (95%). A+T– and A–T– groups showed a comparable frequency of sOBJ on Lrn/WM. For AVLT delayed recall, both of the biomarker positive groups showed a significantly higher frequency of sOBJ relative to the A–T– group. AVLT sensitivity was 45% for the A+T+ group, 25% for the A+T– group, and only 12% of A–T– participants met the sOBJ cut-off (88% specificity).
Diagnostic accuracy
Overall diagnostic accuracy of Lrn/WM for differentiating both A+T+ and A+T– from A–T– participants was very low and not better than chance (see Table 3). Diagnostic accuracy for the AVLT was low, but AVLT delayed recall discriminated both biomarker positive (A+T+ and A+T–) groups from the A–T– group better than chance (CIs did not include 0.50). Comparing total AUC values for both measures, AVLT delayed recall differentiated A+T– and A–T– groups significantly better than Lrn/WM Composite (p = 0.03). However, both measures comparably differentiated A+T+ and A–T– groups (p = 0.40).
Diagnostic accuracy of memory measures based on derived, optimal cut-off scores and conventional cut-off scores (≤–1 SD) indicative of subtle objective cognitive impairment (sOBJ)
A, amyloid; T, tau; CU, cognitively unimpaired; CBB, Cogstate Brief Battery; Lrn/WM, Learning/Working Memory; AVLT, Auditory Verbal Learning Test 30-Minute Delayed Recall; MOANS, Mayo’s Older Americans Normative Studies. 1Performance is considered as part of consensus diagnosis.
Logistic regression analyses controlling for age, sex, and education confirmed that these diagnostic accuracy results cannot be fully explained by age differences across groups (see Table 4). AVLT delayed recall significantly separates A+T– and A–T– groups, above and beyond age, education and sex. For differentiation of A+T+ and A–T– groups, the pattern of results remained consistent with frequency and diagnostic accuracy analyses for AVLT delayed recall, but the p-value was at trend level (p = 0.08). Lrn/WM did not significantly separate groups, confirming the above analyses.
Odds ratios from logistic regression models after adjusting for age, education, and sex
For logistic regression analyses, AVLT delayed recall scaled score was rescaled to z-score units to aid interpretation. For every 1.0 z-score less, the odds of being in the biomarker positive group increases by the OR reported above. A, amyloid; T, tau; CU, Cognitively Unimpaired; CBB, Cogstate Brief Battery; AVLT, Auditory Verbal Learning Test 30-Minute Delayed Recall. AVLT performance is considered as part of consensus diagnosis.
To investigate whether the pattern of results would change if raw scores were used and to ensure that the sensitivity of Lrn/WM was not artificially decreased by including a measure of working memory in the composite, we completed secondary analyses using One Card Learning accuracy raw scores and z-scores, as well as AVLT delayed recall raw scores (see Fig. 1 and Table 5). Total AUC values were similar between the Lrn/WM Composite and OCL z-score, between OCL z-score and raw score, and between AVLT delayed recall scaled score and raw score. In secondary analyses, the STMS did not differentiate A–T– and A+T+ groups better than chance. Overall diagnostic accuracy of the STMS for differentiating A+T– from A–T– participants was low, but better than chance (CI did not include 0.50). Sensitivity and specificity were low to poor for the optimally derived cut-off (A+T– versus A–T–), suggesting limited clinical utility.

AVLT 30-minute delayed recall sensitivity/specificity plot for differentiating CU A–T– and CU A+T+ groups. Panel A shows age-corrected scaled scores and panel B shows raw scores. The dotted line indicates the derived optimal cut-off per Youden’s index, which jointly maximizes the sum of sensitivity and specificity.
Diagnostic accuracy of secondary measures
A, amyloid; T, tau; OCL, One Card Learning; AVLT, Auditory Verbal Learning Test 30-Minute Delayed Recall; STMS, Short Test of Mental Status. 1Performance is considered as part of consensus diagnosis.
CU participants with follow-up data
Demographics and group mean comparisons
Converter and stable CU groups were comparable on age, education, sex, and years of follow-up available (see Table 6). Time to conversion was 2.7 years after baseline Cogstate, on average. The CU converter group showed lower performance relative to the stable CU group (p’s<0.001) on Lrn/WM (Hedge’s g = 0.59) and AVLT delayed recall (Hedge’s g = 0.72). Across CU participants with follow-up data, 57.0% of participants had prior AVLT exposure overall, with comparable frequency of prior exposure across stable and converter groups.
Demographic characteristics and mean performance on memory measures among CU stable and converter participants
p-values represent linear model ANOVAs for mean comparisons or Pearson’s Chi-squared test for frequency comparisons. MOANS, Mayo’s Older Americans Normative Studies; SS, scaled score;- sOBJ, objective subtle cognitive impairment. 1Performance is considered as part of consensus diagnosis.
sOBJ frequency
The CU converter group showed a significantly higher frequency of sOBJ on both Lrn/WM and AVLT relative to the CU stable group (see Table 6). Using a conventional ≤–1 SD cut-off, Lrn/WM showed 25% sensitivity for predicting conversion. Only 12% of CU stable participants met the sOBJ cut-off for Lrn/WM, thus specificity was good (88%). AVLT delayed recall had slightly higher sensitivity (38%) for predicting conversion. Only 14% of CU stable participants met the sOBJ cut-off for AVLT delayed recall, thus specificity was good (86%).
Prognostic accuracy for predicting incident MCI/dementia
Prognostic accuracy of baseline memory performance for differentiating CU stable and CU converter participants was moderate based on total AUC and better than chance (total AUC CIs did not include 0.5) for both Lrn/WM and AVLT (see Table 3).
DISCUSSION
The main finding of this study is that sOBJ, defined based on a single administration of a traditional neuropsychological memory test (AVLT), may help identify CU individuals with 1) preclinical AD as defined by neuroimaging biomarkers of amyloid and tau, and 2) incident MCI/dementia. In accordance with the primary study aim to determine whether baseline CBB performance can help identify CU participants with preclinical AD, a single CBB assessment was not useful in predicting elevated brain amyloid either alone or in combination with tau (A+T– or A+T+). However, CBB Lrn/WM Composite significantly contributed to predicting conversion from CU at baseline to MCI/dementia over follow-up.
Our finding that a single baseline memory test performance could identify 45% of CU individuals with a biological diagnosis of AD based on neuroimaging biomarkers of amyloid and tau (A+T+) and 25% with AD pathologic change alone (A+T–) was somewhat unexpected. Prior studies have shown that individuals with AD pathologic change (A+) but unknown T status do not show impairment in verbal memory when assessed on a single occasion [32 –34]. Consistent with our hypothesis, we demonstrated that CU individuals with a biological diagnosis of AD based on positive amyloid and tau biomarkers (A+T+) showed a higher frequency of sOBJ relative to CU A–T– individuals on AVLT delayed recall. However, despite our prediction to the contrary, CU individuals with AD pathologic change alone (A+T–) also showed a higher frequency of sOBJ on AVLT delayed recall. Prior reports have largely shown small effect sizes in studies comparing CU individuals with and without positive amyloid status, with meta-analytic results ranging from –0.14 [2] to –0.17 [1]. Based on a smaller pool of 7 studies using the 2011 preclinical AD framework [35], a medium effect size (–0.47) was demonstrated for Stage 2 (A+ and evidence of either positive neurodegeneration or tau) versus Stage 0 [1]. Our findings suggest a medium effect size on AVLT delayed recall for both CU individuals with AD (A+T+) and AD pathologic change (A+T–) relative to CU individuals without AD biomarkers (A–T–).
Inconsistent with our hypothesis, the total AUC was significantly greater for AVLT delayed recall than for Lrn/WM Composite for differentiating A+T– and A–T– CU groups. Statistical comparison of the total AUC suggested that both AVLT delayed recall and the Lrn/WM Composite comparably differentiated A+T+ and A–T– groups, However, overall diagnostic accuracy of the Lrn/WM Composite was not better than chance and only 18% of the A+T+ group met the sOBJ criterion, suggesting very limited clinical utility. Thus, when considering single assessment results, the AVLT shows greater clinical utility than the CBB for predicting biomarker positive status. Word list memory tests typically have greater sensitivity to MCI and AD dementia relative to other memory testing paradigms, particularly recognition memory [36]. Thus, this difference in clinical utility may simply reflect the type of memory measure used (word list delayed recall versus visual recognition memory) and does not necessarily imply a difference between traditional and computerized cognitive testing. Importantly, the CBB was designed with a focus on sensitivity to detecting change versus sensitivity for use at a single time point. However, given the CBB’s burgeoning use as a cognitive screening measure for clinical trials and for patient care, a better understanding of its clinical utility based on one time point is important. Further, although the CBB has shown some promise in identifying risk of elevated brain amyloid based on memory decline over time on OCL accuracy [15], the initial data presented in this regard suffered from some circularity in study design and there has not been a direct comparison of the added benefit of longitudinal CBB assessment relative to baseline CBB or baseline traditional memory measures. We plan to directly compare the sensitivity of cross-sectional and longitudinal approaches to defining sOBJ and subtle objective cognitive decline in future work once additional follow-up data have been collected. Relatively few studies have used true longitudinal change over time to define objective subtle cognitive decline [15 , 37] even though intra-individual decline is expected to provide a more sensitive means for detecting preclinical cognitive changes.
Although this study presents data regarding the diagnostic and prognostic accuracy of memory measures, memory measures are non-specific with regards to etiology. Low memory performance can be due to a number of factors including numerous other biological etiologies, demographic variables, and transient situational factors. Clinical diagnosis remains the gold standard for describing clinical syndromes. However, the need for data-driven and efficient approaches to identifying risk of preclinical AD based on cognitive measures or other methods is paramount to ongoing clinical trial efforts, and for identifying individuals who will most benefit from treatment once a viable treatment is identified. Predictive algorithms also show some promise in helping to predict the likelihood of elevated brain amyloid. However, in addition to data from in-person cognitive assessments that are typically included in these algorithms, additional information such as genetic testing (i.e., APOE genotype) and/or knowledge of other medical conditions [38, 39] is also often required, and thus may not be as simple and efficient as cognitive-based methods alone.
Our results showed that AVLT delayed recall alone could help predict, better than chance, which CU individuals had elevated brain amyloid. Although the total AUC values of 66% for A+T+ and 64% for A+T– are “low” per traditional standards, these results are promising for discriminating among CU individuals. For example, a recent study reported an AUC of 66% for plasma Aβ42 alone and 68% for plasma Aβ42/Aβ40 ratio for discriminating subjects with normal versus abnormal amyloid PET scans in CU individuals with subjective cognitive decline [40]. In addition, the sensitivity of the AVLT in our study is likely artificially lowered in two ways. First, a portion of the sample had prior exposure to the AVLT, thus unaccounted for practice effects likely led to slightly improved performance for the assessment used for these participants [41]. Also, AVLT performance was considered for diagnosis in a way that would decrease the sensitivity of this measure, as AVLT performance was considered during consensus diagnosis conference and participants were still given a CU diagnosis. Within ADNI, Kandel et al. [42] also demonstrated that AVLT performance has been shown to help predict CSF Aβ1–42 among patients with MCI, with predictive utility comparable to neuroimaging biomarkers (FDG-PET and hippocampal volume).
A secondary aim of this study was to determine whether performance on the CBB Lrn/WM Composite at baseline could predict conversion from CU to MCI/dementia. As a reference, we also examined AVLT delayed recall performance. Despite the circularity resulting from the AVLT being considered in the diagnosis of MCI, the results were encouraging as both measures achieved moderate overall diagnostic accuracy, with total AUC of 70%. Other cognitive screening measures have similar predictive accuracy values. Both the Montreal Cognitive Assessment (MoCA) and STMS performed similarly for detecting incident MCI with AUCs of 0.70 and 0.71, respectively [43]. Our secondary analyses confirm that the STMS performed similarly within this sample, with a total AUC of 0.74. However, STMS performance is considered as part of the consensus diagnosis. Rizk-Jackson and colleagues [44] found that MRI volumes and FDG PET measures alone showed 65% overall accuracy to predict conversion to MCI within two years, although a model combining MRI and FDG-PET measures achieved 81% total accuracy. Our results are particularly notable given that prior research in this area demonstrated similar AUC values when employing numerous clinical variables. Prediction of incident MCI in CU individuals from the NACC database using 14 non-invasive, clinical variables including performance on various neuropsychological measures resulted in 75% total AUC [45]. Despite our encouraging AUC results for predicting incident MCI/dementia with a single administration of the CBB, derived optimal cut-offs for the CBB fall well within the range of normal performance, complicating meaningful clinical application.
Although this study provides a unique look at the frequency of sOBJ in CU individuals with a biological diagnosis of AD, AD pathologic change, and incident MCI/dementia, there are also important limitations. Most notably, biomarker positive groups were significantly older than biomarker negative groups. Although age-corrected normative scores help account for the substantial age differences across groups, this may have had unintended consequences. For example, biomarker status is known to be significantly associated with age, which precluded use of an age-matched sample for biomarker groups. We aimed to further address this important confound by performing logistic regression equations that controlled for age, education, and sex, and the pattern of results did not change. Also of note, although we report analyses for 3 : 1 age, sex, and education-matched groups for identification of CU stable versus CU converter individuals, unreported preliminary results without this matching procedure showed equivalent findings. Finally, because the current study’s primary focus was on the CBB, a sizeable portion of AVLT performances were derived from individuals who had already been exposed to the AVLT. Application of normative data based on baseline performance therefore likely reduced the sensitivity of AVLT results by failing to take into account the known influence of practice effects [46, 47]. Of note, this may be particularly relevant for the biomarker subgroups, as both the A+T+ and A+T– groups were more likely to have had prior exposure to the AVLT relative to the A–T– group. Prior work by our group showed that among CU individuals, biomarker subgroups may be differentially sensitive to practice effects, with A–N– and A+N– subgroups both showing continued practice effects on memory measures across multiple visits, whereas A+N+ and A–N+ showed relatively stable performances over time [48]. Future work considering the important role of practice effects when optimizing definitions of sOBJ and ΔOBJ is needed.
Despite these limitations, this study contributes to the rapidly growing body of research focused on how to best define sOBJ in preclinical AD and fosters further research in the area. For instance, results suggest that normative data with better sensitivity to preclinical AD are needed. Alternative cut-offs may also need to be considered. For example, a –0.5 SD cut-off may be appropriate for sOBJ [49, 50]. Knopman and colleagues [51] showed that individuals with sOBJ based on a –0.5 SD cutoff have an increased risk of incident MCI/dementia. Use of norms that adjust for sex and education, in addition to age, is recommended for future studies but was not an option for the independent normative data used in this study from the MOANS and Cogstate [21, 29]. Intra-individual definitions of objective subtle cognitive decline may also enhance the sensitivity of cognitive measures to preclinical AD [37] and represents an important future direction.
In summary, results suggest that a single CBB session may show some utility in predicting conversion to MCI/dementia over follow-up, but is unlikely to be helpful in predicting which CU individuals are more likely to have AD pathologic change or AD based on AD neuroimaging biomarkers of amyloid and tau. In contrast, a traditional word list memory test shows promise in this regard, particularly if normative data can be refined to enhance sensitivity.
Footnotes
ACKNOWLEDGMENTS
The authors wish to thank the participants and staff at the Mayo Clinic Study of Aging. This work was supported by the Rochester Epidemiology Project (R01 AG034676), the National Institutes of Health (grant numbers P50 AG016574, P30 AG062677, U01 AG006786, R37 AG011378, R01 AG041851, RF1 AG55151), a grant from the Alzheimer’s Association (AARG-17-531322), Zenith Award from the Alzheimer’s Association, the Robert Wood Johnson Foundation, The Elsie and Marvin Dekelboum Family Foundation, Alexander Family Alzheimer’s Disease Research Professorship of the Mayo Clinic, Liston Award, Schuler Foundation, GHR Foundation, AVID Radiopharmaceuticals, and the Mayo Foundation for Education and Research. We would like to greatly thank AVID Radiopharmaceuticals, Inc., for their support in supplying AV-1451 precursor, chemistry production advice and oversight, and FDA regulatory cross-filing permission and documentation needed for this work. NHS and MMMi serve as consultants to Biogen and Lundbeck. DSK serves on a Data Safety Monitoring Board for the DIAN-TU study and is an investigator in clinical trials sponsored by Lilly Pharmaceuticals, Biogen, and the University of Southern California. RCP has served as a consultant for Hoffman-La Roche Inc., Merck Inc., Genentech Inc., Biogen Inc., Eisai, Inc., and GE Healthcare.
