Episodic Memory and Learning Dysfunction Over an 18-Month Period in Preclinical and Prodromal Alzheimer’s Disease

Abstract

Recent meta-analyses suggest that episodic memory impairment associated with preclinical Alzheimer’s disease (AD) equates to 0.15–0.24 standard deviations below that of cognitively healthy older adults. The current study aimed to characterize impairments in verbal acquisition and recall detectable at a single assessment, and investigate how verbal learning and episodic memory deteriorates in preclinical AD. A verbal list-learning task, the International Shopping List Test (ISLT), was administered multiple times over an 18-month period, to three groups of participants: amyloid-beta negative healthy older adults (Aβ– CN; n = 50); Aβ+ positive healthy older adults (preclinical AD; n = 25); and Aβ+ positive individuals diagnosed with mild cognitive impairment (prodromal AD; n = 22). At baseline, there was no significant difference between the preclinical AD and control groups rate of acquisition, or total and delayed recall, however all indices were impaired in prodromal AD. Performance on ISLT total score improved in the control group over the 18-month period, but showed a moderate magnitude decline in the preclinical AD group (Cohen’s d = – 0.63, [– 1.12, – 0.14]) and the prodromal AD group (Cohen’s d = – 0.36, [– 0.94, 0.22]). No significant impairment in acquisition associated with preclinical AD was seen at baseline. Individuals with preclinical AD showed a significantly different performance on the ISLT total score over an 18-month period, compared to those without abnormal Aβ. Individuals with prodromal AD showed substantial impairment on the ISLT at baseline and declined to a greater extent over time.

Keywords

Alzheimer’s disease cognitive decline learning curve memory and learning tests mild cognitive impairment neuropsychology transfer of learning

^lCo-senior authors.

INTRODUCTION

Neuropsychological models of Alzheimer’s disease (AD) emphasize episodic memory dysfunction as a core clinical manifestation of both dementia and its preclinical and prodromal stages. Consequently, assessment of episodic memory is central to the detection of the disease clinically, as well as for monitoring its progression. In non-demented older adults, accumulation of amyloid-β (Aβ) plaques and aggregation of hyperphosphorylated tau are pathological hallmarks of AD that are associated with brain volume loss and subtle cognitive decline that can begin up to 30 years before dementia [1, 2]. This long preclinical period provides an opportunity for understanding the clinical development of AD. As such, understanding how and when elevated Aβ disrupts episodic memory is a central question to understanding the pathogenesis of AD.

In preclinical AD, where abnormally high levels of Aβ (Aβ+) occur in the absence of clinical impairment, neuropsychological studies show that while episodic memory is impaired, the magnitude is relatively small (e.g., Cohen’s d of 0.15; [3]). By definition, episodic memory impairment is much greater in prodromal AD (Aβ+ patients with mild cognitive impairment [MCI]), typically 1.5 standard deviations below age matched normative data [4]. In both preclinical and prodromal AD, episodic memory dysfunction becomes clearer over time, with progressive decline evident within prospective studies of preclinical and prodromal AD (d’s 0.20–0.50). In contrast, adults with low Aβ levels (Aβ–) show no deterioration, or even some improvement, in performance on episodic memory tests over the same periods [5 –8].

The importance of time and of serial assessments to detecting Aβ+ related memory impairment in non-demented older adults does raise the possibility that consideration of time-dependent learning processes necessary for performance on episodic memory tests in a single administration may increase the sensitivity of those tests to early AD, compared to summary scores derived from the same performance. For example, verbal list learning tests require individuals to learn a set of stimuli over multiple (typically 3–5) trials, with recall of these items required after a brief delay [9]. By considering the rate of learning over trials as well as the related savings score, impairments in episodic memory may become apparent, which are not evident when summary scores such as total and delayed recall are used. In older adults with MCI, learning curves derived from verbal list learning tests are flattened compared to those from matched healthy controls [10 –13]. In these MCI groups, savings scores are typically 50% lower than age-matched controls [14, 15]. Thus, consideration of time-dependent learning curves and savings scores may provide greater sensitivity to Aβ+ related memory impairment than is currently seen when using summary immediate and delayed recall scores.

Where appreciation of Aβ+ related memory disruption requires verbal list learning tests be applied on multiple occasions, the extent to which performance on these tests change with the repeated assessments in Aβ– adults also becomes important. For many standardized verbal list learning tests, repeated administration in older adults is associated with some improvement in performance (i.e., a practice effect), with the magnitude of this varying as a function of the number of administrations, time between administrations, and number or equivalence of the alternative forms [16 –18]. Consequently, these practice effects can obscure the presence and magnitude of Aβ+ related memory decline [18]. Therefore, to detect Aβ+ related decline in episodic memory, it would be optimal to use measures of episodic memory for which repeated administration does not give rise to practice effects, rather, resulting in stable and reliable estimates of cognition [19].

Recently we developed a computerized verbal list learning test with multiple parallel forms to enable repeated administration over short periods of time (International Shopping List Test; ISLT [19]). Unlike other list learning tasks, for example, the California Verbal Learning Task (CVLT-second edition; [20]), on each administration of the ISLT each word in the list is generated pseudo-randomly and without replacement from a library of validated stimuli (n = 128). This means that at each assessment, participants receive lists consisting of different words, and any words used in an assessment are not repeated upon subsequent assessments. This method allows many alternate forms to be generated and controls systematic error that can occur with different yet static word lists [19 , 22]. This use of random word lists allows the ISLT to be given repeatedly at relatively short retest intervals without resulting in practice effects, and to generate data with high reliability (e.g., r’s for total score >0.80) in both healthy younger and older adults and also in MCI and AD patients [22, 23]. Further, ISLT outcome measures can be separated into specific aspects of memory (e.g., acquisition and retention; [22]). Quantitative and qualitative analysis of ISLT performance may therefore be useful to clarify the nature and magnitude of episodic memory impairment in the earliest stages of AD. However, the extent to which Aβ+ disrupts ISLT performance at baseline (over learning trials) or over repeated assessments in both preclinical and prodromal individuals remains unknown.

The first aim of this study was to determine whether a greater focus on time dependent aspects of learning and memory could improve the sensitivity of verbal list learning, over summary recall scores, to Aβ+ related memory impairment in early AD. The second aim was to understand the nature and magnitude of Aβ+ related disruption to verbal learning and memory over 10 assessments of the ISLT over 18 months. We hypothesized that Aβ+ would be associated with disruption to the acquisition and delayed recall of information in early AD, with this impairment greater than that seen using summary scores. We also hypothesized that Aβ+ would be associated with a deterioration in verbal learning and memory over 18 months, while no change would be evident in Aβ– individuals over the same period.

METHODS

Participants

Participants in the current sample were recruited from the Australian Imaging Biomarkers and Lifestyle Rate of Change Sub-study (AIBL-ROCS), a prospective cohort study of 196 older adults who are cognitively normal or diagnosed with MCI or AD where multiple assessments occur between routine AIBL assessments. Recruitment, inclusion/exclusion criteria, and measures used in the AIBL study have been described previously [23, 24]. Briefly, participants undergo neuropsychological, medical, psychiatric and neurological examination at 18-month intervals [24, 25]. Clinical classification has been described previously [24]; briefly, consensus diagnoses were assigned to participants based on Winblad and Petersen criteria for MCI and NINCDS-ADRDA criteria for AD [26, 27]. Clinical diagnoses were made blind to Aβ neuroimaging results. For the purposes of the current study, clinical groups were based on combined clinical and neuroimaging results of the AIBL ROCS sub-study, therefore 68 of the total sample were removed, as they had no imaging data. Also excluded were 12 individuals with a diagnosis of dementia, 12 individuals with a diagnosis of MCI who were Aβ–, and healthy participants who scored outside the bounds of clinically normal (n = 7). Therefore, the sample comprised 50 Aβ– cognitively normal (CN) older adults, 25 Aβ+ CN older adults (preclinical AD), and 22 Aβ+ adults MCI (prodromal AD). Demographic and clinical characteristics of these groups are shown in Table 1.

Table 1

Baseline demographic and clinical characteristics of each clinical group

Aβ– CN (n = 50)	Preclinical AD (n = 25)	Prodromal AD (n = 22)
% Female	60%	52%	64%
% APOE ɛ4	10%	24%	55% ^**
Age	70.72 (5.81)	75.24 (9.47)^*	79.45 (6.13)^**
Premorbid IQ	109.48 (5.02)	109.04 (7.50)	109.41 (5.84)
MMSE	29.44 (0.73)	28.96 (1.31)	27.36 (1.59)^**
CDR Sum of boxes	0 (0.5)	0 (0)	0.5 (3)^**
HADS Depression	2.44 (2.43)	1.55 (1.34)	3.95 (2.95)^*
HADS Anxiety	4.38 (2.42)	3.00 (2.56)	4.05 (2.28)
CVLT-II Total	55.30 (11.03)	53.38 (9.75)	36.34 (9.80)^**
CVLT-II Delayed	12.65 (4.31)	12.14 (3.80)	5.06 (3.85)^**
LM Immediate	13.71 (5.02)	13.81 (4.50)	8.97 (4.50)^**
LM Delay	12.44 (5.67)	12.44 (5.00)	6.07 (5.02)^**

Note. Means (SD). CDR-SB reported as median and range. Reference group for all pairwise comparisons was the Aβ– CN group. MMSE, Mini-Mental Status Examination; CDR-SB, Clinical dementia rating scale, sum of boxes; HADS, Hospital anxiety and depression scale; CVLT-II, California Verbal Learning Test, Second edition; LM, Wechsler Logical memory. ^*p < 0.05; ^**p < 0.001.

Assessments

Demographic and clinical characteristics

Demographic information was collected at the baseline assessment. Age, gender, and medical history were self-reported. The Wechsler Test of Adult Reading (WTAR; [28]) was used to estimate the premorbid intelligence of participants. Clinical disease severity was rated using the Clinical Dementia Rating scale (CDR; [29]). The Mini Mental State Examination (MMSE; [30]) was used to screen for global cognitive function. APOE genotype was determined from genotyping of blood. Assessment of depressive and anxiety symptoms was conducted using Hospital Anxiety and Depression Scale (HADS; [31]).

Neuroimaging

Aβ imaging with positron emission tomography (PET) was conducted using one of three radioligands, that is, Pittsburgh Compound B (PiB), florbetapir, or flutemetamol. The acquisition protocol for each radioligand has been detailed previously [25]. Briefly, a 30-min acquisition was started 40 min after PiB-injection, and 20-min acquisitions were performed 50 min after florbetapir injection and 90 min after flutemetamol injection. For PiB acquisition, standardized uptake value (SUV) data for key regions of interest were summed and normalized to the cerebellar cortex SUV. This resulted in a region-to-cerebellar ratio which was termed SUV ratio (SUVR). For florbetapir, SUVR was generated using the whole cerebellum as the reference region and for flutemetamol, the pons was used as the reference region [32]. Consistent with previous studies, Aβ status was classified as either low (Aβ–) or high (Aβ+). For PiB, an SUVR threshold ≥1.4 was used [32, 33]. For florbetapir and flutemetamol, an SUVR threshold of ≥1.1 and ≥0.62 were employed to discriminate between Aβ– and Aβ+, in accord with results of phase III studies.

The International Shopping List Test (ISLT)

The ISLT is a 12-item verbal list-learning task consisting of three learning trials and a delayed recall trial [19]. Outcome measures include total number of words recalled at each trial, delayed recall, and a savings score (delayed recall/trial 3), where higher scores equal better performance [22]. For each administration of the ISLT the computer program draws 12 words pseudo-randomly (without replacement) from a set of validated stimuli (n = 128). Thus, participants receive lists consisting of different words, with words used in one assessment, not repeated upon subsequent assessments for that individual. The reliability of the total and delayed recall scores from the ISLT in healthy populations, as well as those with MCI and AD is uniformly high (r_ICC = 0.83– 0.94), while trial by trial reliability is slightly lower (r_ICC = 0.69– 0.92) [19]. Performance on the ISLT was not used in the classification of clinical disease status for any participant.

AIBL verbal learning and memory tests

Classification of verbal episodic memory impairment in the AIBL cohort is based on performance on the California Verbal Learning Test, Second Edition (CVLT-II) and on a modified administration of the Logical Memory (LM) subtest of the Wechsler Memory Scale – Revised (WMS -R). The CVLT-II is a 16-item verbal list learning task consisting of five learning trials, immediate free and cued recall trials, and delayed recall and recognition trials [20]. For the purposes of this study, the total of the five learning trials and delayed free recall trial were used. Higher scores indicate better performance. The LM subtest is a paragraph story recall task in which participants are read a short story and are required to repeat as much information as they can remember, both immediately after being read the story (LM-I) and after a 30-min delay (LM-II; [28]). Both tests were administered in the context of the entire AIBL neuropsychological battery [24] and both are used for clinical disease staging in the study.

Procedure

The procedure for the AIBL-ROCS has been described previously [23]. Briefly, participants from the AIBL study were recruited for a series of repeated assessments over short test-retest intervals over an 18-month period. Participants were assessed once per month for four months, then at six months, and then every three months up to 18 months. This procedure was utilized in part to mirror the designs used in clinical trials of new medicines for AD, where in some cases repeated assessments occur at shorter retest intervals which then increase with time (e.g., [34, 35]). An initial training assessment was conducted to prepare participants for regular home visits and to familiarize them with the computerized version of a list-learning task, and therefore the second assessment was considered the baseline. Assessments were conducted either at the participants’ home, a location nearby, or at the Mental Health Research Institute, in Parkville, Australia. During the time between the initial list presentation of the ISLT and the delayed recall trial (approximately 30 min), several other non-verbal computerized cognitive tasks were administered.

Data analysis

All analyses were conducted in RStudio [36] using R version 3.3.1 [37]. Packages used included “afex” for analysis of repeated measures [38], “lme4” for linear mixed model analysis [39], and “ggplot2” for data visualization [40].

Differences in demographic or clinical characteristics

Differences between groups on relevant demographic and clinical characteristics were explored using a series of one-way ANOVAs for continuous variables and chi-square tests for non-continuous and categorical variables (e.g., CDR, APOE status, and sex). Characteristics for which group differences were identified at the p < 0.05 threshold were then added to subsequent analyses as covariates.

Group differences in episodic memory at baseline

To test the hypothesis that learning and recall in preclinical AD would be qualitatively similar but quantitatively different from prodromal AD, while also different from Aβ– CN adults, the analysis proceeded in two stages. First, learning curves on the ISLT trials including the delayed recall trial were compared between groups by submitting the number of words recalled on each trial to a 3×4 (group [Aβ– CN, preclinical AD, prodromal AD]×trial [One, Two, Three, Delayed]) mixed design ANOVA (formula = score ∼ group + age + apoe + hadsd + Error(id/(trial)). Statistically significant group×trial interactions were decomposed using interaction contrasts. Generalized eta-squared ( $η_{G}^{2}$ ) was used to represent the magnitude of the interaction. Second, summary ISLT performance measures, derived from those measures showing sensitivity to Aβ+ in the learning curve analyses were compared between groups using a series of planned comparisons set within a one-way ANOVA, i.e., between Aβ– CN and preclinical AD, and between preclinical and prodromal AD. Measures of effect sizes (Cohen’s d) were used to express the magnitude of difference from the comparison group for each measure.

Change in episodic memory over time

To test the second hypothesis that assessing change in memory over time using multiple repeated assessments would allow characterization of Aβ+ related memory change in preclinical AD, the second analyses proceeded in two stages. First, the three main performance measures for the ISLT, identified from the baseline analyses, were submitted to a series of linear mixed model (LMM) analyses using maximum likelihood estimation and an unstructured covariance matrix. In the LMM, group, time, and group x time interaction were entered as fixed factors; participant entered as a random factor; age, APOE ɛ4 status, and depression subscale of the HADS as covariates (formula = score ∼ group*time + age + apoe + hadsd + (1|id)). From these analyses, the magnitude of difference between groups in the estimates of rate of change for each summary ISLT were expressed as Cohen’s d. Due to the possibility of distribution of performance on verbal list learning tests to be biased towards ceiling effects within healthy populations and floor effects within those with objective memory impairment when using parametric statistical methods, we repeated analyses using non-parametric quantile regression. This allowed investigation of the presence of distribution biases within the ISLT performance measures (see Supplementary Material).

RESULTS

Group differences in demographic and clinical characteristics

Fig.1

Mean words recalled for each clinical group at each ISLT learning trial and delayed recall at baseline. Error bars represent 95% confidence intervals. Analyses adjusted for age, HADS depression, and APOE ɛ4 carriage. No significant differences between the performance of the Aβ– CN and preclinical AD group. Prodromal AD group significantly impaired across trials and at delayed recall.

Table 2

Group mean performance on ISLT measures at baseline

Mean (SD)			Cohen’s d
Outcome	Aβ– CN	Preclinical	Prodromal	Aβ– CN versus	Preclinical AD
AD	AD	Preclinical AD	versus Prodromal AD
ISLT	Total	26.92 (5.02)	26.78 (4. 55)	18.47 (4.55)	– 0.03 [– 0.51, 0.45]	– 1.84 [– 2.52, – 1.16]^**
Delayed	9.47 (2.69)	8.78 (2.40)	3.81 (2.44)	– 0.27 [– 0.75, 0.22]	– 2.05 [– 2.76, – 1.35]^**
Savings	90.88 (23.69)	85.40 (21.20)	49.13 (21.53)	– 0.24 [– 0.72, 0.24]	– 1.70 [– 2.36, – 1.03]^**

Note. Means (SD) are adjusted for the covariates of age, HADS Depression, and APOE ɛ4 status. ^**p < 0.001.

Statistically significant group differences were identified for proportion of APOE ɛ4 carriers, age, and the depression subscale of the HADS (Table 1); thus, subsequent analyses included these variables as covariates. As expected, groups also differed significantly with respect to MMSE and CDR scores. Comparison of the Aβ– CN group to the preclinical AD group indicated no statistically significant differences in performance in CVLT-II total (d = – 0.18[– 0.66, 0.30]), CVLT-II delayed (d = – 0.12[– 0.60, 0.36]), LM-I (d = 0.02[– 0.50, 0.46]) or LM-II (d = 0.00[– 0.48, 0.48]) scores. As expected, compared to the Aβ– CN group, the prodromal AD group performed significantly worse on all aspects of the CVLT-II and LM tests (CVLT-total d = – 1.78[– 2.35, – 1.20], CVLT-delay d = – 1.82[– 2.40, – 1.24]; LM-I d = – 0.97[– 1.50, – 0.45], LM-II d = – 1.16[– 1.70, – 0.63]). Similarly, compared to the preclinical AD group, the prodromal AD group performed significantly worse on CVLT-II total (d = – 1.74[– 2.41, – 1.07]), CVLT-II delayed recall (d = – 1.85[– 2.53, – 1.17]), LM-I (d = – 1.08[– 1.69, – 0.46]), and LM-II (d = – 1.27[– 1.90, – 0.65]).

Group differences in learning at baseline

Table 3

Results of linear mixed model analyses examining the ISLT outcome measures over 18 months, and mean slopes (SD) for each group

Outcome	Group	Time	Group×time	Aβ– CN	Preclinical AD	Prodromal AD
(df) F	Slope (SD)
Total	(122.32) 28.88^**	(596.53) 3.43	(596.14) 9.34^**	0.07 (0.18)^**	– 0.06 (0.19)	– 0.13 (0.19)^**
Delayed recall	(112.23) 34.90^**	(595.22) 1.89	(594.89) 4.24^*	0.02 (0.09)	– 0.01 (0.10)	– 0.05 (0.09)^*
Savings	(138.52) 22.88^**	(597.17) 1.16	(596.69) 0.81	– 0.04 (1.06)	0.02 (1.14)	– 0.39 (1.24)

Note. Analyses covaried for age, HADS Depression, and APOE ɛ4 carriage. ^*p < 0.05; ^**p < 0.001.

Mean words recalled on each of the three ISLT trials are shown in Fig. 1. The mixed ANOVA revealed a statistically significant group×trial interaction, F(5.50, 231.16) = 6.71, p < 0.001, $η_{G}^{2}$ = 0.04. This effect was decomposed using two interaction contrasts. The first interaction contrast compared learning across trials between the Aβ– CN group and the preclinical AD group. No significant group×trial interaction was observed (p = 0.24, d = 0.16) and for both groups, total words recalled improved across the three learning trials, F(2.66, 186.25) = 4.65, p = 0.005, d = 0.29. Savings for the Aβ– CNs were approximately 5% greater than in the preclinical AD group, although the difference was not significant (Table 2). The second interaction contrast compared the rate of learning across trials between the preclinical AD group to the prodromal AD group and a significant group×trial interaction was observed, F(3, 105) = 10.10, p < 0.001, d = 0.46. Compared to the preclinical AD group, the prodromal AD group recalled fewer words on each trial (Fig. 1). The prodromal AD group had significantly less savings (approximately – 36%) compared to the preclinical AD group (Table 2).

Table 2 summarizes the comparisons of the ISLT learning trials, delayed recall, and savings scores between groups. ANOVAs indicated significant differences between groups on all three metrics, and this effect was driven by the prodromal AD group (Table 2). Cohen’s d effect sizes showed the magnitude of the difference between the preclinical AD and Aβ– CN groups were small for all summary metrics (Cohen’s d’s < 0.25). In contrast, the magnitude of the difference between the preclinical AD and prodromal AD groups was very large (Cohen’s d’s > 1.50).

Group differences in rate of change in episodic memory over 18 months

The results of the LMMs of ISLT performance scores are summarized in Table 3 and shown graphically in Figs. 2–4. Group mean slopes derived from these models are also presented in Table 3. A significant group×time interaction was observed for ISLT total recall. Compared to the Aβ– CN group, the preclinical AD group showed significantly less change over time with the difference moderate in magnitude (Cohen’s d = – 0.63[– 1.12, – 0.14]). The prodromal AD group showed significant decline over time, however, relative to the preclinical AD group the difference in slopes was not significant, although moderate in magnitude (Cohen’s d = – 0.36[– 0.94, 0.22]). A significant group x time interaction was also observed for ISLT delayed recall. Compared to the Aβ– CN group, the preclinical AD group did not show a significantly faster rate of decline over time, but the effect size of difference in slopes was moderate in magnitude (Cohen’s d = – 0.37[– 0.85, 0.11]). The prodromal AD group showed significant decline over time, however, relative to the preclinical AD group the difference in slopes was not significant, although of moderate magnitude (Cohen’s d = – 0.42[– 1.00, 0.16]). No significant group x time interaction was observed for the ISLT savings score (Table 3, Fig. 2c). Re-analysis of the prospective data for the ISLT using the non-parametric quantile regression yielded a pattern of outcomes consistent with the parametric methods (see Supplementary Table 1 and Supplementary Figures 1–3). This indicates that outcomes of the linear mixed models were not influenced by any group-based biases in data distributions on the ISLT outcome measures (see Supplementary Material).

Fig.2

Mean words recalled on the ISLT total score for the Aβ– CN, preclinical AD, and prodromal AD groups from baseline to 18 months. Baseline is represented by the second assessment as the first was an initial training assessment. Shading represents 95% confidence intervals fitted via a linear smoothing function. Analyses are adjusted for age, HADS depression, and APOE ɛ4 carriage. The preclinical and prodromal AD group slopes are significantly different to the Aβ– CN group slope. A significant increase in performance is also seen in the Aβ– CN group.

Fig.3

Mean words recalled on ISLT delayed recall for Aβ– CN, preclinical, and prodromal AD groups from baseline to 18 months. Baseline is represented by the second assessment as the first was an initial training assessment. Shading represents 95% confidence intervals fitted via a linear smoothing function. Analyses are adjusted for age, HADS depression, and APOE ɛ4 carriage. No significant difference in the slopes between the Aβ– CN and preclinical AD groups. The prodromal AD group significantly declined over the 18 months.

Fig.4

Percentage of savings on the ISLT savings score for Aβ– CN, preclinical and prodromal AD groups from baseline to 18 months. Baseline is represented by the second assessment as the first was an initial training assessment. The shading represents 95% confidence intervals fitted via a linear smoothing function. Analyses are adjusted for age, HADS depression, and APOE ɛ4 carriage. No change over the 18-month period was observed for any of the groups.

DISCUSSION

The first hypothesis that the acquisition and recall of verbal information would be impaired in early AD in the presence of Aβ+ was partially supported. At the baseline assessment, no differences in acquisition, total recall, delayed recall, or the savings scores on the ISLT were observed in the preclinical AD group. Furthermore, effect sizes reflecting differences in performance between the groups for each of these measures were uniformly small (i.e., Cohen’s ds < 0.25), equivalent to estimates of memory performance detected using the CVLT-II and LM tests (i.e., Cohen’s ds < 0.20), and consistent with the magnitude of impairments in memory in preclinical AD observed in previous reports [7 , 42]. This indicates that when assessed on a single occasion, preclinical AD is not characterized by impairment in episodic memory even when the measures of memory are analyzed to consider both the acquisition and retention of verbal information. In contrast, the prodromal AD group showed substantial impairment (i.e., Cohen’s ds > 1.50) on all measures of the ISLT, equivalent to impairment evident on the CVLT-II and LM tests (Cohen’s ds > 1). Thus, for the preclinical and prodromal AD groups, analyses of acquisition and retention did not provide estimates of memory impairment that were greater than the ISLT summary scores or the CVLT-II and LM scores.

The second hypothesis that with repeated assessment, episodic memory would deteriorate in Aβ+ individuals, but not change in those without Aβ, was also partially supported. The preclinical AD group showed a decline in the ISLT total recall score over the 18-months that, when compared to the Aβ– CN group, was moderate in magnitude (Cohen’s d = – 0.63). A decline of equivalent magnitude in verbal list learning was evident in the prodromal AD group on the ISLT measures of total and delayed recall (Cohen’s ds – 0.36 and – 0.42, respectively). This decline in performance on the ISLT is consistent with the decline on other tests of verbal episodic memory in both preclinical and prodromal AD observed previously in other natural history samples [8 , 43–45]. These data therefore indicate that estimates of cognitive dysfunction in the preclinical stage of AD are most obvious as change over time based on repeated assessments, when compared to individuals without Aβ+.

The observation here, and elsewhere, that individuals with preclinical AD show no substantial impairment in verbal learning and memory when assessed on a single occasion, but do show decline when assessed over time, indicates that while Aβ+ related disruption to medial temporal lobe areas begins very early in the AD process, this takes a long time to manifest clinically [41 , 46–48]. Thus, this preclinical phase of AD is agreed upon to provide an opportunity to intervene so to prevent further neuronal loss and forestall the development of AD [49]. These data also suggest that the slow course of Aβ+ related memory loss continues through the preclinical stage until it becomes severe enough to warrant clinical classification of MCI, whereupon clinically meaningful memory impairment becomes obvious at a single assessment. The Aβ+ related decline then continues steadily throughout the prodromal phase. Previous studies suggest that Aβ+ related memory decline is correlated with loss of volume in the MTL [50 –52]. However, even with this decline in memory, individuals with MCI remain able to live independently and by definition have difficulty only with the most demanding of daily living tasks [4 , 53]. Stopping AD in the prodromal or preclinical stages could therefore provide enormous benefit to individuals.

Although the ISLT was designed to minimize practice effects from repeated administration, using multiple parallel forms gained from randomization of word lists upon each assessment, the Aβ– CN group did show a slight improvement, of approximately one word, over the eight assessments conducted in the 18-months (Table 3). This slight improvement in ISLT performance in Aβ– CN adults was most likely a consequence of the high frequency of ISLT administration in the study, as no improvements in ISLT performance were observed in previous studies where it was administered four times over three months in CN older adults and three times in a single setting in younger adults [19, 54]. Additionally, performance on the ISLT has remained stable between two assessments over a period of one month in participants with clinically diagnosed AD [22]. While the results of this study suggest that even with optimization for repeated use, eight administrations within an 18-month period still results in practice effects in healthy individuals, the magnitude of the practice effect on the ISLT (d = 0.10) was much less than in previous reports. For example, moderate improvements on the Neuropsychological Assessment Battery (NAB) list learning task and the Auditory Verbal Learning Test (RAVLT) have been observed with fewer administrations, equal to a 0.39 standard deviation annual increase [55], and an improvement of d = 0.51 after three assessments in 30 months, respectively [56]. Indeed, the absence of practice effects from repeated administration of memory tests has been proposed to be characteristic of early AD. While the current study aimed to minimize practice effects through the use of the ISLT in order to gain reliable estimates of Aβ+ associated cognitive change, previous work has shown that the degree of practice effects can provide an understanding of disease progression. For example, reduced practice effects were five times more likely in individuals with Aβ+ compared to those without Aβ [57]. Similarly, the magnitude of practice effects across three consecutive memory assessments was significantly less in those who progressed to clinical disease than in those that did not (d = 0.07 and 0.33, respectively) [58]. However, in each case, these estimates of Aβ+ related dysfunction were based on a reduction in the magnitude of practice effects, while still in the presence of performance gains, rather than from observation of a declining trajectory associated with Aβ+, as was shown in the current study (e.g., Fig. 2).

In the context of the hypothesis that memory dysfunction in preclinical AD is more evident upon repeated assessment, it is worth considering that the analysis of learning curves from the individual ISLT trials on the first administration could be considered an index of practice, albeit over minutes rather than over days, months, or years. As no difference in learning curves was observed between Aβ– CN and preclinical AD groups at baseline (Fig. 1), it is likely that information about memory dysfunction in preclinical AD, here shown over 18 months, and in previous studies over a period of one week [57] or several years [58], requires a time interval longer than minutes to become evident. The reduction in practice effects from high-frequency assessment in preclinical AD groups comparative to CN Aβ– older adults suggests that memory dysfunction could be conceptualized as a deficit in learning from repetition. As such, the assessment of memory utilizing primarily learning-based episodic memory paradigms may provide a more effective method for characterization of this proposed deficit in preclinical AD.

There are some caveats to the generalization of the current results. First, the AIBL-ROCS sample was selected specifically to undergo high frequency testing, and as such the repeated assessment schedule employed in AIBL-ROCS does not reflect that typically used in clinical practice. Future research will need to administer repeated assessments in a similar context to that of clinical practice to provide greater generalizability of results. Second, the AIBL sample is unlikely to be representative of the general population, as the study utilizes detailed inclusion and exclusion criteria to ensure minimal untreated comorbidities. Third, given that many AIBL participants have had prior exposure to cognitive testing, replication in a sample naïve to cognitive testing would help to increase generalization to the broader population. Fourth, due to the schedule of testing and necessary home visits for the current study, the sample size was small and as such, replication of the current results in a larger sample is warranted.

Notwithstanding these limitations, our findings highlight an issue with current models attempting to understand subtle cognitive dysfunction in preclinical AD populations using neuropsychological methods created for the detection of impairment. They underscore the importance of shifting from examination of neuropsychological performance at one timepoint to characterizing change in cognitive processes over time. Ultimately, the ability to conduct repeated assessments over shorter time periods will need to become the standard assessment protocol for individuals at risk of AD to demonstrate meaningful change in cognitive function. Our results suggest that even when using cognitive tasks with high test-retest reliability, practice effects can be seen in Aβ– CN adults but not in those with preclinical AD. As such, a potential way forward is to deliberately seek to improve performance in order to separate individuals who are able to improve from those who cannot, potentially leading to an increase in the reliability of prediction of disease progression.

Footnotes

ACKNOWLEDGMENTS

Funding for the study was provided by AstraZeneca Pharmaceuticals LP, the CSIRO Flagship Collaboration Fund and the Science and Industry Endowment Fund (SIEF) in partnership with Edith Cowan University (ECU), The Mental Health Research Institute (MHRI), Alzheimer’s Australia (AA), National Ageing Research Institute (NARI), Austin Health, CogState Ltd, Hollywood Private Hospital, Sir Charles Gairdner Hospital. The study also receives funding from the National Health and Medical Research Council (NHMRC), the Dementia Collaborative Research Centres program (DCRC), The McCusker Alzheimer’s Research Foundation, and Operational Infrastructure Support from the Government of Victoria. The authors acknowledge the financial support of the Cooperative Research Centre (CRC) for Mental Health. The CRC programme is an Australian Government Initiative. YYL reports grants from the National Health and Medical Research Council (GNT1111603, GNT 1147465). The ROCS team wishes to thank the participants in the ROCS study for their commitment and dedication to helping advance research into the early detection and causation of AD and the clinicians who referred patients to the study.

Authors’ disclosures available online ().

The supplementary material is available in the electronic version of this article: .

References

Jack

Jr , Knopman

, Jagust

, Petersen

, Weiner

, Aisen

, Shaw

, Vemuri

, Wiste

, Weigand

, Lesnick

, Pankratz

, Donohue

, Trojanowski

(2013) Tracking pathophysiological processes in Alzheimer’s disease: An updated hypothetical model of dynamic biomarkers. Lancet 12, 207–216.

Sperling

, Aisen

, Beckett

, Bennett

, Craft

, Fagan

, Iwatsubo

, Jack

Jr , Kaye

, Montine

, Park

, Reiman

, Rowe

, Siemers

, Stern

, Yaffe

, Carrillo

, Thies

, Morrison-Bogorad

, Wagster

, Phelps

(2011) Toward defining the preclinical stages of Alzheimer’s disease: Recommendations from the National Institute on Aging and the Alzheimer’s Association workgroup on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement 7, 280–292.

Baker

, Lim

, Pietrzak

, Hassenstab

, Snyder

, Masters

, Maruff

(2016) Cognitive impairment and decline in cognitively normal older adults with high amyloid-β: A meta-analysis. Alzheimers Dement (Amst) 6, 108–121.

Albert

, DeKosky

, Dickson

, Dubois

, Feldman

, Fox

, Gamst

, Holtzman

, Jagust

, Petersen

, Snyder

, Carrillo

, Thies

, Phelps

(2011) The diagnosis of mild cognitive impairment due to Alzheimer’s disease: Recommendations from the National Institute on Aging and Alzheimer’s Association workgroup on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement 7, 270–279.

Lim

, Snyder

, Pietrzak

, Ukiqi

, Villemagne

, Ames

, Salvado

, Bourgeat

, Martins

, Masters

, Rowe

, Maruff

(2015) Sensitivity of composite scores to amyloid burden in preclinical Alzheimer’s disease: Introducing the Z-scores of Attention, Verbal fluency, and Episodic memory for Nondemented older adults composite score. Alzheimers Dement (Amst) 2, 19–26.

Lim

, Maruff

, Pietrzak

, Ames

, Ellis

, Harrington

, Lautenschlager

, Szoeke

, Martins

, Masters

, Villemagne

, Rowe

; AIBL Research Group (2014) Effect of amyloid on memory and non-memory decline from preclinical to clinical Alzheimer’s disease. Brain 137, 221–231.

Doherty

, Schultz

, Oh

, Koscik

, Dowling

, Barnhart

, Murali

, Gallagher

, Carlsson

, Bendlin

, LaRue

, Hermann

, Rowley

, Asthana

, Sager

, Christian

, Johnson

, Okonkwo

(2015) Amyloid burden, cortical thickness, and cognitive function in the Wisconsin Registry for Alzheimer’s Prevention. Alzheimers Dement (Amst) 1, 160–169.

Donohue

, Sperling

, Petersen

, Sun

, Weiner

, Aisen

; Alzheimer’s Disease Neuroimaging Initiative (2017) Association between elevated brain amyloid and subsequent cognitive decline among cognitively normal persons. JAMA 317, 2305–2316.

Strauss

, Sherman

EMS

, Spreen

(2006) A compendium of neuropsychological tests: Administration, norms, and commentary, Third Edition, Oxford University Press.

10.

Petersen

(2004) Mild cognitive impairment as a diagnostic entity. J Intern Med 256, 183–194.

11.

Greenaway

, Lacritz

, Binegar

, Weiner

, Lipton

, Munro Cullum

(2006) Patterns of verbal memory performance in mild cognitive impairment, Alzheimer disease, and normal aging. Cogn Behav Neurol 19, 79–84.

12.

Ribeiro

, Guerreiro

, De Mendonça

(2007) Verbal learning and memory deficits in mild cognitive impairment. J Clin Exp Neuropsychol 29, 187–197.

13.

Perri

, Carlesimo

, Serra

, Caltagirone

; Early Diagnosis Group of the Italian Interdisciplinary Network on Alzheimer’s Disease (2005) Characterization of memory profile in subjects with amnestic mild cognitive impairment. J Clincial Exp Neuropsychol 27, 1033–1055.

14.

Libon

, Bondi

, Price

, Lamar

, Eppig

, Wambach

, Nieves

, Delano-Wood

, Giovannetti

, Lippa

, Kabasakalian

, Consentino

, Swenson

, Penney

, Cosentino

, Swenson

, Penney

(2011) Verbal serial list learning in mild cognitive impairment: A profile analysis of interference, forgetting, and errors. J Int Neuropsychol Soc 17, 905–914.

15.

Moulin

, James

, Freeman

, Jones

(2004) Deficient acquisition and consolidation: Intertrial free recall performance in Alzheimer’s disease and mild cognitive impairment. J Clin Exp Neuropsychol 26, 1–10.

16.

Beglinger

, Gaydos

, Tangphao-Daniels

, Duff

, Kareken

, Crawford

, Fastenau

, Siemers

(2005) Practice effects and the use of alternate forms in serial neuropsychological testing. Arch Clin Neuropsychol 20, 517–529.

17.

Calamia

, Markon

, Tranel

(2012) Scoring higher the second time around: Meta- analyses of practice effects in neuropsychological assessment. Clin Neuropsychol 26, 543–570.

18.

Goldberg

, Harvey

, Wesnes

, Snyder

, Schneider

(2015) Practice effects due to serial cognitive assessment: Implications for preclinical Alzheimer’s disease randomized controlled trials. Alzheimers Dement (Amst) 1, 103–111.

19.

Lim

, Harrington

, Ames

, Ellis

, Lachovitzki

, Snyder

, Maruff

(2012) Short term stability of verbal memory impairment in mild cognitive impairment and Alzheimer’s disease measured using the International Shopping List Test. J Clin Exp Neuropsychol 34, 853–863.

20.

Delis

, Kramer

, Kaplan

, Ober

(2000) CVLT-II: California verbal learning test: Adult version, Psychological Corporation.

21.

Lim

, Prang

, Cysique

, Pietrzak

, Snyder

, Maruff

(2009) A method for cross-cultural adaptation of a verbal memory assessment. Behav Res Methods 41, 1190–1200.

22.

Thompson

, Wilson

, Snyder

, Pietrzak

, Darby

, Maruff

, Buschke

(2011) Sensitivity and test-retest reliability of the international shopping list test in assessing verbal learning and memory in mild Alzheimer’s disease. Arch Clin Neuropsychol 26, 412–424.

23.

Lim

, Jaeger

, Harrington

, Ashwood

, Ellis

, Stöffler

, Szoeke

, Lachovitzki

, Martins

, Villemagne

, Bush

, Masters

, Rowe

, Ames

, Darby

, Maruff

(2013) Three-month stability of the CogState brief battery in healthy older adults, mild cognitive impairment, and Alzheimer’s disease: Results from the Australian imaging, Biomarkers, and Lifestyle-rate of change substudy (AIBL-ROCS). Arch Clin Neuropsychol 28, 320–330.

24.

Ellis

, Bush

, Darby

, De Fazio

, Foster

, Hudson

, Lautenschlager

, Lenzo

, Martins

, Maruff

, Masters

, Milner

, Pike

, Rowe

, Savage

, Szoeke

, Taddei

, Villemagne

, Woodward

, Ames

; AIBL Research Group (2009) The Australian Imaging, Biomarkers and Lifestyle (AIBL) study of aging: Methodology and baseline characteristics of 1112 individuals recruited for a longitudinal study of Alzheimer’s disease. Int Psychogeriatr 21, 672–687.

25.

Rowe

, Ellis

, Rimajova

, Bourgeat

, Pike

, Jones

, Fripp

, Tochon-Danguy

, Morandeau

, O’Keefe

, Price

, Raniga

, Robins

, Acosta

, Lenzo

, Szoeke

, Salvado

, Head

, Martins

, Masters

, Ames

, Villemagne

(2010) Amyloid imaging results from the Australian Imaging, Biomarkers and Lifestyle (AIBL) study of aging. Neurobiol Aging 31, 1275–1283.

26.

Winblad

, Palmer

, Kivipelto

, Jelic

, Fratiglioni

, Wahlund

, Nordberg

, Bäckman

, Albert

, Almkvist

, Arai

, Basun

, Blennow

, de Leon

, DeCarli

, Erkinjuntti

, Giacobini

, Graff

, Hardy

, Jack

, Jorm

, Ritchie

, van Duijn

, Visser

, Petersen

(2004) Mild cognitive impairment–beyond controversies, towards a consensus: Report of the International Working Group on Mild Cognitive Impairment. J Intern Med 256, 240–246.

27.

Petersen

, Smith

, Waring

, Ivnik

, Tangalos

, Kokmen

(1999) Mild cognitive impairment: Clinical characterization and outcome. Arch Neurol 56, 303–308.

28.

Wechsler

(1997) Wechsler memory scale - Third edition, The Psychological Corporation, San Antonio, TX.

29.

Morris

(1993) The Clinical Dementia Rating (CDR): Current version and scoring rules. Neurology 43, 2412–2414.

30.

Folstein

, Folstein

, McHugh

(1975) “Mini-mental state". A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 12, 189–198.

31.

Snaith

, Zigmond

(1986) The hospital anxiety and depression scale. Br Med J (Clin Res Ed) 292, 344.

32.

Villemagne

, Burnham

, Bourgeat

, Brown

, Ellis

, Salvado

, Szoeke

, Macaulay

, Martins

, Maruff

, Ames

, Rowe

, Masters

; Australian Imaging Biomarkers and Lifestyle (AIBL) Research Group (2013) Amyloid β deposition, neurodegeneration, and cognitive decline in sporadic Alzheimer’s disease: A prospective cohort study. Lancet Neurol 12, 357–367.

33.

Jack

Jr , Lowe

, Senjem

, Weigand

, Kemp

, Shiung

, Knopman

, Boeve

, Klunk

, Mathis

, Petersen

(2008) 11C PiB and structural MRI provide complementary information in imaging of Alzheimer’s disease and amnestic mild cognitive impairment. Brain 131, 665–680.

34.

Pietrzak

, Maruff

, Snyder

(2009) Methodological improvements in quantifying cognitive change in clinical trials: An example with single-dose administration of donepezil. J Nutr Health Aging 13, 268–273.

35.

Atri

, Frölich

, Ballard

, Tariot

, Molinuevo

, Boneva

, Windfeld

, Raket

, Cummings

(2018) Effect of idalopirdine as adjunct to cholinesterase inhibitors on change in cognition in patients with Alzheimer disease: Three randomized clinical trials. JAMA 319, 130–142.

36.

RStudio Team (2015) RStudio: Integrated development for R. RStudio, Inc., Boston, MA.

37.

R Core Team (2015) R: A language and environment for statistical computing. RStudio, Inc., Boston, MA.

38.

Singman

, Bolker

, Westfall

, Aust

(2017) afex: Analysis of factorial experiments.

39.

Bates

, Maechler

, Bolker

, Walker

(2015) Fitting linear mixed-effects models using lme4. J Stat Softw 67, 1–48.

40.

Wickham

(2009) ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag, New York.

41.

Lim

, Ellis

, Harrington

, Kamer

, Pietrzak

, Bush

, Darby

, Martins

, Masters

, Rowe

, Savage

, Szoeke

, Villemagne

, Ames

, Maruff

(2013) Cognitive consequences of high Aβ amyloid in mild cognitive impairment and healthy older adults: Implications for early detection of Alzheimer’s disease. Neuropsychology 27, 322–332.

42.

Hassenstab

, Chasse

, Grabow

, Benzinger

, Fagan

, Xiong

, Jasielec

, Grant

, Morris

(2016) Certified normal: Alzheimer’s disease biomarkers and normative estimates of cognitive functioning. Neurobiol Aging 43, 23–33.

43.

Clark

, Racine

, Koscik

, Okonkwo

, Engelman

, Carlsson

, Asthana

, Bendlin

, Chappell

, Nicholas

, Rowley

, Oh

, Hermann

, Sager

, Christian

, Johnson

(2016) Beta-amyloid and cognitive decline in late middle age: Findings from the Wisconsin Registry for Alzheimer’s Prevention study. Alzheimers Dement 12, 805–814.

44.

Farrell

, Kennedy

, Rodrigue

, Wig

, Bischof

, Rieck

, Chen

, Festini

, Devous

Sr , Park

(2017) Association of longitudinal cognitive decline with amyloid burden in middle-aged and older adults: Evidence for a dose-response relationship. JAMA Neurol 74, 830–838.

45.

Petersen

, Wiste

, Weigand

, Rocca

, Roberts

, Mielke

, Lowe

, Knopman

, Pankratz

, Machulda

, Geda

, Jack

Jr (2016) Association of elevated amyloid levels with cognition and biomarkers in cognitively normal people from the community. JAMA Neurol 73, 85–92.

46.

Mielke

, Machulda

, Hagen

, Christianson

, Roberts

, Knopman

, Vemuri

, Lowe

, Kremers

, Jack

Jr , Petersen

(2016) Influence of amyloid and APOE on cognitive performance in a late middle-aged cohort. Alzheimers Dement 12, 281–291.

47.

Papp

, Mormino

, Amariglio

, Munro

, Dagley

, Schultz

, Johnson

, Sperling

, Rentz

(2016) Biomarker validation of a decline in semantic processing in preclinical Alzheimer’s disease. Neuropsychology 30, 624–630.

48.

Insel

, Donohue

, Mackin

, Aisen

, Hansson

, Weiner

, Mattsson

; Alzheimer’s Disease Neuroimaging Initiative (2016) Cognitive and functional changes associated with Aβ pathology and the progression to mild cognitive impairment. Neurobiol Aging 48, 172–181.

49.

Sperling

, Rentz

, Johnson

, Karlawish

, Donohue

, Salmon

, Aisen

(2014) The A4 study: Stopping AD before symptoms begin? Sci Transl Med 6, 228fs13.

50.

Marks

, Lockhart

, Baker

, Jagust

(2017) Tau and β-amyloid are associated with medial temporal lobe structure, function, and memory encoding in normal aging. J Neurosci 37, 3192–3201.

51.

Mattsson

, Insel

, Aisen

, Jagust

, Mackin

, Weiner

; Alzheimer’s Disease Neuroimaging Initiative (2015) Brain structure and function as mediators of the effects of amyloid on memory. Neurology 84, 1136–1144.

52.

Wang

, Benzinger

, Hassenstab

, Blazey

, Owen

, Liu

, Fagan

, Morris

, Ances

(2015) Spatially distinct atrophy is linked to β-amyloid and tau in preclinical Alzheimer disease. Neurology 84, 1254–1260.

53.

Petersen

, Parisi

, Dickson

, Johnson

, Knopman

, Boeve

, Jicha

, Ivnik

, Smith

, Tangalos

, Braak

, Kokmen

(2006) Neuropathologic features of amnestic mild cognitive impairment. Arch Neurol 63, 665–672.

54.

Rahimi-Golkhandan

, Maruff

, Darby

, Wilson

(2012) Barriers to repeated assessment of verbal learning and memory: A comparison of international shopping list task and rey auditory verbal learning test on build-up of proactive interference. Arch Clin Neuropsychol 27, 790–795.

55.

Gavett

, Gurnani

, Saurman

, Chapman

, Steinberg

, Martin

, Chaisson

, Mez

, Tripodis

, Stern

(2016) Practice effects on story memory and list learning tests in the neuropsychological assessment of older adults. PLoS One 11, e0164492.

56.

Machulda

, Hagen

, Wiste

, Mielke

, Knopman

, Roberts

, Vemuri

, Lowe

, Jack

CR Jr

, Petersen

(2017) Practice effects and longitudinal cognitive change in clinically normal older adults differ by Alzheimer imaging biomarker status. Clin Neuropsychol 31, 99–117.

57.

Duff

, Foster

, Hoffman

(2014) Practice effects and amyloid deposition: Preliminary data on a method for enriching samples in clinical trials. Alzheimer Dis Assoc Disord 28, 247–252.

58.

Hassenstab

, Ruvolo

, Jasielec

, Xiong

, Grant

, Morris

(2015) Absence of practice effects in preclinical Alzheimer’s disease. Neuropsychology 29, 940–948.