Abstract
Keywords
INTRODUCTION
Mild cognitive impairment (MCI) is a broad, heterogeneous clinical syndrome that can include multiple neurodegenerative and non-neurodegenerative conditions [1]. The increasing use of biomarkers in recent years has enabled the identification of a subgroup of MCI patients with underlying Alzheimer’s disease (AD). Imaging and cerebrospinal fluid (CSF) biomarkers, in particular, have shown to be excellent tools for the diagnosis of MCI due to AD (MCI-AD) and both have been incorporated into current diagnostic criteria [2, 3].
The core feature of typical AD is episodic memory impairment. This deficit can be assessed by episodic memory tests, [4] but studies comparing their diagnostic performance in biomarker-supported or autopsy-proven patient samples are scarce [5, 6].
One of the most commonly used tests world-wide is the Free and Cued Selective Reminding Test (FCSRT). The FCSRT is considered a typical episodic memory test that ensures encoding by semantic cueing and has been particularly recommended for the diagnosis of prodromal AD [7]. Other tests that use verbal word lists are the Consortium to Establish a Registry for Alzheimer’s Disease (abbreviated herein as CERAD-WL) or the Rey Auditory Verbal Learning Test (RAVLT). The CERAD-WL test includes a delayed recognition task which has been tightly related to the integrity of the entorhinal and perirhinal cortex, regions known to be selectively vulnerable to AD [8]. Although the current evidence indicate that these tests are reliable tools for the early diagnosis of AD, there is insufficient data about their direct comparison [9]. Some studies claim the use of the FCSRT based on longitudinal studies in subjects with MCI [10–13]. However, in these studies the progression to AD dementia was assessed during a short follow-up [10–13]. This approach has two major limitations: first, a limited follow-up may underestimate the full clinical spectrum of AD as some cases may not have the time to progress to the dementia stage. In addition, MCI-AD is heterogeneous and multiple patterns of neurodegeneration and longitudinal trajectories have been recognized [14]. Second, the sensitivity for the diagnosis of AD dementia based only on clinical criteria is limited, as up to a third of the patients may have non-AD pathologies or insufficient AD-related neuropathological changes at autopsy [15, 16]. Thus, we believe that it is necessary to explore the diagnostic accuracy of verbal memory tests in biomarker-supported MCI studies. In this study, we compared the diagnostic accuracy of two commonly used verbal memory tests, the FCSRT and CERAD-WL to predict MCI-AD defined by CSF biomarkers. Importantly, we further provide information about follow-up and the impact of verbal memory profiles on the progression to AD dementia. We hypothesized that the combination of FCSRT and CERAD-WL measures could improve the prediction of MCI-AD and define different prognostic profiles as it would unveil a more accurate picture of verbal memory disintegration along the heterogeneous MCI-AD continuum [14]. This information is crucial to select candidates for biomarker testing in clinical practice and clinical trials [17].
MATERIALS AND METHODS
Study population
Patients were recruited between June 2009 and April 2016 at the Memory Unit at Hospital de la Santa Creu i Sant Pau (Barcelona, Spain) as part of the biomarker program. Briefly, patients underwent a uniform set of clinical, neuropsychological, neuroimaging, and laboratory assessments, including CSF sampling. Subjects were referred by general physicians or neurologists because of cognitive or behavioral complaints. A total of 202 patients with a diagnosis of MCI and CSF biomarker data available were included. We excluded subjects meeting criteria for the diagnosis of dementia [18] at baseline, a history of stroke, other cerebral lesions, substance abuse, psychiatric comorbidities, or other comorbid conditions with excess mortality.
MCI definition
Patients were diagnosed as MCI using a broad definition [1] and met the following criteria: (1) Mini-Mental State Examination (MMSE) score between 24 and 30; (2) cognitive global status of 3 in the Global Deterioration Scale (GDS); and (3) impaired cognitive performance (defined as <1.5 SD on age- and education-adjusted normative scores) in at least one cognitive domain.
Neuropsychological evaluation
Patients underwent a formal neuropsychological evaluation close to CSF sampling (mean interval 0.9±3 months). A detailed description of the neuropsychological testing protocol and verbal memory tests is provided in the Supplementary Material. Declarative verbal memory was evaluated by FCSRT and CERAD-WL. For FCSRT, four scores were considered: total free recall (FCSRT total free recall), total recall (FCSRT total recall), delayed free recall (FCSRT delayed free recall) and delayed total recall (FCSRT delayed total recall). For CERAD-WL, three measures were considered: recall on trial 3 (CERAD-WL trial 3), delayed recall (CERAD-WL recall) and delayed recognition (CERAD-WL recognition). In the recognition task, true positive (items previously presented) and true negative items (novel items) were added to give a maximum score of 20.
Cut-off values for verbal memory tests
We defined specific cut-offs for this study because normalization procedures used for FCSRT and CERAD-WL tests differed [19, 20] and this could interfere with the comparison between both measures. We z-transformed raw values using means and standard deviations (SD) of 48 age-matched, cognitively normal control subjects (33% male, mean age: 67±6.54, education: 13±4.48, MMSE: 29.2±1.05 all with a CDR sum of boxes of 0) [21]. Controls had normal CSF values and thus were classified as stage 0, according to the NIA-AA staging of preclinical AD [22]. The impairment threshold was set at –1.5 SD for each test as in previous studies in patients with MCI [23].
CSF biochemical analysis
CSF was obtained by lumbar puncture and collected following previously reported, international consensus recommendations [24]. All samples were stored in polypropylene tubes at –80°C. We used commercially available ELISA kits to determine levels of Aβ1 - 42 (Innotest β-amyloid1 - 42, Fujirebio-Europe, Gent, Belgium), t-Tau (Innotest hTAU Ag, Fujirebio-Europe), p-Tau (Innotest Phospho-Tau 181P, Fujirebio-Europe) following the manufacturers’ recommendations. Our laboratory has experience in CSF biomarker determination and participates in the Alzheimer’s Association external quality control program for CSF biomarkers [25].
CSF biomarker cut-offs
We applied the t-Tau/Aβ1 - 42 ratio for the diagnosis of MCI-AD because this parameter has shown an excellent diagnostic accuracy in autopsy-proven case-series [26]. We applied our previously validated cut-off of 0.52, which has shown a good diagnostic performance to diagnose AD in our cohort [21] and in other cohorts [26, 27].
Diagnosis of MCI-AD
In accordance with the NIA-AA diagnostic criteria, MCI-AD was applied to any MCI patient with CSF biomarker evidence of AD [3] based on thet-Tau/Aß1–42 ratio. The remaining MCI patients were labeled as MCI-nonAD.
Genetic analysis
APOE was genotyped according to previously described methods [28].
Longitudinal follow-up
Subjects underwent a clinical and neuropsychological evaluation at least once a year unless marked clinical deterioration occurred. Additional visits were scheduled if judged necessary. As in previous longitudinal studies, [29] cognitive progression to AD dementia was operationalized as a loss of more than 3 points between first and last MMSE administration, or a MMSE score <24, or a clinical diagnosis of dementia at follow-up. Dementia with Lewy bodies (DLB), vascular cognitive impairment and frontotemporal dementia (FTD) were diagnosed according to current clinical criteria [18, 30–32]. Patients who converted to non-AD dementias (n = 12, 11 DLB and 1 FTD) and MCI patients with a follow-up of less than 6 months (n = 27) were excluded from the longitudinal analysis, as in similar studies.
Statistical analysis
Continuous variables are described as mean and standard deviation (SD) and categorical variables are described as percentages. Differences in baseline characteristics between MCI groups were assessed using the t-test for continuous variables and the Chi-square for dichotomous or categorical data. Nonparametric tests were applied when variables did not follow a normal distribution. The performance of MCI groups was compared in a series of analyses of covariance using age, sex and years of education as covariates. Effect sizes (Cohen d) were derived from the means and SD of adjusted scores. Logistic regression analyses tested whether the combination of FCSRT and CERAD-WL measures significantly improved the prediction of MCI-AD. We calculated the interclass correlation coefficient (raw values) and Cohen’s Kappa index (dichotomous classification) to test agreement between CERAD-WL and FCSRT measures. We tested the ability of the various verbal memory tests to discriminate the MCI-AD group by means of receiver operating characteristics curve analysis. We used the Cox proportional-hazards model to estimate the risk of progression from MCI-AD to AD dementia for verbal memory profiles based on CERAD-WL and FCSRT measures. These statistical models were adjusted for age, sex and education. Statistical significance for all tests was set at 5% (α= 0.05) and all statistical tests were two-sided. All analyses were performed using SPSS 20.0 (Armonk, NY: IBM Corp.).
Standard protocol approvals, registrations, and patient consent
The study was approved by the local ethics committee and was conducted in accordance with the Declaration of Helsinki. All participants gave their written informed consent to participate in the study.
RESULTS
Patients’ characteristics
Ninety-eight of 202 MCI patients (48.5%) had a CSF profile consistent with AD (MCI-AD). Table 1 shows their demographic, clinical, and CSF biomarker characteristics. The MCI-AD group was older (70.7 versus 66.9 years; p = 0.002) and had a higher frequency of women (61.2% versus 47.1%; p = 0.044) and APOE ɛ4 carriers than the MCI-nonAD group (57.1% versus 21.2%; p < 0.001). Global cognitive impairment as evaluated using the Clinical Dementia Rating Scale Sum of Boxes (CDR-SOB) was higher in the MCI-AD group (2.12 versus 1.33; p < 0.001) but functional impairment (assessed by the Interview for Deterioration in Daily living activities in Dementia (IDDD) was similar in both groups (p = 0.450).
Comparison of patient characteristics between MCI-AD and MCI-nonAD groups
Results are mean±SD (standard deviation) for continuous variables or frequency (%). In bold, p < 0.05. MCI-AD, mild cognitive impairment due to Alzheimer’s disease; MCI-nonAD, mild cognitive impairment not due to Alzheimer’s disease; MCI, mild cognitive impairment; IDDD, Interview for Deterioration of Daily Life in Dementia; MMSE, Mini-Mental State Examination; GDS, Geriatric Depression Scale; CDR-SOB, Clinical Dementia Rating Sum of Boxes; CSF, cerebrospinal Fluid.
Diagnostic accuracy of FCSRT and CERAD-WL for the diagnosis of MCI-AD
Table 2 shows the diagnostic accuracy for each verbal memory measured alone and in combination. FCSRT total free recall had the highest sensitivity (90.8%) while CERAD-WL recognition showed the highest specificity (79.6%). These two tests yielded similar global predictive values (59.9–65.3% and 59.4–62.8% for FCSRT and CERAD-WL, respectively). The positive predictive and negative predictive values were highest for FCSRT total free recall and CERAD-WL recognition (66.1% and 82.3%, respectively). Globally, FCSRT measures were more sensitive than CERAD-WL, but CERAD-WL measures were more specific. Combined measures increased sensitivity at the expense of decreasing specificity. For example, sensitivity was found to be highest for the combination (at least one abnormal test) of FCSRT total free recall and FCSRT total recall measures (93.9%). However, this was at the expense of a low specificity (26.9%). The combination of abnormal FCSRT total free recall and FCSRT total recall measures had the highest global predictive value (65.3%) and yielded a sensitivity of 84.7% and a specificity of 47.1%. FCSRT AUC values were slightly superior for raw values, but when adjusting for age, sex and education, the two tests yielded a similar diagnostic performance.
Diagnostic performance of verbal memory tests and combined memory profiles in MCI-AD subjects
*Adjusted for age, sex and years of education. S, sensitivity; E, specificity; PPV, positive predictive value; NPV, negative predictive value; GPV, global predictive value; AUC, area under the ROC curve; CERAD-WL, Word List from the Consortium to Establish a Registry for Alzheimer’s Disease; FCSRT, Free and Cued Selective Reminding Test; MCI, mild cognitive impairment.
Low agreement between FCSRT and CERAD-WL
A low agreement in interclass correlation coefficient (ICC) and Cohen’s Kappa index (κ) was noted between FCSRT and CERAD-WL verbal memory measures in this population (Supplementary Tables 2 and 3). Importantly, similar agreement rates were noted when using both raw data (ICC) and dichotomized groups (κ), indicating that low agreement rates were not related to the cut-offs used.
FCSRT and CERAD-WL for the prediction of MCI-AD
All verbal memory measures were lower in the MCI-AD group than in the MCI-nonAD group (Supplementary Table 1). Measures with the highest effect size for each test were the FCSRT delayed total recall (d = 0.976) and CERAD-WL trial 3 (d = 0.956). The model fit using forward stepwise logistic regression was best when the CERAD-WL trial 3 and FCSRT delayed total recall were included (Nagelkerke R2 = 0.319). Models based on FCSRT measures improved significantly when CERAD-WL measures were added (see Table 3).
Logistic regression analyses to predict MCI-AD in MCI subjects
*Age, sex, and years of education were introduced first as covariates in all analyses; †, Lower scores indicate better model fit; In bold, p < 0.05. MCI-AD, mild cognitive impairment due to Alzheimer’s disease; CERAD-WL, Word list of the Consortium to Establish a Registry for Alzheimer’s Disease; FCSRT, Free and Cued Selective Reminding Test.
Use of FCSRT and CERAD-WL to predict conversion to AD dementia in MCI-AD
Of the 163 patients selected for longitudinal analysis, 78 (47.8%) had an AD CSF profile (MCI-AD). After a mean follow-up of 34.2±24.2 months, almost half of these patients (35/78, 44.9%) had progressed to AD dementia. Four patients labeled as MCI nonAD (2.4%) at baseline converted to AD dementia during follow-up (two had borderline Aβ1 - 42 levels and two had a CSF profile compatible with suspected non-Alzheimer disease pathophysiology (SNAP) (for a detailed description of patients’ characteristics see Supplementary Table 4). As shown in Fig. 1, the combination of FCSRT and CERAD-WL subtests defined different prognostic profiles in MCI-AD patients. After adjusting for age, sex, and education, MCI-AD patients who failed both FCSRT delayed free recall and CERAD-WL recall had a faster progression to AD dementia than patients who failed only one test (HR = 4.7; p = 0.018). Similarly, patients who failed both FCSRT delayed total recall and CERAD-WL recognition tests had a faster progression to AD dementia than patients who failed only one test (HR = 2.39; p = 0.019, Supplementary Table 5). As shown in Fig. 1, amnestic profiles did not differ in the non-amnestic global z-scores, suggesting that progression rates to AD dementia were not driven by global cognitive impairment (data not shown).

Survival plots for the progression to AD dementia in MCI-AD. A) Survival plots of FCSRT delayed free recall and CERAD-WL delayed recall amnestic profiles; B) Survival plots of FCSRT delayed total recall and CERAD-WL recognition amnestic profiles; *, p < 0.05 (Mantel-Cox); ns, no statistically significant difference; +, abnormal result according to the calculated cut-off; -, normal result according to the calculated cut-off; MCI-AD, mild cognitive impairment due to Alzheimer’s disease; FCSRT, Free and Cued Selective Reminding Test; CERAD-WL, Word list of the Consortium to Establish a Registry for Alzheimer’s Disease.
DISCUSSION
In this study, we found that the combination of FCSRT and CERAD-WL measures improves the identification of subjects with MCI-AD defined by CSF biomarkers. In addition, the combination of a semantically-cued test and a free list-learning test with a recognition task defined distinct prognostic profiles in MCI-AD.
We found a relatively low agreement between FCSRT and CERAD-WL, indicating that each of these tests measures different but overlapping aspects of declarative memory [4]. Although both tests evaluate free retrieval, the FCSRT could more closely reflect the hippocampal-mediated consolidationprocess because registration is ensured by semantic cueing. The CERAD-WL test, in contrast, includes a recognition task that implies both recollection and familiarity, functions that depend more on extra-hippocampal areas [8]. Previous studies have examined the neural substrate of the different components of episodic verbal memory using the RAVLT, a non-cued list-learning test with a delayed recognition task [8]. A study using MRI found a close relationship between RAVLT recognition and atrophy in perirhinal and entorhinal regions [33], while a study based on 18F-fluoro-D-glucose positron emission tomography found that the delayed recall performance was associated with hypometabolism in orbitofrontal, cingulate-precuneus, and parietal areas [34]. Conversely, other imaging studies have shown that cued-recall measures of the FCSRT are related to medial temporal lobe atrophy, although other temporal and parietal regions have been also involved [6, 36]. Moreover, a recent study has highlighted heterogeneity in amnestic FCSRT profiles among MCI-AD patients as defined by CSF core AD biomarkers [37]. Taken together, these data indicate that different verbal memory tests capture different aspects of episodic memory that reflect interrelated but distinct neural circuits.
Comparative data are scarce with regard to the diagnostic performance of verbal memory tests in biomarker-supported or autopsy-proven patient samples and most studies have relied on clinical diagnosis [10–13]. In the only comparable study so far, the authors found that the FCSRT test performed better than the CERAD-WL test to predict MCI-AD [5]. Their sample size (185 patients) was similar to ours (202 patients) and they used a similar broad definition of MCI, but we were unable to replicate their findings. However, we identified some methodological differences that might explain the discrepant results. First, in our study we added both true positive and true negative responses to the CERAD-WL recognition final score, while Wagner et al. considered only true positive responses. Second, we included the CERAD-WL trial 3 score in the analysis but Wagner et al. did not. It is of note that both these subtests appeared to be important predictors of MCI-AD in our study.
We observed that the FCSRT showed a slightly higher sensitivity than CERAD-WL in the diagnosis of MCI-AD (specially, FCSRT total free recall and FCSRT total recall) but both tests had low specificities, reinforcing the notion that MCI is a heterogeneous syndrome [38, 39]. It is known that MCI can be caused by pathophysiological processes other than AD that target the medial temporal lobe structures. Hippocampal sclerosis, TDP-43 pathology, primary age-related tauopathy, and other non-neurodegenerative conditions may explain some of these MCI-nonAD cases [40–42]. Our results could have implications for clinical practice and clinical trials. In clinical practice, when selecting MCI patients for biomarker testing, FCSRT could be used first and then CERAD-WL could be applied if FCSRT is normal. This would maximize the sensitivity to detect MCI-AD in clinical practice. In a clinical trial, measures with high specificity, such as CERAD-WL recognition, could be used in order to minimize the number of screening failures due to biomarker negative results. Further studies are needed to test if the addition of a delayed recognition task to the FCSRT may help increasing specificity for the diagnosis of MCI-AD.
The progression rates to AD dementia during the longitudinal follow up of the patients were similar to those in previous large collaborative studies [23, 43]. Of note, 4 patients with MCI-nonAD converted to AD dementia during follow-up, possibly due to an atypical CSF biomarker AD profile (SNAP or borderline Aβ1–42 levels) or phenocopies due to non-AD conditions [23, 41].
It is also of note in our study that MCI-AD patients with deficits in both FCSRT and CERAD-WL progressed to dementia faster than those with deficits in only one memory test. These data is in agreement with other previous studies that have described different patterns of neurodegeneration in MCI-AD [14, 39]. We speculate that because FCSRT and CERAD-WL reflect slightly different neural substrates, MCI patients with deficits in both tests may have a higher pathology burden in AD-related structures and their disorder could therefore be at a more advanced stage in the disease continuum. Further imaging studies are needed to confirm this hypothesis.
This study has several limitations. First, we did not analyze measures of visual memory and we acknowledge that the combination of verbal and visual memory measures could further increase the ability to detect MCI-AD [17]. Second, the inclusion of a second verbal memory test could have caused interference when administering the two tests. To minimize this effect, however, we allowed a minimum time between both memory tests and included a non-verbal task. Nonetheless, interference is a potential limitation of all studies that introduce multiple declarative memory tests. Third, we defined MCI-AD using the t-Tau/Aβ1–42 ratio instead of decreased Aβ1–42 levels together with increased t-Tau or p-Tau. Our decision to do so was based on large neuropathological series where t-Tau/Aβ1–42 ratio has proven to be the most sensitive CSF biomarker for AD [26, 44]. Finally, the fact that we used broad criteria to define MCI could be seen as a limitation in a heterogeneous syndrome such as MCI. However, the frequency of MCI-AD being consistent with previous CSF-based studies on MCI population [23, 43] suggests that the MCI sample in our study is comparable to many previous studies.
In summary, we provide evidence that the classification of MCI-AD can be improved by combining FCSRT and CERAD-WL. The diagnostic performance of each test should be considered when MCI-AD is suspected and when selecting patients for biomarker testing in clinical trials. Importantly, the combination had prognostic implications. Our findings highlight the heterogeneity of the amnestic syndrome in the earliest symptomatic phase of AD and reinforce the need to use pathophysiological markers to diagnose AD at this stage.
Footnotes
ACKNOWLEDGMENTS
This work was supported by research grants from the Carlos III Institute of Health, Spain (grants PI11/02425 and PI14/01126 to Juan Fortea, grants PI10/1878 and PI13/01532 to Rafael Blesa, grants PI11/03035 and PI14/1561 to Alberto Lleó) and the CIBERNED program (Program 1, Alzheimer Disease to Alberto Lleó and SIGNAL study,
), partly funded by Fondo Europeo de Desarrollo Regional (FEDER), Unión Europea, “Una manera de hacer Europa”. This work has also been supported by a “Marató TV3” grant (20141210 to Juan Fortea) and by by Generalitat de Catalunya (2014SGR-0235). I. Illán-Gala is supported by the i-PFIS grant (IF15/00060) from the FIS, Instituto de Salud Carlos III.
The authors thank the patients and their relatives for their support for this study. We also thank Laia Muñoz and Carolyn Newey for technical and editorial assistance.
