Abstract
Background:
The Memory Binding Test (MBT) demonstrated good cross-sectional discriminative validity and predicted incident aMCI.
Objective:
To assess whether the MBT predicts incident dementia better than a conventional list learning test in a longitudinal community-based study.
Methods:
As a sub-study in the Einstein Aging Study, 309 participants age≥70 initially free of dementia were administered the MBT and followed annually for incident dementia for up to 13 years. Based on previous work, poor memory binding was defined using an optimal empirical cut-score of≤17 on the binding measure of the MBT, Total Items in the Paired condition (TIP). Cox proportional hazards models were used to assess predictive validity adjusting for covariates. We compared the predictive validity of MBT TIP to that of the free and cued selective reminding test free recall score (FCSRT-FR; cut-score:≤24) and the single list recall measure of the MBT, Cued Recalled from List 1 (CR-L1; cut-score:≤12).
Results:
Thirty-five of 309 participants developed incident dementia. When assessing each test alone, the hazard ratio (HR) for dementia was significant for MBT TIP (HR = 8.58, 95% CI: (3.58, 20.58), p < 0.0001), FCSRT-FR (HR = 4.19, 95% CI: (1.94, 9.04), p = 0.0003) and MBT CR-L1 (HR = 2.91, 95% CI: (1.37, 6.18), p = 0.006). MBT TIP remained a significant predictor of dementia (p = 0.0002) when adjusting for FCSRT-FR or CR-L1.
Conclusions:
Older adults with poor memory binding as measured by the MBT TIP were at increased risk for incident dementia. This measure outperforms conventional episodic memory measures of free and cued recall, supporting the memory binding hypothesis.
INTRODUCTION
Early detection of individuals at risk for dementia is essential to the development of secondary prevention strategies [1, 2]. Memory binding deficits have been proposed as a cognitive marker for individuals at high risk of Alzheimer’s disease (AD) [3–6]. Memory binding refers to the ability to encode or retrieve independent information as part of a more complex unit [7]. Deficits on binding of color and shape probed by the Short-Term Memory Binding test occur in asymptomatic carriers of a presenilin-1 mutation [4, 5]. The binding of face and name probed by the Face-Name Associative Memory Exam is related to amyloid burden in clinical normal elderly [3, 6]. However, the predictive validity of these tests with regards to longitudinal outcomes remains unknown.
Buschke and colleagues developed the Memory Binding Test (MBT), previously known as Memory Capacity Test, to assess semantic binding. The MBT links the same set of semantic category cues to two distinct word lists [8]. Initially, the participant is instructed to learn the first list of 16 items where the category cue is used to ensure controlled learning and encoding specificity. After learning all items from list 1, Cued Recall from List 1 (CR-L1) is performed. Then, the participant learns the second list of items using the same set of category cues and then Cued Recall from List 2 (CR-L2) is performed. The last step is the paired recall condition where the participant is asked to recall items from both lists when prompted with the category cue. As a result of this step, we obtain the MBT binding measure called the Total numbers of Items in the Paired condition (TIP). The category cues used in the MBT are designed to control for attention and achieve controlled learning, encoding specificity and cued recall. We previously demonstrated that the memory binding measure, MBT TIP, distinguished persons with dementia and amnestic mild cognitive impairment (aMCI) from cognitively normal elderly in cross-sectional analyses and was highly predictive of incident aMCI [9, 10]. These performances from the binding measure support the memory binding hypothesis. However, it was unclear whether the MBT TIP was predictive of incident dementia and how it compared to conventional list learning tests.
The MBT utilizes two distinct word lists paired with one set of category cues (2 items per cue, see more above and below in Methods), whereas the Free and Cued Selective Reminding Test (FCSRT) utilizes a single list of 16 items paired with one set of 16 category cues (1 item per cue) and the same list and cues are repeatedly used in all 3 trials in the FCSRT [11, 12]. The FCSRT is a conventional learning list that measures episodic memory under controlled learning conditions and has demonstrated predictive validity for dementia prediction. In the FCSRT retrieval phase, participants are given up to two minutes for free recall followed by cued recall for missed items. The total recall score is the sum of free and cued recall scores. The FCSRT free recall and total recall can both identify prevalent dementia and incident dementia [13–17]. The FCSRT free recall score is predictive of future dementia and AD [15, 18], and correlated with hippocampal neuronal metabolism and volume [19]. Compared to total recall, free recall score achieves higher area under the receiver operating characteristic curve (ROC AUC) for the dementia versus no dementia comparison. Furthermore, for prediction of incident dementia, when both free recall and cued recall are included in the same Cox proportional hazards model to predict incident dementia, only free recall remains significant [16]. In addition, total recall shows a ceiling effect in cognitively normal individuals [16, 20], so free recall on the FCSRT is the preferred measure in the context of dementia prediction.
Impairment on explicit semantic tasks among AD patients is shown to be a result of inefficient retrieval and a partially degraded semantic network [21]. Growing evidence has supported the hypotheses that semantic knowledge is stored in a complex distributed brain network and that degradation of the semantic systems starts as early as in the preclinical AD stage [22–24]. Successful performance on the MBT also requires semantic function, attention and executive function. In this regard, it is important to evaluate the non-memory cognitive domains, whether subtle disturbances in semantic systems, attention, and executive function may contribute to the prediction of the incident dementia, and how they compare to the MBT TIP measure in the context of dementia prediction.
In the present study, our main goal was to test the following hypotheses that the MBT binding measure, TIP, is predictive of incident dementia, and outperforms a conventional measure of episodic memory, FCSRT free recall score [11, 12], and the MBT single list recall measure, the number of items Cued Recalled from List 1, CR-L1. In addition, Rentz et al. found that poorer performances on CR-L2 in the 30-min delayed free recall condition were correlated with higher amyloid load in the precuneus brain region among cognitively normal participants [25]. Unfortunately, we have not collected data on the delayed recall measures [9], so instead we assessed whether the MBT immediate cued recall measure of the second list, CR-L2, was predictive of incident dementia. We further assessed whether tests probing non-memory cognitive domains were predictive of incident dementia and compared their performances to that of the MBT TIP.
METHODS
Participants
The present study includes 309 participants who were initially free of dementia and received the first administration of the MBT from May 2003 through December 2007. The current sample was obtained from a sub-study of the Einstein Aging Study (EAS), a population-based longitudinal study that examines predictors of cognitive decline among older adults [26]. As reported previously, these 309 participants constitute ∼40% of all the participants evaluated by the parent EAS study during that same time period, and were similar in terms of demographic characteristics and cognitive performance versus the remaining participants [9]. Inclusion criteria for the present study were the same as the EAS inclusion criteria: age 70 + at enrollment, English speaking, community-residing, and ambulatory. Individuals were excluded if they had auditory, visual, or motor impairments too severe to perform the neuropsychological tests, or active psychiatric symptomatology that interfered with the completion of study assessments. Individuals with history of stroke and other neurologic conditions were not excluded resulting in a sample that was more representative of the underlying population. The study protocol was approved by the local Institutional Review Board and informed consent was obtained from each participant.
Memory Binding Test
Details of the MBT are described elsewhere [8–10]. Briefly, the MBT utilizes a list of 16 semantic category cues (for example, “flower”) to facilitate the learning of two distinct 16-item word lists (for example, “tulip” on the first list and “carnation” on the second), where the category cues are used to ensure controlled learning, encoding specificity, and cued recall. At the onset of test administration, the participant was instructed to learn the first list of 16 word items. Four words at a time were visually presented on a card. Participants were orally presented with category cues and asked to identify the corresponding item from the first list. After presentation of the first list, study cards were removed and participants were asked to retrieve all 16 words in response to category cues, yielding the index called CR-L1 (Cued Recall – List 1: range 0–16), a single list recall measure. Then a second list was presented similarly as the first list followed by cued recall for the second list, yielding the number of items Cued Recalled from List 2 (CR-L2, range 0–16). As the last task, in a condition called the paired recall, the same category cues were orally presented again in the same order, and the participant was asked to recall the items from both lists in relation to the appropriate category cue. The paired condition yield two measures that probe binding: the number of Pairs cued recalled In the Paired condition (PIP, range 0–16; Requiring that both items are retrieved correctly for a category cue) and the TIP (Total Items in the Paired condition) counting all the correctly recalled items from both lists, paired and unpaired (range 0–32). TIP equals to “PIP×2 + the number of unpaired items recalled”. Total test administration time was approximately six minutes.
Previously we showed that the MBT TIP, a measure of memory binding, had optimal validity for the cross-sectional discrimination of persons with dementia from persons without dementia than the other MBT indices including PIP [9]. The optimal TIP cut-score of 17 was chosen by maximizing the sum of sensitivity and specificity for distinguishing dementia (sensitivity = 0.95, specificity = 0.86) from non-dementia in previous cross-sectional work [9], therefore, in this present study we defined poor memory binding using the cut-score of MBT TIP≤17. The cut-scores of CR-L1≤12 and CR-L2≤7 were defined in the same way as the MBT TIP cut-score from previous cross-sectional work [9].
Clinical and neuropsychological assessments
Participants were evaluated at baseline and at annual follow-up visits. For the present study, baseline was defined as the first administration of the MBT from May 2003 to December 2007. Analyses included follow-up visits through January 2017. In addition to the MBT, demographic, medical history, and health status were measured, and participants received a clinical neuropsychological test battery and a standard neurologic evaluation by a study clinician, as described previously [26].
There were challenges assessing education for elderly participants who attended school decades ago and whose school systems may differ, so in addition to collecting information on years of education, we administered the Wide Range Achievement Test – Third Edition (WRAT-3) reading subtest to each participant [27]. The WRAT-3 reading subtest measures one’s ability to recognize and pronounce words with increasing complexity and unfamiliarity. The WRAT-3 grade score represents the number of years of education that is equivalent to the achieved reading level, is a widely employed measure of literacy and taps education quality rather than merely years of education [28].
We assessed functional status by clinical evaluation, standardized questionnaires directed to study participants and informants (if available) [29], and instrumental activities of daily living (IADL) [30]. Depressive symptoms were measured by the 15-item Geriatric Depression Scale (GDS, range: 0–15) with higher scores indicating more depressive symptoms [31]. The cumulative number of cardiovascular disease (CVD) events including myocardial infarction, stroke and angioplasty was self-reported at each annual visit.
We assessed global cognitive function using the Blessed Information-Memory-Concentration test (BIMC, range: 0–33) with higher scores indicating poorer performance [32, 33]. Episodic memory was assessed by the FCSRT free recall score (FCSRT-FR, range: 0–48) [11, 12] and Logical Memory I subtest of the Wechsler Memory Scale – Revised (LM-I, range: 0–50) [34], with higher scores indicating better performances on both tests. Subjective memory impairment was ascertained based on self- or informant-reported information on standardized questionnaires [29], the Informant Questionnaire on Cognitive Decline in the Elderly (IQ CODE) [35] and the EAS Health Self-Assessment (HSA) Questionnaire [36]. Language and associated semantic memory, attention, and executive function abilities were assessed by the Category Fluency task (animals, vegetables, and fruits) [37] and confrontational naming was assessed by the Boston Naming Test [38]. Attention was assessed by the Trail Making Test part A [39] and the Digit Span subtest of the Wechsler Adult Intelligence Scale-III [40]. Visuospatial construction was assessed by the Block Design subtest and psychomotor speed and attention were assessed by the Digit Symbol subtest, both from Wechsler Adult Intelligence Scale-III [40]. Executive function was assessed by the Trail Making Test part B [39] and the Letter Fluency “FAS” task [41].
Diagnostic criteria
Diagnoses were assigned at consensus case conferences attended by a neurologist and a neuropsychologist who did not have access to MBT results. The primary outcome of this study was incident dementia. A dementia diagnosis was assigned based on the criteria from the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition [42], which required impairment in objective memory, impairment in one or more non-memory domains, and significant functional decline. Objective memory impairment was defined as FCSRT-FR≤24 [15] or 1.5 standard deviation (SD) or more below the age-adjusted mean on LM-I. Functional decline was determined at case conferences based on information from clinical evaluation, self- or informant-reports on the standardized questionnaires, and impairment score on the IADL [30]. Common dementia subtype assignments included probable or possible AD based on the clinical criteria established by the National Institute of Neurological and Communication Disorders and Stroke and the Alzheimer Disease and Related Disorders Association (NINCDS-ADRDA criteria) [43], and probable or possible vascular dementia [44]. When more than one etiology was clinically evident, multiple diagnoses were assigned. An aMCI diagnosis was assigned based on revised Petersen criteria [45] which required the presence of objective memory impairment as defined using the same criteria above based on the FCSRT-FR and LM-I, the presence of subjective memory impairment based on self- or informant report, the absence of functional decline, and the absence of dementia. In the EAS, persons with aMCI could also have impairment in non-memory cognitive domains of language, attention, visuospatial, and executive function. A non-amnestic MCI (naMCI) diagnosis was assigned to participants who did not meet criteria for dementia, had no functional decline, did not meet the criteria for memory impairment, but had impairment (1.5 SD or greater below the age-adjusted mean) in≥1 non-memory domains using tests for each domain described above. Non-memory domains supporting a diagnosis of naMCI included language, attention, visuospatial, and executive function. Individuals who did not meet any of these criteria were diagnosed as cognitively normal.
Statistical plan
Wilcoxon-Mann-Whitney tests were used to compare continuous variables between groups while Pearson’s Chi-square tests and Fisher’s exact tests were used to compare categorical variables. Time to incident dementia was defined as time from the first administration of the MBT to the time of incident dementia. Kaplan-Meier survival curves and Log-rank tests were used to examine whether MBT TIP at baseline was predictive of incident dementia. Cox proportional hazards models were used to adjust for covariates including age, sex, baseline WRAT-3 grade score (for assessment of education), and GDS depressive symptom score, where age was used as time scale using the delayed entry approach. The proportional hazards assumption was assessed by testing whether there was a non-zero slope in the plots of the scaled Schoenfeld residuals over functions of time [46, 47]. To assess the risk for various prediction time windows, we restricted the maximum follow-up length to two years, three years, ... , and six years, separately, to fit these Cox models, where the incident dementia events and follow-up durations beyond a specific time window were censored at the end of that time window. We did not assess the seven-year prediction window because there were no new dementia cases beyond six years in the group with poor memory binding (MBT TIP≤17) [9]. FCSRT-FR (optimal empirical cut-score of≤24) [15] and MBT single list measure, CR-L1 (cut score:≤12) [9], were assessed similarly as MBT TIP. In order to test if the MBT provided additional predictive validity beyond that of a conventional test, we have also assessed whether MBT TIP was predictive of incident dementia when adjusting for FCSRT-FR or CR-L1. The MBT CR-L2 and tests from non-memory cognitive domains were evaluated accordingly as well. For sensitivity analyses, we assessed the predictive validity of various cut-scores on MBT TIP and CR-L1. We replaced WRAT-3 grade score with years of education in the Cox model. We further adjusted the Cox model for the number of cumulative CVD events because vascular burden is a risk factor for cognitive decline [48]. While all-cause dementia was the outcome in the main analyses, we repeated the analyses using AD as an outcome to test whether the MBT was predictive of AD. In the analyses described above, we included prevalent aMCI cases at baseline to predict incident dementia because individuals with aMCI were community-residing individuals who may be at risk for dementia. As part of sensitivity analyses, we excluded prevalent aMCI cases from baseline and assessed whether the MBT was still predictive of dementia.
RESULTS
Among the 309 participants who were free of dementia at baseline, 35 participants developed incident dementia during 1,284 person-years of follow-up (Table 1). Among the 35 all-cause incident dementia cases, 23 were diagnosed as AD alone, 5 patients diagnosed as having AD and vascular dementia, 5 as vascular dementia, and the remaining 2 as dementia with functional decline. Compared to the participants who remained free of dementia during follow-up, the incident dementia cases were on average 1.7 years older (p = 0.05) at baseline. The incident dementia group had worse global cognitive function indicated by higher BIMC scores (p = 0.02), lower performances on all four MBT indices at baseline (p < 0.001), poorer baseline performances on FCSRT-FR, LM-I, category fluency, letter fluency, block design, and trail making part A and part B. The two groups were similar with respect to gender, race, education, WRAT-3 grade scores, GDS depressive symptom scores, self-reported cumulative CVD events, digital symbol, digital span, Boston naming, IADL at baseline, and study follow-up time,
Baseline characteristic for the study sample and for groups stratified by incident dementia status at follow-up
WRAT-3, Wide Range Achievement Test-Third Edition reading subtest grade score; GDS, Geriatric Depression Scale; CVD, cardiovascular disease; BIMC, Blessed Information-Memory-Concentration test; MBT, Memory Binding Test; CR-L1, Number of items Cued Recalled from List 1 on the MBT; CR-L2, Number of items Cued Recalled from List 2 on the MBT; PIP, Number of Pairs cued recalled In the Paired condition on the MBT; TIP, Total number of Items cued recalled in the Paired condition on the MBT; FCSRT-FR, the Free and Cued Selective Reminding Test free recall score; LM-I, Logical Memory I; IADL, Instrumental Activities of Daily Living. aWilcoxon-Mann-Whitney test were used to compare the continuous variables while Pearson’s Chi square test and Fisher’s exact test were used to compare categorical variables. bWe used FCSRT-FR and LM-I to define objective memory impairment for the clinical diagnosis of dementia at the case conferences.
MBT TIP
Participants with poor memory binding at baseline were less educated, had more depressive symptoms, worse global cognitive function and poorer performances on FCSRT-FR and LM-I, other neuropsychological tests and IADL (Supplementary Table 1). Kaplan-Meier curves differed between groups (Log-rank test, p < 0.0001) (Fig. 1A) and TIP was highly predictive of incident dementia (hazard ratio (HR) = 8.58, 95% CI: (3.58, 20.58), p < 0.0001) (Table 2). When assessing various time windows, HR was high (range: 17.0–32.1, p < 0.0001) although it decreased with longer time windows (Table 3).

Kaplan-Meier survival curves as a function of baseline MBT TIP, MBT CR-L1, and FCSRT-FR. MBT, Memory Binding Test; TIP, Total number of Items cued recalled in the Paired condition on the MBT; CR-L1, Number of items Cued Recalled from List 1 on the MBT; FCSRT-FR, the Free and Cued Selective Reminding Test free recall score.
Memory Binding Test (MBT) as a predictor for incident dementia: Cox proportional hazards model estimates
HR, Hazard Ratio; MBT, Memory Binding Test; TIP, Total number of Items cued recalled in the Paired condition on the MBT; FCSRT-FR, the Free and Cued Selective Reminding Test free recall score; WRAT-3, Wide Range Achievement Test – Third Edition reading subtest grade score; GDS, Geriatric Depression Scale; CR-L1, Number of items Cued Recalled from List 1 on the MBT; CR-L2, Number of items Cued Recalled from List 2 on the MBT. aModel 1: MBT TIP was assessed as the main predictor; Model 2: FCSRT-FR was the main predictor; Model 3: MBT TIP and FCSRT-FR were both assessed in the same model; Model 4: MBT CR-L1 was the main predictor; Model 5: TIP and CR-L1 were both assessed in the same model. bAge was used as the time scale in these models. cBaseline deficit on memory binding was defined as MBT TIP≤17. dBaseline deficit on FCSRT was defined as FCSRT-FR≤24. eWRAT-3 was used instead of years of education because it is highly correlated to years of education and may be a better proxy for education. fBaseline deficit on single list recall was defined as MBT CR-L1≤12. gBaseline deficit on CR-L2 was defined as MBT CR-L2≤7.
MBT TIP versus FCSRT-FR
Kaplan-Meier curves for incident dementia differed between groups with and without memory impairment based on an empirical cut-sore of≤24 on the FCSRT-FR (Log-rank test, p < 0.0001) (Fig. 1C). Like MBT TIP, FCSRT-FR alone was predictive of incident dementia (Table 2A): HR = 4.19, 95% CI: (1.94, 9.04), p = 0.0003. When both MBT TIP and FCSRT-FR were included in the same model, HRs were attenuated while remaining significant for both: HR for MBT TIP: 5.73, 95% CI: (2.30, 14.28), p = 0.0002, and HR for FCSRT-FR: 2.76, 95% CI: (1.23, 6.21), p = 0.01, indicating that MBT TIP was predictive of dementia even when adjusting for FCSRT-FR. When assessing various time windows, HRs for TIP were numerically higher than FCSRT-FR (Table 3).
Predictive validity of the Memory Binding Test (MBT) for incident dementia for various time windowsab
MBT, Memory Binding Test; TIP, Total number of Items cued recalled in the Paired condition on the MBT; FCSRT-FR, the Free and Cued Selective Reminding Test free recall score; CR-L1, Number of items Cued Recalled from List 1 on the MBT; HR, hazard ratio. aHazard Ratio (HR) and 95% confidence interval for developing incident dementia when the maximum follow-up length was restricted to 2 years, 3 years, 4 years, 5 years, and 6 years, respectively. bHazard ratios were estimated from the Cox proportional hazards model with adjustment for age, sex, WRAT-3 grade score for assessment of education, and GDS depression score. WRAT-3 was used instead of years of education because it is highly correlated to years of education and may be a better proxy for education.
MBT TIP versus MBT CR-L1
Kaplan-Meier curves were significantly different using an empirical cut-score of≤12 on the MBT CR-L1 (Log-rank test, p < 0.0001) (Fig. 1B). The HR for TIP was higher than the HR for the CR-L1 (HR = 2.91, 95% CI: (1.37, 6.18), p = 0.006) (Table 2B). When both TIP and CR-L1 were included in the same model, HR for TIP was attenuated but remained significant while CR-L1 lost significance (p = 0.56). When assessing various time windows, HRs for TIP were higher than CR-L1 (Table 3).
MBT TIP versus MBT CR-L2
Table 2C shows that MBT CR-L2 was predictive of the incident dementia, HR = 6.05, 95% CI: (2.49, 14.70), p < 0.0001. But when both TIP and CR-L2 are included in the model, CR-L2 lost its significance (p = 0.19).
MBT TIP versus tests of non-memory cognitive domains
As shown in Table 4, when each test was examined as the main predictor (Model A), digital symbol, digital span, Boston naming, trail making part A and IADL were not significant (p > 0.05) while category fluency, letter fluency, block design, and trail making part B were significant. When MBT TIP was also included in the model (Model B), each of these four tests remained significant while the effects of MBT TIP were attenuated with category fluency, letter fluency, and trail making part B meaningfully impacting the HR of MBT TIP (>25% reduction). When we included category fluency, letter fluency, block design and trail making part B along with MBT TIP in the same model, MBT TIP remained significant, HR = 4.30, 95% CI: (1.59, 11.61), p = 0.004 while category fluency was significant, HR = 0.95, 95% CI: (0.90, 0.999), p = 0.047 and letter fluency, block design, and trail making part B lost significance (p = 0.43, 0.23, and 0.48, respectively).
Predictive validity of tests for the non-memory cognitive domains: Estimates from Cox proportional hazards regression
HR, hazard ratio; CI, confidence intervals; MBT, Memory Binding Test; TIP, Total number of Items cued recalled in the Paired condition on the MBT; IADL, Instrumental Activities of Daily Living. aHazard ratios were estimated from the Cox proportional hazards model with adjustment for age, sex, WRAT-3 grade score for assessment of education, and GDS depression score. WRAT-3 was used instead of years of education because it is highly correlated to years of education and may be a better proxy for education.
Sensitivity analyses
Various cut-scores of MBT TIP were predictive of dementia (p < 0.01) (Table 5). Various cut-scores of MBT CR-L1 with sensitivity values in the equivalent range were also assessed although not all cut-scores were predictive and HRs were relatively lower. When years of education was used to substitute WRAT-3 grade score in the Cox model, MBT TIP was still highly predictive of dementia (HR = 7.92, 95% CI: (3.49, 17.96), p < 0.0001). When further adjusting for the cumulative number of CVD events, MBT TIP remained highly predictive (HR = 8.58, 95% CI: (3.58, 20.57), p < 0.0001). When using AD as an outcome, there were 28 incident AD cases including 23 without vascular dementia and 5 with vascular dementia. HR for MBT TIP was 11.90, 95% CI: (4.51, 31.37), p < 0.0001, higher than the HR in the analyses with all-cause dementia as outcome, HR = 8.58, 95% CI: (3.58, 20.58), p < 0.0001. The conclusions for comparing MBT TIP versus FCSRT or MBT CR-L1 still held as well for AD as an outcome. When we excluded 31 prevalent aMCI cases from baseline (Supplementary Table 2, Supplementary Figure 1), 26 participants developed incident dementia during follow-up and MBT TIP was predictive of dementia (HR = 4.85, 95% CI: (1.54, 15.22), p = 0.007) while FCSRT-FR was not (p = 0.08). The MBT TIP remains significant (HR = 4.05, 95% CI: (1.25, 13.12), p = 0.02) when adjusting for FCSRT-FR (p = 0.20).
Prediction of incident dementia using various cut-scores of MBT TIP
MBT, Memory Binding Test; TIP, Total number of Items cued recalled in the Paired condition on the MBT; CR-L1, Number of items Cued Recalled from List 1 on the MBT. aThe cross-sectional discriminative validity data was copied from an earlier paper. bThe sum of sensitivity and specificity. The cut-score of CR-L1≤12 and TIP≤17 were chosen by maximizing the sum of sensitivity and specificity within each index, respectively. cHazard ratios were estimated from the Cox proportional hazards model with adjustment for age, sex, WRAT-3 grade score for assessment of education, and GDS depression score. WRAT-3 was used instead of years of education because it is highly correlated to years of education and may be a better proxy for education. dMBT(+) was defined as baseline MBT CR-L1 or TIP score less than or equal to the cut-score; MBT(–) was defined as baseline MBT CR-L1 or TIP score greater than the cut-score.
DISCUSSION
We previously demonstrated that the MBT TIP had good discriminative validity for distinguishing study participants with dementia and aMCI versus cognitively normal elderly at cross-section [9]. MBT TIP scores also predicted incident aMCI longitudinally [10]. In the present study, we demonstrated that the MBT binding measure, TIP, was predictive of incident dementia. Predictive validity was compared to that of the MBT single list recall measure, CR-L1, and more importantly, the commonly-used FCSRT-FR scores. The MBT TIP at baseline was associated with incident dementia in models including FCSRT-FR or MBT CR-L1. The FCSRT-FR remained predictive with MBT-TIP in the same model. When MBT CR-L2 and MBT TIP were included in the same model, MBT TIP remained significant while CR-L2 lost its significance. Category fluency, letter fluency, block design and trail making part B were predictive of dementia by itself. When adjusting for all these 4 tests, MBT TIP remained significant while category fluency alone remained significant. Various sensitivity analyses supported the robustness of these findings.
We were previously unable to compare the MBT to FCSRT for aMCI prediction [10] because the EAS aMCI diagnosis relies heavily on the FCSRT. But in this report, because dementia is a distal outcome, we chose to compare the MBT to FCSRT. The diagnosticians at our case conference had access to the FCSRT-FR scores but not to the MBT. This could create a bias which favors the FCSRT-FR over the MBT. In the present investigation, we showed, despite this potential bias, that the MBT TIP was an independent predictor in models including FCSRT-FR.
Compared to other binding tests including the Short-Term Memory Binding [4, 5] and the Face-Name Associative Memory Exam tests [3, 6] that probe binding of information from different domains (color and shape, face and name, respectively), the MBT focused on binding in the same semantic domain. The MBT was designed to probe the memory binding function using two word lists that share the same category cues. It involves deep semantic processing which ensures encoding specificity and elicits maximum retrieval through controlled learning and cued recall. The paired recall condition that probes binding is inherently more challenging than recall of unpaired word list(s) [49]. Successful performance on the MBT also requires functions from other neurocognitive domains. Compared to the participants remained free of incident dementia during follow-up, participants who developed incident dementia had lower baseline scores on category fluency, letter fluency, block design, trail making test parts A and B (Table 1). Cox regression models show that trail making part A was not predictive of dementia while category fluency, letter fluency, block design and trail making part B were significant. When these four tests were adjusted for in the same model, the HR for MBT TIP was attenuated but still significant while only category fluency remained significant. These results indicate that there may be subtle disturbances in attention and executive function at baseline but they were minor contributors to the prediction of future dementia compared to binding and subtle semantic disturbances. The present study cannot distinguish whether the MBT assesses binding or merely a greater semantic burden, which needs further research.
These results suggest that the MBT may serve as a neurocognitive marker for early detection of dementia. Clinicians interested in identifying persons at high risk for dementia might consider using the MBT. Future studies should investigate TIP measure as a predictor of biomarkers for AD and assess the separate and joint predictive validity of cognitive and biological markers [25, 50].
One limitation of this study was the relatively small sample size though results were highly statistically significant. In addition, as only 40% of the total participants from EAS during the study period received the MBT there is potential for selection bias. However, the sub-study sample did not differ from the parent EAS study in terms of demographic characteristics or cognitive variables making participation bias unlikely [9]. A third potential limitation is that the cut-scores used for longitudinal analyses were based on cross-section results [9]. We used these pre-specified cut-scores to avoid capitalizing on chance by defining within sample cut-scores. Applying the MBT to additional samples of older participants representing different types of cohorts would be helpful. Fourth, our outcome was all-cause dementia rather than AD alone. We chose to study all-cause dementia because most dementia in older adults is of mixed pathology, although AD pathology was more common [51, 52]. The sensitivity analysis with AD as an outcome supports the hypothesis that binding deficits predict AD better than dementia of other causes, and using all-cause dementia as the outcome was a conservative approach. Finally, as we mentioned in the Methods section, participants with auditory, visual or motor impairments that prevented them from completing the MBT were excluded. Although there was no dementia at baseline in this paper, we successfully administered the MBT to patients with mild dementia [9]. We expect that the test may not be suitable for patients with moderate and severe dementia, which may not be of major concern because the MBT was developed as a challenging test for early detection.
In summary, this present study demonstrated that the MBT was predictive of incident dementia. This conclusion was drawn from a population-based prospective longitudinal cohort study where participants were systematically evaluated, so it has good generalizability. Compared to imaging or cerebrospinal fluid markers, the MBT is non-invasive and less expensive to administer. It is a short test taking only a few minutes. A Spanish version of the test was developed and validated [53–55]. With these desirable properties, the MBT may have the potential to emerge as a behavioral marker for early detection of dementia.
Footnotes
ACKNOWLEDGMENTS
The authors thank Drs. Cuiling Wang and Charles B. Hall for their advice. We also thank the participants, investigators, and staff for this study. This work was supported by the National Institute on Aging at the National Institutes of Health (grant number P01 AG03949).
Albert Einstein College of Medicine owns the copyright for the Memory Binding Test (MBT) and the Free and Cued Selective Reminding Test, makes them free for academic research but licenses the test for commercial use. Dr. Buschke receives royalties from Albert Einstein College of Medicine when the memory tests including the Free and Cued Selective Reminding Test and the Memory Binding Test are used for commercial purpose.
