Abstract
INTRODUCTION
Alzheimer’s disease (AD) is a devastating disorder that results in subtle cognitive and biological changes in the brain long before a formal diagnosis ofdementia, or even mild cognitive impairment (MCI), can be established [1, 2]. The identification of early cognitive and functional impairment is increasingly critical to identify the subtle cognitive changes that occur in preclinical disease states. While delayed recall and rate of forgetting were previously considered the hallmark cognitive features of medial temporal lobe dysfunction in AD [3], it has become increasingly recognized that other paradigms are more sensitive and effective in identifying deficits and can have positive predictive value of progression to AD over time. For example, measures of initial learning may be even more sensitive than delayed recall in the identification of MCI [4, 5]. Our group has also shown through the development of the Semantic Interference Test (SIT), that vulnerability to proactive semantic interference (pSI) is a more sensitive indicator of early MCI, and also more predictive of progression to dementia than traditional measures such as delayed memory for passages, visual reproduction, Trails B, and category fluency [6]. Measures that are susceptible to pSI were found to be highly associated with amyloid load among community dwelling elders [7]. This led to the development of a more refined measure, the Loewenstein-Acevedo Scales for Semantic Interference and Learning (LASSI-L), which employs an active encoding strategy to organize the learning of 15 target words within three semantic categories (fruits, clothing, and musical instruments). Immediately thereafter, a second list of semantically matched targets is presented. Recent studies found that a) cued recall on the LASSI-L using different semantic categories was more discriminative between early MCI and dementia than free recall; b) LASSI-L measures susceptible to proactive interference were the most highly correlated with medial temporal lobe atrophy on MRI; c) cued recall for interference trials produced an extremely high number of semantic intrusions that were highly diagnostic of early AD [8, 9]; and d) amyloid load was more highly related to proactive semantic interference (pSI) relative to traditional neuropsychological measures of memory and non-memory function among elderly participants who were cognitively normal [10].
Given that subtle deficits linked with semantic memory interference have been found to be a predictive cognitive marker of incipient AD, our group aimed to develop a sensitive, more portable, and easily administered instrument that incorporates and builds upon the scientific principles of the SIT and the LASSI-L. The Miami Test of Semantic Interference and Learning (MITSI-L) employs a novel paradigm that maximizes semantic interference effects through the binding and unbinding of semantic associations. Additionally, the instrument was designed to include the following features: a) shorter administration time; b) administration using simple touch screen technology which has been found to be easily accepted by older adults [11]; c) electronically captures both responses and time of response in milliseconds without the need of a human examiner; d) eliminates errors inherent in voice-recognition software by designing a challenging recognition format; and e) is portable and has the potential to be remotely administered through web-based systems. These strengths make the test highly useful across both clinical and research settings. In the current investigation, we studied performance on the MITSI-L among 98 persons, either clinically diagnosed with mild cognitive impairment (MCI) or cognitively normal (CN).
METHODS
We recruited 98 participants (24 males and 74 females) from two academic institutions in a research consortium (University of Miami School of Medicine [UM] and Mount Sinai Medical Center [MSMC]) who met diagnostic entry criteria as described below. Dr. Loewenstein was the Principal Investigator of this consortium and led a team of clinicians and neuropsychologists from both study sites who conducted uniform clinical interviews using the Clinical Dementia Rating Scale (CDR). Common neuropsychological test protocols were employed to render diagnoses as described below. These participants were community-dwelling elders and knowledgeable study partners who had been recruited from tertiary memory disorder clinics, community memory screenings, or community talks with various senior groups. All participants were independent in their activities of daily living, had knowledgeable collateral informants, and did not meet DSM-5 criteria for Major Neurocognitive Disorder when clinically interviewed. Exclusion criteria included individuals with a current episode of Major Depression, or any other major psychiatric disorder.
All participants were administered a common clinical assessment protocol, the CDR, and the Mini-Mental State Exam (MMSE). Memory and other cognitive complaints were assessed using the CDR interview conducted by an experienced geriatric psychiatrist (MG), or a clinical neuropsychologist (DL, RC, MR) who were blind to the neuropsychological test results. The neuropsychological tests were administered by a trained psychometrist who was in turn, blind to the clinical diagnosis. All persons administering the CDR received formal training and had considerable research experience as investigators of a federally-funded Alzheimer’s Disease Research Center, as well as experience with clinical trials that include both an extensive interview and use of the CDR. An identical neuropsychological test battery was administered at each site. The testing was conducted independent of the clinical examination and included the Hopkins Verbal Learning Test-Revised (HVLT-R), the Logical Memory subtest from the National Alzheimer’s Coordinating Center Uniform Data Set (NACC UDS) protocol, Category Fluency, Letter Fluency, the Block Design subtest of the Wechsler Adult Intelligence Scales-Fourth Edition (WAIS-IV), and Parts A and B of the Trail Making Test (TMT-A/B).
Participants were administered the LASSI-L [8] and the MITSI-L on the same day as the diagnostic neuropsychological measures. It is important to note that the MITSI-L was administered at the end of the test battery compared to the LASSI-L, which was administered at the beginning of the test battery. In addition, the MITSI-L scores were not employed in the diagnostic formulation.
Each participant had his or her independent clinical evaluations and neuropsychological test results reviewed at a diagnostic consensus conference. Subjects were selected for the current study if they met the final diagnosis as follows:
Criteria for Cognitively Normal (CN: n = 64) were as follows: a) no evidence, by extensive clinical evaluation or history, of memory or other cognitive decline; c) a Global CDR score of 0, rated by the clinician; d) all memory and non-memory traditional neuropsychological measures scored within normal limits relative to age and education related-norms, as determined by an experienced neuropsychologist (this was typically
Criteria for Mild Cognitive Impairment (MCI: n = 34) were as follows: a) subjective memory complaint reported by the participant and/or the collateral informant; b) evidence on clinical evaluation or history of memory and/or other cognitive decline; c) a Global CDR score of 0.5; d) one or more memory or non-memory measures that were 1.5 SD or more below normal limits relative to age and education-related norms.
The Miami Test of Semantic Interference (MITSI-L)
The MITSI-L was developed by Loewenstein and Curiel (2015), and consists of a computerized paired associate task that requires the participant to remember nine word pairs. Each word pair represents one of three semantic categories (animals, fruits, and musical instruments). The participant is first presented with the initial nine word pairs (List A). Each semantically-related word pair is presented both verbally and visually, one pair at a time, on the computer screen. The participant is then presented with the first word of each word-pair, and is instructed to select among four semantically similar targets for its matched pair. For example, if the association is blueberry-peach, the targets for blueberry would be pear, peach, grapes, or strawberry. This recognition paradigm is made challenging because of the semantically similar distractors, as well as the fact that there are three paired associates for each sematic category, making acquisition challenging. List A word pairs are presented again for a second learning trial, and these results become an important measure for maximum performance through the correct pairing of List A targets.
Next, susceptibility to proactive interference is assessed by coupling the initial target word with a new, semantically similar target word (List B). For example, the target word blueberry is now linked to pear, with response choices being grapes, strawberry, pear, or peach. Decoupling the previously learned associations, and linking the original target to a semantically similar target, gives rise to proactive interference. Recovery from proactive interference is assessed by presenting a second learning trial of List B targets. In a previous study by Curiel and Loewenstein [12], the test re-test reliabilities for correct paired associations after administration of Paired Associates A Trial 2 was r = 0.56 (p < 0.006), Paired Associates B Trial 1 was r = 0.51 (p < 0.02), and Paired Associates B Trial 2 was r = 0.66 (p≤0.001). There were no statistically significant reliabilities for Trial 1, indicating that there was insufficient stability over time. The MITSI-L takes 8–10 min to administer as compared to 15–20 min for the LASSI-L.
Statistical methods
For demographic information and comparisons between groups on MITSI-L measures, we conducted a series of ANOVAs. Following a statistically significant F at p≤0.05, Chi-square analyses were employed for ordinal data. Logistic regression and ROC analyses were employed to further delineate the ability of specific subtests or combinations of subtests of the MITSI in distinguishing between MCI and CN groups. SPSS 22 and MedCalc Statistical Software version 16.2.0 were utilized.
RESULTS
As depicted in Table 1, although difference in age for CN and MCI groups approached statistical significance, there were no differences between groups with regards to educational attainment or gender distribution.
The MCI group had MMSE scores that were slightly lower than the CN groups. Even after adjusting for differences in age and MMSE scores, MCI subjects also evidenced lower scores than CN subjects on all subscales of the MITSI-L including Correct Pairs A1, Correct Pairs A2, Correct Pairs B1, and Correct Pairs B2.
We employed logistic regression and analyzed area under the receiver operator curve (aROC) to determine the extent to which MITSI-L subtests could distinguish between groups. As indicated in Table 2, sensitivities and specificities for some subtests of the MITSI-L yielded formidable results.
When we entered all four MITSI-L subscales as predictors of classification in logistic regression procedures, a combination of List B1 and List A2 resulted in sensitivity of 76.5%, specificity of 89.1%, and overall correct classification of only 84.7%. To increase the range of scores, we added the total correct pairs for List B1 and List A2. An optimal cut-off score of 8 or less yielded an aROC of 0.927 (SE = 0.03), sensitivity of 85.3%, specificity of 84.4%, and an overall correct classification of 84.7% (See Table 2).
Since HVLT-R delayed recall and delayed recall on NACC passages were a critical part of the initial diagnostic workup for MCI (the vast majority of our cases were amnestic MCI), this creates circularity when using them as classifiers to predict diagnosis, so we used variables that were not part of the diagnostic classification to compare to the total correct pairs comprising List A2 and List B1. Measures that were examined included a) MMSE, b) immediate memory for passages (i.e., Logical Memory of the NACC UDS protocol), c) maximum cued recall of the 15 targets on List A of the LASSI-L, d) maximum recall of the 15 targets on List B1 of the LASSI-L that were vulnerable to proactive interference, and e) cued recall of LASSI-L List B2, which taps recovery from proactive interference. The LASSI-L measure was administered during the first part of the neuropsychological evaluation, was not employed in diagnostic classification and was removed temporally from the MITSI-L which was administered at the end of the neuropsychological evaluation, approximately 3 h later. Of all of these measures, the highest aROC curve was 0.93 (SE = 0.03) for total correct MITSI-L pairs comprising List A2 and List B1 (See Fig. 1), which was statistically greater than the aROC for the MMSE = 0.77 (SE = 0.05) [Z = 2.92; p < 0.007]; immediate memory for the NACC passage aROC = 0.76 (SE = 0.05) [Z = 3.20; p < 0.002]; LASSI maximum cued recall aROC = 0.82 (SE = 0.05) [Z = 2.56; p < 0.02), and LASSI-L List B cued recall = 0.80 (SE = 0.05) [Z = 2.48; p < 0.02]. There were no differences between MITSI-L pairs comprising of List A2 and List B1 and cued recall for LASSI-L B2 targets aROC = 0.88 (SE = 0.04) [Z = 1.20; p = 0.23].
A significant feature of the computerized MITSI-L is the ability to measure the speed of correct responses in milliseconds. Thus, we examined average time to make a correct paired association on the recognition trials across each learning trial. As can be seen in Table 3, after correcting for differences in age, the only differences in latency between the MCI and cognitively normal elderly group was for the second Trial of List A2.
DISCUSSION
This study is one of the first to investigate a brief, novel, computerized paired associate learning measure in its ability to distinguish elderly individuals with MCI from those with normal cognition, with statistically significant group differences across all measures. Results of the study confirmed that high levels of sensitivity and specificity could be established with excellent area explained under the ROC curve. Further, the simple touch-screen design of the computerized instrument was generally well-received by older adult participants, notwithstanding its challenging task demands.
A combination of Paired Associate List A correct responses after two trials, combined with Paired Associate List B correct responses (sensitive to proactive interference), resulted in an aROC of 0.93, and a sensitivity and specificity of 85% and 84%, respectively. Importantly, when we compared this classification rate to neuropsychological measures that were not employed in the initial diagnostic workup, the MITSI-L had a significantly greater area under the ROC curve relative to the MMSE, immediate memory for a story passage and sensitive LASSI-L measures that have previously shown excellent sensitivity and specificity in distinguishing between MCI and normal elderly control subjects [8].
Previous studies by Curiel et al. [12] have shown that test-retest reliabilities are not acceptable for the first MITSI-L trial, and, indeed, MITSI A1 paired associates recognition had lower areas under the ROC curve relative to other indices. Thus, it appears that the other MITSI-L indices are not only more stable, but also have better discriminatory power. All MITSI-L List A learning trials together with the MITSI B1 trial, can be administered in under 9 min, yielding rapidly accessible valuable data.
There are some limitations associated with the current investigation. First, participants with MCI were slightly older than those with normal cognition. For this reason, we entered in age as a covariate in all ANOVA analyses, although this had no statistically significant effect on outcome as compared to non-corrected models.
A second potential issue is that while combining List A2 and B1 produced the greatest discriminatory power, it is fair to question whether the current findings more accurately reflect a generalized deficit in learning the paired associates rather than due to any putative effects of proactive semantic interference found in failing to learn the List B paired associates. In post hoc analyses, we derived the proportion of B1 responses that were correctly divided by initial A1 responses, the proportion for CN subjects was 1.21, versus 1.23 for aMCI participants. The proportion of B2 correctly paired associate responses was 1.16 for CN, versus 1.19 for MCI participants. Not only are these proportions virtually identical for CN and MCI subjects, but also indicate the lack of a proactive interference effect. This reflects the difficulty of acquisition and retention of the initial List A associations. For example, MCI participants did not achieve eight of the nine List A target pairs after the first two trials, and less than 15% were able to learn two-thirds of the nine target pairs. For CN participants, 35% obtained eight or nine pairs, and 25% did not correctly obtain two-thirds of the target pairs after two learning trials.
An inspection of the current findings indicated that there was not a significant decrement in performance among either study group when List B associations were made compared to List A associations. Proactive interference can only occur when a sufficient amount of previous learning has maximally impeded new learning [8]. It is also possible that the requirement for the subject to learn nine word pairs, representing one of three semantic categories, may have been too challenging, significantly impeded learning of List A targets, and thus limiting any proactive interference effect that might be observed on List B targets. Thus, the obtained results may reflect the generally greater difficulties with performance on the MITSI-L paired associates tasks as a whole, rather than reflecting putative semantic interference effects.
Nonetheless, a combination of A2 and B1 correct paired associate responses evidenced high levels of sensitivity and specificity in logistic regression and ROC analyses, albeit difficult for both CN and MCI participants.
This initial study indicates that brief, computer-administered tasks using paired associates was successful in distinguishing older adults who were cognitively normal from those with MCI, and did this more effectively than global tests of mental status or tests of paragraph recall, as well as some measures of cued call. One of the major potential advantages of the MITSI-L over other available computerized measures such as the CogState MCI/AD battery, or the Computerized Battery of the NIH toolbox, is that it employs a novel assessment paradigm (semantic paired associate learning utilizing a recognition memory platform) that has been sensitive to pick up on subtle cognitive changes associated with preclinical AD states. This potentially makes the MITSI-L a more sensitive instrument than other available computerized batteries, given that many are automated versions of traditional assessment paradigms originally developed for the assessment of dementia or traumatic brain injury. Due to its highly challenging format, we are in the process of exploring other methods of initial acquisition and investigating the utility of having one unique semantic category for each of the targets to determine if we can build sufficient proactive interference, such as that observed on the LASSI-L, SIT, and other measures that have yielded significant proactive interference effects. Even though semantic interference effects were not substantially elicited with the current version of the MITSI-L, with this future direction in mind, the current version of the MITSI-L is expected to be useful for both screening and assessment purposes given that it is a computerized paired associate test that offers portability, has demonstrated to be palatable for use with older adults, does not require an examiner, and most notably, is able to distinguish between groups efficiently, yielding rapidresults.
