Abstract
Background:
Researchers have questioned the utility of brief cognitive tests such as the Mini-Mental Status Examination (MMSE) and the Montreal Cognitive Assessment (MoCA) in serial administration and suggested that brief cognitive tests may not accurately track changes in Global Cognition.
Objective:
To examine the accuracy of longitudinal changes on brief cognitive tests in reflecting progression in Global Cognition measured using comprehensive neuropsychological assessments.
Methods:
Two hundred and seven participants were assessed with the MMSE, MoCA, and a validated comprehensive neuropsychological battery. Global z-scores on the battery were derived and used to assess overall and significant (≥0.5 standard deviation) decline on Global Cognition. Different patterns of decline on MMSE/MoCA were classified. Accuracy was examined using receiver operating characteristic curve, and sensitivity, specificity, positive (PPV) and negative (NPV) predictive values were reported.
Results:
The overall ability of MMSE/MoCA change scores to discriminate participants who did and did not decline on Global Cognition was fair-to-moderate (AUC [95% CI] = 0.71 [0.64–0.78] & 0.73 [0.66–0.80] for overall decline; 0.78 [0.70–0.85] & 0.80 [0.73–0.86] for significant decline, respectively). Changes in MMSE/MoCA had low accuracy in identifying significant Global Cognitive Decline (PPV = 0.41 & 0.46, respectively) but high accuracy in ruling out significant decline and identifying cognitively stable participants (NPV = 0.89 & 0.88, respectively).
Conclusion:
There is limited utility in brief cognitive tests for tracking cognitive decline. Instead, they should be used for identifying participants who remain cognitively stable on follow up. These results accentuate the importance of acknowledging the limitations of brief cognitive tests when assessing cognitive change.
Keywords
INTRODUCTION
Brief cognitive tests such as the Mini–Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA) are among the most widely used instruments to screen for cognitive impairment and possible dementia [1, 2]. MMSE and MoCA scores have been reported to correlate with performance on comprehensive multi-domain neuropsychological measurements in cross sectional studies [3–5] and hence are widely used as indices of global cognitive function [6–8]. These tests have also been used in longitudinal studies as measurements of cognitive changes over time. Several studies used the MoCA to examine cognitive changes among healthy elderly as well as patients with Alzheimer’s disease (AD) and found significant decline in MoCA scores over a 1-year period [9, 10]. Other studies found that declines on MMSE and MoCA were associated with conversion to mild cognitive impairment (MCI) and higher risks for worse prognosis [11, 12].
However, the utility of serial administration of brief cognitive tests for tracking longitudinal changes in cognition has been questioned [13, 14]. The limited value in repeated administration of brief cognitive tests due to practice effects, measurement error as well as wide variability in rate of changes has been previously highlighted [14, 15]. Although the validity of changes in brief cognitive tests scores was found to improve with observation intervals of more than a year, changes in MMSE scores were still not reliable for observations separated by less than 3 years [14, 16]. In addition, researchers that included both brief cognitive tests and comprehensive neuropsychological batteries in their studies often observed differences in longitudinal changes between the tests. In one study, participants with different rates of cognitive decline had comparable MMSE scores but significantly different comprehensive cognitive profiles at baseline [13]. Similarly, another study found significant changes in MoCA scores while scores on the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) remained unchanged among healthy elderly participants [15]. Taken together, these studies suggest that changes on brief cognitive tests may not accurately track changes in Global Cognition as measured by comprehensive neuropsychological batteries.
To date, although brief cognitive tests have been taken as a measure of Global Cognition in some studies, no study has examined the reliability and accuracy of longitudinal changes in performance on brief cognitive tests. Hence, the utility of brief cognitive tests as an accurate measure of global cognitive changes remains unclear. In the present study, we examined the MMSE, MoCA, and comprehensive neuropsychological tests in a longitudinal study of participants with and without cognitive impairment and we hypothesize that decline in MMSE and MoCA will not accurately reflect decline in Global Cognition measured on comprehensive neuropsychological tests.
METHODS
Study population
Participants were consecutive patients with cognitive impairment measured on a locally validated comprehensive neuropsychological tool [17] recruited from an ongoing longitudinal study from tertiary memory clinics in Singapore. Participants were assessed yearly, and were included in the present study if they had been followed up annually until year 3 (Y3). Healthy controls without cognitive impairment on the VDB were recruited from the community. Participants were eligible if they were 50 years or older; received a clinical diagnosis of no cognitive impairment (NCI), cognitive impairment-no dementia (CIND), or dementia at baseline; and could provide informed consent. Participants with moderate and severe dementia, major psychiatric illness, substance abuse disorders, and incomplete follow-up data were excluded. The study procedures are described in Fig. 1. All procedures involving experiments on human subjects are done in accord with the ethical standards of the Committee on Human Experimentation of the institution in which the experiments were done in accord with the Helsinki Declaration of 1975. Ethical approval was obtained from the National Healthcare Group Domain-Specific Review Board. Written informed consent was obtained prior in the participants’ preferred language.

Breakdown of study sample.
Information from clinical and neuropsychological investigations was reviewed by a team of clinicians, psychologists, and research personnel during weekly research consensus meetings. The diagnosis of dementia was made according to the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV) criteria. Participants with impairment on at least one domain without significant loss of independence in daily functioning and did not meet the DSM-IV criteria for dementia were diagnosed with CIND. Participants with no objective cognitive impairment were diagnosed with NCI. Severity of dementia was rated using the Clinical Dementia Rating–global (CDR-global) scale, a 5 point scale assessed via semi-structured interview with a reliable informant (0 = No dementia; 0.5 = questionable dementia; 1 = mild dementia; 2 = moderate dementia; 3 = severe dementia) [18].
Clinical and neuropsychological assessments
All participants underwent a standardised neuropsychological assessment consisting of the MMSE, MoCA, a modified National Institute of Neurological Disorders and Stroke-Canadian Stroke Network protocol (NINDS-CSN) [19, 20], and the 15-item Geriatric Depression Scale (GDS) [21] at baseline (BL), one, two, and three years later (Y3).
In the present study, the NINDS-CSN battery was used to assess longitudinal changes in Global Cognition. It consists of 14 tests assessing six neuropsychological domains: Attention (Digit Span Forward and Backward); Executive function (Colour Trails Test 1 and 2, and Animal Fluency); Language (modified Boston Naming Test); Visuomotor speed (Symbol Digit Modalities Test); Visuospatial function (copy subtask of the Rey Complex Figure Test [RCFT]); and Memory (immediate, delayed, and recognition subtasks of RCFT and Hopkins Verbal Learning Test). Details of the battery have been described previously [20]. Normal distribution of each cognitive test from the NCI group at baseline was assessed. Box-Cox transformations were applied on variables that deviated from normal distribution [22]. The modified Boston naming test was excluded from all analyses as transformations to achieve normal distribution were unsuccessful.
Longitudinal changes in Global Cognition
Baseline mean and standard deviation (SD) of each cognitive test in the NCI group were used to compute standardized z-scores. Domain z-scores were derived by averaging individual test z-scores and standardizing using the NCI group mean and SD. Global Cognition z-scores were calculated by averaging domain z-scores and standardizing using the NCI group mean and SD. Changes in Global Cognition were derived by subtracting BL from Y3 follow up scores. Two types of decline on Global Cognition were examined: overall decline (BL z-score >Y3 z-score) and significant decline (decline≥0.5 standard deviation in z-score from BL-Y3) [23, 24].
Longitudinal changes in brief cognitive tests
Different patterns of decline on brief cognitive tests were also examined to study if the type of decline affects the accuracy of identifying subjects with Global Cognitive Decline. The types of decline were: a) overall decline from BL-Y3; b) consistent decline, defined as overall decline from BL-Y3 without fluctuations between each follow up visit; and c) significant decline, defined as decline of≥3 points from BL-Y3 on the MMSE and MoCA [25].
Statistical analyses
Spearman’s rank correlation was performed to examine significant bivariate relationships between: a) MMSE/MoCA and Global Cognition at baseline; and b) changes on the MMSE/MoCA and changes in Global Cognition from BL-Y3. Multivariate linear regression analyses were performed to determine the independent association between changes in MMSE/MoCA on changes in Global Cognition after adjusting for demographic factors, such as age, gender, education, and baseline MMSE/MoCA score. Receiver operating characteristic (ROC) curve analyses were performed and the resultant area under curve (AUC) statistics were reported to determine the utility of MMSE/MoCA changes in discriminating between participants who declined on Global Cognition (i.e., overall decline and significant decline) from participants who remained stable. Sensitivity, specificity, and positive (PPV) and negative (NPV) predictive values were calculated to examine the accuracy to which decline on the MMSE/MoCA could identify participants with overall decline and significant decline on Global Cognition. These values were also reported for different patterns of decline on the brief cognitive tests. Separate analyses were performed for the MMSE and MoCA. Only cases with complete data at all 4 time points (baseline, year 1, year 2, and year 3) were included in the current sample for analysis. p-values < 0.05 were considered statistically significant. All analyses were performed on the Statistical Package for the Social Sciences (SPSS ver. 24).
RESULTS
Sample characteristics
Participants included in the present study were assessed at BL from August 2010 - December 2013, and assessed at Y3 from August 2013 - November 2016. Two hundred and eighty-six participants were recruited in the present study. Thirty participants were deceased before Y3 follow up and 49 had missing data due to incomplete neuropsychological or clinical evaluations in between follow up visits, hence change scores on MMSE/MoCA and/or Global Cognition scores could not be generated and participants were excluded from the analyses. Therefore a total of 207 participants were included in the analyses (Fig. 1). Study sample characteristics are presented in Table 1. Participants who were excluded from the sample due to death or incomplete data were not significantly different in demographic and clinical characteristics compared to participants included in the final analyses.
Clinical and neuropsychological characteristics of study participants
Data expressed as mean (sd) or n (%); GDS-15, 15 item Geriatric Depression Scale; NCI, No cognitive Impairment; CIND, Cognitive Impairment no Dementia; MMSE, Mini-Mental Status Examination; MoCA, Montreal Cognitive Assessment.
Associations between brief cogntive tests and Global Cognition
Significant correlations were found between Global Cognition and MMSE and MoCA at baseline (ρ= 0.85 & 0.87 respectively, p < 0.01) and longitudinally (ρ= 0.46 & 0.47 respectively, p < 0.01). Multivariate linear regression analyses revealed that associations between changes in brief screening tests scores and Global Cognition were significant for both MMSE and MoCA, after controlling for demographic variables as well as baseline MMSE/MoCA scores (MMSE, β[SE] = 0.46[0.02]; MoCA, β[SE] = 0.49[0.02]; p < 0.001 for both).
Discriminatory accuracy in declines on brief cognitive tests
Results from the ROC curve analyses revealed that the overall ability of MMSE and MOCA change scores in discriminating between participants who declined from those who remained stable were fair to moderate (AUC [95% CI] = 0.71 [0.64–0.78] & 0.73 [0.66–0.80] for overall Global Cognitive Decline, p < 0.01; AUC [95% CI] = 0.78 [0.70–0.85] & 0.80 [0.73–0.86] respectively for significant Global Cognitive Decline, p < 0.0001 for both; see Fig. 2).

AUC curves from ROC analyses on changes in MMSE and MoCA in discriminating A) overall decline on Global Cognition, and B) significant decline on Global Cognition.
Changes on MMSE and MoCA also had low accuracy in identifying actual decline on Global Cognition. The low positive predictive values suggest that among participants who exhibited decline on MMSE and MoCA over the 3-year follow up period, only 64–70% had overall decline and 41–46% had significant decline on Global Cognition as measured on comprehensive neuropsychological tests. Even after considering different patterns of decline on the brief cognitive tests, the findings were generally consistent in that decline on brief cognitive tests does not accurately detect decline in Global Cognition (Table 2).
Sensitivity and specificity of MMSE and MoCA decline on Global Cognitive Decline in whole sample (N = 207)
*Consistent decline was defined by participants who declined overall from baseline to 3-year follow up, with no fluctuations in scores on brief cognitive tests. †Significant decline was defined by≥3 points decline from baseline to 3-year follow up. ‡No significant difference was found between AUC for MMSE and MoCA. AUC, area under curve; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value.
Nevertheless, both MMSE and MoCA had excellent negative predictive values in ruling out cases of significant Global Cognitive Decline (NPV = 0.89 & 0.88, respectively). This is also the case among individuals who do not consistently or significantly decline over the years, as shown by the high specificity (Range = 0.81–0.91) and NPV (Range = 0.76–0.85). Hence, lack of changes in brief screening tests scores are able to accurately identify participants who remain cognitively stable during follow up.
DISCUSSION
The present study investigated the utility of brief cognitive tests in tracking longitudinal decline in Global Cognition. Despite linear associations between changes in MMSE and MoCA scores and changes in Global Cognition, decline in scores on both brief cognitive tests over a 3-year follow up period did not accurately reflect overall decline as well as significant decline on Global Cognition. However, a lack of decline in the brief cognitive tests indicated stable cognition.
We found that a decline on MMSE and MoCA had poor PPV of cognitive decline. Even among participants who consistently declined or declined significantly on the brief cognitive tests, only about half had significant decline on Global Cognition. Previous studies have cautioned against the use of MMSE and MoCA in serial administration due to the high measurement error and low test-retest reliability of the tests; a five-point change on the MMSE was needed at one year to be considered significant change not due to chance [16, 26]. Moreover, a long observation period was necessary for changes in brief cognitive test scores to be reliable [14]. These findings suggest that using brief cognitive tests to measure cognitive decline may lead to high false positive rates, hence challenging the usefulness of brief cognitive tests to measure longitudinal changes in cognition.
However, our results suggest that MMSE and MoCA may have more value in identifying participants who remain cognitively stable. In cross-sectional studies, the MMSE and MoCA perform better as screening tests to rule-out dementia compared to diagnosing and identifying dementia cases [2, 27]. In the current longitudinal study, the high negative predictive values observed also indicate that changes in brief cognitive tests scores are excellent in ruling out cases of cognitive decline. Hence, rather than using MMSE and MoCA scores to track cognitive deterioration, our results suggest that brief cognitive tools should instead be used to identify participants who have stable cognition on follow up.
These findings have important implications on the utility of brief cognitive tests in clinical and community settings. While there have been numerous studies using short and brief cognitive tests in repeated testing [9, 13], the current study suggests that using brief cognitive tests alone to determine cognitive decline may result in inaccurate diagnosis and prognosis. It is hence important to also determine performance on comprehensive neuropsychological batteries when measuring cognitive deterioration because comprehensive batteries are more sensitive than brief cognitive tests in detecting clinical decline [9]. Moreover, while it remains paramount to accurately track cognitive deterioration over time, a large number (n = 150, 72%) of patients in this study cohort did not exhibit significant cognitive decline over the 3 year follow-up period. Given that the majority of patients remain cognitively stable, our results suggest that brief cognitive tests will be especially useful to accurately identify these patients and that further comprehensive neuropsychological investigations may be spared in these cases. This can lead to cost saving as well as increased efficiency in disease monitoring in the long run. Taken together, the results from this study has reinforced the need for researchers and clinicians to exercise caution when inferring cognitive decline and cultivate a good understanding of the properties of cognitive tests used [13].
This study has several strengths. Our study included a moderately sized cohort with a long follow up period of 3 years, which allowed for a more accurate measure of cognitive decline using brief cognitive tests [14]. Nevertheless, since cognitive deterioration is a slow and gradual process, studies with even longer follow up periods are necessary to verify the findings from this study [28, 29]. Another strength is the examination of both linear and discriminatory properties of MMSE and MoCA change scores, which provided more comprehensive information for the interpretation of the utility of these brief cognitive tests in longitudinal administration.
The present study also has limitations. Group differences were not examined in this study due to a small sample and low incidence of cognitive decline in the dementia-free group. Brief cognitive tests have been suggested to have varying sensitivities at different levels of cognitive performance [30]. An investigation of diagnostic group differences would likely yield different results in the accuracy of changes in brief cognitive tests scores and should be pursued in future efforts. Within the current sample, stratified analyses in dementia patients found that the PPVs of MMSE and MoCA in detecting significant Global Cognitive Decline remains poor to fair (PPV range = 0.56–0.75), suggesting that the brief cognitive tests are unable to accurately track cognitive decline even among participants with a high rate of cognitive deterioration (n = 32, 48%) (refer to Supplementary Table 1). However, due to the small sample of dementia patients in the current study, future studies with larger dementia cohort are needed in order to verify the findings from our stratified analyses.
In addition, the results from the present study were derived from an Asian memory clinic cohort with an overall lower level of education and only included milder dementia patients, hence the results may have limited generalizability to other populations in the existing literature. Future studies involving different populations, disease types and severity will be needed to validate our findings. It is important to further examine the validity of these findings in light of the emphasis in identifying early cognitive decline among elderly in the community before the onset of cognitive impairment and dementia.
Conclusion
In conclusion, we found limited utility in brief cognitive tests in tracking cognitive decline among elderly from a memory clinic cohort. Instead, it is recommended that the use of brief cognitive tests be focused on ruling out cognitive decline and identifying individuals who remain cognitively stable on follow up. Results from this study accentuate the importance of understanding the properties of cognitive tests when assessing cognitive change over time.
Footnotes
ACKNOWLEDGMENTS
This study was supported by the Singapore National Medical Research Council (NMRC) Center Grants NMRC/CG/NUHS/2010 (project No. R-184-005-184-511) and NMRC/CG/013/2013, NMRC Clinician Scientists Individual Research Grant NMRC/CIRG/1446/2016, and the Memory Aging and Cognition Centre (MACC) Research Fund.
