Abstract
Wang et al. analyze Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment accuracy as screening tests for detecting dementia associated with Alzheimer’s disease (AD). Such tests are at the center of controversy regarding recognition and treatment of AD. The continued widespread use of tools such as MMSE (1975) underscores the failure of advancing cognitive screening and assessment, which has hampered the development and evaluation of AD treatments. It is time to employ readily available, efficient computerized measures for population/mass screening, clinical assessment of dementia progression, and accurate determination of approaches for prevention and treatment of AD and related conditions.
Keywords
Alzheimer’s disease (AD) is the only cause of death ranked in the top ten globally without precise early diagnosis or effective means of prevention or treatment. Further, AD was identified as a pandemic [1] well before COVID-19 was dubbed a 21st century pandemic [2]. And now, with the realization of the prominent secondary impacts of pandemics, there is a growing, widespread recognition of the tremendous magnitude of the impending burden from AD in an aging world population in the coming decades [3]. This appreciation has amplified the growing and pressing need for a new, efficacious, and practical platform to detect and track cognitive decline, beginning in the preliminary (prodromal) phases of the disease, sensitively, accurately, effectively, reliably, efficiently, and remotely [4–7]. Moreover, the parallel necessity of clarifying and understanding risk factors, developing successful prevention strategies [8–17], and discovering and monitoring viable and effective treatments could all benefit from accurate and efficient screening and assessment platforms.
Modern recognition of AD [18] as a common affliction of the elderly began in 1968 with a paper by Blessed, Tomlinson, & Roth [19] in which two tests, one a brief assessment of cognitive function and the other a measure of daily function, demonstrated impairment which was associated with the postmortem counts of neurofibrillary tangles, composed mainly of microtubule-associated protein-tau (tau), in the brain, though not to senile plaques, composed mainly of amyloid-β (Aβ). Even in more recent analyses, the tangles correspond with the severity of dementia more than the plaques [20, 21]. Since 1960, a plethora of cognitive tests, paper and pencil [22, 23], simple screening models [24], and computerized [25–27], have been developed to assess the dysfunction associated with AD. However, there has been limited application of Modern Test Theory, which includes Item Characteristic Curve Analysis, used in the technological development of such tools [28–31], along with widespread failure to understand the underlying AD pathological process to guide test development [32, 33]. The lack of such development has likely been a major contributor to the failure of the field to develop timely screening approaches for AD [34, 35], inaccurate assessment of the progression of AD [36], and even now, failure to find an effective approach to stopping AD.
MINI-MENTAL STATE EXAMINATION VERSUS MONTREAL COGNITIVE ASSESSMENT
One of the biggest problems in the AD field is the lack of an efficient and precise screening tool to recognize dementia and early AD in clinical settings. The two most widely used instruments for this purpose have been the Mini-Mental State Examination (MMSE) [37] and the Montreal Cognitive Assessment (MoCA) [38]. The examination by Wang et al. [39] is of great significance with its in-depth meta-analysis of 67 studies examining the utility of the MMSE and the MoCA. These two brief tests have been, by far since 1975, the two most widely used tools for rapid assessment of cognitive impairment and dementia. Wang et al. [39] provide a powerful analysis indicating a pooled sensitivity for the MoCA of 0.94 (95% CI 0.906 to 0.954) and specificity of 0.90 (95% CI 0.859 to 0.928), whereas the sensitivity for the MMSE was 0.88 (95% CI 0.859 to 0.903) and specificity was 0.90 (95% CI 0.879 to 0.923). Despite the MoCA’s superior performance over the MMSE and the overall performance of these tests being adequate in some restricted situations, lengthy administration time, the inherent 10% error rate, and the inability of these tests to accurately define function on the continuum from normal cognition to moderate dementia makes them no longer acceptable, and there are many other available options.
The MMSE was developed in the early 1970s at a time when several very similar sets of questions were being used to screen for cognitive impairment [22]. The MMSE became the most popular largely because of its ease of use and its inclusion of items which were commonly used in clinical mental status examinations at that time. In fact, several of the MMSE items provide good item-response characteristics, though most do not [31], which causes a considerable amount of noise in this test. And the MMSE provides little information in the mild impairment range for distinguishing normal cognition from early dementia, while providing more information in moderate levels of dementia [36], making it a poor screening tool. However, a study of the MMSE’s individual items did lead to development of better screening tests, such as the Mini-Cog [40] and the Brief Alzheimer Screen [22, 42].
The MoCA was developed 30 years later (2005), mostly as an improvement on the MMSE to target a broader range of cognitive functions and discriminate higher levels of function in the mild impairment range. As is evident from Wang et al. [39], the MoCA has largely supplanted the MMSE as the brief cognitive test of choice for initial evaluation of individuals being considered as potentially having impairment of cognitive function, and, as warranted, have a score indicating a recommendation for comprehensive assessment. However, as noted in by Wang et al., the MoCA is also constrained by factors such as age, education, region, and ethnicity [43, 44].
The MMSE and MoCA both involve having the patient sit with a trained tester who will ask and record responses to the specified questions, taking 10 to 15 minutes. Both tests are composed of a series of items which are each given a single point. At the completion of the test, the rater adds up the points to produce the score (Classical Test Theory). The cut-offs of these tests have been studied extensively for their ability to distinguish between individuals with normal cognition or mild cognitive impairment and dementia, with wide-ranging debate about exactly where the precise point of demarcation should be and the effects of age, education, and culture.
For the MMSE and MoCA, as well as nearly all such cognitive tests, there has been essentially no application or consideration of Modern Test Theory, specifically, item characteristic curve analysis. This approach involves determining the characteristics of individual items with respect to the continuum of dysfunction, thus providing measures of difficulty and discriminability for each item [31, 45]. Notably, the value of the MMSE and MoCA items have not been thoroughly analyzed using Modern Test Theory which would provide a better metric, as item-response methodology improves statistical power of such tests [28–31]. Once these metrics are determined, a mathematical combination is readily calculated on any computer appropriately programmed to produce a score placed on a continuum of dysfunction, along with a confidence interval. And, in this era, computers can easily perform such calculations. The reason for the failure of Modern Test Theory application may be related to the fallacious notion of “validity”, in which the whole test, in a copyrighted format, is “validated”. However, in Modern Test Theory, the important issue is the attention to the characteristics of each item, and for ideal testing, it is best to administer items using “computerized adaptive testing”, in which the score on administered items leads to the selection of the next item, so the placement of the subject on the continuum of dysfunction is progressively determined. This approach would provide a far superior metric for cognitive assessment. The increase of power and efficiency of testing would stimulate important developments for clinical management and substantially improve the power-calculations for treatment investigations.
In their current forms, neither the MMSE nor MoCA can provide a precise metric of cognitive dysfunction. Further, repeating these tests leads to some degree of learning, contaminating test-retest reliability [46]. Accordingly, the solution with greatest priority would be a test capable of accurately determining an individual’s cognitive dysfunction with respect to the continuum of dementia and that could be administered repeatedly without compromising validity. As cognitive dysfunction associated with AD clearly progresses over time, a patient would be most meaningfully assessed with respect to a “time-index” [36, 47].
ADUCANUMAB (ADUHELM) APPROVAL BY THE FDA IN 2021 AND THE CRITICAL ISSUE OF COGNITIVE ASSESSMENT
The recent controversial approval of aducanumab by the FDA as a treatment for early AD [48] over the unanimous objection of the FDA Scientific Advisory Committee [33] highlights several major issues in understanding AD and accurately measuring the cognitive dysfunction and behavioral and functional changes that are caused by AD and related dementias, especially the issues surrounding meaningful clinical change [49, 50].
The first issue involves assumptions regarding the use of Aβ-related measurement as a surrogate outcome variable. The role of the plaques and the Aβ build-up in the AD brain has been considered, since the beginning of the modern era of AD recognition, to be a hallmark of AD pathology, but not as an indicator of disease severity [19]. There is no doubt that Aβ, whose gene is located on chromosome 21 [51–53], is associated with this fundamental genetic signature of AD, as evidenced strongly by the original link showing the early onset of dementia in Down syndrome, which manifests early deposition of Aβ due to triplication of chromosome 21 and resulting overproduction of the amyloid-β protein precursor (AβPP) [54, 55]. And the most common genetic predisposition to AD, apolipoprotein E (APOE), whose effect to predispose to AD dementia is highly aligned with age [56, 57], also has a major effect on the age of Aβ deposition [58], though many years preceding the appearance of dementia. The controversy, which has gone on for over 30 years, is whether Aβ has any role in the causation of dementia or even any of the associated neurobehavioral symptoms. Consequently, there is clear concern, born out by over forty-billion dollars-worth of failed trials since 1995 [59], that removal of Aβ from the brain has no or only a minimal effect on the clinical symptoms of AD and its progression [33]. However, the critical issue is the role of the AβPP, which is clearly a central factor in causing the dementia associated with AD, and likely because it plays a fundamental role in neuroplasticity, the neurophysiological basis of episodic memory [32, 61].
Second, the selection of subjects for the testing of aducanumab brought up the question of adequate representation across the population of people who have AD [62]. One factor which the lay community must come to appreciate is that preliminary scientific trials require as large and homogeneous a population as possible to minimize the variance in the statistical analysis of the data. Consequently, criticisms of the initial aducanumab studies for not being adequately “diverse” are arguably misplaced. Further, the complaint is applicable to nearly all AD therapeutic investigations subsequent to the original study in this field seeking to discover a viable and effective treatment for AD [63]. Until a truly beneficial treatment is determined, this attack is not appropriate, though specifically focusing on those most at-risk, best determined by age and genetics, is relevant. Moreover, once a positive therapeutic intervention is discovered, there will be a major need to expand screening to be more inclusive and aptly representative to determine the efficacy and benefit of a treatment for all individuals who suffer from AD.
More relevant to the current discussion, a third concern is the outcome measures used in the aducanumab trials. The levels of Aβ are the focus of the anti-amyloid therapies, but levels of Aβ in the cerebrospinal fluid (CSF) significantly reflect APOE genotype, but not dementia severity, while CSF tau levels significantly reflect dementia severity, but not APOE genotype [64]. And tau pathology is directly causing the synapse loss, which, in turn, causes the dementia [65]. However, even tau levels, be they in the brain or CSF, are still not the precise problem that needs treatment. While dementia in general is the loss of cognitive function not just memory, AD appears to be a specific disease of neuroplasticity, which usually manifests initially as a loss of episodic or recent memory. As the neuroplastic mechanisms deteriorate over time and the underlying neuronal processing capacities are dulled by loss of supporting axonal and dendritic arborizations, there is a progressive loss of the neuronal components of older memories, semantic memory, and other cognitive functions. Accordingly, for evaluating and treating AD, the focus should be on memory and related cognitive and psycho-social disruptions. Outcome measures such as the MoCA, the MMSE, and the Clinical Dementia Rating Scale Sum of Boxes [66] lack the sensitivity needed for meaningful measurement of the continuum of cognitive dysfunction associated with AD, particularly in its early phases and in treatment trials. Accurate assessments would remove the ambiguity of whether a treatment is beneficial. However, just to be clear, the biological mechanism which leads to the neuropathological mechanism causing tau dysfunction is what needs to be prevented, a process involving neuroplasticity and brainstem neurotransmitters [32, 67].
ONLINE SCREENING AND BRIEF COGNITIVE ASSESSMENT
As suggested in the Wang et al. paper [39], “Moving from the MoCA to a computerized test and evaluating its screening accuracy are important issues in the detection of dementia”. Evident by the current global clinical state of aging and dementia, a comprehensive major effort is urgently needed to develop and implement an effective dementia prevention strategy where AD patients are identified, through reliable and cost-effective screening, for early intervention, biologically, psychologically, and socially [4, 12]. Accordingly, innovative applications are required now to address this vital and widespread global mandate, including practical and engaging AD screening methods that are low-cost, engaging, and utilize accurate and precise diagnostic approaches. It is also essential to develop cost-effective assessment tools with the capacity to evaluate established and new treatment and management systems sensitively and precisely. The position of the US Preventive Services Task Force in its most recent recommendation statement in 2020 [68] (USPTF, 2020), essentially reiterated the same statements it has made since 1996, concluding “that the evidence is lacking, and the balance of benefits and harms of screening for cognitive impairment cannot be determined”. In balancing the innocuous nature of many preliminary cognitive assessments and the severe harms induced by dementia, routine cognitive screening should be considered suitable from even before 65 for the individuals who have an APOE4 gene [69], possible cerebral vascular disease, depression, or subjective cognitive decline. Thus, specific studies are needed to develop the responses of individuals to screening information that would accordingly guide decision-making away from harms and to beneficial outcomes, reinforcing the design of memory screening programs that have public health value [34, 70].
Currently, however, there are only a few readily accessible and convenient tools that meet the needs of valid early and timely detection of AD, and particularly ones that can be practically used for monitoring cognitive change over time. Thus, there have been prominent recent calls by both the private and public sectors around the world for developing widely accessible, reliable, and affordable digital cognitive assessments [4–7]. Responsive solutions would include computerized devices and platforms to screen for subtle memory impairment and other cognitive and functional changes that may distinctively indicate the onset of dementia and AD and precisely track progression. Importantly, brief, simple cognitive testing is needed that can be readily and frequently (daily, weekly, monthly) completed online, with a high level of subject engagement so that the individual being tested will return willingly for further testing. Computerized testing can offer efficient, engaging assessment with substantial improvements in the accuracy and precision of testing. Such symptom testing could also be integrated with promising blood-based biomarkers [71].
There are multiple computerized cognitive tests which have been developed and described in several reviews [25, 72]. A recent systematic review identified 10 self-administered brief online computerized cognitive assessments for older adults, based on the criteria of: 1) sample characteristics; 2) administration time of 30 minutes or less; and 3) psychometric characteristics (requiring 2 cognitive domains) [27]. The highest concurrent validity estimates were mostly reported with respect to the MMSE but not the MoCA. However, none of these tests took less than 3 minutes and 8 took over 12 minutes. And there are numerous other tests which have been adapted to a computer platform, but these are complex [73], take at least 15 minutes, or involve advanced virtual reality (Altoida) [74, 75] or eye-tracking technology (Neurotrack) [76]. And many online tests are designed for assessment of clinical performance, rather than screening or outcome assessment [77]. The recent 2021 review did not assess the desirability of the identified tests or the number of times or frequency with which the tests could be repeated, though computerized tests have been shown to engender positive attitudes in clinical settings [78].
To date, there are three online computerized tests which have been compared to the MoCA: MemTrax, Cognivue, and the Virtual Supermarket Test. MemTrax takes less than 2 minutes and is engaging and fun [25, 79] and shows superiority over the MoCA in two studies [80, 81]. Cognivue, which takes 10 minutes, did not perform as well as the MoCA [82]. The Virtual Supermarket Test outperformed both the MMSE and MoCA but takes 30 minutes and requires virtual-reality technology [83]. MemTrax, taking about 90 seconds and using novel images for every administration, provides for a potentially unlimited number of tests and can be used to monitor memory function and processing speed over time with essentially no limit on how often the test can be repeated due to availability of over 3,000 images. MemTrax is used by the Brain Health Registry, showing a 31% online completion rate [84], and results from study of the functional measures on this platform support the validity of online assessments of cognitive function [85]. Machine learning analysis and modeling shows that MemTrax can be effectively used to assess episodic memory for predicting cognitive health status [86] and classify mild cognitive impairment [87]. MemTrax parameters are significantly correlated with 6 of 8 MoCA cognitive domains (visuospatial, naming, attention, language, and abstraction, but not recall or orientation). Future modifications of online tests should be calibrated to detect cognitive impairment with the most efficient sensitivity and specificity possible as well as evaluation of the impact of all relevant cognitive functions. Additionally, important screening assessments should consider assessing other relevant dimensions, such as depression, anxiety, apathy, agitation, and sleep disturbances [88–91], which may be symptoms also caused by the underlying AD pathology [92]. And a mild behavior impairment index also serves as a marker for identifying neurodegenerative processes comparable to cognitive scales [93]. However, using an online screening tool, such as MemTrax, to assess memory and processing speed, in place of the MMSE and MoCA, with careful assessment of the precision of such a test for the temporal continuum, could greatly facilitate the efficiency and utility of initial cognitive evaluations.
OUTCOME ASSESSMENT MEASURES
A major issue is the meaningful and precise assessment of patient function in the clinic and in clinical research trials. In the first double-blind study of a cholinesterase inhibitor, the need for “more sensitive and specific memory tests” was emphasized [63]. As is clear from essentially all drug trials for AD, most recently the ambiguous results of the aducanumab trials, the current cognitive/behavioral assessment instruments are inadequate to assess precise change in dementia-related function, particularly over the continuum for normal to mild dementia. With precise measures, particularly with respect to the time course of AD, and accurate determination of rate of change, the discernable effect of a therapeutic agent would be definite, and, correspondingly, it would be unmistakably evident if there were no benefit from the treatment.
Further, the current patient evaluation tools do not correspond closely to the biomarker measures. Consequently, it is unclear whether the problem is the lack of drug benefit on the cognitive/behavioral indices or if the drug has inadequate benefit to exert changes in dementia related functions, particularly over the early parts of the continuum from normal to mild dementia. In any case, more precise measurement should help to resolve the conundrum. Presented here are updated versions of three tests published 30 years ago [94] which provide a means to accurately and validly assess the dysfunction of AD over time [36, 47]. See the Supplementary Material for details. Activities of Daily Living (ADL) –updated –with reference to: FAQ (Functional Activities Questionnaire): [95], ADCS –ADL Inventory: [96], AD8: [97]. https://www.medafile.com/AFA/ADLs-IB.htm
Brief Neurocognitive Assessment –updated with reference to the Brief Alzheimer Scale [41]. https://www.medafile.com/AFA/BNS.htm
DSM-5 Inventory –updated with reference to the DSM-5 (Diagnostic and Statistical Manual of the American Psychiatric Association): [98]. The Quick Dementia Rating System has 8 items which are similar to items on this form [99]. https://www.medafile.com/AFA/DSM5-NCI.htm
These three scales were calibrated to allow them to be averaged to improve the power of testing [36]. With these three tests administered during the same session, assessment of clinical patients and research subjects can be measurably improved, which will both facilitate the diagnosis of conditions associated with cognitive impairment and advance therapeutic development for AD.
These scales were averaged and translated into a “time-index” continuum which showed significant changes after 6 months in mildly demented individuals [36]. However, for practical implementation in a clinic or clinical trial, the test items should be analyzed with respect to the individual patients or subjects being evaluated to determine the pattern and pace of change along the continuum of dysfunction for each item, underscoring those respective characteristics specific to each person. This Modern Test Theory process should be continued with iterative analysis until stable metrics of difficulty and discriminability are established. Further, reassessment of subjects after 6 months would provide data for estimation of the impairment of each subject with a “time-index” [47], which would have substantial meaningful value for estimating rate of deterioration in a clinic or in a clinical trial. This approach will facilitate the diagnosis of conditions associated with cognitive impairment and significantly advance therapeutic development for AD. While composite scales are clearly superior to individual measures for assessing change over time [100], using these three tools, which were developed for comparative scoring and translatable to a “time-index” for long-term research studies, would likely provide a substantially more power and increase the effect size well beyond the currently used tools. Further, use of MemTrax, as described above, daily, weekly, or monthly, with over 3,000 different pictures providing essentially no test-retest learning effect, could also substantially improve the precision of measurement for clinical assessment or research trials over shorter periods of time.
Additionally, there are other dynamic system models that can be applied: dynamic systems analysis allows researchers to capture the process of development over time by explicitly mapping parameters of change onto aspects of functioning to which they correspond. This system can monitor the personal resources needed to change, and the emotional and physiological outcomes through which stress/decline manifests itself and contributes to the disease. The integration of systems and contexts across multiple timescales in a dynamic way then moves beyond traditional ways. As these decline processes are not linear or uniform and do not adequately account for the complexities of interconnections and circular causality, the use of multilevel dynamic approaches in longitudinal studies will not only shed light on when interventions are most effective but also allow for modeling complex interactions across multiple domains.
DEMENTIA REGISTRY
The current system in the U.S. is not satisfactory for recognizing individuals with impending or early dementia [4, 27]. At this time in the U.S., the Centers for Medicare and Medicaid Services (CMS) supports the Medicare Annual Wellness Visit which requires an assessment to detect cognitive impairment [101]. However, CMS has not specified an approach to conduct this assessment. In this vacuum, there has been no significant improvement of care for elderly with cognitive impairment. Recognizing the prevalence and power of “Big Data”, it is time for CMS to develop a registry online, including brief cognitive testing and subject query, with embedded analytics and artificial intelligence-driven machine-learning capabilities to monitor trends and improve predictive modeling as the registry consumes new data. Such a registry could serve as the Annual Medicare Wellness cognitive screen, be available for clinicians for evidence-informed patient management, and provide information for enrollees about possible optional participation in research studies. Such a registry would provide an important safety-net for identifying individuals at very early phases of cognitive deterioration. A well-designed registry could also help to reduce disparities in underserved populations, while helping to meet the most difficult need in research studies, which is recruiting subjects. An example of such a system is the Brain Health Registry [84, 85], but this platform is not designed for clinical use and does not share results with participants or their clinicians. Further, addition of a genetic analysis component would provide critical information for predicting risk, understanding AD, and providing directions for successful prevention and treatment strategies [102]. The time has come to recognize that it is important to screen for evidence of dementia [34] and for the addition of online screening methodology for brain health and cognitive assessment [103], preferably using approaches that can be used globally in a unifiable format with minimal impacts from culture, language, and education.
Early identification of individuals at risk for cognitive deterioration is essential for providing numerous convincing recommendations for slowing the rate of dementia progression and possibly for dementia prevention [4, 15–17]. Whereas the field is complex and there is no specific proof that screening will benefit the population, there is wide-ranging evidence that screening and relevant helpful information is itself beneficial to patients and caregivers [34]. It is time to work on improving population and professional recognition of the risks for developing cognitive impairment, and adopting these recommendations, which are being recognized as practical solutions that will improve quality of life with aging if initiated in a timely fashion.
Footnotes
ACKNOWLEDGMENTS
J. Wesson Ashford (chair), Fred Schmitt, Peter J. Bayley, Herman Buschke, Margaret Dean, Sanford I. Finkel, Lee Hyer, and George Perry are members of the Medical, Scientific, and Memory Screening Advisory Board of the Alzheimer’s Foundation of America (AFA).
