Abstract
Background:
In 1986, the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD) was mandated to develop a brief neuropsychological assessment battery (CERAD-NAB) for AD, for uniform neuropsychological assessment, and information aggregation. Initially used across the National Institutes of Aging-funded Alzheimer’s Disease Research Centers, it has become widely adopted wherever information is desired on cognitive status and change therein, particularly in older populations.
Objective:
Our purpose is to provide information on the multiple uses of the CERAD-NAB since its inception, and possible further developments.
Methods:
Since searching on “CERAD neuropsychological assessment battery” or similar terms missed important information, “CERAD” alone was entered into PubMed and SCOPUS, and CERAD-NAB use identified from the resulting studies. Use was sorted into major categories, e.g., psychometric information, norms, dementia/differential dementia diagnosis, epidemiology, intervention evaluation, genetics, etc., also translations, country of use, and alternative data gathering approaches.
Results:
CERAD-NAB is available in ∼20 languages. In addition to its initial purpose assessing AD severity, CERAD-NAB can identify mild cognitive impairment, facilitate differential dementia diagnosis, determine cognitive effects of naturally occurring and experimental interventions (e.g., air pollution, selenium in soil, exercise), has helped to clarify cognition/brain physiology-neuroanatomy, and assess cognitive status in dementia-risk conditions. Surveys of primary and tertiary care patients, and of population-based samples in multiple countries have provided information on prevalent and incident dementia, and cross-sectional and longitudinal norms for ages 35–100 years.
Conclusion:
CERAD-NAB has fulfilled its original mandate, while its uses have expanded, keeping up with advances in the area of dementia.
Keywords
BACKGROUND
Of all the measures of the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD; PI Albert Heyman, MD), the neuropsychology battery has withstood the test of time, enjoyed extensive and varied use, and is the focus here. A previous review paper provided information on the first 20 years of experience with all the measures [1].
CERAD’s mandate was to develop brief, valid, and reliable measures to standardize the evaluation and diagnosis of patients with AD. When CERAD was initiated, there was general acceptance of the diagnostic criteria for Alzheimer’s disease (AD), but the multitude of cognitive assessment measures used in different combinations across memory disorders clinics did not facilitate aggregation of information, and resulted in reduced power to explore AD or dementias classified as AD. With input from 24 NIA-sponsored Alzheimer’s Disease Research Centers and other university programs in the US, CERAD developed brief clinical, neuropsychology, neuroimaging, and neuropathology batteries, which were then utilized by these sites. Data from the annual evaluations of up to eight years of African American and White patients (N = 1,094) and nondemented control subjects (N = 463) recruited by these sites constitute the CERAD database. Autopsy examination of the brain confirmed the clinical diagnosis of AD in 87% of autopsied cases. This database provides the natural history of AD before the availability of antidementia medications.
According to the original NINCDS-ADRDA criteria for AD, the neuropsychological aspect of a clinical diagnosis of probable and possible AD dementia should include a brief overall mental status test, and be confirmed by deficits in one (“possible” AD), or two or more areas of cognitive function (“probable” AD), together with progressive worsening of memory and other cognitive functions in the absence of disturbance of consciousness [2]. Relevant areas of cognitive function included orientation for place and time, assessment of memory, language skills, praxis, attention, visual perception, problem-solving, and social functioning; none was prescribed, neither were specific measures required. In the absence of population-based norms, scoring in the bottom 20th percentile based on local age, sex, and education norms, together with further decline over time, was suggested as indicating impairment. Of note, the authors stated that “. . .diagnosis cannot be determined by laboratory tests” [2].
Approximately 25 years later, diagnostic criteria for AD were updated [3, 4]. Although some expansion of the psychological and behavioral characteristics of AD was recommended, little described there required modification of the CERAD neuropsychology assessment battery (CERAD-NAB). It has been pointed out, however, that “The extended CERAD-NAB (i.e., CERAD Plus) follows more closely the criteria for the new DSM-5 Mild and Major Neurocognitive Disorder” and that it “Could be further developed to include a test for social cognition.” [5]. To date, this extension has not happened, at least not formally.
While the original battery remains in use, the need for direct assessment of executive functioning was made clear by users and, together with broader assessment of verbal functioning, was added formally by Monsch and his group, yielding CERAD Plus [6]. Additionally, interest expanded to identifying prodromal AD, in particular mild cognitive impairment (MCI) and its types [7]; identifying dementing disorders originally not distinguished from AD (e.g., frontotemporal dementia, Lewy body disease); and identifying variants of MCI that might predict these different dementing disorders. Currently, identification of the likely presence of AD by blood-based biomarkers appears feasible [8], raising question regarding the value, or necessity, of measures such as the CERAD-NAB.
The battery, however, remains valuable for an increasing variety of interests. Use for diagnostic evaluation continues in independent memory clinics. Individual measures provide uniformity of evaluation across multiple clinics (e.g., in the Toronto Cognitive Assessment Battery (TorCA), used by the Toronto Dementia Resource Alliance (TDRA) (https://tdra.utoronto.ca/torca-2/, https://tdra.utoronto.ca/ accessed August 24, 2021); and by Team 17 of The Comprehensive Assessment of Neurodegeneration and Dementia (COMPASS-ND) of the Canadian Consortium on Neurodegeneration in Aging (CCNA CCNV) (https://ccna-ccnv.ca/about-us/, accessed August 24, 2021). There is extensive use in cross-sectional and longitudinal epidemiological surveys of the prevalence and incidence of dementing disorders, which provide information for public health needs. It has facilitated biological enquiry, including helping to identify regions of the brain associated with different cognitive abilities. Availability in multiple languages has facilitated multi-national studies. Some use the entire battery, e.g., AddNeuroMed [9], others use selected measures, as in the Harmonized Cognitive Assessment Protocol (HCAP) studies [10]. The battery has acted as a unifying clinical force among German-speaking clinicians under resources developed by A. Monsch, MD (https://www.memoryclinic.ch).
We report here on the psychometric characteristics of the original neuropsychology battery and to a lesser extent of CERAD Plus, and on the main uses. We also indicate the translations available and in progress, the norms available, how selected measures have been used, the multiple countries served, and in particular the multiple surveys.
CERAD NEUROPSYCHOLOGY ASSESSMENT BATTERY (CERAD-NAB)
Content
In 1986, the CERAD neuropsychology task force, headed by Richard Mohs, PhD, developed a brief battery of seven measures (20”–30” administration time), based on the initial NINCDS-ADRDA guidelines for cognitive assessment of AD. In order, these measures were: Verbal fluency (semantic fluency, naming as many animals as possible in 60 seconds); confrontational naming (naming 15 outline drawings selected from the 60-item Boston Naming test) [11]; a brief overall test of cognitive function (Mini-Mental State Exam [MMSE]) [12]; Word List Memory/Learning (10 common nouns administered in random order on three successive occasions, with recall after each occasion; to reduce confusion, this test will be called Word List Learning in this paper); Constructional Praxis (copying four designs ranging in complexity) [13]; Word List Recall (of the 10 original words), and Word List Recognition (the 10 original words embedded with 10 foils). A later addition included recall of Constructional Praxis. Approximately two decades later, measures to assess executive function (Trails A and B), and phonetic fluency (words starting with “S”), taking about 10 minutes, were added to the German translation (CERAD Plus) [6], and in Luxembourg to their French translation.
Scoring individual measures
Summary, derived, and combined scores have been developed. For instance, the Word List Learning measure can yield at least two scores: the score on the third administration, the sum of the three administrations, and number of intrusions; Word List Recall provides three scores: the number of words recalled, percentage of words learned that are recalled, number of intrusions; similarly Constructional Praxis Recall can be scored as the score obtained, and the percentage of the original score retained. Scores of different measures of the same concept may be combined (e.g., memory measure combinations).
Total score
A Total Score (TS1), based on the sum of all measures except the MMSE and constructional praxis recall, and excluding derived scores, has been developed, and tested extensively [14]. TS2 includes the Constructional Praxis Recall score [15] but is less used than TS1.
TS1 has been found able to identify clinically relevant change in level of cognition, i.e., progression of AD; to distinguish cognitively normal controls from MCI and AD, and between stable and progressive MCI; and permit early detection of AD, prevalent dementia, and incident dementia (Table 1, and Supplementary Material A, which also includes information on a Total Score for patients with Parkinson’s disease based on CERAD Plus) [14 –19].
CERAD Total Scores that distinguish cognitively normal performance, MCI, and dementia status from each other (by education categories where available)
Information above has been adapted from tables presented in the references indicated. AD, Alzheimer’s disease; AgeCoDe, Ageing, Cognition and Dementia study of primary care patients age 75 years and over in Germany; AUC, area under the curve; aMCI, amnestic mild cognitive impairment; MCI, mild cognitive impairment; SD, standard deviation; Sensitivity, percentage of cases correctly identified; Specificity, percentage of non-cases correctly identified; TS1, Sum of scores (maximum scores in parentheses) on Verbal Fluency (24), 15-item modified Boston Naming test (15), Word List Learning test (30), Constructional Praxis copy (11), Word List Recall (10), Word List Recognition (10). Total maximum score = 100; TS2, All items of TS1 plus Constructional Praxis recall (11). Total maximum score = 111.
The CERAD-NAB can assess cognition ranging from normal status through moderate AD (Clinical Dementia Rating [CDR] 2) [20]. Few patients with severe dementia (CDR 3) are able to give scorable responses. At that level, physiologically-based measures may be needed. For the levels at which it is appropriate, the CERAD-NAB has been identified as among the most accurate [21].
PSYCHOMETRIC CHARACTERISTICS OF THE CERAD-NAB
Reliability and validity
Information on one-month test-retest reliability was determined from 632 patients with mild or moderate AD, and 394 control subjects who had been entered into the CERAD database. Correlations (values for AD patients precede those for control subjects) were: Verbal fluency (0.80, 0.71), 15-item Boston Naming test (0.91, 0.71), MMSE (0.87, 0.67), Word List Learning (0.80, 0.65), Word List Recall (0.56, 0.67), Word List Recognition original words (0.53, 0.35), foils (0.60, 0.16). Information on Constructional Praxis recall was not available since this measure was added later. Some correlations were lower because of ceiling effects (for control subjects) and floor effects (for patients with dementia) [22]. With the exception of recognition of foils in the Word List Recognition test (significant at p < 0.01), all correlations were significant at p < 0.0001. Inter-rater reliability was high, with intraclass correlation coefficients ranging from 0.92 for Constructional Praxis to 1.00 for Word List Recall [20 , 24]. The availability of training tapes facilitates high inter-rater reliability.
The validity of a diagnosis of dementia was originally confirmed by neuropathology findings [25]. Cross-sectional data indicate that average level of performance was poorer as stage of AD increased, with some measures reaching floor before others in both clinical and nonclinical settings [26 –28]. Longitudinally, the rate of decline increased with decrease in initial level of performance [29], a finding that has not been confirmed [30].
Among the translations, validity has been reported for the Chinese-Cantonese [31], Finnish [32], German [17 , 34], Korean [35], and Spanish for Colombia [36]. Evaluation using the German translation indicated that, with rare exception, all measures distinguished normal controls from AD patients, dementia was identifiable at an early stage [16]. In addition, the battery discriminated among depression, MCI, and various dementias [33, 37]. A combination of Verbal Fluency, Word List measures, and Constructional Praxis recall, discriminated between controls and AD patients with 93% accuracy, and has been used when brevity is necessary.
Norms
Norms for each individual measure, for certain combinations of measures, and for TS1 and TS2, are typically based on epidemiological data. As is common for cognitive measures, age, education, and to a lesser extent gender, had significant effects on cognitive performance in healthy older people [38], with specific tasks differentially affected by these demographic characteristics [39] (see also studies referenced in [40]). Increased age and lower education were often associated with poorer performance, while the effect of sex differed across measures. Level of performance has been found to vary within a given country [23 , 41], and across countries [18, 42]. Clinicians who use the CERAD-NAB German translation or CERAD Plus have an additional resource at the www.memoryclinic.ch website, which provides demographically characterized norms for these specific measures.
Published norms are available for the cognitively intact; for MCI; and by stage, for AD. They may be presented by specific age or age categories, cross-tabulated by other demographic characteristics (e.g., education, gender), adjusted for these and additional characteristics (e.g., depression, waist circumference) [43], or determined by regression equations. Most norms are based on cross-sectional data, but some longitudinal norms have also been published. And of those, some are labelled “robust”, i.e., are obtained from persons whose cognitive status has remained stable over time, and so may offer superior clinical utility [42]. Norms may have been obtained incident to the main focus of a study, as for instance, information on annual performance scores for up to five years on all neuropsychology measures for 236 initially very mild /mild community-resident AD patients in the ALSOVA study [44]. Brief review of many studies that present norms is given in Supplementary Material B.
Factorial structure
The factorial structure of the basic CERAD battery has been examined with inconclusive findings: three factors (memory, language, praxis, in a CERAD tertiary care AD sample [n = 354] [26], a diverse AD pilot sample [n = 35] [45], a Korean sample of patients with dementia [n = 106] [46], and a Colombian sample [36]; five factors (memory, learning, language, praxis, executive function, in healthy elderly) [38]; a single factor that assesses severity of cognitive impairment (CERAD AD sample n = 913) [47]; a Veterans Administration (VA) AD sample [48], and a cognitively normal Yoruba sample (n = 100) [49]; two factors in a combined AD/nonAD VA sample [49], and a cognitively intact African American sample (n = 86) [50]. Disagreements probably reflect sample differences (e.g., education level), the variables used, their manner of combination and coding, and sharp differences in statistical approaches.
The factorial structure of CERAD Plus has also been examined, using both exploratory factor analysis (EFA; letting the data indicate what is present), and confirmatory factor analysis (CFA; checking whether the data agree with assumption of what is present) [51]. Principal component analysis of baseline Berlin Aging Study II data, using age, education and gender-corrected z-scores of the 11 CERAD Plus tests (n = 1,380; age 60–80 years; 52% women), indicated solutions with up to five factors, the four factor solution being preferred, and confirmed as having the best fit when Word List intrusions and the abbreviated Boston Naming test were dropped.
Further CFA analysis which omitted the dropped measures, identified five models: a general latent factor that included all measures; a two latent factor model, relevant to distinguishing amnestic from nonamnestic MCI; a three latent factor model comparable to that previously reported [26 , 46]; and a four latent factor model (verbal memory, visuo-construction, executive functions and processing speed, verbal fluency), initially identified by EFA.
Practice effects
It is important to be able to distinguish practice effects (change in performance due to learning because of multiple administrations of the same measure, and not attributable to aging), from change due to a particular intervention. Review of the few studies then available [52], found, overall, that persons with normal cognition had no gain in Verbal Fluency, some improvement on the Word List measures, and possibly a decline in Constructional Praxis, findings similar to those reported previously [23]. Examining evaluations two and four years post-baseline, a recent study in Finland found statistically significant but minor improvement in modified Boston Naming and Word List Recognition, but not in Verbal Fluency, Constructional Praxis, Word List Learning or Word List Recall in the cognitively normal group [43]. One study, with a one-year interval, found improvement in the Word List measures for the cognitively normal group, but not for those with mild AD [53]. Similarly, a small study of two matched samples with mild/moderate AD found a decrease in the TS1 score of the untreated group across the first three evaluations, with a slight improvement on the fourth occasion [54].
In their own study, Mathews and colleagues [52] used data obtained on five annual occasions from a sub-study of cognitively normal men in the PREADViSE and SELECT studies (N = 308, mean age 70 years, 46% >college education, 88% White) [55]. They found little change over time for the 15-item Boston Naming, Constructional Praxis, and Word List Recognition tests, all of which showed ceiling effects initially. Where there was room for improvement (Word List Learning, Word List Recall), none occurred, probably because the words were changed on each administration. For Verbal Fluency and TS1, there was a statistically significant but clinically irrelevant improvement through year 4, with a slight decline at year five. It was, however, notable that those with low TS1 on entry had erratic scores, which fell at year 5. Improvements were not attributable to dropout.
This limited information suggests that repeated administrations alone seem to have little positive effect on performance of patients with AD. Performance may increase among those who are cognitively normal, but does not appear to be substantial, and may be concealed by ceiling effects. This is confirmed by analysis of AD patients and control subjects in the CERAD database who provided information on at last two, and up to five occasions (i.e., up to four years) [30]. Over four years control subjects had a minimal annualized improvement (∼3 points) in TS1, but there was an average decline of ∼22 points for AD patients, with a mean annualized rate of –7.2 points regardless of severity of condition. Change scores > ±10.5 points were needed to indicate a reliable, clinically meaningful change, but this was based on a very conservative method. Issues that arise, and appears to be little addressed, is the time interval over which change in score is measured (hours, weeks, months, years), and the impact of familiarity with the testing situation. The latter is sometimes handled by providing practice sessions before what is accepted as the initial administration.
Reliable change indices
Reliable change indices for Verbal Fluency and the Word List tasks have been examined [56]. Data (baseline, 1.5 years, and 3 years later) came from 1,450 participants age ≥75 years in the AgeCoDe study, who remained cognitively normal over a period of three years [57]. Performance, cross-categorized by age (70–79 years, ≥80 years), education (elementary, >elementary schooling), and sex is reported for each measurement occasion, as well as change in score between measurement times, and test-retest reliability. The reliable change indices provided (90% confidence interval), take into account measurement error, practice effects, and normal age-related cognitive decline.
Focusing on the Word List Learning and Recall tasks, and savings scores provided by a sample of 368 women in the Australian Women’s Healthy Ageing Project, reliable index scores have been presented for the overall group, and separately for those with APOE ɛ4, those without, healthy controls, and persons with AD or MCI. The reliable change index, calculated as the difference in score between two timepoints divided by the standard error of the difference, and accepting 1.95 standard deviation as reliable change, varies considerably across these groups [42].
DIAGNOSTIC CAPABILITIES OF THE CERAD-NAB
This battery has been relied on to help identify AD and its subtypes, progression in AD, and precursor conditions, i.e., MCI in its different forms. Of particular interest in this regard is identification of improvement in cognitive functioning not attributable to change due to practice, in particular the extent to which MCI is reversible. Transition rates among normal cognition, MCI and dementia have been reported based on data from the Nun Study [58, 59]. In agreement with other studies, they found both stability in MCI, and reversion to normal cognition, the rate of reversion being twice as great under the age of 90 than above that age.
Identification of AD subtypes
AD subtypes, identified by Q-type factor analysis, indicated that approximately 75% of 960 patients in the CERAD database could be classified into three distinct clusters, and 45% of 465 controls into two distinct clusters. While impaired verbal learning was present for all three AD clusters (a hallmark of AD), they differed on semantic and visual-construction measures. “Control” clusters differed from “AD” clusters [60]. A grade of membership analysis (GoM) on CERAD’s first 718 AD patients identified six clinical pure types—AD with: parkinsonism; depressive symptomatology; mild language problems (parkinsonism and depressive symptoms absent); impaired cognitive status with problems performing instrumental activities of daily living; late onset, mild AD; late onset AD of long duration, severe at entry and with additional chronic disease present. Neuropsychological data yielded five pure types, which differed primarily in level of neuropsychological performance. Clinical and neuropsychological types were not notably related. Relevant identifying data for other pure types (e.g., familial AD) was absent [61]. GoM is based in fuzzy set theory, and these “pure” types may overlap.
The addition of biological characteristics identifies other subtypes. For instance, four distinct trajectories of tau, with unique cognitive profiles and progression have been identified by tau-positron emission tomography [62]. Genetic variants determine some subtypes.
Statistical and pattern of response approaches to identifying and distinguishing among controls, MCI, AD/other dementias
Multiple statistical approaches have been used to identify and distinguish among normal cognition, MCI, dementia, and AD. Most recently this includes studies in Thailand [63], comparing “receiver operating characteristic analysis, Linear Support Vector Machine, Random Forest, Adaptive Boosting, Neural Network models, and t-distributed stochastic neighbor embedding”. Based on a sample of 60 MCI patients and 63 normal controls, Word List Learning and Recall, and Verbal Fluency best discriminated MCI patients from controls. There was, however, considerable overlap between the two groups, and the tests were not predictive of MCI. Machine learning, however, was able to discriminate AD from other dementias with 82% accuracy [64].
In Brazil, a group of investigators consisting in various combinations of Brasil Filho, Pinheiro PR, Pinheiro MCD, Coelho, Costa, and de Castro, examined a multicriteria classification model, a hybrid model, and compared two multicriteria classification methods for AD identification (work published in 2008–2010 in issues of Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). It is not yet clear which input variables facilitate machine learning—neuropsychological, clinical, or biological (e.g., tau, amyloid-β, amyloid-β ratios), or a combination.
Analysis using association rule learning (a rule-based machine learning), to CERAD baseline and one-year data, found that it was possible to distinguish among controls, mild, and moderate to severe AD according to the number of frequent item sets retained across both evaluation time points [65].
The most straightforward, easy-to-use information, comes from the Total Score (Table 1), which can identify cognitively normal persons, persons with AD, and stable and progressive MCI.
The pattern of response across the neuropsychology battery may identify presymptomatic AD [66], and distinguish among different types of MCI [67, 68]. Comparison of the performance of patients with AD with that of patients with other dementias (vascular, mixed, frontotemporal lobe, Lewy body disease), MCI, and depression, has yielded different findings. One study found that while patients with AD could be distinguished from control subjects, it was not possible: to accurately distinguish among AD, vascular dementia, and mixed dementia; MCI from normal control subjects; and AD from depression [33]. Others, however, have found that the pattern of cognitive deficits identified by the battery can distinguish between frontotemporal dementia (FTD) and AD [69, 70]; among FTD, semantic dementia, and AD [37]; between FTD and vascular dementia [71], as well as between dementia and nondementing disorders such as depression [72], and schizophrenia [73]. The pattern of performance of persons with early Huntington’s disease was comparable to that found in frontal-subcortical dementia [74].
In a carefully controlled and statistically sophisticated study [6], CERAD Plus, with its additional executive function and phonemic measures, demonstrated better discriminability than CERAD-NAB in distinguishing between persons with normal cognition and patients with mild or moderate AD, mixed dementia, vascular dementia, and FTD, but was similarly unable to distinguish patients with AD from the group of other dementias. The statistical procedure used, which adjusted for missing values and allowed for inclusion of all measures, also provided a prediction score for the dementia diagnosis identified. (The high non-response rate to Trail Making Test B was itself indicative of a dementing disorder.)
COGNITIVE ASSESSMENT OF NON-DEMENTING CONDITIONS AND COGNITIVE INTERVENTIONS
The CERAD-NAB has also been used to assess cognitive status in non-dementing conditions that are possibly associated with, or risk factors for AD and other dementing disorders, including depression [72 , 76], Parkinson’s disease [77, 78], schizophrenia [76], and diabetes [79, 80]. Cognitive status in a wide variety of other disorders has also been explored (e.g., aphasia, fibromyalgia, glaucoma, gustatory muscle strength, Huntington’s disease, hypertension, inflammation, multiple system atrophy, olfactory problems, thyroid conditions).
The battery has also been used in pharmaceutical studies to assess the impact of drug interventions; identify post-operative cognitive decline [81 –84]; examine the impact of environmental conditions, both conditions that may have a positive cognitive effect (e.g., selenium in soil) [85, 86], and those that may have a negative effect, such as air pollution (SALIA study) [87].
GENETICS
Major studies that include CERAD measures and genetic information include the Cache County Study on Memory in Aging [88], Dementia Competence Network in Germany [89], and the Leipzig Life-Adult studies [90].
Of particular interest is a community of approximately 5,000 residents in Antioquia, Colombia, with early-onset familial Alzheimer’s disease attributable to the E280A mutation on the presenilin-1 gene. These residents have been studied extensively using CERAD-COL, the Colombian Spanish translation of CERAD-NAB, together with multiple additional measures, because of the information they can provide on biological risk factors for AD [19 , 91].
Consideration of the role of APOE4 remains a constant in many studies, for instance, among those with the APOE4 genotype, peripheral blood biomarkers have been found to be associated with semantic and episodic memory impairments among persons with amnestic MCI and AD [92].
CERAD Plus has been used to identify specific genetic markers for cognitive impairment and gene-gene interactions (e.g., GSTO1*C, KIBRA rs17070145 gene [94], estrogen receptors [95]), gene-associated cognitive status in various health conditions (e.g., Parkinson disease) [78, 96], and in gene-wide association studies of post-operative cognitive decline [97].
ASSOCIATION OF NEUROPSYCHOLOGY MEASURES WITH BRAIN NEUROANATOMY
In neuroimaging studies, the battery has been used to identify and examine regions of the brain associated with particular types of cognition [40 , 98–100] in neuronal degeneration, and asymmetric disease [101]. Awareness of brain regions associated with specific types of cognition has also been used to confirm the efficacy of Transcranial Pulse Stimulation (TPS), i.e., that TPS focused on specific regions improves cognitive functioning determined by those regions [45].
TRANSLATION, ISSUES IN TRANSLATION
Based on successful clinical use by the Alzheimer’s Disease Research Centers (ADRCs) current at the time, CERAD’s project officer suggested translation to encourage broader uniform use of AD assessments. CERAD did so initially with translations for use in France (for an epidemiological survey) [102], French-speaking Quebec, Spain, Portugal, and Brazil. The complete CERAD neuropsychological battery has now been translated into 18 languages, some with linguistic variants, e.g., three variants for Arabic, two for French, two for Chinese (Cantonese, Mandarin), with additional languages in process (Table 2). In addition, there is a translation into Finnish sign language. Notably, the Word List measures have been translated and administration modified by the 10/66 program for use in multiple low/middle income countries [103].
Languages into which CERAD-NAB has been translated
References and an extensive list of norms, including sample descriptions, is provided in Supplementary Material B. Studies in which these translations have been used are given in Supplementary Material C. The CERAD 15-item abbreviation of the Boston Naming test is only provided with evidence of permission for use from the Boston Naming copyright holder, ProEd; https://www.proedinc.com/Reprint-Permissions.aspx. The items of the CERAD 15-item abbreviation of the Boston Naming Test are not uniform across languages, but are selected to represent high, medium, and low frequency words in the language of administration. Differences across languages may be substantial. Chinese (Mandarin), Finnish, and Korean, use unique outline drawings that overlap with none of the original 60 Boston Naming test drawings; selection of items for the Hebrew version is uniquely related to the language. BN, modified Boston Naming test; CERAD-NAB, Consortium to Establish a Registry for Alzheimer’s Disease –neuropsychological assessment battery; CP, Constructional Praxis; MMSE, Mini-Mental State Exam; VF, Verbal Fluency; WL, Word List measures.
Translation procedures
With rare exception, translations adhere to best practices: forward and backward translation using professional bilingual translators; panels that include, variously, neurologists, psychologists, neuropsychologists, and social workers [104]; and procedures that are culturally appropriate [31,103–108 , 31,103–108]. Translations are piloted and revised as needed. The underlying psychological concepts are maintained. For the Word List measures, particular attention is paid to word frequency (sometimes relying on general knowledge in the absence of word frequency lists), phonemic similarity and imagery equivalence. Translations were also generally reviewed through the CERAD home office by a native speaker of the language in question. Where feasible, the translated material maintained the format of the original.
Secular changes have occurred since battery development in 1986. The measure most affected has been the culture-sensitive CERAD 15-item abbreviation of the Boston Naming test. (Note: Available only with permission of the copyright holder, ProEd.) The original low, medium, high word frequency in the U.S., represented by the selected outline drawings, has changed over time, and familiarity with the items represented varies across countries. Some countries had developed and normed their own set of items and substituted them (e.g., China (Mandarin), Finland, Israel for Hebrew, Korea), promoting continuity with their previous findings, or particulars of their own language. Other countries (e.g., Thailand) replaced some drawings based on new norms they developed [109]. Valuable examples of how word frequency was maintained, while making the test nationally applicable and providing norms have been published for a Hebrew-speaking population in Israel [110], and for Thailand [109].
Person-specific challenges
More challenging were problems evaluating cognitive status in persons with literacy or physical limitations; administration is not yet uniform. For the minimally literate, or the visually impaired, each item of the Word List Learning and Recognition measures may be read aloud, and then repeated by the person being tested, so maintaining two of the input modes. For the hearing impaired, sound magnification using accessible cheap devices may be feasible; alternatively, if literate, a hearing impaired individual may be asked to read the directions, and pronounce each test word. As yet, there is no substitute for measures that are inappropriate for persons with certain physical limitations.
A largely illiterate rural population in India, unfamiliar with many features the West takes for granted, offered particular challenges. The solution was to maintain the underlying neuropsychological concept assessed by each measure, while modifying the measures so that they were culturally appropriate. For instance, since there was familiarity with neither outline drawings nor photographs of objects, small replicas of items were substituted to evaluate confrontational naming [111, 112]. The resulting battery, in English and Hindi, is available at: https://www.dementia-epidemiology.pitt.edu/indous-instruments/ (checked 31 July 2022).
CERAD-NAB AS A UNIFYING CLINICAL SOURCE
In Switzerland, approximately 10 years after CERAD-NAB was introduced, Monsch and his colleagues translated the battery into German, and recommended its use as a uniform, internationally compatible measure and minimal dataset for German-speaking Europe (Switzerland, Austria, Germany) [6
, 113–115]. Validation and psychometric characteristics were determined [34]. A decade later CERAD Plus (see
Importantly, a web site was created that provided clinicians with sex-, age-, and education-categorized norms, initially for CERAD-NAB (based on a sample of 1,100 German-speaking adults, age 49–92 years), and later also for CERAD Plus (adding information from 604 adults, age 55–88 years). Qualified users may deposit their data, which, after processing, provides additional norms for clinic patients (https://www.memoryclinic.ch/de/main-navigation/neuropsychologen/) [116].
ALTERNATIVE MODES OF ADMINISTRATION AND SCORING
Aside from computer-assisted in-person interviewing (CAPI) or telephone (CATI) administration, we have identified few studies that used computer-based resources, and none using an iPad, although computer-based resources for examiner administration, and subject response to cognitive assessments have been available for over four decades [117, 118].
Computer-based scoring
The Verbal Fluency and Word List Learning and Recall measures are the tasks that have most frequently lent themselves to computer-based use. Using CATI, which followed the standard script, the Verbal Fluency and Word List Learning and Recall tasks were administered to 18,072 nationally representative participants age 45 years and older in the REGARDS study, after determining that the subject could respond correctly and had given permission [119 –121]. Since Word List measures items could not be shown to or read by the participants, the words were recorded for telephone use, ensuring consistent administration. Verbal Fluency task responses were recorded onto electronic sound files that also noted task time. Responses were later played back for scoring.
While gathering data was straightforward, scoring was complex. A human interface was required to recognize problems in administration, including device-related issues; identify accuracy of animal naming, repetitions, and intrusions; and activate computer-based resources for summarizing scoring information. The entire approach for the Verbal Fluency measure involved considerable training and checking [121 and supplement]. The process, however, ensured inter-examiner agreement and accuracy of scoring, separated administration and data gathering from scoring (potentially reducing costs), and facilitated use of more sophisticated staff for scoring. Findings (number of acceptable animal names mentioned), compared well with the conventional approach.
Related to this work, additional scores for different aspects of the Verbal Fluency test (e.g., number of correct responses in the first and last 30 seconds, mean cluster sizes, number of switches between clusters), have been proposed and evaluated on subjects in two Korean surveys of dementia (KLOSHA, NASDEK) [122]. A weighted score, based on these subscores and adjusted for demographic characteristics has been developed, with ability to identify AD comparable to that of the MMSE [12], but without the drawbacks of that measure. The Verbal Fluency test can be self-administered on a smart phone downloaded with an android application (“Traffic Light for Dementia” (TLiDe; https://play.google.com/store/apps/details?id=com.appmd.dementia.signal), which is equipped with voice recognition and automated scoring, and which provides the probability of AD in one to three days [123]. This approach remains to be assessed in non-Korean-speaking populations, where voice recognition may work better.
The HCAP sub-study of the Health and Retirement Study (Lindsay Ryan, personal communication, August 13, 2021), used CAPI to administer the Word List measures, with responses recorded by the interviewer on the original scoring sheets programmed for laptop entry.
Administration of the Mild Cognitive Impairment Screen (CIS) made fuller use of computer resources [124]. CAPI was used to administer the battery, but responses, including order of word recall, were entered into a programmed scoring algorithm that identified “normal” versus “impaired” status, provided demographic-based information for each item, and recommendations for additional evaluation if necessary [125]. (MCIS used directions for administration different from the standard but maintained the original test words.)
OTHER USES
CERAD-NAB has been used to estimate the impact on cognitive functioning of naturally occurring circumstances and planned interventions. Among others, these include naturally occurring selenium [86]; antioxidants (PReADVISE and SELECT trials) [55]; the effect of reality orientation [54]; and of hormone replacement therapy [126, 127].
Dementia has real life consequences for the individual with dementia, family and friends, society, and the business sector. CERAD-NAB data have been used to examine health service use [128], estimate the cost of AD (see CADAS study below), and by the long term care insurance industry [129].
MAJOR EPIDEMIOLOGICAL SURVEYS OF DEMENTIA
A major use of the entire battery, or a subset of its measures, has been in cross-sectional and longitudinal epidemiological surveys. We focus first on U.S. studies, particularly those with an international component, before reporting on epidemiological studies elsewhere. Additional details are given in Supplementary Material C.
Epidemiological studies in the U.S. that use the entire CERAD-NAB
The ADAMS eight-year study of prevalence and incidence of dementia is the first, and to date only nationally representative survey of dementia in the U.S. [130, 131].
Studies of specific geographic areas include the 18-year Cache County Study of Memory in Aging, in Utah [88, 132]; the 10-year KAME project of Japanese residents of King County, Washington State [107, 133], with a counterpart in Hawaii, Honolulu-Asia Aging Study [107, 134]; the 15-year Monongahela Valley Independent Elders Study in rural Pennsylvania [135], and its counterpart, an older, largely illiterate, rural population in Ballabgarh, India [111 , 136]; the 20-year Indianapolis-Ibadan study of Black residents of Indianapolis primarily originating from West Africa, compared with Yoruba residents in Ibadan [137, 138]; and the Chicago Health and Aging Project, which used CERAD in its early years [139]. The Black/White dementia study in the North Carolina Piedmont area, is cross-sectional, but with a look-back component permitting assessment of disease incidence [140]; the 15-year School Sisters of Notre Dame (Nun Study), focused on a specific religious group [59].
Representative U.S. studies that use a subset of the CERAD-NAB
Longitudinal studies using a subset of measures include the REasons for Geographic And Racial Differences in Stroke (REGARDS) [120]; Jackson Heart study [141]; and the ARIC Neurocognitive Study (ARIC NCS) component of the Atherosclerosis Risk in Communities (ARIC) Study [142]. These studies, together with two others that do not include CERAD measures, harmonize their cognitive assessments, share data, and have overlapping participants. ARIC data have been used to examine sex and race disparities in cognitive decline [143, 144]; REGARDS, with its national overview, identified geographic differences in cognitive status [41], and also developed electronic administration and scoring of CERAD measures [121]. Inclusion of the Verbal Fluency and Word List measures in the longitudinal National Health and Nutrition Examination Survey (NHANES, 2011–2014), has facilitated publications on the association between cognition and many common characteristics, and provided nationally representative norms for persons 60 years and over [145].
Use of identical procedures has permitted the creation of ROSMAP, a merger of the longitudinal Religious Orders Study (ROS) study, and Rush Memory and Aging Project (MAP) [146 –148]. Information on the Black participants has been enhanced with the addition of the Minority Aging Research Study and members of the Clinical Core of the Rush Alzheimer’s Disease Core Center [149].
The Verbal Fluency and Word List measures, in particular, have been included in brief standardized assessments designed for use in epidemiological surveys of dementia in low and middle income countries (Identification and Intervention for Dementia in Elderly Africans (IDEA)) study [150 –152]; and the 10/66 studies [103].
Forthcoming studies using selected CERAD measures include the Caribbean American Dementia and Aging Study (CADAS), (https://urapprojects.berkeley.edu/projects/detail.php?id_list=Pub1026 accessed August 23 2021), designed to obtain nationally-representative baseline data (including Verbal Fluency) on individuals age 65 years and over living in Puerto Rico and the Dominican Republic (N =∼500), to ascertain the association of dementia with life course socioeconomic status, and the societal costs of dementia. The Determinants of Incident Stroke Cognitive Outcomes and Vascular Effects on RecoverY study (DISCOVERY; 2019–2025; PIs Drs. Natalia Rost, Steven Greenberg; Massachusetts General Hospital) plans to use the Word List measures, among others, to study determinants of post-stroke dementia, in particular among minority patients (https://clinicaltrials.gov/ct2/show/NCT04916210 https://www.resilientbrain.org/discovery.html). They also plan to evaluate telephone administration of the Word List measure.
Countries outside the U.S. with single epidemiological studies
The earliest epidemiological survey of dementia to use the CERAD-NAB was the PreMAP study in south-eastern France, carried out in 1991 [102], which showed that the rate of dementia in this representative sample of community and institutional residents 70 years of age and older was comparable to that elsewhere in Europe. A survey in southern Taiwan on the incidence and prevalence of dementia, focused on the impact of sociodemographic characteristics and urbanization [153, 154]. Luxembourg, a multilingual country, used CERAD Plus and included biological measures [155]. Participants had to speak one of Luxembourgish, French, German, English, Portuguese, or Italian (CERAD is available in all except Luxembourgish). On average participants spoke nearly three languages.
Countries outside the U.S. with multiple epidemiological and health condition studies Korea
The rate of aging in Korea occasioned concern regarding the prevalence and incidence of MCI and dementia, and possible change in these over time. To clarify this issue, the CERAD clinical and neuropsychology batteries (CERAD-K), and the CERAD Behavior Rating Scale for Dementia, were translated into Korean, and validity and reliability determined [46]. Surveys, initially small and localized, followed by nationally representative and longitudinal, were run. They include the Seoul study [108], the Korean Longitudinal Study on Health and Aging (KLoSHA, [156, 157]), the Nationwide Surveys on Dementia Epidemiology in Korea (NaSDEK 2008, NaSDEK 2012, NaSDEK 2018) [158 –160], and the Korean Longitudinal Study on Cognitive Aging and Dementia (KLOSCAD), a nationwide, population-based, prospective, longitudinal, cohort study, with follow-up every two years after baseline in 2010–2012 [161]. KBASE, a new longitudinal study of 20–90 year-old residents, that includes multiple biological measures, is in progress [162]. Among multiple interests, findings have been reported on change in prevalence rates over time [160], and the rate of dementia mortality based on incident dementia [163]. Particular attention has been paid to developing norms [15, 40]; identifying dementia, in particular AD, its risk factors and its biological characteristics [161]; electronic administration and scoring; and showing the effects on prevalence rates of different diagnostic criteria for MCI and dementia [159].
CERAD-K has been involved in multiple additional studies, including examination of neurotrophic factors, impact of pharmacologic treatment (donepezil), homocysteine and folate levels, effect of ginseng, and in particular, examination of changes in the brain in AD. Of unique interest, and relevance for public health planning, is a finding that, over a period of six months, the cognitive status of AD patients improved when given care by healthy elderly who receive government support [164]. This finding suggests that support of caregivers can reduce the public cost of dementia.
Finland
CERAD-NAB was translated into Finnish in 1999 [165], then into Estonian, and, uniquely, into Finnish sign language. Norms and cut-off values indicative of cognitively normal status, very mild and mild AD; discriminability of MCI, AD, and normal cognition; and the impact of age and education have been determined [32 , 166–170]. For norms, see Supplementary Material B.
In Finland, studies using CERAD-NAB have typically focused on cognitive status in specific health conditions, and determinants of and interventions to improve cognitive status. Both the longitudinal CAIDE85+ (Cardiovascular Risk Factors, Aging and Dementia study, age 85+) [171], and the 5-year Kuopio ALSOVA study examined the association of cognitive and ADL functioning. The latter study provided norms for very mild /mild community-resident AD patients for each succeeding year for all CERAD neuropsychology measures [44]. The Drugs and Evidence-Based Treatment in the Elderly (DEBATE) study of cognitively intact 75–90-year-old community residents with stable atherosclerotic disease, found that use of anticholinergic drugs was associated with poorer verbal fluency and confrontational naming, but had little effect on memory tests that are more sensitive to identifying early AD [172]. Cognitive decline in Parkinson’s disease, was not marked by forgetting; similarly, patients with frontotemporal lobe degeneration also exhibited moderately well-preserved delayed recall and memory but had impaired verbal fluency [70]; cognitive profile in seasonal affective disorder differed from that in other depression-related disorders [173]. An ancillary project of the Finnish Diabetes Prevention Study found that over two years, cognitive performance was stable among patients with short duration of diabetes (<7 years), but declined with diabetes of longer duration [174, 175]. Additional studies included, among others, examining the impact on cognitive status of cortical acetylcholinesterase activity in multiple sclerosis patients [176]; inflammatory disorders [177]; and late pre-term birth [178].
The longitudinal DR’s EXTRA (Dose Response to Exercise Training) study showed the importance of a multicomponent approach –alone, intervention of improved diet, resistance exercise, and aerobic exercise had no impact on cognitive function, but after four years there was a trend towards improvement in cognition in the diet+aerobic exercise group [43, 179].
The two-year FINGER (Finnish Geriatrics Intervention Study to Prevent Cognitive Impairment and Disability) study found that those who participated in the multicomponent intervention (“nutritional guidance; exercise; cognitive training and social activity; and management of metabolic and vascular risk factors”), improved in cognition as assessed by CERAD and additional measures “regardless of sociodemographic and socioeconomic factors or other baseline characteristics”; the risk of chronic disease was also reduced [180 –183]. This positive outcome culminated in World Wide FINGERS (WW-FINGERS), which since 2017 has enrolled studies in over 25 countries that will take a culturally appropriate, harmonized, multicomponent approach, to determine whether cognitive outcome can be improved in persons at risk for dementia. Even absent an impact on cognitive status, health in general may be improved.
German-speaking countries
The CERAD battery has been included in several major longitudinal studies in German- speaking countries (see Supplementary Material C for details). They include Ageing, Cognition, and Dementia (AgeCoDe), a large, randomized sample of older cognitively healthy, primary care patients [184, 185]; the LIFE-Adult Study of 10,000 residents of Leipzig, age 40–79 (with emphasis on age ≥60 years), assesses impact on cognition of an extensive array of social, medical and biomarker information [90 , 187]. The longitudinal Berlin Aging Study II (BASE-II), enrolled 1,600 volunteer Berlin residents age 60–80 years, and 600 volunteers age 20–35 years, with re-evaluation planned in seven years. Two days of evaluation included 6 hours of cognitive assessment, and gathering a substantial range of biological and social information [79, 188]. In 2018–2020, 1,100 older BASE-II participants were incorporated into the GendAge study, which has a particular focus on sex and gender differences in dementia [189]. The SALIA study examines the impact of air pollution in the Ruhr on cognitive function [190 –193]. Finally, AgeWell.de, modelled on the FINGER study [182], examines the impact of a multicomponent lifestyle intervention on cognitive impairment assessed by a basic set of CERAD measures [194].
PRODEM-Austria is a longitudinal, multi-center study of over 500 patients with early dementia. Assessments include neuroimaging, biomarkers, functional status, cognition (CERAD Plus), and caregiver burden [195, 196].
Italy
The Monzino 80-plus study is an on-going in-home, longitudinal survey of a near total population age 80 years and over, in the province of Varese, Italy, with currently up to nine waves of data. This study provides information on the prevalence and incidence of dementia, as well as outcomes of and risk factors for dementia in older age [197, 198]. In addition, the Word List measures were used in a study examining the impact of anemia on well-being in the oldest-old [199].
Thailand
Translation into Thai has been recent. Advanced statistical approaches, including machine learning, identified those cognitive measures that discriminated MCI from controls, and found that the same tests were not predictive of MCI [63]. Within schizophrenia, similar statistical approaches identified groupings of adverse characteristics, the biological measures to which they may be related, and identified alternative diagnostic categories of schizophrenia [76, 200].
Evaluation of basic biological (including genetic) characteristics in the expression and development of MCI, AD (and schizophrenia), agreed that APOE4 was a risk factor for AD (APOE3 was protective), but not for MCI. APOE4 carriers not only had poorer cognitive status, but also poorer ADL functioning and social skills [201]. In the same sample (and as reported earlier under Genetics), the combination of APOE4 and selected peripheral blood markers predicted greater semantic and episodic memory impairments, suggesting that such a combination might help explain the transition from amnestic MCI to AD [92].
Using SPECT to examine perfusion of a dose of donepezil four hours after administration distinguished two groups: the group with broader perfusion showed a cognitive response after 6 months, the other group did not [202].
Representative multinational studies that use CERAD-NAB (also see Supplementary Material C)
Because of multiple translations, CERAD-NAB lends itself to multinational studies. The longitudinal, multinational, AddNeuroMed study, designed to be comparable to the U.S. Alzheimer Disease Neuroimaging (ADNI 1) study, is a public-private collaboration initiated in 2005, with sites in Finland, France, Greece, Italy, Poland, and the United Kingdom. The main focus is identification of AD biomarkers from multiple sources: “We wanted to focus initially on proteomics but have also candidate protein studies, genomics, lipidomics, imaging, and more importantly have sought approaches that combine these various technologies.” [9]. Sample sizes vary by country and by specific issue of concern. The original data have since been modified to improve use and accessibility, and combined with two other major datasets with overlapping interest, with consequent overall increase in value [203].
AddNeuroMed developed several summary scores: Total Score (TS1) distinguished MCI from AD, and stable from progressive MCI [204], while compound scores (combinations of selected measures), identified prodromal AD [18]. Just as REGARDS found that cognition differs within a country [41], AddNeuroMed found country differences in scores that distinguished between control and MCI groups [204].
NU-AGE (France, Italy, Netherlands, Poland, United Kingdom), examined the impact of a culturally adapted diet [205]. Both control and intervention groups improved cognitively, but the differences between them were non-significant. Higher adherence to the NU-AGE diet, however, was associated with statistically significant improvements in global cognition and episodic memory. (NU-AGE, also referred to as NU-AGE diet, and NU-AGE whole diet approach, is called “European Project on Nutrition in Elderly People NU-AGE” on clinicaltrials.gov.)
A study of the cognitive impact of Covid-19 is in process in four countries (COVID-19 international studies “Genetics, Immunological and Neurological Long-term Consequences in Prospective COVID-19 Cohort in Thailand, Japan, Philippines and USA”). Each site plans to recruit 30 severe COVID-19 patients who have recovered, and to re-assess them at 3 weeks, and 3, 6, 12, and 24 months following discharge, with the entire CERAD-NAB administered on the first and last occasions.
Some multinational studies use individual CERAD-NAB measures. The Verbal Fluency and Word List measures were used in the LipiDiDiet trial (Finland, Germany, Netherlands, Sweden), a double-blind, placebo-controlled trial of patients with MCI or mild AD, with random assignment to a nutritional intervention. After 24 months a non-significant clinical improvement was noted in the intervention group. Evaluation at 36 months, however, found that both clinical and cognitive decline slowed in the intervention group vis-à-vis the placebo group. The authors reason that “benefits increased with long-term use”, and that this intervention has “the potential to alter disease trajectories” [206]. LipiDiDiet may continue for 72 months.
The same cognitive measures were included among others in EPIDEMICA, which assessed the prevalence of MCI and dementia in the Central African Republic and the Republic of the Congo [207], and in a representative random sample of community residents age 65 and over in Pune and Kirkee cantonments in India [208].
The most extensive set of studies using only part of the neuropsychology battery are those participating in HCAP [10]. HCAP capitalizes on the major HRS international studies on aging “to facilitate cross-national comparisons of the prevalence and trends of dementia in aging populations around the world”, permitted by adding a uniform neuropsychological assessment that includes up to three CERAD measures (Verbal Fluency, Word List tasks, Constructional Praxis). Study sites are in Latin America, Asia, Africa, and Europe, they include the BASIC-Cognitive study, and studies in the Survey of Health and Retirement in Europe (SHARE-ERIC) –France, Italy, Germany, Denmark, Poland.
LIMITATIONS
Although the CERAD database of up to eight years of data from Black and White AD patients and control subjects seen at the 24 original ADRCs remains available, the depository ceased with cessation of funding. Consequently, with the exception of data gathered by German clinicians (https://www.memoryclinic.ch/de/main-navigation/neuropsychologen/), a CERAD depository for independent clinicians is unavailable. The original data, obtained before useful antidementia medications were developed, provides valuable information on the natural history of AD.
Use of CERAD-NAB has expanded beyond the original mandate. Absent a library search, there is presently no way for interested users to know what others are doing. Access to a depository could provide such information, increase the power of many studies, facilitate examination of otherwise rare events, and provide comparison data. In theory, CERAD-based information could be added to electronic health records (EHRs). Problems related to EHR cost and interoperability, and absence in smaller practices, do not make this a feasible solution.
While moderated in the summary Total Score [14], ceiling effects on some CERAD-NAB measures restrict evaluation of high performers, floor effects limit evaluation beyond moderate AD (CDR 2), and, as is common, most measures are affected by level of education and age, and to a lesser extent by sex. Over time, the educational level of the older population has increased, suggesting that norms may need to be re-evaluated at intervals of 5 or 10 years.
The original CERAD battery assesses executive function indirectly [209], and verbal functioning partially. CERAD Plus addressed these inadequacies and maintains brevity while standardizing use of measures that are otherwise applied in an ad hoc manner [6]. These additional measures, as well as a test for social cognition recommended by DSM-5, remain to be included in CERAD-NAB. Alternative equivalent Word Lists have been requested but are currently unavailable. This may be a minor limitation for AD patients, given that there appears to be little learning after repeated administration [52].
The guidelines provided for administration and scoring of all measures do not cover all situations (e.g., telephone administration, and difficulties attributable to illiteracy or physical issues). Users have indicated how they have handled such problems, but solutions remain to be formally integrated into CERAD-NAB and disseminated. Cut-points that identify different levels of adequacy of performance for each measure, and Total Scores should be readily accessible, cognitive profile sheets are desirable.
Although face-to-face administration remains the norm, electronic administration and scoring have been developed for some CERAD measures. Where appropriate, electronic administration should be encouraged to increase uniformity of administration, accuracy of scoring, and possibly reduce nonparticipation, dropout, and cost, and to allow evaluation when face-to-face administration is not feasible. For some measures, electronic scoring is somewhat problematic [123], but electronic advances suggests that such procedures will become easier. Which measures can be handled in this manner, by whom and when, the resources needed, and cost of development and use, remains to be determined.
Some limitations reflect general cultural issues. Considerable care is needed when translating the items of the Word List to ensure maintenance of concept. Although the definition of AD includes impaired ADL performance, there is a paucity of studies linking CERAD-based cognitive status to ADL performance, with some exceptions [44 , 210].
DISCUSSION
The CERAD program had a mandate to develop brief assessments of AD that would encourage uniformity of data gathering, and aggregation of information. The resulting database of up to eight years of information on over 1,000 carefully evaluated Black and White patients with AD, and nearly 500 cognitively unimpaired persons remains of particular value, since it reflects the natural history of AD before the advent of truly effective interventions.
The CERAD-NAB continues to be in demand, and its use has expanded. Originally designed specifically for identification of AD, further study found that it can also be used to identify MCI, as well as frontotemporal lobe dementia, and Lewy body disease. Additional ways of scoring have been developed, in particular a rapidly determined Total Score [14], which can distinguish among normal cognitive function, MCI, and AD; and differentiate stable MCI from progressive MCI (see Supplementary Material A). CERAD-NAB has lent itself to telephone and computer-based administration and automated scoring that do not require face-to-face presence. Some studies suggest that the battery could be reduced, but there has been little further examination of this possibility.
The constituencies for which CERAD-NAB is relevant have expanded. Originally developed with the aid of the initial ADRCs, these are now served by the National Alzheimer Coordinating Center (NACC), and by NACC’s standardized evaluations [211]. Clinicians and clinical sites that are not affiliated with tertiary care medical centers are now the main users of CERAD-NAB. In addition, use has expanded to include epidemiological surveys of the prevalence and incidence of dementia.
While somewhat longer than commonly used screens of dementia [12, 212], CERAD-NAB provides deeper information on areas of cognitive function, facilitating diagnostic differentiation of dementing disorders. Translation for international surveys has resulted in additional published guidance on maintaining the cognitive concepts being tested while remaining sensitive to cultural issues [111, 112].
Epidemiological studies using CERAD-NAB or CERAD Plus have been and continue to be carried out in multiple regions (U.S., Latin America, Europe, Africa, Asia), enrolling groups of people differing in demographic characteristics (see Supplementary Material C). Within the U.S., different U.S. populations can be compared (Black, White, Japanese, Catholic sisters of a particular denomination), and different geographical settings (U.S. urban, rural; northeast, southeast, central, west coast). International uniformity of information allows direct comparison of prevalence and incidence across countries. At both local and national levels, findings can, and in some countries (e.g., Korea), do provide information on public health issues. It is unclear, however, to what extent the data gathered are accessible.
Selected CERAD measures with established norms (generally any one or all of Verbal Fluency; Word List measures, often with Word List Recognition omitted; Constructional Praxis), are often chosen when cognition is an interest, but not a focus of the study. Such use extends the scope of an investigation, or, as in the case of HRS HCAP and similar studies, provides an area of harmonization permitting study of issues benefitting from a larger database.
It has been estimated that up to 40% of the identified risk factors for AD are modifiable [213]. Both through its use in many diverse studies, as well as in multicomponent modifiable risk factor studies such as FINGERS [182] and AgeWell.de [192], CERAD-NAB has shown that it is able to contribute appropriately to this important task; one or more constituent measures have been used in studies examining all the currently identified risk factors, with the exception of traumatic brain injury (Supplementary Material D provides a summary of relevant studies).
Question arises as to whether CERAD-NAB will continue to be relevant, given that AD, even in its early stages, may be identifiable through blood-based biomarkers [8], although problems have been expressed with this approach (see Journal of Prevention of Alzheimer’s Disease, 2021;4(8)). There is little information, however, on whether these biomarkers are also indicative of the behavioral manifestations of AD—the cognitive loss and loss of independence. Additionally, while uncommon, biological characteristics can be present without any obvious cognitive disturbance, and vice versa [214]. Similarly, treatments for AD may be arriving on the market, currently aducanumab, donanemab, lecanemab (ClinicalTrials.gov ID: NCT05108922, NCT04437511, NCT03887455), and others are in the pipeline. There is concern regarding their validity, appropriate use, extent to which they are applicable, and their total cost. In addition, it may not be sufficient to determine that plaques and tangles are being reduced or prevented. The cognitive, behavioral, and functional manifestations of AD will still need to be measured to ascertain whether these have been modified.
We anticipate that, being more cost-effective, assessments such as CERAD-NAB will remain useful in the clinic, and in major epidemiological studies where blood biomarkers may be more difficult to handle, and differential diagnosis is a concern. CERAD-NAB can be administered by paraprofessionals after little training and could be programmed to provide data summaries and profiles of performance across cognitive areas. In particular, CERAD-NAB provides a standard measure for assessing the effect of an intervention on cognitive functioning. Just because an intervention should have an anticipated biological effect, does not mean that it does, or that other diagnostic characteristics will change in the desired manner. The final determination of the value of an intervention depends on the extent to which impaired cognition can be delayed safely, or reversed, and independence facilitated. CERAD-NAB and CERAD Plus can remain important in that determination.
Implications for further use
As indicated above, we suggest that neuropsychological assessments will remain useful to identify dementia at both an individual clinical level, and for epidemiological surveys. In particular, ongoing surveys of aging in different countries [10 , 103] may have a unique relevance to the U.S., which is becoming a majority minority country, or to any country with an increasing number of older minority residents, since it will be possible to compare “minority” performance in the country of origin with that of the same minority in the country of immigration, with the aim of avoiding diagnostic misclassification. Still needed, however, are norms for minority populations in the U.S., including for persons who ostensibly speak the same language, but come from different countries.
Current interest in dementia research is focused on identifying persons at high risk before manifestations of dementia become obvious. Inclusion of a mini-CERAD-NAB, such as used by the Health and Retirement Study Harmonized Cognitive Assessment [10], in the complementary Medicare Annual Wellness visit (for which 96% of the population age 65 years and over is eligible), would provide a baseline against which to assess cognitive change, and facilitate dementia diagnosis since additional relevant concerns, e.g., activities of daily living, depression, are also assessed.
As shown by CERAD Plus, with its improved diagnostic capabilities, measures should not remain static, but should be consistently re-evaluated for continued relevance, and modified to reflect changing diagnostic criteria and new findings. Greater attention should be paid to increasing the efficiency of the battery. Computerization should be encouraged (including for verbal responses which are currently difficult to score electronically). Ideally, scoring would become more reliable, information would be more quickly accessible, and developing databases would be simplified.
CERAD-NAB and CERAD Plus are used for a very broad variety of purposes. Use appears to be limited only by ingenuity, and a willingness among investigators to share their experience (as when they translate measures), and in their findings, when these are published.
Footnotes
ACKNOWLEDGMENTS
The extensive list acknowledging the sites and individuals involved in facilitating and providing critical data basic to developing the entire CERAD program, including CERAD-NAB, was given in Fillenbaum GG, van Belle G, Morris JC, Mohs RC, Mirra SS, Davis PC, Tariot PN, Silverman JM, Clark CM, Welsh-Bohmer KA, Heyman A. CERAD (Consortium to Establish a Registry for Alzheimer’s Disease): the first 20 years. Alzheimers Dement. 2008;4:96-109. We re-acknowledge that group and wish also to acknowledge and thank the many people who used, translated, published, and provided information relevant for the current paper.
CERAD owes a debt to the foresight of the original project officers, Zaven Khatchaturian and Teresa Radebaugh; to Neil Buckholz, who followed them; and especially to the PI: Albert Heyman, MD.
FUNDING
This work was initially supported by NIA grant AG06790 under which the Consortium to Establish a Registry for Alzheimer’s Disease was developed. The present paper was supported by NIA grant #1P30 AG028716 Claude D. Pepper OAIC (Duke University) (Fillenbaum).
CONFLICT OF INTEREST
The authors have no conflict of interest to report.
DATA AVAILABILITY
Data sharing is not applicable to this article as no datasets were generated or analyzed during this study.
