Abstract
Keywords
INTRODUCTION
A timely and accurate diagnosis in people with cognitive complaints can answer concerns about symptoms [1], facilitate adjustment [2], and is essential to initiate appropriate care and treatment [3]. However, differentiating between normal aging, mild cognitive impairment (MCI), and dementia, determining the underlying cause for cognitive complaints, and making a reliable prognosis for a patient is often challenging in clinical practice [4, 5].
The diagnostic assessment of patients presenting at a memory clinic with a cognitive complaint starts with clinical evaluation by a medical doctor and may be followed by neuropsychological assessment (NPA), depending on the setting. Many studies have shown that several single neuropsychological tests, isolated or combined, can predict future dementia [6] and differentiate between Alzheimer’s disease (AD) and other common causes for cognitive impairment [7–10]. Neuropsychological testing has long been embedded for both diagnostic and treatment purposes in daily clinical practice at memory clinics and is a useful additional test according to clinical criteria for AD and neurocognitive disorders [11]. However, the added diagnostic and prognostic value of a complete NPA to the clinical evaluation has not been systematically assessed. Identification of the optimal diagnostic strategy in patients with cognitive complaints is important to ensure an accurate diagnosis and prognosis, without preventing valuable testing or exposing patients to unnecessary testing.
We therefore aimed to evaluate the added diagnostic and prognostic value of NPA as administered after the clinical evaluation in tertiary memory clinic patients and to derive strategies for optimal use of NPA as a diagnostic instrument. A staged clinical decision-making process was used to compare the usefulness of the standard clinical evaluation without and with NPA to diagnose cognitive syndrome, differentiate between AD and other etiologies, and to make a disease course prognosis. We examined change in diagnostic accuracy, correct reclassification, net reclassification improvement and change in diagnostic confidence related to NPA when added to the standard clinical evaluation.
MATERIALS AND METHODS
Participants
We selected participants from the prospective Leiden-Alzheimer Research Netherlands (LeARN) study [12]. In this study, consecutive patients from four Dutch academic memory clinics (Maastricht University Medical Center, VU Medical Center, Radboud University Medical Center, Leiden University Medical Center) were included between 2009 and 2011 and annually followed for two years. Inclusion criteria were referral for a cognitive complaint by a physician or suspicion of a cognitive disorder, a Mini-Mental State Examination (MMSE [13]) score≥20 [14], and a Clinical Dementia Rating (CDR [15]) score≤1. Patients were excluded if the cognitive problems were due to stroke, neurological disorder, major psychiatric disorder, or substance abuse, or when a reliable informant was absent. The Medical Ethics Committee of each participating center approved this study and all patients initially included and their informal caregivers gave written informed consent.
Of the 304 participants who met inclusion criteria for the LeARN study, 221 patients were included in the current analyses. Eligibility for these analyses required availability of at least one follow-up assessment at one or two years after the baseline assessment. Patients who died (n = 3), were institutionalized (n = 4), unwilling to further participate (n = 62), or could not be reached (n = 14) were therefore excluded. Of these, 163 (74%) had both a 1-year and a 2-year follow-up assessment, 39 (18%) had only a 1-year and 19 (9%) had only a 2-year follow-up assessment. Eligible patients and excluded patients did not differ in age at baseline, years of education, MMSE score, CDR score, or clinical diagnosis at baseline.
Clinical evaluation
At baseline and follow-up, participants underwent a standardized clinical evaluation consisting of a detailed history provided by the patient and informant, a psychiatric, neurological, and physical examination, and assessment using clinical rating scales. Clinical rating scales included the MMSE [13], the CDR [15], the Geriatric Depression Scale-15 (GDS-15) [16], the Neuropsychiatric Inventory (NPI) [17], and the Disability Assessment for Dementia (DAD) [18].
Neuropsychological assessment
Neuropsychological assessment was administered at baseline and repeated annually, and consisted of a standardized battery of Dutch versions of cognitive tests [19]. Tests included the Auditory Verbal Learning Test (AVLT) [20] comprising an immediate recall of 5 trials of 15 semantically unrelated words and a delayed recall and recognition after 20 minutes to assess verbal memory, the Visual Association Test (VAT) [21] to assess visual memory, the Digit Span task of the WAIS III (forward and backward) [22] to evaluate working memory, a one-minute verbal fluency test using a semantic category cue (animals) [23] to assess language function, the Letter Digit Substitution Test (LDST) [24] to assess information processing speed, the Stroop Color-Word Test (SCWT) [25] and the Trail Making Test (TMT) (Part A ‘letters’ and Part B ‘concept shifting’) [26] to assess attention and executive functioning, and subtests of the Visual and Object Spatial Perception (VOSP) test [27] to assess visual perception. Raw test scores were converted to z-scores using Dutch norms adjusting for age, gender, and education.
Neuroimaging
Neuroimaging by magnetic resonance imaging (MRI) took place at baseline only. Measures of medial temporal lobe atrophy (MTA), scored with the MTA scale [28], white matter lesions scored with the Fazekas scale [29], and the number of infarcts, lacunes and microbleeds were derived from the 3T MRI images, which were acquired following the standardized Parelsnoer protocol [19].
Diagnostic procedure
All patient information was summarized in digital case descriptions and presented to individual members of an expert panel in a stepwise fashion to simulate clinical practice (Fig. 1), as previously described [30]. An expert panel was composed of clinicians from three different disciplines (geriatrics, geriatric psychiatry, and neurology) with at least five years of clinical experience in a memory clinic. This included experience with integrating neuropsychological test information in their diagnostic reasoning. At each stage, experts were asked to individually diagnose the patients’ syndrome and etiology underlying the cognitive complaints, and to predict the course of cognitive symptoms and daily functioning in the next two years. The syndrome could be classified as: subjective cognitive impairment (SCI), MCI, or dementia; the etiology as: AD, vascular, frontotemporal, Lewy Body or Parkinson’s, other, or no neurodegenerative disease; and the disease course prognosis as: improvement, stable, or decline. Experts also indicated confidence in their diagnosis on a scale from 0% (very uncertain) to 100% (fully convinced) at each stage. Firstly, the information from the clinical evaluation as described above was presented to the individual experts resulting in the initial clinic only diagnosis on syndrome, etiology and predicted disease course. Secondly, neuropsychological test results (z-scores) and a short summary of the neuropsychologist were presented as add-on to the clinical information and interpreted by the experts to make a subsequent clinic + NPA diagnosis. The short summary provided by the neuropsychologist consisted of 2-3 sentences in which performance on all individual cognitive domains tested was objectively described as “(very) high”, “above average”, “average”, “below average”, or “(very) low” without including an opinion or diagnosis. Characteristic observations conducted during testing were also included. Thirdly, the baseline information on the clinical evaluation and NPA results combined with measures from the MRI scan was presented and integrated by the experts in the baseline diagnosis. Lastly, all information from baseline and follow-up was presented to determine the follow-up diagnosis.
The same trio of experts completed all steps for each case. Experts did not take notes during any of the assessments. Experts were not obliged to apply strict decision rules but were recommended to use diagnostic guidelines as they would do in clinical practice. The patient’s name was de-identified and the number of cases in which the expert and the participant originated from the same clinical center was minimized to prevent experts from recognizing a case. However, this could not be fully prevented, which resulted in the recognition of one case by one expert.
Consensus procedure
For the clinic only and clinic + NPA diagnoses for each patient, we adopted the syndrome (SCI, MCI, or dementia), etiology (dichotomized into AD or no AD) and prognosis (dichotomized into decline or no decline) as designated by the majority of experts (at least 2 out of 3).
For the baseline diagnoses, consensus among members of an expert panel on the syndrome was determined and for the follow-up diagnoses, consensus on the etiology and predicted disease course was determined. In case of discrepancy, a meeting was organized approximately 2 weeks after the initial rating in which the expert trios were invited to express their arguments and reach consensus. This procedure resulted in the consensus syndrome, consensus etiology (dichotomized), and consensus disease course (dichotomized) that were used as reference standards for syndrome, etiology and disease course prognosis.
Data analysis
Syndrome, etiology and prognosis for the clinic only and clinic + NPA diagnoses were compared against their reference standards. Percentages correctly classified in the clinic only and clinic + NPA diagnoses were calculated for syndrome, etiology and predicted disease course and compared with the McNemar test statistic for paired proportions. We calculated sensitivity and specificity, positive predictive value and negative predictive value for the etiology and disease course classifications of the clinic only and clinic + NPA diagnoses. Mean certainty of the clinic only and clinic + NPA diagnoses and prognoses was compared with paired-sample t-tests. The added value of NPA was expressed in percentages of (correct) reclassifications in syndrome, etiology and predicted disease course from clinic only and clinic + NPA diagnosis determined using reclassification tables. Category-based Net Reclassification Indices (NRI) for events (AD or decline) and non-events (no AD or no decline) were calculated for the etiology and predicted disease course classifications. The NRI is the difference in proportions reclassified upwards and downwards among events versus non-events, or NRI = [Pr(up | events) – Pr(down | events)] + [Pr(down | non-events) – Pr(up | non-events)] = NRI events + NRI non-events [31]. NRI events and NRI non-events represent the net percentage of patients with or without events correctly assigned after receiving the NPA results. Z-scores for the NRIs were calculated according to the method of Pencina et al. [31] and compared with the standard normal distribution. To identify in which cases the NPA had added value, clinical characteristics and NPA results of patients in whom syndrome, etiology, or course was correctly reclassified from clinic only to clinic + NPA diagnosis were compared with those of patients in whom there was no or incorrect reclassification using independent t-tests and chi-square tests. Percentages of correct and incorrect reclassification were calculated for each value of the predictor. We also characterized patients in whom etiology or course was correctly not reclassified from clinic only to clinic + NPA diagnosis to identify in which cases the NPA had no added value. Analyses were conducted with SPSS version 20.0 (Chicago, IL, USA) and significance level was set at p < 0.05.
RESULTS
Baseline characteristics of the sample are displayed in Table 1. Of the 221 patients included in the analyses, 52 were diagnosed with SCI, 90 with MCI, and 79 with dementia at consensus syndrome. Based on the consensus etiology, cognitive problems were caused by AD in 102 (46%) patients (2 SCI, 43 MCI, 57 dementia). Clinically relevant cognitive or functional decline over the follow-up period was observed in 128 (58%) participants (5 SCI, 53 MCI, 70 dementia) based on the consensus disease course. The experts did not reach consensus on the reference etiology in 1 case, and on the reference disease course in 3 cases. In these participants a majority decision was adopted.
Comparison of clinic only and clinic + NPA diagnoses
In these analyses, we directly compared the syndrome, etiology, and predicted disease course assigned to patients in the clinic only and in the clinic + NPA diagnoses with the reference standards to examine correct classification. The syndrome was initially correctly classified in 70.7% (n = 152, indeterminate diagnosis in 6 cases) of patients, which increased to 88.5% (n = 192, indeterminate diagnosis in 4 cases) after NPA results were disclosed (χ2 = 15.2, p < 0.001). The mean certainty of the experts in their syndrome classification increased from 68.0% to 74.3% (t = –10.9, p < 0.001) after making NPA data available. Diagnostic accuracy measures of the etiology and predicted disease course classifications are displayed in Table 2. Correct classifications of etiology in 76.5% of patients by the clinic only diagnosis increased to 81.0% by the clinic + NPA diagnosis (χ2 = 10.0, p = 0.002). In making a disease course prognosis, the percentage correctly classified cases slightly increased after presentation of NPA results from 74.7% to 75.6% (χ2 = 0.5, p = 0.48). The experts were 6-7% more certain of both etiology (t = –12.1, p < 0.001) and predicted disease course (t = –10.1, p < 0.001) classifications after reviewing the NPA. Supplementary Table 2 presents measures of classification separately for patients with up to 2 years of follow-up and patients with only a 1-year follow-upassessment.
Reclassification of diagnoses
We also examined the reclassification of syndrome, etiology, and predicted disease course from clinic only diagnosis to clinic + NPA diagnosis and determined whether this reclassification was correct based on the reference standards. Table 3 displays the reclassification of syndromes assigned to patients at the clinic only and clinic + NPA stages compared to the consensus syndrome. Overall, the initial classification of syndrome changed in 20.7% (n = 44, indeterminate diagnosis in 9 cases) after reviewing NPA results, which were correct when compared to the consensus syndrome in 90.9% (n = 40). In cases initially assigned as SCI, 33.3% (n = 15) were reclassified to MCI of which correctly in 93.3% (n = 14). Of the patients who were initially diagnosed with the MCI syndrome, 25.7% (n = 28) were reclassified to SCI or dementia, for whom the reclassification was correct in 89.3% (n = 25). Only 1 case with initial dementia was reclassified to MCI, which matched the consensus syndrome.
Etiology was reclassified by the clinical experts in the clinic + NPA diagnosis in 14.5% (n = 32), which were correct when compared to the consensus etiology in 65.6% (n = 21, Table 4). Overall, there was net improvement in diagnosing etiology (NRI = 0.10, z = 1.85, p = 0.032) after NPA and this was mainly caused by an improvement of 8.8% in reclassifying the etiology of AD cases from ‘no AD’ to ‘AD’.
The disease course prognosis was reclassified in 15.4% (n = 34) of patients, of which correctly according to the consensus prognosis in 52.9% (n = 18). The overall NRI for predicted disease course was 0.01 (z = 0.12, p > 0.05) indicating no significant improvement.
To evaluate the added value of NPA across different syndromes, we examined reclassification from clinic only to clinic + NPA diagnosis according to the clinic only diagnosis of syndrome. Etiology was more often correctly than incorrectly reclassified in patients with an initial SCI or MCI syndrome, but not dementia. NRIs were different per diagnostic group and significant for the etiology diagnosis in patients with initial SCI (NRI = 0.61, z = 1.81, p = 0.036) and MCI (NRI = 0.17, z = 2.50, p = 0.006), and for the disease course prognosis in patients with initial MCI (NRI = 0.14, z = 1.68, p = 0.047). These improvements were all driven by the NRI events, indicating that the NPA is beneficial for ruling in AD etiology. For the disease course prognosis, there were more correct than incorrect reclassifications in patients with initial MCI, but not SCI and dementia. In initial dementia patients, a net worsening of classification (NRI = –0.06, z = –1.73, p = 0.042) due to the NPA was found in making a disease courseprognosis.
Indicators of correct reclassification
Characteristics of participants in whom syndrome, etiology, or predicted disease course were correctly reclassified or correctly not reclassified (i.e., unchanged correct classification) after NPA are displayed in Supplementary Table 1. Higher MMSE (27.5 versus 25.6, t = –3.5, p = 0.001) and lower CDR scores (0.5 versus 0.6, t = 3.4, p = 0.001) were found to be indicative of correct syndrome reclassification. Correct reclassification of etiology was characterized by worse AVLT delayed recall z-score (–2.2 versus –1.5, t = 2.1, p = 0.039) and better performance on digit span (15.7 versus 13.1, t = –2.5, p = 0.012). Further, lower CDR scores (0.4 versus 0.6, t = –2.6, p = 0.010) and higher AVLT immediate recall z-scores (–0.7 versus –1.5, t = –2.6, p = 0.011) were indicators of correct reclassification of predicted disease course. Figure 2 depicts the distribution of correct and incorrect reclassifications for values of the neuropsychological tests and scales that were related to correct reclassification and Supplementary Figure 1 further specifies the direction of these reclassifications. Patients in whom the NPA did not change an initially correct classification were in general characterized by worse performance on both clinical rating scales and neuropsychological tests. Patients’ age, years of education, score on a depression or disability questionnaire, and other neuropsychological test results were not indicative of correct reclassification or correct no reclassification.
DISCUSSION
In this study, we assessed the added value of NPA to standard clinical evaluation in the diagnostic process in memory clinic patients. NPA changed the diagnosis of the cognitive syndrome in 22% of patients and the underlying etiology as well as the disease course prognosis in 15%. Overall, this led to an increase in correctly classified cases that could be attributed to the NPA of about 18% for the cognitive syndrome, 5% for the underlying etiology, and 1% for the predicted disease course. Results demonstrated that the diagnostic and prognostic value of NPA when added to the clinical evaluation depended on which initial syndrome was diagnosed. The NPA shows added value for diagnosing syndrome and etiology in patients with initial SCI or MCI, and for predicting disease course in patients with an initial MCI syndrome. There was no added diagnostic or prognostic value of the NPA in patients suspected of dementia after the initial clinical evaluation. Further, inclusion of the NPA in the diagnostic process increased confidence in the diagnosis.
NPA had high added value in diagnosing the cognitive syndrome, which is in line with the general idea that neuropsychological tests are accurate in determining the severity of cognitive deficits [32]. The value of NPA in diagnosing syndrome did not depend on initial under- or overestimation of cognitive complaints. There was added value of NPA in diagnosing etiology in patients with an initial SCI or MCI syndrome, and for predicting disease course in patients with initial MCI. Our overall NRI was mainly directed by the events NRI in these patients, indicating that the NPA supported the detection of persons who had AD or declined at follow-up. In patients initially suspected of a dementia syndrome, the added value of the NPA was limited, and in some cases the NPA even misled the experts into believing patients would remain stable. Furthermore, incorrect reclassifications of predicted disease course often occurred in patients with SCI in whom it is likely to take more than two years before a disease is clinically expressed [33]. When determining whether or not to conduct a NPA or to reclassify after the NPA has been conducted, tests displayed in Fig. 2 and Supplementary Figure 1 can be used as guideline in decision-making. Results also showed that age and education, depression and disability scores were unrelated to the impact of reclassification after NPA on the diagnosis. This indicates that the added value of NPA in the diagnostic process of memory clinic patients is not dependent on these factors.
To our knowledge, no earlier study investigated the added value of a complete NPA to the standard clinical evaluation or examined neuropsychological test results in a memory clinic setting by verifying diagnoses through a reference diagnosis such that correct reclassifications could be determined. Other studies concerning dementia diagnostics found changes caused by the NPA of 11% for the etiology diagnosis [34], 25% for diagnosing a combination of syndrome and etiology [35], and 26% for diagnosing a combination of syndrome and etiology when both NPA and MRI were added to the standard clinical assessment [36]. However, divergent study populations and classification categories prevented reliable comparison with our study.
This study simulated clinical decision-making using comprehensive case descriptions and panel discussions. This combination of individual and plenary approaches has been found to produce very similar results to a full plenary approach as often adopted in clinical practice [37]. Additionally, assigning diagnoses after each step allowed us to compare classifications but also to investigate reclassifications, which are important to consider when the goal is to gain insight in the added value of a diagnostic instrument. The NRI measure is implicitly weighted by the event rate [38], which is useful when certain outcomes are not as prevalent as others, as is the case in this study. Another strength of this study is that diagnoses were compared against a reference standard to assess correctness. It should be noted that clinical applications of the NPA other than diagnostics, such as psycho-education, cognitive training and treatment, were not assessed here while these are also important utilities ofNPA [39].
A limitation of the study design is that incorporation bias may have occurred since the index test results were also part of the reference standard, as is often inevitable in dementia diagnostic studies [40]. This may have led to an overestimation of the various measures of diagnostic accuracy. Also, there was a maximum follow-up of only two years. There might have been an overestimation of expected decline in dementia patients while the period is too short to decline for persons with SCI. Most patients were followed for two years but some patients only received a one-year follow-up assessment, which slightly affected the diagnostic accuracy measures. Further, although amyloid and tau biomarkers are more often used in clinical practice, we did not incorporate this information in the diagnostic process nor did we verify the etiology reference standard based on biomarker information. Moreover, the expert panels already diagnosed etiology and predicted disease course correctly in about 75% based on the clinical evaluation only. This high rate is likely explained by the extensive information made available in the clinical evaluation leaving less room for improvement for NPA and by the extensive diagnostic experience of our clinical experts. Additionally, this study was performed in specialized centers and results may therefore not be generalizable to other settings. Use of multidisciplinary teams including a neuropsychologist who interprets and transfers NPA findings in a comprehensive manner during the diagnostic process may overcome this limitation in less specialized centers. The added value of NPA may also depend on the choice of NPA tests or the order of presentation of additional test results, and may prove differently when more sensitive or specific measures would be used or when NPA and MRI would be presented in a different order. Lastly, our simulation was a simplification of actual practice because we presented summarized NPA results, which possibly led to an underestimation of the impact of the NPA since a neuropsychologist would also provide a contextual interpretation of the results, and take part in multidisciplinary meetings in clinical practice.
Although the NPA is widely applied in patients presenting with cognitive complaints, the diagnostic and prognostic value was not evaluated before in a large prospective study simulating clinical practice. We conclude that the added value of the NPA depends on the initial clinical impression of syndrome, and is different for diagnosing syndrome, etiology or predicting disease course. This may stimulate a more individualized approach in the diagnostic evaluation of persons with cognitive complaints, which may benefit cost-effectiveness of NPA. To make an optimal diagnosis and prognosis, we recommend the use of NPA as decisive investigation in patients who are considered non-demented at initial clinical evaluation and in patients in whom more certainty about the diagnosis is desired. In patients suspected of dementia, the diagnostic value of the NPA is limited and this should lead to reconsideration of administering NPA bearing in mind alternative purposes of NPA such as inventorying profiles of cognitive strengths and weaknesses, burden and care needs.
Footnotes
ACKNOWLEDGMENTS
This research is performed within the framework of CTMM, the Center for Translational Molecular Medicine (
), project LeARN (grant 02 N-101). The funding body had no role in the study design, the collection, analysis or interpretation of data, or the writing of the manuscript. The funding body checked the manuscript for possible infringement of intellectual property rights and approved the manuscript for publication without suggesting any revisions.
The authors thank Nico Rozendaal for data management of the LeARN project.
