Abstract
Background:
In patients with Alzheimer’s disease, global assessment scales, such as the Clinical Dementia Rating-Sum of Boxes (CDR-SB), the Clinician’s Interview-Based Impression Plus Caregiver Input (CIBI plus), and the Clinical Global Impression (CGI) are commonly used.
Objective:
To clinically understand and interpret the associations between these scales, we examined the linkages for the total and change scores of CDR-SB, CIBI plus, and CGI.
Methods:
Individual participant data (N = 2,198) from five pivotal randomized placebo-controlled trials of donepezil were included. Data were collected at baseline and scheduled visits for up to 6 months. Spearman’s correlation coefficients ρ were examined between corresponding total and change scores of simultaneous CDR-SB, CIBI plus, and CGI ratings. To link between the simultaneous ratings, equipercentile linking was used.
Results:
We found strong evidence that the Spearman’s correlation coefficients between the CDR-SB and CGI, and CDR-SB and CIBI plus total scores were at least adequately correlated (ρ= 0.50 to 0.71, with p < 0.01). The correlation coefficients between the change scores of CDR-SB and CGI were deemed adequate for weeks 6 to 24 (ρ= 0.44 to 0.65); the remaining correlations were smaller in magnitude (ρ= 0.09 to 0.35). Overall, the linkages were in-line with expectations, e.g., CDR-SB range score of 3-4 (= very mild dementia) was linked to a CGI score of 3 (= mildly ill), and an increase of CDR-SB of 1 was linked to a change of 5 (= minimal worsening) in both CGI and CIBI plus.
Conclusion:
The study findings can be useful for clinicians wishing to compare scores of different scales across patients. They can also help researchers understand results of studies using different scales and can facilitate meta-analyses, to increase statistical power.
Keywords
INTRODUCTION
Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by progressive impairments in cognition, social and functioning activities, and ultimately death [1]. Multiple rating scales have been developed to measure functional impairment. These scales ascertain patient and caregiver information in addition to the clinician’s global impressions of the patient’s functional and cognitive capacities [2]. The scales are widely used in research and clinical settings to support a diagnosis, assess disease severity or progression, and quantify treatment efficacy. In 1990, the US Food and Drug Administration (FDA) highlighted that a global clinical assessment is essential and should be a co-primary outcome measure, in addition to those of cognitive functioning, when testing antidementia drugs based on the premise that a change in the patient’s global condition is clinically meaningful [3]. Recently, the FDA [4], the European Medicines Agency (EMA) [5], and the Japanese Pharmaceuticals and Medical Devices Agency (PMDA) [6] suggested that, at least for prodromal AD/mild cognitive impairment due to AD, a single primary endpoint examining both cognition and function could also be considered appropriate, a shift from the generally applied co-primary approach; nevertheless, this approach remains a requirement according to all three regulatory agencies for established AD. In response to the relevant guidelines, various global assessment scales are used.
The Clinical Dementia Rating (CDR) scale is a frequently applied global assessment instrument used to appraise the cognitive and functional aspects of dementia [7]. While the instrument is subjective as it relies on clinical impression after examining the patient and a caregiver/informant, evidence indicates that this measure has high validity and reliability [7–12]. A more expanded alternative to the CDR is the Clinician’s Interview-Based Impression Plus Caregiver Input (CIBI plus). The CIBI plus requires a semi-structured interview with the patient without other clinical information or test scores but with input from a responsible caregiver and so aims to capture ‘global’ changes across domains of cognition, function, and behavior [13, 14]. Finally, the Clinical Global Impression (CGI) scales are used to measure overall symptom severity (CGI-S) or change (CGI-C) in a variety of neurological and psychiatric conditions such as anxiety, bipolar disorder, depression, and schizophrenia [15]. Thus, the CGI is often disorder- and even study-specific. Its administration, scale wording, or scoring instructions may differ, capturing a variety of changes depending on the version included in the study. Nevertheless, the CGI is extremely useful in everyday clinical practice as it is quick and easy to complete, and its interpretation is straightforward even by clinicians who are not experts in a specific condition. Evidence demonstrates that the CGI links well to standardized detailed efficacy scales in anxiety [16–18], depression [19, 20], and schizophrenia [21, 22].
Linking different scales has been undertaken to examine measures of cognition in dementia research [23–26]. However, to date, no study has linked global assessment rating total and changes in AD. Many situations may require the conversion of scores obtained from one scale to equivalent scores on another scale. Score conversions are clinically useful, for example, when comparing different efficacy outcomes across patients. In research settings, linking measures is required when aiming to meta-analyze studies using different scales, and in postmortem research where information may be available on one but not another measure. Therefore, we hereby aimed to link CDR-SB, CIBI plus, and CGI using data from five pivotal double-blind, randomized placebo clinical trials of AD.
MATERIALS AND METHODS
Individual patient data
We requested individual participant data from double-blind, randomized placebo-controlled trials of donepezil conducted by Eisai Co. Ltd. in AD. Data access was provided following the submission of an a priori analytic plan and analyzed via a secure Internet cloud-based platform (http://www.clinicalstudydatarequest.com). We included only trials in which patients with AD were assessed using at least two of the following three scales: Clinical Dementia Rating-Sum of Boxes (CDR-SB), Clinician’s Interview-Based Impression Plus Caregiver Input (CIBI plus), and Clinical Global Impression (CGI). Institutional review boards had approved all studies, and all participants had given written informed consent.
Measures
Clinical dementia rating (CDR)
The CDR examines the patient’s performance in six domains. These are memory, orientation, judgment and problem solving, community affairs, home and hobbies performance, and personal care [7]. It involves a semi-structured interview with the patient and an informant or caregiver. It is clinician-rated on a 5-point scale with a rating of 0 meaning ‘no dementia’, 0.5 ‘questionable dementia’, and 1, 2, and 3 showing ‘mild, moderate and severe cognitive impairment’ respectively. The six domains are summed to create a 0–18 Sum of Boxes (SB) score, which is considered a more detailed index than the CDR global score (0–3) in cases of mild dementia [27], allowing a reliable differentiation between mild cognitive impairment and AD [28, 29]. According to these interpretive guidelines, a CDR-SB range score of 0.5–4.0 corresponds to ‘questionable cognitive impairment’, of 4.5–9.0 corresponds to ‘mild dementia’, of 9.5–15.5 to ‘moderate dementia’ and of 16–18 to ‘severe dementia’ [28, 29].
Clinician’s Interview-Based Impression Plus Caregiver Input (CIBI plus)
The CIBI plus is a measure of severity or change in four domains. Namely, general condition, cognitive function, behavior, and activities of daily living [13, 14]. Ratings are based on a semi-structured interview with the patient and a caregiver, and range from 1 to 7 points; for CIBI of Severity plus (CIBIS plus), 1 corresponds to ‘no symptoms’ and 7 to ‘extremely severe’, whereas for CIBI of Change plus (CIBIC plus), 1 corresponds to ‘markedly improved’ and 7 to ‘markedly worse’.
Clinical global impression (CGI)
The CGI is rated on a 7-point scale, ranging from 1 to 7, and reflects a clinician’s global judgment about illness severity or progression. For CGI-Severity (CGI-S) scale, 1 corresponds to ‘normal’, 2 to ‘borderline mentally ill’, 3 to ‘mildly ill’, 4 to ‘moderately ill’, 5 to ‘markedly ill’, 6 to ‘severely ill’ and 7 to ‘among the most extremely ill patients’, whereas for CGI-Change (CGI-C), 1 corresponds to ‘very much improved’, 2 to ‘much improved’, 3 to ‘minimally improved, 4 to ‘no change’, 5 to ‘minimally worse’, 6 to much worse’, and 7 to ‘very much worse’.
Statistical analysis
Initially, we computed descriptive statistics of the trial measures. For each study, donepezil and placebo arms were found to be very similar in terms of patient characteristics (see Supplementary Table 1 for descriptive statistics of the studies) and were pooled to increase statistical power. We analyzed all trials collectively as a unique population rather than by trial to maximize the sample size. Nevertheless, results on linking were presented separately for drug and placebo arms as well. We used the Spearman correlation coefficient (ρ), and the nominal level of statistical significance was adopted (p < 0.05). Spearman correlation coefficient was considered poor for values < 0.40, adequate for values between 0.40 and 0.70, and strong for > 0.70 [30]. Then, equipercentile linking was computed. This technique identifies scores on each scale that have the same percentile ranks and allows for a conversion between them. Linking assumes a nonparametric association between any two measures, aims to concord between those measures, allows for possible measurement errors on both scales compared and has no independent and dependent variables; thus, linked scores are interchangeable [31]. This method has been used in previous studies extensively [16, 32]. In instances of multiple assessments over time, the median value across the different measurement points was used to define the corresponding scores between scales [26]. Equipercentile linking was conducted using the equate library [31] in R 3.6.2. [33].
RESULTS
We included five pivotal randomized placebo-controlled trials of donepezil [34–38] that employed at least one simultaneous scheduled administration of the CDR-SB, CIBI plus, and CGI in the present analysis. Table 1 details the studies. Data from 2,198 trial participants with AD based on the Diagnostic and Statistical Manual of Mental Disorders, Revised Third or Fourth Edition (DSM-III-R/-IV) [39, 40], were included. None of the trials required biomarker confirmation of AD pathology. There were 854 men and 1,344 women, with a mean age (standard deviation [SD]) of 72.4 (7.5) years. Total and change rating scale score descriptive statistics are shown in Table 2. The CDR-SB was administered in all five trials, whereas the CIBI plus in three trials and the CGI in two trials with no overlap between them (Table 1). Independent raters scored CDR-SB and the global measurement in three of the studies [36–38]. Details on the specific versions of CGI used in the selected studies are provided in Supplementary Material A.
Trial characteristics of the analytic dataset
CDR, Clinical Dementia Rating global score; MMSE, Mini-Mental State Examination; ADAS cog, Alzheimer’s Disease Assessment Scale-Cognitive Subscale.
Sample characteristics
N, Number of sample size; CGI-S, Clinical Global Impression-Severity; CDR-SB, Clinical Dementia Rating-Sum of Boxes; CIBIS plus, Clinician’s Interview-Based Impression of Severity Plus Caregiver Input; CIBIC plus, Clinician’s Interview-Based Impression of Change Plus Caregiver Input; ρ, Spearman’s correlation coefficient. Values of ρ in bold are statistically significant (p < 0.05) Sample size varies owing to different visit schedules across trials.
CDR-SB, CGI-S, and CIBIS plus
The Spearman correlation coefficients between CDR-SB and CGI-S scores were strong at both available timepoints (ρ= 0.68 at baseline, and ρ= 0.71 at week 12; Table 2), with strong evidence against the null hypothesis of zero correlation (p-value < 0.01) in all instances. Equipercentile linking of CDR-SB scores with CGI-S scores was calculated. An estimated CDR-SB score of 2 corresponded to CGI-S score of 2, CDR-SB of 4 to CGI-S of 3, CDR-SB of 6.5 to CGI-S of 4, and CDR-SB of 11.5 to CGI-S of 5 (Fig. 1A, Table 3).

A) Clinical Dementia Rating Sum of Boxes (CDR-SB) linked to Clinical Global Impression Severity (CGI-S) at baseline and week 12. B) Clinical Dementia Rating Sum of Boxes (CDR-SB) linked to Clinician’s Interview–Based Impression of Severity Plus Caregiver Input (CIBIS plus) at baseline. C) Clinical Dementia Rating Sum of Boxes (CDR-SB) Change Score linked to Clinical Dementia Rating Sum of Boxes (CDR-SB) Change Score Clinical Global Impression Change (CGI-C) from week 1 to week 24 at scheduled visits with the median score across assessment points. D) Clinical Dementia Rating Sum of Boxes (CDR-SB) Change Score linked to Clinician’s Interview–Based Impression of Change Plus Caregiver Input (CIBIC plus) from week 3 to week 24 at scheduled visits with the median score across assessment points.
Conversion table between CGI-S, CDR-SB and CIBIS plus scores
CGI-S, Clinical Global Impression-Severity; CDR-SB, Clinical Dementia Rating-Sum of Boxes; CIBIS plus, Clinician’s Interview-Based Impression of Severity Plus Caregiver Input.
For both CDR-SB and CIBIS plus scales, only baseline assessments were available. The Spearman correlation coefficient between their scores was deemed adequate (ρ= 0.50 at baseline; Table 2), p < 0.01. Equipercentile linking of CDR-SB scores with CIBIS plus is presented in Fig. 1B and corresponding scores in Table 3. Approximately, CDR-SB score of 2.5 corresponded to CIBIS plus score of 1, CDR-SB of 3.5 to CIBIS plus of 2, CDR-SB of 5.5 to CIBIS plus of 3, CDR-SB of 7.0 to CIBIS plus of 4, CDR-SB of 11.0 to CIBIS plus of 5, and CDR-SB of 13.0 to CIBIS plus of 6.
CDR-SB change, CGI-C, and CIBIC plus
The Spearman correlation coefficients between CDR-SB change and CGI-C scores were poor at the first 3 weeks (ρ= 0.16 to 0.34), and adequate at weeks 4 to 24 (ρ= 0.44 to 0.65), with p < 0.01 at all time points apart from week 1 (Table 2). Equipercentile linking of CDR-SB change with CGI-C scores was calculated. Based on the median values across different time points, a CDR-SB decrease of 3.5 corresponded to CGI-C score of 1, a CDR-SB decrease of 2.0 corresponded to CGI-C score of 2, a CDR-SB decrease of 0.5 corresponded to CGI-C score of 3, no change on CDR-SB scores corresponded to CGI-C score of 4, a CDR-SB increase of 1.0 to CGI-C score of 5, a CDR-SB increase of 2.5 corresponded to CGI-C score of 6, and a CDR-SB increase of 5.5 corresponded to CGI-C score of 7 (See Fig. 1C and Table 4).
Conversion table between CGI-C, CDR-SB change and CIBIC plus scores
CGI-C, Clinical Global Impression-Change; CDR-SB, Clinical Dementia Rating-Sum of Boxes; CIBIC plus, Clinician’s Interview-Based Impression of Change Plus Caregiver Input. Conversion scores are based on median value across different measurement points to define the corresponding scores between the scales.
The Spearman correlation coefficients between CDR-SB change and CIBIC plus scores were poor over time (range ρ= 0.09 to 0.35; Table 2). Equipercentile linking of CDR-SB change with CIBIC plus scores was calculated. Based on the median values across different timepoints, a CDR-SB decrease of 5.0 corresponded to CIBIC plus score of 1, a CDR-SB decrease of 3.0 corresponded to CGI-C score of 2, a CDR-SB decrease of 1.0 corresponded to CGI-C score of 3, no change on CDR-SB corresponded to CGI-C score of 4, a CDR-SB increase of 1.0 corresponded to CGI-C score of 5, a CDR-SB increase of 3.0 corresponded to CGI-C score of 6, and a CDR-SB increase of 6 corresponded to CGI-C score of 7 (See Fig. 1D and Table 4). In Supplementary Material B, results on linking were presented separately for drug and placebo arms (see Supplementary Tables 2–5).
DISCUSSION
The current study was the first to examine the linkage among global rating scales in patients with mild to moderate AD (i.e., CDR-SB, CIBI Plus, and CGI total and change scores). Based on individual participant data derived from five pivotal randomized controlled trials that compared donepezil and placebo, our results offered researchers and clinicians a concordance table of total and change scores between these global rating scales. There are several notable features to the current study. First, these broadly used scales have never been linked before. Second, the results are from individual-level data on a large sample of participants (n = 2,198), strengthening the robustness of our findings. Third, the concordance tables may solve practical problems in everyday clinical practice and research. Fourth, placebo and donepezil treated participants were included in the analysis, which results in a greater spread of post-baseline scores.
The CDR-SB had adequate correlations with CGI-S and CIBIS plus total scores, increasing our confidence in the linking functions. Results suggested that a CDR-SB score of 2–2.5 (= questionable cognitive impairment) was linked to a CGI-S score of 2 (= borderline mentally ill) and to a CIBIS plus score of 1 (= normal). Similarly, a CDR-SB score of 3.5–4 (= very mild dementia) was linked to a CGI-S score of 3 (= mildly ill) and to a CIBIS plus score of 2 (= borderline mentally ill); a CDR-SB range score of 5.5–7 to a CGI-S score of 4 (= moderately ill) and a CIBIS plus score of 3 (= mildly ill) or 4 (= moderately ill); and a CDR-SB range score of 11–13 (= moderate dementia) to a CGI-S score of 5 (= markedly ill) and to a CIBIS plus score of 5 (= markedly ill) or 6 (= severely ill).
In contrast, the correlation of change scores between the rating scales examined was lower, especially for the linking of CDR-SB and CIBIC plus ratings, suggesting that conversion of change scores may be less reliable than total scores, a finding replicated in many other studies [41, 42]. These lower correlations may partly reflect floor and ceiling effects in extreme change scores (i.e., scales that lack sensitivity to differentiate moderate from marked change and vice-versa) as it is evident in Table 4, with similar scores indicating moderate and marked improvement. In addition, extreme values often result in weakened linkages owing to few available data. Therefore, more caution is necessary when interpreting the results in the very low or very high end of the score range. As for the minimal clinically important difference, our results suggested that a 1-point CDR-SB increase corresponds to minimal worsening (Table 4) whereas Andrews et al. reported that, for patients with mild, moderate, or severe AD, a 2-point CDR-SB increase is indicative of a minimal clinically important decline using anchor-based methods [43]; nevertheless, our study was not designed to identify minimal clinically important differences. Moreover, correlations of change scores between CDR-SB and CGI-C were adequate for weeks 16 to 24. Several potential reasons could explain why the change score correlations increased over time (Table 2). One explanation is that clinicians expect a delayed treatment effect of antidementia drugs [44]; thus, as the trial progressed, the variability of scale scores increased, as did the correlation coefficients between them. Another explanation is that the natural progression of the disease induced an increase in functional impairment and its variability.
One should consider additional general drawbacks when analyzing change scores in AD. First of all, clinicians appear to find it easier to identify progressive deterioration than to identify amelioration [45, 46] given the understanding that dementia progression is inevitable, even under treatment. In other words, the model of progressive decline is better defined and identified than the model of improvement [47, 48]. Furthermore, in AD, any given patient’s picture may remain stable for an extended time interval or fluctuate within or by day [49]. Thus, it is evident that, in order for a fixed change of symptom severity to accumulate and become clinically detectable, RCTs with long durations may be needed [50]. Indeed, a European task force consensus recommended 18-month-long trials be used in AD [51], but our analysis was restricted to a maximum of 6 months. Offsetting this limitation, we present information by period for researchers and clinicians where required.
Participants in our analysis were highly selected since they came from double-blind, randomized placebo-controlled trials with strict inclusion criteria (Table 1). Based on trial inclusion criteria, no patient had very early or prodromal AD (stages 1, 2, and 3) or severe AD (stage 6) since all patients were required to have an initial CDR severity of 1 (mild AD, stage 4) or 2 points (moderate AD, stage 5)—an entry cut-off for all studies (Table 1). This inclusion selection cut-off limited the variability of patient severity needed in linking studies and no linking results were available for total scores corresponding to prodromal or severe dementia. Also, research has shown that such selection criteria allow only limited generalizations from clinical trial data to the general population [52, 53]. Nonetheless, our findings may apply to persons with mild to moderate symptoms of AD. Epidemiological studies with longer follow-up periods may be needed to inform clinical practice better and replicate the current results. Furthermore, there may be order-of-assessment effects in the participant ratings.
Notwithstanding the limitations of our study, our results can be useful in clinical and research settings. In clinical settings, they provide clinicians an accessible means to interpret different ratings, and so for instance, compare patients. In research settings, they are useful for meta-analysis (e.g., when the conversion of a covariate or efficacy measure is needed), to pool and/or harmonize data across studies and so increase statistical power, and when postmortem data is examined since information may be limited and linking could be a sensible alternative.
In conclusion, the provided linkages between CDR-SB and CGI as well as CDR-SB and CIBI plus offer the first evidence-based insight into the associations between these global rating scales, which could contribute to a better understanding and improved interpretation of clinical data and trial results in Alzheimer’s disease.
Footnotes
ACKNOWLEDGMENTS
We acknowledge Eisai Co. Ltd for providing us with the study data. Eisai Co. Ltd did not provide study design, critical input, or manuscript review for the study. The views expressed in this manuscript may or may not express the views of Eisai Co. Ltd. We acknowledge
for hosting the study data. http://www.clinicalstudydatarequest.com did not provide study design, critical input, or manuscript review for the study. The views expressed in this manuscript may or may not express the views of http://www.clinicalstudydatarequest.com.
Cipriani is supported by the National Institute for Health Research (NIHR) Oxford Cognitive Health Clinical Research Facility, by an NIHR Research Professorship (grant RP-2017-08-ST2-006), by the NIHR Oxford and Thames Valley Applied Research Collaboration and by the NIHR Oxford Health Biomedical Research Centre (grant BRC-1215-20005). The views expressed are those of the authors and not necessarily those of the UK National Health Service, the NIHR, or the UK Department of Health. Efthimiou is supported by project grant No. 180083 from the Swiss National Science Foundation (SNSF).
In the last three years Stefan Leucht has received honoraria as a consultant and/or advisor and/or for lectures from Alkermes, Angelini, Eisai, Gedeon Richter, Janssen, Johnson and Johnson, Lundbeck, Medichem, Merck Sharpp and Dome, Otsuka, Recordati, Rovi, Sandoz, Sanofi Aventis, Sunovion, TEVA.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
