Abstract
The clinical utility of amyloid positron emission tomography (PET) has not been fully established. Our aim was to evaluate the effect of amyloid imaging on clinical decision making in a secondary care unit and compare our results with a previous study in a tertiary center following the same methods. We reviewed retrospectively 151 cognitively impaired patients who underwent amyloid (Pittsburgh compound B [PiB]) PET and were evaluated clinically before and after the scan in a secondary care unit. One hundred and fifty concurrently underwent fluorodeoxyglucose (FDG)-PET. We assessed changes between the pre- and post-PET clinical diagnosis and Alzheimer’s disease treatment plan. The association between PiB/FDG results and changes in management was evaluated using χ2 and multivariate logistic regression. Concordance between classification based on scan readings and baseline diagnosis was 66% for PiB and 47% for FDG. The primary diagnosis changed after PET in 17.2% of cases. When examined independently, discordant PiB and discordant FDG were both associated with diagnostic change (p < 0.0001). However, when examined together in a multivariate logistic regression, only discordant PiB remained significant (p = 0.0002). Changes in treatment were associated with concordant PiB (p = 0.009) while FDG had no effect on treatment decisions. Based on our regression model, patients with diagnostic dilemmas, a suspected non-amyloid syndrome, and Clinical Dementia Rating <1 were more likely to benefit from amyloid PET due to a higher likelihood of diagnostic change. We found that changes in diagnosis after PET in our secondary center almost doubled those of our previous analysis of a tertiary unit (9% versus 17.2%). Our results offer some clues about the rational use of amyloid PET in a secondary care memory unit stressing its utility in mild cognitive impairment patients.
INTRODUCTION
Positron emission tomography (PET) tracers allow moderate to frequent amyloid-β (Aβ) plaques to be detected in the brain. There is abundant evidence of the relationship between the risk of mild cognitive impairment (MCI) and progression to Alzheimer’s disease (AD) with brain Aβ deposits [1, 2]. Although PET amyloid has been included in new proposals of research criteria for AD, [3] there are still many uncertainties regarding the implications of having a positive amyloid scan in absence of the cognitive symptoms typical of AD. On the other hand, there have been documented pathologically proven AD cases with negative ante-mortem amyloid PET scan [4]. Therefore, amyloid testing should be put in context with clinical evaluation and other biomarkers.
Three amyloid tracers have been approved for clinical use, but their cost at present is high and there is still insufficient clinical experience [5]. In 2013, Appropriate Use Criteria (AUC) were published. However, these are recommendations mainly based on expert panels [6]. Nowadays, Centers for Medicare & Medicaid Services do not provide coverage for amyloid PET scans due to insufficient evidence for health improvement in dementia with these techniques [7]. A recent literature review using a structured framework developed for the assessment of oncological biomarkers concluded that large studies assessing clinical utility of amyloid PET were needed [8]. Several publications have attempted to address this issue; however, many of them come from tertiary care centers with selected patients included in ongoing research protocols and treated by highly specialized neurologists [9–20]. In 2014 we published the experience of University of California San Francisco Memory Aging Center (UCSF-MAC) with Pittsburgh compound B PET (PiB-PET) [15]. We showed that discordance between initial clinical diagnosis and the result of the PET was a major driving force of diagnostic changes. However, the agreement between clinicians and PET in that center was very high and the percentage of patients with diagnostic changes after PET was lower than in previous reports. One of the caveats of that study was the dubious generalizability of some of the findings, particularly to less specialized practice settings.
The purpose of this study was to evaluate the effect of PiB-PET and fluorodeoxyglucose (FDG) PET in clinical practice in a secondary care memory unit attending non-selected patients with cognitive complaints referred by general practitioners. To achieve this, we followed the same design as in our previous study at UCSF-MAC, but applied to a less specialized setting at University Hospital Marqués de Valdecilla (UHMV) in Santander (Northern Spain).
We hypothesize that there might be substantial differences in the estimation of the clinical effect of PET amyloid depending on the particularities of the center. In a secondary care unit like UHMV, PiB-PET might have higher repercussions on clinical management and might be more influential for clinicians than in highly specialized tertiary units like UCSF-MAC.
MATERIALS AND METHODS
Study population
We reviewed retrospectively the UHMV Memory Unit database between 2010 and 2015, and out of the 2116 new patients evaluated, we identified 151 who underwent FDG-PET and PiB-PET and were assessed clinically before and after the scan. PET scans were performed under research protocols evaluating the utility of PiB in the differential diagnosis of AD [21]. Tests were ordered by the treating neurologists when considered to be helpful in their diagnostic workup. Patients with unstable medical comorbidities, brain mass lesions, and significant cerebrovascular disease were not eligible. Before the PET scan all patients underwent an assessment by a neurologist, cognitive testing, and structural neuroimaging with CT or MRI. CSF AD biomarkers were not available at the time of disclosure of the PET scan results. FDG-PET and PiB-PET results were revealed simultaneously to the neurologist. Clinical diagnosis was made based on best clinical judgment by the attending neurologists. Up to three differential diagnoses could be listed on the “differential diagnosis,” ranked in order of likelihood. The post-PET visit included a clinical evaluation and review of PET results. Patients’ records were reviewed retrospectively by two neurologists (CL and AGS) to determine the use of AD specific medications at the pre- and post-PET visits.
PET scan acquisition and interpretation
All patients underwent PiB-PET and FDG-PET at the Nuclear Medicine Department of UHMV. 11C-PiB synthesis and image acquisition have been described elsewhere [21]. PET scans were visually interpreted by an experienced nuclear medicine specialist (JJB or IB) as positive/negative for cortical PiB uptake. The inter-rater reliability was very high, with a correlation of 93.3% and a kappa coefficient of 0.87 (p < 0.001). When a PiB-PET was considered as positive a global subjective estimation of the amyloid load was given (mild, moderate or severe) describing which brain areas were involved. Equivocal cases were repeated to rule out technical issues, if after repetition they were still considered as borderline they were removed from the analysis. FDG scans were rated as consistent with “AD” or its variants (including dementia with Lewy bodies) if hypometabolism primarily involved the temporoparietal cortex, posterior cingulate/precuneus, or occipital cortex. Scans were rated as “non-AD” if hypometabolism primarily involved the frontal or anterior temporal cortex (frontotemporal dementia [FTD] pattern) or appeared within normal limits. All PET scan ratings were performed blinded to clinical data. The clinician in charge was given a report including the dichotomous classification of each scan and a description of each tracer’s spatial binding pattern.
Standard protocol approvals, registrations, and patient consent
Written informed consent was obtained from all patients or surrogates. The study was approved by our regional review board for human research (Comité Ético de Investigación de Cantabria).
Data analysis
Pre-PET clinical diagnoses were divided into “Aβ” or “non-Aβ” categories based on the association of the clinical syndrome with amyloid pathology (Table 1). Aβ diagnoses consisted primarily of typical and atypical presentations of AD [22]. Dementia with Lewy bodies was also included in the Aβ group due to its high degree of co-pathology with AD. The non-Aβ category consisted of clinical variants of FTD. Amnestic MCI was included in the Aβ category, and non-amnestic MCI was considered a non-Aβ diagnosis [23]. In cases with multiple differential diagnosis, the first item listed was considered “primary diagnosis”. Patients listed as both Aβ and non-Aβ diagnoses on the differential diagnosis were considered “diagnostic dilemmas”. The primary predictor of interest was concordance between PET result and clinical diagnosis. PiB positive and FDG-AD scans were considered concordant with an Aβ diagnosis, while PiB negative and FDG-non-AD scans were considered concordant with a non-Aβ diagnosis. The main outcomes were defined as changes in: 1) primary diagnosis, 2) clinical uncertainty and 3) AD treatment between the pre- and post-PET visits. Change in primary diagnosis was defined as a change in the first-listed diagnosis from Aβ to non-Aβ or vice versa. Change in AD treatment was defined as initiating or discontinuing cholinesterase inhibitors or memantine. Clinical uncertainty was estimated by the percentage of diagnostic dilemmas. We first assessed the relationship between PET results and clinical outcomes separately for PiB and FDG using χ2 or Fisher’s exact test. Next, we performed logistic regression predicting each outcome when accounting for the following predictors: discordant PiB, discordant FDG, diagnostic dilemma pre-PET, sex, age at PET <65 years, baseline Aβ diagnosis, and Clinical Dementia Rating (CDR).
Specific diagnoses at baseline
AD, Alzheimer’s disease; PPA, primary progressive aphasia; MCI, mild cognitive impairment; bvFTD, behavioral variant frontotemporal dementia; CBS, corticobasal syndrome. * B12 deficiency, immune mediated cognitive impairment, psychiatric, systemic disease.
RESULTS
PET scans were ordered by seven different neurologists, three of them are experts in behavioral neurology and the remaining four are general neurologists. We compared the degree of clinical concordance with PET results and found no statistically significant differences across neurologists (PiB p = 0.48; FDG p = 0.46). Additionally, we stratified the sample comparing the three more experienced neurologists in behavioral neurology with the general neurologists without finding differences in concordance (PiB p = 0.88; FDG p = 0.54). The most frequent etiologic subgroup in our cohort was amnestic MCI followed by AD (Table 1). In most patients, an Aβ diagnosis was expected before PET. The average age of our patients was relatively young and most of them were at initial stages at the time of the study. Only a quarter of our patients were on AD-drugs treatment pre-PET (Table 2). Three PiB-PET scans were considered as “equivocal” and removed from the analysis.
Clinical and demographical characteristics
MMSE, Mini-Mental-State Examination; AD, Alzheimer’s disease; ChEI, cholinesterase inhibitors; CDR, Clinical Dementia Rating.
Concordance between PET results and clinical suspicion
Overall concordance between classification based on scan readings and pre-PET diagnosis was 66.2 % for PiB and 46.7% for FDG. PiB concordance was higher than FDG concordance in typical AD (p = 0.05) and in amnestic MCI (p = 0.00002); and PiB concordance was higher in AD than in MCI (p = 0.03) and in corticobasal syndrome (p = 0.001) (Fig. 1A). We found no differences regarding age (PiB p = 0.63; FDG p = 017) or CDR (PiB p = 0.94; FDG p = 0.25). (Fig. 1B, C). Overall, PiB and FDG agreed in classifying 74% of patients.

A) The percentage of concordance between the initial diagnosis and PIB and FDG PET results. PiB concordance was higher than FDG concordance in typical AD (80% versus 57%, respectively) and in amnestic MCI (57% versus 20%, respectively). PiB concordance was higher in AD than in MCI (80% versus 57%, respectively) and in corticobasal syndrome (80% versus 0% respectively). We found no differences regarding age (B) or CDR (C). AD, typical Alzheimer’s disease; PPA, primary progressive aphasia; MCI, mild cognitive impairment; bvFTD, behavioral variant frontotemporal dementia; CBS, corticobasal syndrome; MMSE, Mini-Mental-State Examination; CDR, Clinical Dementia Rating.
Diagnostic changes after PET
Factors associated with diagnostic changes
*Adjusted by all other covariates included in the model CDR, Clinical Dementia Rating.
The primary diagnosis changed after PET in 17.2% of the patients. Tested separately, discordant PiB and discordant FDG results were both strongly associated with diagnostic change. In the crude analysis, there was a very significant association between patients with diagnostic dilemmas pre-PET and changes in diagnosis. When including both PET scans as predictors in a single logistic regression model, diagnostic changes were associated with discordant PiB (p = 0.0002) but not discordant FDG (p = 0.14) (Table 3). When both scans agreed with clinical diagnosis, changes were exceptional (1.5%). On the contrary, diagnostic changes were likely performed when both scans were discordant with clinical diagnosis (45.6%) or when PiB was discordant but not FDG (60%); however, when FDG was discordant but PiB agreed with the clinical diagnosis, clinicians tended to relay more on PiB and only changed the diagnosis in 2.9% of cases (Table 4).
The full logistic regression model (Table 3) shows that diagnostic dilemmas and discordant PiB remained significantly associated to diagnostic changes after p-value adjustment. Additionally, when a non Aβ syndrome was suspected, this diagnosis was most likely to be changed after PET; the same happened with patients with CDR <1, which is consistent with the fact that 34.6% of all diagnostic changes took place in amnestic MCI patients.
Changes in the clinician’s diagnostic confidence
The number of diagnostic dilemmas decreased significantly from 37.7% pre-PET to 15.6% post-PET (p = 0.00002).
Treatment changes after PET
In 45% of the patients a treatment change took place after PET results. The most common change was the addition of an AD drug (85.3%). FDG results did not influence treatment. However, we found that concordance between PiB-PET and clinical diagnosis was significantly associated to treatment change (p = 0.006), and these results were also statistically significant in the full logistic regression model (p = 0.009). (Table 5). The main diagnostic group where changes took place was amnestic MCI (47% of treatment changes), of which in 94% consisted in the initiation of an AD drug.
Comparison with a tertiary center
UCSF-MAC and UHMV study populations had on average a similar age at disease onset (UCSF-MAC 65.0 years versus 67.3 years UHMV) and were also evaluated at early disease stages (UCSF-MAC MMSE 22.7 versus 24.2 UHMV). However, the percentage of AD drug treated patients was higher in UCSF-MAC (46% on cholinesterase inhibitors and 39% on memantine) than in UHMV (75% untreated), and UHMV had a predominance of suspected Aβ pathology (72.8% UHMV versus 46% in UCSF-MAC). Another distinction between both study populations is the fact that while MCI was the most frequent diagnostic category in UHMV (49%), it was rare at UCSF-MAC (7%). This is consistent with differences in CDR between both populations (UHMV CDR <1 89.9% versus 42% UCSF-MAC).
Diagnostic changes according to FDG and PET PiB concordance to clinical diagnosis
Changes in AD treatment in relation to concordance of clinical diagnosis with PET results
*Adjusted by: discordant PiB, discordant FDG, diagnostic dilemma pre-PET, sex, age at PET < 65 years, baseline Aβ diagnosis, and Clinical Dementia Rating.
A common finding with the UCSF-MAC study was that clinical concordance with PiB was higher than with FDG, a difference that was statistically significant for classical forms of AD. Additionally, in both studies PiB results were more determinant for clinicians than FDG, so when the PiB results were discordant with the FDG, they tended to follow the PiB.
We found that agreement between clinical diagnosis and amyloid PET was lower in UHMV than in UCSF-MAC (66.2% versus UCSF-MAC 84%). That lower agreement in the secondary center was in line with a higher rate of changes in diagnosis after PET (UHMV 17.2% versus UCSF-MAC patients 9%). Likewise, clinical dilemmas reduction was more intense in UHMV with a 22.1% reduction after PET compared to 8% at UCSF-MAC. Finally, meanwhile the influence of PET-PiB over treatment was not significant for the UCSF-MAC patients, there was a clear effect on treatment in UHMV, where PET played a confirmatory role.
DISCUSSION
One of our main findings was that changes in diagnosis after PET in UHMV almost doubled those of our previous analysis of the UCSF-MAC patients. Percentages referring to diagnostic changes after amyloid PET reported in previous studies vary widely from 9% to 79%, and similar disparities are found when other indicators are analyzed such as influence on AD specific treatment or clinicians’ confidence in diagnosis [9–20]. These differences are related to study design and methodology. In general, site specialty studies, like the current work and our previous analysis of the UCSF-MAC series, tend to show lower clinical repercussion than large multicenter studies. For instance, diagnostic changes after PET were estimated to be 9%, 19%, 23%, and 23% respectively in uni-center studies [11, 18]; in contrast to larger multi-center studies: 32.6%, 54.6% and 79%, respectively [12, 20]. This is in line with preliminary data from the Imaging Dementia Evidence for Amyloid Scanning study (IDEAS), a study organized by the Alzheimer’s Association currently assessing the clinical utility of amyloid PET in 674 clinical practices. Interim results from the first 4,000 people scanned show that after amyloid PET results care plans shifted for 67.6 percent of participants (Rabinovici, personal communication). These differences might be related to the fact that in single site studies there could be an overrepresentation of more specialized centers with earlier access to amyloid PET technology diluting this bias in large multicenter studies.
Due to the heterogeneity among published studies, the comparison between UHMV and UCSF-MAC, applying the same design and methods, has notable value because it allows a straightforward interpretation and offers clues about the different utility of these tests depending on the context. Different rates in diagnostic change could be partially explained by the fact that the agreement between clinical diagnosis and amyloid PET, the largest determinant of diagnostic change in both studies, was 17.8% lower in UHMV. The discordance between the clinician’s initial diagnosis and the result of the scan could be a proxy of the amount of additional information offered by the test. Therefore, in our setting, amyloid PET seems to play a more valuable role than in tertiary units like UCSF-MAC. The differences in discordance between centers might be caused by many factors such as neurologist expertise, methodological differences in clinical workup and diverse patient profile. We did not find significant differences within UHMV neurologists. However, there is evident distinctness in the average patient profile attended by each group. Age at onset and disease stage at recruitment time was similar in both studies. However, UCSF-MAC patients were frequently referred for second opinions and for inclusion in research protocols, as reflected by the fact that almost half of them were already treated with AD drugs at recruitment time, and in less than half of the cases Aβ pathology was the first suspected diagnosis. In UHMV, most patients were referred by general practitioners for diagnosis and treatment, AD being the most common initial diagnosis. Some of the patients’ characteristics reflect the particularities of a secondary care center versus a tertiary center highly specialized in FTD, like the UCSF-MAC. A major difference between both populations is the fact that while the most frequent diagnostic category in UHMV was MCI, it was almost non-existent at UCSF-MAC. This is of special importance because patients with CDR <1 of our series were significantly more likely to change diagnosis after PET, which is in line with the fact that a third of diagnostic changes took place in the amnestic MCI patients.
Changes in treatment were a major clinical output of our study. There was a clear effect on treatment in UHMV, where PET played a confirmatory role. Thus, AD treatments were initiated in many patients, mostly in amnestic-MCI, when PET-PiB was positive. This pattern has been found also in other studies in which clinicians’ decisions to start AD treatments were supported by amyloid PET results [20]. In contrast, the influence of PET-PiB on treatment was not significant for the UCSF-MAC patients. In both studies, PET scan information helped to increase diagnostic certainty indirectly estimated by a decrease in the percentage of patients with clinical dilemmas; again, this effect was more intense in UHMV compared to UCSF-MAC. The increase in the clinician’s confidence in diagnosis is a constant finding across studies assessing the clinical utility of amyloid PET. In our study, increased diagnostic confidence facilitated a more proactive attitude towards AD treatment. There are evidences in the literature supporting that early AD treatment might be beneficial [24, 25]. Additionally, many of our patients are illiterate and they have a very basic premorbid functional level, therefore, sometimes it is not straightforward to estimate a clear loss of function, as this could be evident for the family relatively late. In these cases, in which functional impairment is doubtful, a positive amyloid test might reinforce the decision to start treatment.
A common finding with the UCSF-MAC study was that clinical concordance with PiB was higher than with FDG, a statistically significant difference for classical forms of AD. Additionally, in both studies PiB results were more determinant for clinicians than FDG, so when the PiB results were discordant with the FDG, they tended to follow the PiB. This is supported by the fact that in our full logistic regression model, the clinical discordance with the results of the FDG-PET was not significantly associated with diagnostic change, despite a strong association in the univariate analysis. The discordance with FDG was not associated with treatment changes in any of the studies either. Our naturalistic approach is not suitable for a direct comparison between both PET tracers. Therefore, these results, must be taken with caution. Since we have no pathology data available we are only reflecting clinician’s behavior and not the true sensitivity or specificity of the test. However, from a qualitative point of view we consider that FDG PET could be very helpful in the diagnosis of complicated cases, especially in those in which co-pathology is suspected.
Amyloid PET tracers approved for clinical use are still very expensive, and therefore it is relevant to provide clinicians with guidelines for a rational and cost-effective use. The AUC proposes that amyloid PET should be used in patients with uncertain diagnosis, in three clinical scenarios: 1) MCI, 2) atypical dementia, and 3) early-onset dementia. Our data strongly support the indication of testing for MCI patients. On the one hand, we found the highest level of discordance in this group and consequently the highest levels of diagnostic changes; on the other hand, treatment changes were more frequent after concordant PET results in MCI, a population where the test mainly played a confirmatory role in decisions regarding the initiation of treatment. We found only partial evidence supporting the second scenario as the degree of discordance was significantly higher in an atypical syndrome like corticobasal syndrome (p = 0.001) compared to typical AD, indicating that amyloid PET could be of more help in these patients. Our study might be underpowered for detecting significant difference in other atypical cases where numbers were small for the specific categories. We did not find any differences between patient age at study entry and clinical discordance with the PET results. However, most of the patients were relatively young, as clinicians are aware of age-related decrease in specificity, so we were unable to contrast the utility of the test with older patients [26].
Our data offers some hints of the patient’s profile in which the test would offer more information. In addition to PiB discordance, the main predictors of diagnostic change in the full regression model were diagnostic dilemmas, initial diagnosis of non Aβ syndrome and CDR <1. Therefore, according to our results, the archetypical patient in which the test is more likely to be helpful is a relatively young patient (our population average 67.3 years old) studied at early disease stages, in which the main suspected diagnosis is not AD, though AD cannot be ruled out in the differential diagnosis.
The study has some caveats. PET images were not rated using semiquantitative methods. However, in a previous study we have compared a semiquantitative analysis, using a SUVR threshold, versus a subjective assessment method and we found a high concordance between both methods [21]. The retrospective design precludes a direct estimation of clinicians’ change in diagnostic confidence; we attempted to quantify this factor by the degree of clinical dilemma reduction after the test. Additionally, despite our multivariate analysis, we cannot completely separate the influence of PiB and FDG or control for the evolution of clinical symptoms or the availability of additional data at the post-PET visit. In our study, we have no neuropathological data; therefore, we are unable to contrast clinical or PET results with a gold standard. Our design follows a naturalistic approach, attempting to observe and quantify clinician behavior in real practice.
This study represents a rare opportunity to assess, using the same methodology, the differential effect of amyloid PET between a secondary and a tertiary center, supporting the hypothesis that this test plays a more relevant role in a less specialized context. There is a bias in scientific literature toward studies coming from tertiary centers, but we think that our results, evaluating the clinical repercussion of amyloid PET in a secondary care memory unit, are more likely generalizable to an average clinical practice. Large prospective multicentric studies like the ongoing IDEAS including centers with diverse characteristics are still needed to robustly evaluate the clinical contribution of amyloid PET.
