Abstract
Background:
The diagnosis of incipient symptomatic stages of early-onset dementia is challenging. The magnetic resonance imaging (MRI) is an easy-access biomarker.
Objective:
We aim to determine the distribution and diagnostic performance of the existing atrophy visual rating scales on MRI in initial stages of the most frequent neurodegenerative early onset dementias.
Methods:
We evaluated the visual atrophy scales usefulness in two hundred subjects: seventy sporadic early onset Alzheimer’s disease (AD) patients (48 amnestic and 22 non-amnestic), 14 patients with autosomal-dominant AD (ADAD), 25 sporadic frontotemporal dementia patients [11 with behavioral variant (bvFTD), nine with semantic variant of primary progressive aphasia (svPPA), and 5 with non-fluent primary progressive aphasia (nfvPPA)], 7 with genetically determined FTD (genetic FTD), 25 mild cognitive impairment due to non-degenerative disorders, and 59 healthy controls. All had MMSE≥18, 3T-brain MRI, and biomarker-supported diagnosis. Two raters evaluated six frontal, temporal, and parietal scales. Inter-rater reliability and diagnostic performance in terms of area under the receiver-operator curves and balanced accuracy were analyzed.
Results:
Best scales to discriminate AD from controls were the anterior cingulate scale for amnestic and the posterior atrophy scale for sporadic non-amnestic AD and ADAD. The anterior temporal scale was the best for sporadic bvFTD and svPPA and the anterior cingulate scale was for nfvPPA. All scales performed well for the genetic FTD. However, no scale demonstrated good performance at discriminating AD from FTD or non-degenerative disorders.
Conclusions:
The clinicians should interpret with caution atrophy scale assessment in subjects with early-onset cognitive impairment given that none of the evaluated scales met the requirements for being a diagnostic biomarker.
INTRODUCTION
Early onset dementia (EOD) (age of onset under 65) can reach up to 10% of dementia cases and its diagnosis is frequently challenging [1]. The higher frequency of atypical presentations entails a clinical overlap between different diseases that might lead to a relevant delay until an accurate diagnosis [2].
Several imaging, genetic, and biochemical biomarkers have been developed and are included in the Alzheimer’s disease (AD) and frontotemporal dementia (FTD) current diagnostic criteria [3–6]. The structural magnetic resonance imaging (MRI) for brain atrophy evaluation is one of the most available biomarkers. Although volumetric quantification methods have been developed, they are complex and time-consuming making them challenging to integrate into clinical practice. For that reason, visual assessment remains the most widely used tool for brain atrophy assessment in clinical environments.
Visual assessment has demonstrated to be useful in late-onset patients [7–10]. However, the different clinical, neuropsychological, and neuropathological features of early onset patients could also lead to a different diagnostic performances compared to late onset patients [11–13]. In a recent study, an extensive and well-structured rating protocol was applied to a mostly focused EOD cohort, but it included wide range of disease stages [14]. In this context, they concluded that Fronto-Insular (FI) and Medial Temporal Atrophy (MTA) were the best scales to differentiate AD and FTD from controls.
However, the moment of the first evaluation is when these scales can be most needed to be of help, usually in initial symptomatic stages, mild cognitive impairment (MCI) and mild dementia. In this sense, we aim to elucidate the diagnostic performance of visual rating assessment on MRI in initial stages of the most frequent neurodegenerative EOD (AD and FTD) with a wide range of clinical phenotypes, in both sporadic and genetically determined cases.
METHODS
Two hundred subjects evaluated at the Alzheimer’s disease and other cognitive disorders Unit at Hospital Clínic de Barcelona were enrolled in this cross-sectional study. The study was approved by the Hospital Clínic Barcelona Ethics Committee and all participants gave written informed consent. All subjects had a clinical onset before 65 years and Mini-Mental State Examination (MMSE)≥18 and were selected from the Early-onset Dementia Cohort and the Genetic counselling program for familial dementias (PICOGEN) [12]. All of them had a neurological and neuropsychological evaluation, 3T brain MRI, and biomarker-supported diagnosis.
Patients were classified into the following groups: Sporadic early onset AD (EOAD) (n = 70): Twenty-seven with MCI (Pfeiffer Functional Activities Questionnaire FAQ≤6) and 43 with mild dementia (FAQ > 6). All subjects had typical AD core cerebrospinal fluid (CSF) biomarkers profile (n = 65) or positive amyloid-PET (amyloid tracer-positron emission tomography) (n = 5) and fulfilled the National Institute on Aging and Alzheimer’s Association (NIA-AA) diagnostic criteria for MCI due to AD or AD dementia. They were further classified into amnesic (A-EOAD, 48 subjects with memory impairment as main feature in clinical and neuropsychological evaluation) and non-amnesic variants (NA-EOAD, 22 subjects with main language, executive, or visuospatial impairment) [3, 4]. Autosomal dominant Alzheimer’s disease (ADAD) (n = 14): Seven with MCI and seven with mild dementia due to AD. All of them carried a pathogenic mutation in the PSEN1 gene. Sporadic FTD (n = 25): Eleven subjects accomplished behavioral variant of FTD (bvFTD) diagnostic criteria, five of non-fluent variant for primary progressive aphasia (nfvPPA), and nine of semantic variant of primary progressive aphasia (svPPA) [5, 6]. All FTD patients had normal AD CSF biomarkers results (n = 22) or negative amyloid-PET (n = 3). Genetic FTD (n = 7): Two subjects with C9orf72 expansion (2 bvFTD), four with GRN mutation (three nfvPPA, one bvFTD), and one case with MAPT mutation (bvFTD). Non-degenerative MCI (n = 25): Subjects with MCI with normal AD CSF biomarkers and clinical features compatible with fibromyalgia, chronic fatigue syndrome, or psychiatric disorders as anxiety or depression. Controls (n = 59) with no cognitive complaints and normal AD CSF biomarkers results. Forty-two controls age-matched with sporadic EOAD and FTD patients and 17 controls age-matched with ADAD patients.
CSF biomarkers determination
CSF levels of amyloid-β, total-tau, and phosphorylated-tau were measured using Innotest ELISAs following manufacturer’s instructions (Fujirebio, Ghent, Belgium).
Brain MRI imaging
High-resolution T1-weighted images were acquired in a 3Tesla scan (Siemens Magnetom Trio, Erlangen, Germany) at the Magnetic Resonance Image Core Facility, using proprietary three-dimensional magnetization-prepared rapid-acquisition gradient echo: MPRAGE sequences (TR = 2300 mseg; TE = 2,98 mseg; acquisition matrix 256×256, voxel size 1×1×1, slice thickness = 1 mm, absent interslice gap).
Visual rating assessment
MRI visual rating assessment was performed by two clinical dementia experts (M.B, N.F) blind to all clinical information. They evaluated the following scales: anterior temporal scale (AT), medial temporal atrophy (MTA), posterior atrophy (PA), and anterior atrophy scales including orbitofrontal (OF), anterior cingulate (AC), and fronto-insular (FI) regions [16]. Raters scored the two sides separately and the final score was the mean between the two sides. These scales were assessed for every subject following the detailed rating protocol previously developed and validated in a pathologically proven cohort by Harper et al. [14]. Different degrees of atrophy were scored according the indications and reference images provided by this rating protocol [14]. Then, scores were rated from 0 to 3 in OF, AC, FI, and PA, and from 0 to 4 in AT and MTA scales.
Statistical analysis
Statistical analysis was conducted using Stata/IC 14.2 (College Station, TX, USA). Categorical data was analyzed by χ2 test and quantitative data by ANOVA and Student’s t-test with Bonferroni post-hoc procedure. Inter-rater and intra-rater reliability of rating scales was determined by the intraclass correlation coefficient (ICC) by a two-way random, absolute for single-measures [ICC (2,1)] and average measures ICCs [ICC(2,k)]. The diagnostic accuracy was estimated using area under the curve (AUC) values of receiver operator characteristics. Balanced accuracy was calculated as 0.5×(sensitivity + specificity). Logistic regression models were fit in order to determine the AUC for multiple visual rating scores combinations.
RESULTS
Demographics
Demographical data is shown in Table 1. The ADAD subjects and their paired controls were younger with respect to the other groups (p < 0.01). Higher women rate was found in controls. No differences in disease duration between groups were found. EOAD and bvFTD had lower MMSE than controls as well as A-EOAD and bvFTD than non-degenerative MCI (p < 0.01).
Demographics and mean visual rating scores
aAmnesic AD versus behavioral variant FTD; bAmnesic AD versus Semantic dementia; cAmnesic AD versus genetic FTD; dAmnesic AD versus controls; eAmnesic-AD versus non-degenerative MCI; fNon-amnesic AD versus behavioral variant FTD; gNon-amnesic AD versus Semantic dementia; hNon-amnesic AD versus genetic FTD; iNon-amnesic AD versus controls; jNon-amnesic AD versus non-degenerative MCI; kbehavioral variant FTD versus Semantic dementia; lBehavioral variant FTD versus Non-fluent aphasia; mBehavioral variant FTD versus controls; nBehavioral variant FTD versus non-degenerative MCI; oSemantic variant FTD versus Non-fluent aphasia; pSemantic variant FTD versus genetic FTD; qSemantic variant FTD versus controls; rSemantic variant FTD versus non-degenerative MCI; sNon-fluent aphasia versus genetic FTD; tGenetic FTD versus controls; uGenetic FTD versus non-degenerative MCI; vGenetic FTD versus behavioral variant; wMCI versus controls; xFamiliar AD versus younger controls; yNon-fluent-aphasia versus amnesic AD. *Indicates significance at p < 0.01, otherwise p < 0.05. sA-EOAD, sporadic A-EOAD; sNA-EOAD, sporadic NA-EOAD; Ndg-MCI, non-degenerative MCI; NA, not applicable.
Inter- and intra-rater reliability of visual rating scores
The kappa index showed substantial inter-rater agreement on all the scales. The ICC values for intra and inter-rater agreement are included as Supplementary Material (Supplementary Tables 1 and 2).
Mean rating scores per diagnostic group
Detailed rating score data is summarized in Table 1 and Fig. 1. A-EOAD had higher scores than controls in all scales (p < 0.05) and in AC and FI scales than non-degenerative MCI (p < 0.05). The NA-EOAD showed higher ratings than controls in PA, AT, and OF (p < 0.05) and higher rates on PA and AT (p < 0.05) than non-degenerative MCI. In EOAD subgroups comparison, a tendency to higher PA score in NA-EOAD and higher MTA score in A-EOAD were found, although they did not reach the statistical significance. In ADAD, MTA (p < 0.05) and PA (p < 0.01) scores were higher than controls.

Distribution of mean rating scores for each group.
Compared to controls and non-degenerative MCI, bvFTD obtained higher rates in AC, FI, OF, AT, and MTA (p < 0.05) and svPPA in FI (p < 0.05) and AT (p < 0.01). No differences were found with nfvPPA. All scales were higher in genetic FTD (p < 0.05).
The bvFTD and svPPA groups showed higher scores in MTA and AT (p < 0.01) than EOAD groups. Genetic FTD had higher scores in FI, OF, and AT (p < 0.05) scales than A-EOAD, in the AC, FI (p < 0.01), and AT (p < 0.05) than NA-EOAD. The detailed distribution of rating scores is included as Supplementary Figure 1.
Rating scales diagnostic performance
Detailed diagnostic performances of scales for each group comparison are shown in Table 2. Diagnostic performance is referred to comparison against controls if not otherwise specified. In A-EOAD, AC (AUC = 0.80), followed by FI and MTA (AUC = 0.77) showed good diagnostic accuracies. In NA-EOAD and ADAD, the PA scale was the best one (AUC > 0.80). Otherwise, none of the scales showed an acceptable diagnostic accuracy for discriminating EOAD from FTD or non-degenerative MCI (AUC≤0.75).
Diagnostic performance of brain atrophy scales for each group comparison
aSensitivity and specificity of scales with AUC over 0.75 and balanced accuracy over 70% are shown. The optimal cut-offs for each scale are those with higher balanced accuracy. They should be interpreted as: < cut-off = normal,≥cut-off: abnormal. Se, sensitivity; Sp, specificity; Ndg-MCI, non-degenerative MCI.
Diagnostic performance in bvFTD was very good (AUC > 0.90) for AT, OF, and AC scales and good for FI, MTA, and PA (AUC > 0.75). In svPPA, diagnostic performance was very good (AUC > 0.90) for AT and MTA scales and good (AUC > 0.75) for FI and OF, while in nfvPPA, only the AC scale showed good diagnostic performance (AUC = 0.78). All scales showed a very good (AUC > 0.90) or excellent (AUC > 0.97) diagnostic performance in genetic FTD.
Logistic regression models showed an improvement of the AUC combining different visual scales the following group comparisons: EOAD versus HC (AUC 0.88), A-EOAD (0.88) versus HC, and FTD versus HC (0.95) (Table 3).
Diagnostic accuracy of brain atrophy rating scales in combination
The combination of scales with better diagnostic accuracy than those evaluated separately are shown for each diagnostic comparison. Se, sensitivity; Sp, specificity; BA, balanced accuracy; Ndg-MCI, non-degenerative MCI.
DISCUSSION
In our EOD cohort, the atrophy visual rating demonstrated to be reliable when based in a well-structured evaluation [14]. The best scales for identifying AD and FTD from controls were different according to the clinical phenotype. The AC scale was the best for A-EOAD and PA scale for NA-EOAD, highlighting the little diagnostic accuracy of MTA in EOAD. Otherwise, AT is better to discriminate bvFTD and svPPA from controls as well as the AC scale for nfvPPA. All the scales had a very good diagnostic performance for genetic FTD.
Previous studies have investigated the diagnostic performance of brain atrophy scales in dementia being most of them focused on late-onset dementias and without biomarkers supported diagnosis [7–10, 20]. Solely a recent study studied a pathologically-proven sample mostly focused on EOD but in advanced clinical stages (EOAD mean MMSE was 16.6±6.3) [14]. Furthermore, another previous study with no biomarker-supported diagnosis found out better diagnostic performance of visual rating scales on moderate-severe AD cases compared to the mild ones [20]. To our knowledge, our study is the first to compare the visual brain atrophy assessment exclusively focused on EOD in the initial symptomatic stages (MCI and mild dementia, MMSE 23.3±3.5) in a well-characterized cohort with biological confirmation of the disease. We consider it of interest, since this is the timeframe when these subjects will seek medical advice, thus it is precisely then when these radiological biomarkers should be more important for establishing an accurate diagnosis. Additionally, often in clinical practice the diagnostic dilemma is not between AD or FTD and healthy controls but between the different neurodegenerative dementias (AD versus FTD) and non-degenerative cognitive impairment; for this reason, a group of non-degenerative MCI was also included in the evaluation.
FTD and AD showed higher atrophy scores than non-degenerative MCI and controls. Higher rates of atrophy were found on FTD being especially remarkable in the genetic cases, accordingly to recent data reported on a larger genetic FTD cohort [21]. As expected, all FTD variants except the nfvPPA, showed a predominant atrophy pattern in those scales reflecting frontotemporal damage, and relatively lesser on posterior areas (i.e., PA scale) [22].
Different atrophy spreading patterns between AD phenotypes have been defined. While amnestic AD is characterized by early atrophy of the hippocampus and medial temporal lobes before spreading to neocortex, non-amnestic AD has relative sparing of the hippocampus and other regions as parietal areas are more affected [23]. In our study, a tendency of greater atrophy on PA scale in NA-EOAD and MTA on A-EOAD was found. This predominant posterior atrophy in non-amnestic forms compared to the typical amnestic ones agrees with their clinical phenotype [23, 24].
Furthermore, the underlying pathological changes of the different dementia types are related to patterns of volume loss as well as to the clinical phenotype. Regarding to AD, recent data obtained from volumetric assessment in a pathologically confirmed cohort reported differential atrophy patterns between early and late-onset AD. They demonstrated more involvement of medial temporoparietal or medial temporal regions, respectively [25, 26]. Regarding FTD, the relationship between phenotypes, pathological changes, and patterns of atrophy is even more difficult due to the overlap of these factors. However, it has been described that FTD variants with underlying tau and TDP43 pathologies could have similar patterns of frontotemporal atrophy [25]. Diagnostic performance was good or excellent in most scales for FTD variants, except for the nfvFTD. The good diagnostic accuracy achieved by the MTA scale in these FTD variants was especially notable, which contrasts with its lower discriminative capacity in EOAD [13, 27]. This finding is especially relevant for the amnestic phenotypes since it is expected to have more hippocampal atrophy and higher MTA ratings. In addition, the MTA scale was not able to discriminate AD from FTD and neither from the non-degenerative MCI. Taking into account that MTA is considered a hallmark of AD and a biomarker in the current AD diagnostic criteria, it is important to note its low diagnostic performance in EOAD. This finding is consistent with previous studies that highlighted that hippocampal atrophy would not accomplish with the consensus for valid AD biomarker of at least 80% sensitivity and specificity [13, 28].
Although the anterior cingulate functions and their relation with AD remain understudied, previous reports described its volume loss in AD [29]. It has been related to executive functions such as unawareness of memory deficits, flexible thinking and apathy [30]. In this line, AC scale was the best for A-EOAD in our cohort; however, its diagnostic performance was still disappointing (66% sensitivity, 83% specificity).
The PA scale demonstrated to have the best diagnostic performance for NA-EOAD and ADAD. Since parietal atrophy is more common in AD atypical presentations and it is not characteristic of FTD, the PA scale was expected to obtain good diagnostic performance in NA-EOAD, but surprisingly it was not enough to meet the consensus of AD biomarkers criteria [28].
Combining different scales improved the diagnostic accuracy for some diagnostic groups (EOAD, A-EOAD, and FTD groups) when compared to controls. Unfortunately, the multiple visual ratings did not demonstrate an added value when compare to other degenerative diseases (i.e., EOAD versus FTD, FTD versus EOAD).
As far as none of the scales reached the AD biomarkers criteria, clinicians should be aware of over relying on the presence of atrophy at the first evaluation in a subject with early-onset cognitive impairment. On a broader view, we might reconsider the role of brain atrophy assessment in the current AD diagnostic criteria, defining whether it should be supportive criteria for clinical diagnosis instead of being categorized as a biological AD biomarker in the early-onset patients. This would allow to weigh the added diagnostic value of visual atrophy assessment in comparison to other more specific AD biomarkers (i.e., total and phosphorylated tau on CSF) which would be more in concordance of what is defined on the new AD research framework [31].
The main limitation of our study is the relatively small sample size in some of the FTD subgroups, although, as far as we are aware this is the larger reported cohort focused on early stages of EOAD and FTD patients with biomarker-supported diagnosis.
In summary, even if we found differences in visual scales scores between groups, they have shown little utility in the differential diagnosis of EOD in early stages. Furthermore, the results of the present study evidenced that none of the scales met the requirements for being a valid diagnostic biomarker. Further studies in other well-characterized cohorts to evaluate single visual rating scales usefulness in early stages of EOD are needed to confirm our data.
Footnotes
ACKNOWLEDGMENTS
The authors thank patients, their relatives, and healthy controls for their participation in the research. This work was supported by Spanish Ministry of Economy and Compititiveness-Instituto de Salud Carlos III and Fondo Europeo de Desarrollo Regional (FEDER), Unión Europea, “Una manera de hacer Europa” [PI14/00282 to Dr. A. Lladó], PERIS 2016–2020 Departament de Salut de la Generalitat de Catalunya [SLT002/16/00408 to Dr. Sanchez-Valle. Dr. Neus Falgàs received funding from Hospital Clínic Barcelona [Ajut Josep Font]. Dr. Anna Antonell [PERIS 2016–2020 SLT002/16/00329] and Dr. Albert Lladó (PERIS SLT008/18/00061) received funding from Departament de Salut de la Generalitat de Catalunya.
