Abstract
Background
The amyloid, tau, neurodegeneration (ATN) framework provides a biological staging model of Alzheimer's disease (AD) using magnetic resonance imaging (MRI), cerebrospinal fluid (CSF), or positron emission tomography (PET) biomarkers. MRI, being non-invasive, accessible, and cost-effective, holds promise as a biomarker.
Objective
To evaluate the utility of MRI-based automated brain volumetry in classifying cognitive impairment severity—cognitively unimpaired (CU), mild cognitive impairment (MCI), and dementia—as well as ATN profiles, independently.
Methods
We analyzed 394 subjects from the Alzheimer's Disease Neuroimaging Initiative. First, we assessed how well MRI volumetry stratifies cognitive stages. Next, we tested its ability to distinguish A + T + N+ from A-T-N- individuals while classifying clinical stages. Finally, we evaluated its predictive power for cognitive severity in A + T+ and A-T- subgroups, irrespective of neurodegeneration (N), to examine the added value of volumetry across AT profiles.
Results
MRI volumetry showed comparable performance to established biomarkers in identifying CU, MCI, and dementia, and offered complementary value when combined with phosphorylated tau. Hippocampal and temporal gray matter volumes distinguished A + T + N+ from A-T-N- classes with accuracies of 0.81 and 0.78, respectively. In A + T+ versus A-T- comparisons, the highest classification performance for cognitive severity was observed in the A-T- group.
Conclusions
MRI-based brain volumetry can effectively classify cognitive stages and distinguish biological subtypes in AD. It is a promising tool for clinical staging and predicting impairment severity, especially when used alongside phosphorylated tau.
Introduction
Classification of cognitive decline in aging relies on two dimensions: (i) the severity of the disease (cognitively unimpaired (CU), mild cognitive impairment (MCI), and dementia), in accordance with the standard criteria of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM5), and (ii) the etiology of the disease, characterized by its biological definition regardless of its severity (e.g., clinical manifestation). 1
The clinical definition of Alzheimer's disease (AD) has been largely debated,2,3 in particular due to its phenotypical 1 and biological 4 diversity. Several biological processes contribute to the progression of dementia that may not necessarily be attributed to AD. Therefore, several neuropathological studies have shown the importance of describing this disease from a biological perspective. 1 The amyloid (A), tau (T), and neurodegeneration (N) framework represents an important step towards the stratification of the AD continuum based on biomarkers helping to distinguish highly probable AD from non-AD causes of cognitive impairment, also referred to as suspected non-Alzheimer pathology (SNAP).5,6 According to this scheme, a subject is considered A+ if abnormal measurements of amyloid-β42 peptide are found in the cerebrospinal fluid (CSF) or amyloid positron emission tomography (PET) imaging were observed; T+ if the concentration of the CSF phosphorylated tau 181 (pTau) in CSF or accumulation at tau-PET imaging are abnormal; N+ if measurements of total tau protein in CSF or metabolic signal using fluorodeoxyglucose (FDG) PET are beyond the specific cut-offs of “biological healthy” standards.7–11 Conversely, a negative suffix (i.e., A-, T-, N-) indicates absence or normal values for these biomarkers.
Recently, the Alzheimer's Association workgroup introduced updated guidelines emphasizing a biological approach to AD diagnosis. 12 These criteria advocate for the use of core A and T biomarkers to identify AD pathology, even in individuals without clinical symptoms. Concurrently, the International Working Group proposed its own revised definitions, focusing on a combined clinical and biological framework recommending an AD diagnosis when there is both a clinical presentation of cognitive impairment and supportive A and T biomarker evidence. 13
Complementary to total tau CSF biomarker and FDG PET, atrophy assessed with structural magnetic resonance imaging (MRI) is considered an important surrogate for neurodegeneration. 14 Voxel-based volumetry has been largely used to explore patterns of brain atrophy in AD by enabling voxelwise comparisons across participants and providing a quantitative and unbiased approach to explore structural brain differences.4,15–17 Alternatively, volume-based approaches can reach comparable accuracy for disease prediction, even at early stages of AD. 3 One of the existing toolboxes for fully automated brain segmentation is MorphoBox research application, that estimates volumes from T1-weighted brain MR scans and proved to be effective in discriminating clinically-defined MCI and dementia on a standardized data set from the Alzheimer's Disease Neuroimaging Initiative (ADNI). 18
Using the same data set, this work aims to investigate the potential of automated brain volumetry to determine the clinico-biological spectrum of AD characterized either by cognitive staging or by the ATN classification scheme, 19 using three different experiments (Figure 1). First, we stratify the dataset regarding the severity of the cognitive decline into three clinically-defined diagnostic groups, i.e., CU, MCI, and dementia, and assess the power of volumetry to discriminate dementia and MCI patients from CU, respectively. The objective of this first experiment is to reproduce the previous work Schmitter et al. (2015) 18 and extend it with an evaluation of the volumetric performance in comparison to CSF biomarkers. Despite being invasive, these biomarkers are regarded as the state-of-the-art approach for identifying AD pathology in clinical settings. Therefore, we compare volumetry with biology and explore the potential benefits of combining the two, as more and more studies show the possibility of detecting biomarkers in the plasma.20,21 Second, we focus on the biological substrate and only consider the extremities of the biological biomarkers’ spectrum: triple positive (A + T + N+) and triple negative (A-T-N-) participants. We assess the power of volumetry to discriminate these two subsets, hypothesizing that these biologically-defined AD and non-AD categories would have very distinct brain volumetry profiles. Lastly, we bring back the importance of detecting the clinical severity by stratifying the dataset in two subgroups according to their AT profiles, regardless the neurodegeneration axis (N) and evaluating the potential of volumetry to tease apart the different stages of cognitive decline (CU, MCI, dementia).

Summary of the three experiments performed to investigate the discrimination power of volumetry. In order we evaluate in the experiments (1) how different brain volumes are important in detecting a normal cognitive state from a mild decline or from a severe impairment; (2) the capacity of volumes to identify the two extremities of the AD spectrum defined by the biological axis (specifically A + T + N+ and A-T-N- subjects) with respect to the discrimination power of the clinical profiles; (3) the ability of volumetry in teasing apart the clinical severity in two biological defined groups (e.g., A + T+ and A-T-).
For better understanding of this study and its various stratifications, the following sections have been structured according to the corresponding experiments as described in Figure 1.
Methods
Data and experiments overview
Data used in this study were obtained from the ADNI database (adni.loni.usc.edu), launched in 2003 as a public-private partnership led by principal investigator Prof. Michael W. Weiner. The primary goal of ADNI has been to test whether serial MRI, PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD. For more information, visit www.adni-info.org.
In this work, we use the “complete annual year two visits’’ 1.5T standardized ADNI dataset. 22 Participants were divided according to the severity of the cognitive impairment into dementia, MCI or CU using the procedures described in Petersen et al. (2010): 23 inclusion criteria include Hachinski Ischemic Score was less than or equal to 4 and followed for 12 months using standard cognitive and functional measures typical of clinical trials. Additionally, they were classified into ATN groups according to positivity/negativity of their CSF or PET biomarkers for amyloid-β (“A”, CSF Aβ42 or PET AV45), tau (“T”, CSF phosphorylated tau 181 (pTau)) and neurodegeneration (“N”, CSF total tau (tTau) or FDG PET).
Overall, we consider 394 subjects from ADNI-1 dataset that have all the biomarkers needed to reappraise it in several subdivisions. 111 subjects were CU, 186 were diagnosed with MCI and 97 with dementia (see Table 1 for a detailed overview on the dataset according to the ATN classification, in parentheses we add the number of subjects considered positive only for the PET markers).
Number of subjects per ATN group and age (average and standard deviation) and clinical diagnosis (dementia, MCI: mild cognitive impairment, CU: clinically unimpaired). Note that a subject can be positive (or negative) on the 3 different axes (A, T, N), that is why the summed-up numbers do not yield the total number of subjects. Among the A+ subjects, 17 tested positive for the AV45 PET marker, while among the N+ subjects, 25 tested positive for FDG PET.
We rate biomarkers’ negativity (-, i.e., normal) and positivity (+, i.e., abnormal) according to corresponding cut-offs following literature for both CSF6,10,11,24—using Elecsys method 22 —and PET biomarkers.7,8 We apply a 10% gray zone to consider participants still “biologically healthy”, because of the variability of the different cut-offs according to the analysis methodology used. We consider positivity in each one of ATN dimensions if the subject's biomarkers values do not respect either CSF or PET biomarkers cut-offs (see Supplemental Table 1 for more details).
To have a comprehensive description of the ATN classification, we consider a subject A+ if the Aβ42 (CSF biomarker) concentration values are smaller than the threshold, or the radioligand florbetapir AV45 uptake (PET imaging) is above the conventional threshold; T+, if CSF pTau181 (CSF biomarker) concentration is higher than the threshold, while, as for the neurodegeneration axis (N), we categorize a subject N+ if thresholds are not met in terms of tTau (CSF biomarker) or the radiotracer FDG (PET). 25
In experiment 1, all 394 subjects’ brain regions volumes are used to predict the three clinical stages representing the severity of the cognitive impairment: dementia, MCI, and CU. The volumes of regions of interest estimated by MorphoBox research application are used as features and compared to the CSF biomarkers in terms of classification power. Subsequently, pTau is combined with each volume, incrementing the univariate features space to a 2-dimensional features space.
In experiment 2, we focus on the biological axis and we consistently utilize the regional brain volumes to discriminate two biological labels: triple positivity (A + T + N+) and triple negativity (A-T-N-). We distinguish these two sets of subjects as ground truth using CSF biomarkers and PET imaging values and evaluate the discrimination power of brain volumes for characterizing the biological axis, particularly comparing their ability to distinguish A + T + N+ subjects from A-T-N- with respect to detecting clinical impairment (dementia and MCI versus unimpaired). In order to perform a fair comparison between the biological and clinical severity axes, we consider for the classification only the total of 242 participants with triple positivity or triple negativity (Figure 2(a)) also for the clinical assessment, as shown in Figure 2(b).

Experiment 2: cohort characteristics with respect to (a) biological, namely ATN classification, and (b) clinical severity (CU: cognitive unimpaired, MCI: mild cognitive impairment, dementia: Alzheimer's disease dementia). In the latter classification, we consider what is considered cognitive normal (yellow box) and not (red) among the triple positive subjects. This figure represents the data stratification for the results shown in Figure 5 (colors are visible in the online version).
Finally, in the last experiment, the whole dataset (N = 394) is split into two subsamples according to paired A and T positivity or negativity (Figure 3): in this scenario, participants are included irrespective of their clinical assessment to capture the severity of cognitive impairment ranging from none to mild and severe.

Experiment 3: cohort characteristics while separating A + T+ participants from A-T-. In the group 2 (blue box) there are 6 subjects clinically diagnosed with dementia that have negative biomarkers in both A and T axes: these cases may correspond to non-AD diagnoses such as limbinic-predominant Aged-related TDP-43 Encephalopathy (LATE), vascular dementia or FTD or Lewy body dementia (LBD) (colors are visible in the online version).
Image processing and analysis
Volumetry-based discrimination (experiment 1)
First, we derive 120 brain volume estimates from the 3D T1-weighted MPRAGE images of the entire group of 394 subjects using the Morphobox research application as detailed in Schmitter et al. (2015). 18 The analysis mainly focuses on regions that have proved to be linked with AD-related brain atrophy, namely hippocampus, ventricles, temporal gray matter, and additionally CSF, total gray and white matter (see Supplemental Table 2 for a detailed list of selected regions as features of interest). For all experiments, we consider volumetry detrended for age18,26 to discard any potential confounding effect due to aging. Using a univariate logistic regression, we evaluate the performance of classifying the clinical stages considering separately each normalized regional volume with respect to Aβ42, pTau, and tTau values. Given the discrimination power of pTau, we then combine each normalized volume with pTau values. Since the number of features is typically much smaller than the number of samples, the classification problem is not linearly separable. As a matter of fact, we employ the support vector machine (SVM) model with a polynomial kernel to categorize the clinical groups. The kernel type as well as the regularization penalty coefficient and class weight are considered as hyper-parameters optimized with a nested cross-validation (see the Supplemental Material).
Training is performed following stratified 5-folds as outer loop and 3-folds for the inner one, leaving 20% of the data for testing.
All hyper-parameters are optimized with a grid-search in the inner loop of the nested cross-validation on the training set. Receiver operating characteristic curves (ROC) is used to compare performances against the null-model performance, based on 1000 iterations of labels permutation
ATN triple positivity and negativity (experiment 2)
In this section, we consider the subset of triple positive (A + T + N+) and triple negative (A-T-N-) subjects in order to compare the performance of a machine learning classifier in discerning biological labels with respect to clinical ones. Only 242 subjects are taken into account, discarding mixed positivity and negativity.
Using a univariate logistic regression, we evaluate the performance of the classification considering each normalized regional volume. The model is trained following a nested-cross validation (stratified 5-folds as outer loop and 3-folds for the inner), leaving 20% of the data for testing.
Next, we evaluate the permutation test score that generates a null distribution by calculating the accuracy of the classifier on 1000 permutations of the dataset, where features remain in the same order but labels undergo different permutations. This is the distribution for the null hypothesis which states there is no dependency between the features and labels. An empirical p-value is then calculated as the percentage of permutations for which the score obtained is greater than the score obtained using the original data. Also in this case, a 5-fold cross validation is applied. See the Supplemental Material for further details.
A multivariate approach is explored to evaluate the classification performance combining multiple brain volumes: hippocampus (left and right), ventricles, total white and gray matter, total CSF, cortical gray matter, temporal white and gray matter (left and right), frontal white and gray matter (left and right), third and fourth ventricle.
Training has been performed following the same procedure: a nested cross-validation (stratified 5-folds as outer loop and 3-folds for the inner), leaving 20% of the data for testing. The optimized and chosen hyperparameters are reported in the Supplemental Material.
Subsequently, the classifier's performance in distinguishing between the two biological profiles (triple positivity and negativity) and the clinical severity is compared using a bootstrap method with confidence intervals, and a permutation test is conducted to determine the statistical difference between the AUCs of the biological and clinical class distinctions.
Discarding the neurodegeneration axis (experiment 3)
When we consider the separation between A + T+ and A-T- subjects, we aim to distinguish three stages of cognitive impairment (unimpaired, mild and dementia). Therefore, this becomes a multiclass problem where we try to distinguish each class versus the other two (one versus all).
Also in this case, we apply univariate and multivariate approaches optimized with a nested-cross validation. Random forests are used to perform the classifications generating three ROC curves for each predicted class. In the univariate case, we consider each volume as feature and, in order to score the regions based on the classification power, we use a micro-average (Equation 1) by calculating a ratio between all the true positive results for each class and the sum of true positive and false positive results for each class:
In this way, the classes with the most observations have more power.
As for the multivariate approach, the number of features to select is considered as an additional hyper-parameter within the nested cross-validation. Therefore, the selected brain regions are learned during the training step and used as features on unseen data. Given the multiclass problem, we report the AUCs of each ROC separately to evaluate the performance in the stratification of the three stages of cognitive decline.
All the accuracy results presented in this work are obtained from the nested cross-validation performed on the test set, which was not utilized during the different models’ training phase. Furthermore, we have employed a bagging classifier with bootstrapping to estimate the variance of the random forests, as it relies on 10 estimations using random subsets.
Results
ADNI1 standardized subset comprises a total number of 394 subjects: 148 participants with triple positivity (A + T + N+) (age 73.75 ± 7.4, MMSE: 25.77 ± 2.58, 43% F and 57% M) and 94 with triple negativity (A-T-N-) (age 74.4 ± 6.7, MMSE: 28.47 ± 1.41, 45% F and 55% M). The remaining 152 have a mixed combination of positivity and negativity. Figure 4(a) gives an overview of the distribution of the ATN profiles across the clinical groups with respect to the relative volumes of the left and right hippocampus. Considering the wide spectrum of combinations related to biomarkers positivity-negativity and clinical stage, we propose a color- and shape-encoding, taking solely into account CSF and PET biomarkers. Figure 4(b) contains a simplified color coding focusing on the two classes of interest in gray and brown. Both panels of Figure 4 show dashed lines that indicate the normative ranges established by determining the linear regression prediction intervals associated with a specific percentile. This was done using linear regression against age and sex on the healthy cohort under the simplifying assumption that normalized volumes exhibit a normal distribution at each age with consistent variance. The two-colored dotted lines represent the learned linear regressions for the A-T-N- subset (in gray) and the A + T + N+ subset (in brown). The regression line for the A-T-N- group closely aligns with the continuous line, which represents the mean of the normative ranges derived from clinically unimpaired subjects.

(a) Scatterplots of hippocampus relative volume with respect to age. Dots represent individual participants. Lines indicate the 10th, 90th (dashed) and 50th (solid) percentiles for healthy controls according to the clinical severity classification. These lines indicate an age matched normative range for the volume of interest. This was done using linear regression against age and sex on the healthy cohort. Shapes of the scatter plot represent the clinical diagnosis (dementia, MCI, unimpaired) and color represents the ATN biological profile. (b) Same as in (a) but highlighting the two groups of interest: A-N-T- in grey and A + T + N+ in brown. The dotted lines represent the learned linear regressions of the two subsets.
Comprehensive ADNI subset (experiment 1)
Volumetry as discriminator for cognitive impairment severity as opposed to CSF biomarkers
In this section, we evaluate the effectiveness of brain volumetry in identifying cognitive impairment severity. By analyzing brain volumes in a univariate manner, we examine the ability of each brain region to differentiate between dementia versus CU and MCI versus CU. Among all the brain regions considered, only those with performance similar to the CSF biomarkers (where the performance did not reject the null hypothesis in DeLong's test comparing the ROC curves of brain regions to CSF biomarkers) are presented in Tables 2 and 3.
Logistic regression performance in terms of AUC in distinguishing cognitive unimpaired participants from dementia and MCI patients using univariately some relevant brain regions (hippocampus, temporal gray matter, ventricles, CSF and temporal white matter) and, as a reference, the same model's performances using solely biomarkers (namely, Aβ42, tau and pTau) values are reported in the last three columns.
SVM performance in terms of AUC combining in a multivariate fashion each brain region with pTau with respect to using the three biomarkers together as features.
Results show that hippocampal and temporal gray matter volumetry yields the highest AUC, with 0.825 and 0.834 respectively, to identify dementia from CU; 0.819 and 0.736 to identify MCI from CU subjects (Table 2). The performance of other regions ranges from 0.4 to 0.56, rendering them less relevant and not comparable to the CSF biomarkers or the brain regions associated with dementia
Phosphorylated tau, used as a univariate feature of the logistic regression, has also high AUC scores in both classification tasks (0.826 for dementia versus CU and 0.755 for MCI versus CU). As the highest scores of classifications are reached by both volumetry and pTau, we then evaluate the classification power of brain volumetry enriched with pTau CSF levels in a multivariate analysis. The brain volumes combined with pTau have better performances than the two features considered independently, as summarized in Table 3. As expected, the hippocampal volumetry (along with pTau) outperforms temporal gray matter and the three CSF biomarkers combined together.
From this experiment, we can infer that the volumetry of the hippocampus and the temporal gray matter can discriminate cognitive impairment severity as effectively as the CSF biomarkers. However, volumetry performance is overall better when combined with pTau.
ATN triple positivity and negativity (experiment 2)
Biological versus clinical categories
Using each volume singularly, hippocampal and left temporal gray matter volumetry distinguish A + T + N+ from A-T-N- patients with 0.78 and 0.81 AUC, respectively (Figure 5(a) and (b)).

Experiment 2: univariate and multivariate approach to classify participants using volumetry in (i) A + T + N+ and A-T-N- cohorts (in blue) and (ii) clinically impaired and unimpaired cases (in red). (a) ROC curves using univariately the left temporal GM and (b) the hippocampus, resulting from the logistic regression. (c) AUC scores for the univariate approach comparing biological (blue) to severity (red) labels. Each AUC score underwent a permutation test to prove its significance and the features (or brain regions) that were statistically significant are indicated with a *. The brain regions that reach significance are statistically far from the null-model performance. (d) ROC curves with 90% confidence intervals of the SVM multivariate approach using all brain regions as features. The two AUC indicated in the legend of the figure are significantly different (permutation test). The null-model confidence intervals (0.493–0.51) are represented with a grey box around the diagonal. The data stratification used for these results is represented in Figure 2 (colors are visible in the online version).
Figure 5(c) summarizes the AUC scores for other brain regions comparing the clinical labels to the biological ones. In Supplemental Figure 1, we plot histograms of the permutation scores (the null distribution) to illustrate examples of significant AUC scores. These scores are unlikely to be obtained by chance and are therefore considered significant. The significant regions are indicated with an asterisk and the variance of the estimates (0.08 on average) is represented with an error bar (Figure 5(c)). Permutation tests show that brain volumetry predicts significantly better clinical labels (cognitively normal versus abnormal) than biological profiles (A + T + N+ versus A-T-N-; p-value < 0.01). The ROC curves with 90% confidence intervals are shown in Figure 5(d). This experiment probes the effectiveness of brain volumetry in distinguishing the two extreme biological profiles of the ATN spectrum. Yet, brain volumetry demonstrates superior efficacy in classifying cognitive impairment severity than biological profiles.
Discarding the neurodegeneration substrate (experiment 3)
Finally, we address the distinction of cognitive impairment severity stages in A + T+ (N = 291) and A-T- (N = 103) subjects separately (see Figure 3). In both the univariate and multivariate approaches, we obtain better performance on stratifying severity stages (CU versus MCI, CU versus dementia, and MCI versus dementia) in the A-T- subsample of participants. Table 4 shows the random forest performances: the AUC scores are colored according to the specific cognitive impairment stage classification (CU in blue, MCI in red, and dementia in green) and they are reported in the two subsets (A + T+ and A-T- participants). The best performance in A-T- participants is generally reached when the model distinguishes CU from MCI or dementia groups, with an AUC of 0.80 for both hippocampus and ventricles. Nonetheless, the high accuracy reached in predicting severe cognitive impairment cannot be considered significant due to the small sample size of participants that have both A-T- profile and dementia stage.
Random forest performance in terms of AUC using a univariate approach (feeding the model with one brain region at the time) to tease apart the severity axis (unimpaired, mild and severe) within two biologically defined groups (A + T+ and A-T-). The values are separated among the two groups (A + T+ and A-T-) according to the cognitive impairment (C.I.). The last column values (A-T- with severe cognitive impairment) are marked with a * because these performances are not significant for the number of participants belonging to this class (6).
In the A + T+ group, the classifier performance is good for teasing apart CU participants from the other cognitive stages, considering solely the hippocampal volume (AUC = 0.81), but it performs poorly for the other cognitive stages (MCI versus dementia) and considering other brain region volumes (see Table 4). The null model performance is: 0.48 ± 0.08 A + T+ and 0.49 ± 0.07 for A-T-.
In the multivariate analysis, the brain volumetry features are automatically selected and differ in the two subsets of participants (see Supplemental Figure 2).
In the A + T+ groups, hippocampus, temporal and frontal gray matter, temporal white matter and fourth ventricle are combined together. In the A-T- group instead, only hippocampus and temporal gray matter are selected.
Figure 6 displays the ROC curves of the multivariate approach with 90% confidence interval. Also in this case, the classification reaches the best performance in the A-T- participants, keeping in mind that the severe cognitive impairment performance is not significant. The best performance in A + T+ participants is obtained in distinguishing the CU group from the others, while classification is at chance level in MCI and dementia groups. The features are automatically selected within the nested cross-validation and are illustrated in Supplemental Table 2.

Multiclass discrimination of the three stages of clinical severity using a multivariate approach by combining multiple volumes. ROC curves with 90% confidence intervals of the random forest. Each class performance is plotted separately to gain knowledge about the class by inspecting its corresponding classifier performance. The micro-average AUC scores of the A + T+ and A-T- subsets are 0.67 and 0.82 respectively.
Overall, these results show that brain volumetry is able to tease apart cognitive impairment stages within two different biological profiles.
Discussion
In summary, we have assessed the efficacy of MR-based automated volumetry in classifying clinical and AT(N) biological profiles across a variety of individuals, encompassing both those with AD and non-AD facets. Our aim was to identify the specific contexts where this tool demonstrates heightened effectiveness using three different experiments.
In experiment 1, we have reproduced and built upon the findings of Schmitter et al. 18 Initially, we explored univariate SVM classifiers to evaluate the power of individual brain regions of interest to discriminate CU, MCI, and dementia patients, irrespectively of their ATN profile. Subsequently, we employed a multivariate SVM adding the pTau as an additional feature, resulting in improved discrimination performance. The primary objective was to evaluate the effectiveness of volumetry as a tool for distinguishing unimpaired individuals from those with mild and severe cognitive impairment, comparing its performance to that of AD biological biomarkers used as univariate variables in the classification. As a second step, we have incorporated volumetry along with biological biomarkers, specifically pTau. Given the superior performance of pTau compared to other CSF biomarkers to discriminate clinical severity and its potential future detection in blood,20,21 pTau represents a relevant non-invasive biomarker. This aligns with our emphasis on highlighting the non-invasive aspect of MRI volumetry as well.
The main goal for experiment 2 was to investigate whether automated brain volumetry could effectively differentiate biological profiles (A+, T+, N+ and A-, T-, N-). As a benchmark, we have compared the performance of the same volumetric measurements in assessing impairment severity within the same groups of subjects, aiming to distinguish two different classes: cognitive unimpaired (CU) and cognitively impaired (MCI, demented) subjects. Despite resulting in unbalanced A+, T+, N+ and A-, T-, N- groups, we prioritized a fair comparison by examining the same pool of participants by first focusing on the classification along the biological axis, and then on the impairment severity. Notably, volumetry exhibited significantly superior performance in clinical severity stratification compared to biological profiles’ classification. This discrepancy can be attributed to the well-known phenomenon that biomarkers’ positivity may manifest much earlier than cognitive impairment, which typically occurs concurrently with neurodegeneration. 27 Moreover, the A-, T-, N- group contains MCI and demented patients for whom cognitive decline is likely linked to neurodegenerative non-AD pathologies.
The main aim of the last experiment was to investigate if volumetry accounts for the clinical severity when separately considering participants with A+, T+ or A-,T- biological profiles. In this case, we have disregarded the contribution of N in the ATN profiling as we considered a circularity between MRI-based atrophy measurements (i.e., MorphoBox) and other features of neurodegeneration considered in the ATN profiling (i.e., hypometabolism in FDG PET and total tau). Overall, our results demonstrated the high accuracy of volumetry in differentiating CU from impaired patients. However, the accuracy decreased when trying to discriminate MCI from dementia. When focusing on the A-,T- group, there was a tendency for improved discrimination performance. Nevertheless, the sample size for this subgroup was limited (N = 6), hence the observed trend should be interpreted with caution, as the small number of subjects limits the generalizability of the finding. Further research will be needed to understand whether MRI can truly help in the stratification of subjects with non-AD diseases and complement other biomarkers in the characterization of the clinical severity.
Reappraising ADNI data
From the traditional prospect of viewing ADNI data according to the clinical labels provided, we could assess that brain volumetry can be considered as a good predictor of clinical severity with respect to the CSF biomarkers according to ATN profiles. Our results are in line with previous observations from Schmitter et al. (2015), 18 suggesting the potential of volumetry to distinguish dementia and mild cognitive impairment from CU. This is additionally proven by reappraising the data in different ways: focusing on the biology of AD and its biomarkers.
With that, we drive the attention towards the CSF biomarkers and PET imaging measurements, known to be correlated with AD pathology. But, at the same time, we look out for the clinical assessment, given the cognitive labels ‘CU’, ‘MCI’, and ‘dementia’.
In order to visualize the combination of the factors under consideration, we have proposed a color-and-shape encoding, which mainly provides an overview of biologically defined AD with no cognitive decline from a clinical point of view—and the other way around.
It is not fully surprising that some participants with positive CSF biomarkers do not present any cognitive symptoms, thus considered cognitively unimpaired, as biology has generally an early onset with respect to the appearance of cognitive symptoms and these participants are at the preclinical stage of AD. 28 Actually, many studies have considered the scenario of cognitively unimpaired participants with an AD biological profile, 29 trying to assess the likelihood of longitudinal cognitive decline from asymptomatic at risk or early AD patients with biomarkers positivity. 30
It is also interesting to consider cognitively impaired participants with negative AD biomarkers. These participants likely suffer from non-AD pathologies. A number of these brain diseases stand as alternatives to AD such as diffuse Lewy body dementia (LBD), vascular dementia, frontotemporal dementias (FTD), as well as less prevalent entities. Phenotypic differences between typical AD presentation and LBD, FTD and vascular dementia may help teasing apart these diagnoses, even though they may also co-occur in the same patient. Recently, non-AD pathologies that may mimic AD phenotype (including episodic memory deficit of the hippocampal type and medial temporal lobe atrophy) have been described under the general label of suspected non Alzheimer's disease pathophysiology (SNAP). These can include hippocampal sclerosis, limbic-predominant aged-related TDP-43 encephalopathy (LATE) and primary age-related tau pathology (PART) or frontotemporal lobar degeneration (FTLD). 31
In general, by altering the perspective on the data, it becomes apparent that there are multiple ways to label each subject based on the chosen approach (biological or clinical). It is important to consider two labels’ sets: the clinical ones (dementia, MCI, unimpaired) that are very well correlated with morphometry, and the biological labels that have earlier onsets. 27 The correlation between brain volumetry and clinical severity is expected because brain atrophy and cognitive impairment appear at a later stage of the disease and almost simultaneously. 27
This work demonstrates that hippocampal and temporal gray matter volumes reach a good performance in discriminating clinical severity, as compared to CSF biomarkers and PET imaging values (experiment 1). Indeed, temporal lobes and hippocampus are known brain structures that correlate well with typical phenotypes of dementia due to AD.
Nonetheless, when used to discriminate dementia from MCI, volumetry yields an important limitation with respect to the CSF biomarkers: the univariate classifier of each volume of interest performs a bit better than chance, as reported in Schmitter et al. 18 Our findings are consistent with the existing literature in terms of the cognitive-decline predictive ability of volumetry 18 and pTau. 32
Biological markers of Alzheimer's disease
The most common CSF biomarkers of AD pathology are Aβ42 and pTau. Recent studies indicate that brain-derived proteins can also be measured in the blood, opening the prospects for AD blood tests’ availability in the near future.20,21 This goes well in line with our effort of investigating less invasive biomarkers using brain volumetry, and with our results on the multivariate combination of volumetry and pTau values. Potentially, we could combine less invasive biomarkers (such as pTau181 and pTau217 in the plasma and brain volumes) to improve the stratification of cognitive decline, which has been already proved to be effective using tau PET.21,32 If all these biomarkers were available with non-invasive tests, one may question the relevance of automated volumetry. In this work, we try to tackle this point by addressing the relevance and need of imaging diagnostics, especially in specific subsets of participants.
Volumetry as biomarker and methodological perspectives
The role and importance of computing brain volumes has emerged especially when we stratify the data separating A + T+ and A-T- participants. Here, it is important to note that the neurodegeneration dimension has been discarded from a biological point of view. The stratification of the participants has allowed us to better investigate the strength of volumetry in this particular sub-context.
The modest performances achieved in the A + T+ cohort may suggest that brain volumetry is not significant to detect cognitive impairment and one of the explanations could be the existence of co-pathologies in this group, such as LATE and/or mixed etiology of clinical AD. On the other hand, the preliminary findings in A-T- participants suggest that automated volumetry might be a potential biomarker and non-invasive stratification tool for classifying clinical stages. However, given the small sample size, this remains a hypothesis which will require further testing in future studies.
As a matter of fact, another important aspect of distinguishing these two subsets of participants is related to investigating the “LATE” pathology5,6,29,33 within the SNAP group. 6
SNAP is defined by A- individuals with normal levels of amyloid-beta protein but with abnormal neurodegeneration. 34 The progression of the cognitive impairment for these participants is generally slower than for A + N+ subjects. Even though this subset of participants could also include non-AD neurodegeneration, we highlight the complementary role of volumetry as support in the clinical stratification. Among these A-T- individuals, there were six participants with dementia (Figure 3). The underlying pathology is independent of AD and may be associated with cellular lesions involving pathological aggregation of another misfolded protein, TDP-43, especially in the medial temporal regions.
As mentioned in the previous sections, we cannot conclude anything on this specific sub-class because of the very small sample size. This can play an important factor in both the variability of the ROC curves (i.e., the confidence intervals are larger) and in the overall number of features selected (four in A-T- and ten in A + T+ subset, respectively). A crucial role is played by the automatic features selection since the random forest can be sensitive to correlated measures, and indeed we are considering overlapping brain regions (left/right brain volume and total brain volume).
Overall, we anticipated good performances in identifying cognitively normal A-T- participants, considered biologically healthy, as well as the good classification of cognitively unimpaired A + T+ participants since the volumetric information encodes meaningful correlations with atrophies that are concomitant with cognitive decline.
Limitations and future works
While stratifying subjects according to the A + T+ (N+) and A-T-(N-) profiles, we acknowledge that focusing on these subsets only can be limiting as it disregards a substantial portion of mixed possibilities (e.g., A + T-, A-T+ subjects) and creates unbalanced subgroups. However, the purpose of this study was also to reappraise the data from different points of view.
We also recognize that there are different thresholds and analysis methodologies (such as Elecsys and Luminex) for the AT classification, which is why we apply a 10% gray zone to mitigate false positives and false negatives in the classification. However, we strive to be as inclusive as possible in our definition of AT groups by incorporating PET imaging measurements like FDG 8 and AV45. 7 This approach allows us to identify biologically defined AD patients more precisely, as a participant is considered positive if either the CSF biomarker or the PET measurement thresholds are met. However, the A-T- subgroup was limited and should be increased in future studies to validate any result drawn from it.
A key limitation of this study is that the Aβ42 cutoff is influenced by inter-individual differences in overall CSF Aβ production. Additionally, CSF Aβ42 is highly labile and sensitive to handling, including freeze-thaw cycles, which can lead to a reduction in its concentration. Using the CSF Aβ42/Aβ40 ratio can help mitigate these effects, but we did not have sufficient measurements in this cohort.
Conclusions
In this work, we have investigated the power of MRI-based automated brain volumetry to discriminate between cognitive stages (CU, MCI, dementia) and biological (A + T + N+, A-T-N-) profiles.
We found that brain volumetry is strongly associated with clinical severity irrespectively of the biological profile. Moreover, hippocampal and temporal gray matter volumes could well distinguish A + T + N+ from A-T-N- profiles.
Future work may address the importance of volumetry in the A-T- subset of individuals, especially in a longitudinal framework, tackling the impact of morphometry in case of patients converting from none to mild cognitive impairment or from mild to severe cognitive impairment, and disentangling comorbid neuropathologies frequently associated with AD, especially in the oldest old population. This may be extremely helpful to provide patients with accurate individualized prognostic estimates for dementia and thus maximize the effect of upcoming disease modifying treatments.
Supplemental Material
sj-docx-1-alz-10.1177_13872877251339840 - Supplemental material for The complementary role of automated brain volumetry to stratify ADNI participants within the ATN framework
Supplemental material, sj-docx-1-alz-10.1177_13872877251339840 for The complementary role of automated brain volumetry to stratify ADNI participants within the ATN framework by Ilaria Ricchi, Alessandra Griffa, Ricardo Corredor-Jerez, Jonas Richiardi, Jean-François Démonet, Gilles Allali, Bénédicte Maréchal, Olivier Rouaud and in Journal of Alzheimer's Disease
Footnotes
Acknowledgments
Data collection and sharing for this study was funded by the Alzheimer's Disease Neuroimaging Initiative ADNI (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer's Association; Alzheimer's Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (
). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for NeuroImaging at the University of Southern California. The Contribution was prepared by IR at the direction of the employer (Siemens Healthineers) and within the scope of your employment and copyright in the Contribution is owned by the employer.
ORCID iDs
Ethical considerations
Author contributions
Ilaria Ricchi (Investigation; Methodology; Software; Validation; Visualization; Writing – original draft; Writing – review & editing); Alessandra Griffa (Writing – original draft; Writing – review & editing); Ricardo Corredor-Jerez (Supervision; Writing – review & editing); Jonas Richiardi (Conceptualization; Supervision; Writing – review & editing); Jean-François Démonet (Conceptualization; Supervision; Writing – review & editing); Gilles Allali (Writing – review & editing); Bénédicte Maréchal (Conceptualization; Data curation; Writing – review & editing); Olivier Rouaud (Validation; Writing – review & editing).
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The authors RC and BM are currently employees of Siemens Healthineers International AG, Lausanne, Switzerland and IR was also employed for this project.
Data availability statement
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
