Abstract
Background:
Alzheimer’s disease (AD) patients show heterogeneous cognitive profiles which suggest the existence of cognitive subgroups. A deeper comprehension of this heterogeneity could contribute to move toward a precision medicine perspective.
Objective:
In this study, we aimed 1) to investigate AD cognitive heterogeneity as a product of the combination of within- (factors) and between-patients (sub-phenotypes) components, and 2) to promote its assessment in clinical practice by defining a small set of critical tests for this purpose.
Methods:
We performed factor mixture analysis (FMA) on neurocognitive assessment results of N = 230 patients with a clinical diagnosis of AD. This technique allowed to investigate the structure of cognitive heterogeneity in this sample and to characterize the core features of cognitive sub-phenotypes. Subsequently, we performed a tests selection based on logistic regression to highlight the best tests to detect AD patients in our sample. Finally, the accuracy of the same tests in the discrimination of sub-phenotypes was evaluated.
Results:
FMA revealed a structure characterized by five latent factors and four groups, which were identifiable by means of a few cognitive tests and were mainly characterized by memory deficits with visuospatial difficulties (“Visuospatial AD”), typical AD cognitive pattern (“Typical AD”), less impaired memory (“Mild AD”), and language/praxis deficits with relatively spared memory (“Nonamnestic AD”).
Conclusion:
The structure of cognitive heterogeneity in our sample of AD patients, as studied by FMA, could be summarized by four sub-phenotypes with distinct cognitive characteristics easily identifiable in clinical practice. Clinical implications under the precision medicine framework are discussed.
INTRODUCTION
Clinical heterogeneity in neurological practice is a critical and underestimated issue, highly impacting both diagnosis and prognosis (for a review, see [1]). Indeed, interindividual differences in clinical manifestation of a disease may reveal biological or epigenetic differences [2] which could strongly affect drug mechanism of action [3], and effectiveness of other treatments (e.g., cognitive training) [4]. Heterogeneity has been studied in many neurological disorders, including psychosis [5], schizophrenia [6], stroke [7, 8], Parkinson’s disease [9], and multiple sclerosis [10]. Taken together, these findings highlight the need of a paradigm shift toward precision medicine. This issue is particularly relevant in neurodegenerative diseases where clinical phenotypes reflect the combination of heterogeneity in brain aging [11], age-related cognitive decline [12], and baseline individual differences. This would lead to high variance both in behavioral and in vivo biomarkers, seriously misguiding the disease understanding, as in the case of Alzheimer’s disease (AD) [13]. According to the DSM-5, the core symptom for the diagnosis of neurocognitive disorder due to AD is a progressive decline in memory, with alteration of at least one other cognitive domain. In clinical practice, however, the pattern of cognitive deficits observed in AD patients is highly variable and, according to the Alzheimer Precision Medicine Initiative (APMI), there is a strong need for patient-tailored interventions accounting for individual-specific biological profiles [3, 14]. For this reason, it is of crucial importance that research on AD focus on clinical variability, in terms of possible sub-phenotypes [13].
Previous research has characterized cognitive heterogeneity in AD, through theory-driven approach (e.g., [15]). However, only a few studies have dealt with this issue in a data-driven manner. For example, Cappa and colleagues [16] suggested the existence of four sub-phenotypes mainly characterized by the differential impairment of visuospatial/perceptual abilities, memory, perception, calculation, and language. Other studies have shown AD patients either classifiable on eight clusters of cognitive features [17], or simply based on the presence/absence of memory impairment [18]. Taken together, these studies suggest the presence of cognitive AD sub-phenotypes, but the number of clusters explaining variability across profiles is not clear, yet. One of the reasons behind this lack of consensus is that no studies have combined the investigation of inter-individual differences with that of intra-individual latent factors, which could lead to a finer understanding of the structure of AD cognitive heterogeneity.
In the present study, we aimed at investigating cognitive sub-phenotypes in AD through a relatively new approach for the study of heterogeneity, namely the factor mixture analysis (FMA) [19, 20]; see Methods for details), whose effectiveness has been proven in different domains, including mild cognitive impairment (MCI) and dementia [21]. The strength of this method is that it fosters a finer-grained description of heterogeneity compared to standard approaches. Indeed, by employing a combination of categorical and continuous latent variables, FMA allows both to study heterogeneity at the group-level (i.e., classifying individuals into subgroups) and to describe it within subgroups [22]. This technique is specifically suitable for our purpose since it allows to identify both the latent factors (i.e., linear combination of cognitive scores) and the potential sub-phenotype of patients who share common cognitive patterns (see the methods section for more details). This approach could help mapping clinical heterogeneity in AD, thus contributing to the implementation of the precision medicine perspective in clinical neuropsychology practice. Furthermore, our second aim was to find the minimum set of cognitive tests to effectively and rapidly highlight AD sub-phenotypes features in clinical routine, under the hypothesis that extensive neuropsychological batteries may be effectively reduced to a smaller set of critical tests without losing diagnostic accuracy and quality in the description of the cognitive profile, and maximizing resources [23].
MATERIALS AND METHODS
Participants and procedure
The study group is a retrospective sample of N =268 consecutive patients selected from a larger cohort of patients with neurological disorders referring to the neuropsychological service of the University of Padua (Italy). Inclusion criteria were: 1) clinical diagnosis of probable AD based on the criteria of the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA) [24]; 2) availability of the Mini-Mental State Examination (MMSE) [25] score within an extensive cognitive assessment (i.e., Esame Neuropsicologico Breve 2 - ENB2 [26]; see Supplementary Table 1). Pathophysiological biomarkers were not available in this retrospective sample, but all patients included in the final sample showed a clinical phenotype of AD, in line with the latest recommendations [27]. Patients showing comorbidity with psychiatric or other neurological diseases were excluded. Only patients coming for the first time at the neuropsychological service for clinical assessment were included in the study. Thirty-eight patients were discarded due to missing data; thus, the final sample was composed of N = 230 patients (age range: 58–93, Mage = 77.1, SDage = 6.3; Meducation = 7.3, SDeducation = 3.9; MMMSE = 21.5, SDMMSE = 3.5, 151 F). Furthermore, a sample of N = 326 age- and education-matched healthy controls (HC), who were administered the same neuropsychological assessment, was compared with the sample of patients. All participants gave their written consent for the anonymous use of the data. The study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethical Committee for the Psychological Research of the University of Padova.
Statistical analysis
Factor mixture analysis (FMA) to study heterogeneity
Heterogeneity in our sample of AD patients was investigated by means of a statistical technique called FMA [19, 20], which allows to evaluation of the factorial structure of a phenomenon while simultaneously investigating the existence of sub-populations (i.e., clusters of participants) [20, 28], without assuming that all participants in a sample are representative of the same population, as traditional factor analysis models do. In particular, FMA goes beyond standard factor analysis since it does not rely on the assumption that factors are normally distributed. Furthermore, it assumes that correlations between latent factors could vary across subpopulations, thus allowing to identify clusters within a heterogeneous sample [6]. Finally, FMA assumes a parametric structure within each class and can be used to test a series of structural hypothesis: in this way it allows to understand complex phenotypic structures that are simultaneously categorical and dimensional [22 , 29–31].
For these reasons, FMA is a suitable technique to model the underlying structure of psychological [19] and psychopathological [22] constructs. Recent studies have shown that FMA can be useful in the identification of sub-groups in HC, MCI, and dementia [21]. We thus decided to adopt the FMA to investigate the presence of cognitive sub-phenotypes in AD and to describe their key features. The FMA was applied on data from the whole cognitive battery except one test (i.e., token test) which was discarded due to its null variability.
Importantly, we adopted an exploratory approach in order to highlight the most reliable model of AD cognitive heterogeneity. To this end, AD cognitive scores were first scaled on HC data, then we estimated 49 FMA models by testing all the combinations from 1 up to 7 factors, and from 1 up to 7 groups to find the best combination fitting our data. The Bayesian Information Criterion (BIC) [32] was calculated for each model and the one with the lowest BIC was chosen as indicating the most plausible combination of latent factors and groups. Only the results relative to the best model will be reported and discussed.
Selection of the best subset of tests
Our second aim was to find the core set cognitive tests with the highest diagnostic accuracy (i.e., in discriminating AD versus HC). To this end, we ran a logistic regression model with participants’ status (either AD or HC) as dependent variable, and the whole set of tests as predictors. Then, this model was used as input for a backward stepwise procedure based on the Akaike Information Criterion (AIC) [33], which returned the best set of tests for the prediction of participants’ status. In order to control for the variability among HC data and to match sample sizes, this procedure was repeated 1000 times, each time randomly selecting 230 out of 326 HC to match AD sample size, and the logistic model was built on a dataset of N = 460 (230 AD and 230 HC). Thus, each iteration resulted in a selection of tests providing the highest classification accuracy between AD and the random sample of HC. Tests selected in ≥95%iterations were included in the best subset. As a control analysis, we tested the efficacy of this subset in the discrimination between AD and HC, and, more importantly, in the detection of AD sub-phenotypes. In other words, the set of tests which best detected AD patients was tested also to identify individual cognitive sub-phenotypes. To this end, we employed three machine-learning classifiers, i.e., Random Forest (RF), Support Vector Machine (SVM), and Naïve Bayes (NB) with a 10-folds cross-validation design (see the Supplementary Material for details). Again, the procedure was repeated 1000 times employing random selections of HC. Finally, the accuracy resulted from the selected tests was compared to that of the whole battery. All analyses were performed by means of R Software [34] and custom coding. The FMA was performed by means of the FactMixtAnalysis R package [35]. Machine learning analyses were performed by means of RWeka [36] R package.
RESULTS
Four cognitive sub-phenotypes of Alzheimer’s disease
A BIC value was computed for all 49 FMA models (Fig. 1a) and the lowest BIC (i.e., best balance between model likelihood and parsimony) highlighted a model with five factors and four groups (i.e., clusters) as the solution that best fitted our data. We also computed BIC weights [37], a transformation of BIC values into a probability space (range: 0–1), which allows to quantify the evidence in favor of one model being better than the others. The model with 5 factors and 4 groups showed a rounded BIC weight close to 1, indicating a ∼100%probability of being the best solution within the set of tested models (Fig. 1b).

Comparison of FMA models. a) Bayesian Information Criterion (BIC) for each FMA model. The minimum value of BIC indicates the best solution, i.e., 5 factors and 4 groups. b) BIC weights are computed in probability space, with 1 indicating 100%probability of being the best model compared to the alternatives.
An oblique (Promax) rotation was applied to factor loadings (see Fig. 2) to improve their interpretation. The first factor (F1) loaded mainly on verbal memory tests, F2 on visuospatial abilities, F3 on working memory, while F4 mainly loaded on attention and executive functions, and F5 on language and praxis abilities.

Factor loadings of the latent components emerged in the FMA. The values were Promax rotated to improve interpretability. Columns from F1 to F5 correspond to the five-factor solution derived from FMA. Colored cells indicate the most important tests for each factor (i.e., loading values above a threshold of |0.2|). According to the highest loadings, each factor can be interpreted as follows. F1, verbal memory; F2, visuospatial abilities; F3, working memory; F4, attention and executive functions; F5, language and praxis abilities; Prose Memory (Imm.), immediate recall prose memory; Prose Memory (Del.), short-delayed recall prose memory; Int.Mem.10 s, interference memory (10 seconds); Int.Mem.30 s, interference memory (30 seconds); TMT-A, Trail Making Test A.
According to this factorial structure, the sample of AD patients was split into four clusters including 20%(45/230), 18%(42/230), 46%(106/230), and 16%(37/230) patients, respectively (Fig. 3a), which loaded on different combinations of factors (Fig. 3b). A deeper look into clusters’ cognitive profiles revealed that in profiles belonging to Cluster 1, memory difficulties were mainly accompanied by visuospatial deficits (F2), thus we called this cluster “Visuospatial AD”. In Cluster 2 patients showed the typical AD cognitive pattern, characterized by predominant memory deficits, for this reason this cluster can be labelled as “Typical AD”. On the other hand, Cluster 3 showed a less impaired memory performance, thus can be called “Mild AD”. Finally, Cluster 4 was mainly explained by deficits in language and praxis abilities (F5), with relatively spared memory, thus this cluster could be labelled as “Nonamnestic AD” (Fig. 3c). Noteworthy, clusters should not be considered as being associated with a single cognitive feature, but as patterns distinguishable from each other based on peculiar cognitive weaknesses. For a clearer clinical interpretation of clusters’ cognitive profiles, summary statistics of cognitive scores are reported in Table 1.

Clusters (i.e., sub-phenotypes) characterization. a) Clusters size distribution. b) Clusters comparison across factors. F1, verbal memory; F2, visuospatial abilities; F3, working memory; F4, attention and executive functions; F5, language and praxis abilities. c) Mean normalized score obtained by each cluster in the different cognitive tests. Notably, TMT-A score (time in seconds) was transformed in a velocity measure (i.e., 25 items/time) to be comparable with the other measures (i.e., higher values indicate better performance). Each score was z-scored on the HC sample (N = 326). Prose Memory (Imm.), immediate recall prose memory; Prose Memory (Del.), short-delayed recall prose memory; Int.Mem.10 s, interference memory (10 seconds); Int.Mem.30 s, interference memory (30 seconds); TMT-A, Trail Making Test A.
The table reports Mean, SD and 95%Confidence Interval (CI) of cognitive scores for each cluster of patients (z-scored on N = 326 HC). Prose Memory (Imm.), immediate recall prose memory; Prose Memory (Del.), short-delayed recall prose memory; Int.Mem.10 s, interference memory (10 seconds); Int.Mem.30 s, interference memory (30 seconds); TMT-A, Trail Making Test A
Previous findings have shown that age, sex, and education might impact AD heterogeneity and drive diverging pathophysiologic paths across subtypes [38]. Thus, we checked whether these variables, as well as MMSE score, could explain our clusters by means of a logistic regression model. Significant main effects of sex (χ2 = 18.3, p < 0.001) and MMSE (χ2 = 13.9, p < 0.001) emerged. More specifically, between-clusters post-hoc comparisons suggested that the Nonamnestic AD patients (Cluster 4) where characterized by a better global cognition (MMSE score) than Visuospatial (Cluster 1; t[77.5] = –5.04, p < 0.001) and Typical AD patients (Cluster 2; t[72] = –3.9, p < 0.001), while Mild AD patients (Cluster 3) had significantly higher MMSE as compared to Visuospatial AD patients (Cluster 1; t[69.4] = –3.6; p = 0.003). All p-values were Bonferroni-corrected for multiple comparisons. Moreover, the proportion of females in Mild AD patients (Cluster 3) was significantly higher than in the other clusters (Cluster 1: χ2[1] = 18.6; Cluster 2: χ2[1] = 37; Cluster 4: χ2[1] = 29.5; all Bonferroni-corrected ps<0.001). These results indicates that, to some extent, AD cognitive heterogeneity might partially reflect gender-related and global cognitive functioning differences (see Supplementary Figure 1).
Precision medicine in clinical practice: cognitive sub-phenotypes are captured by few tests
The stepwise procedure (see Methods section) run on a logistic model for the discrimination of N = 230 AD versus N = 230 HC over 1000 iterations highlighted nine tests as the most critical for the diagnosis of AD (i.e., without distinguishing between sub-phenotypes; Fig. 4). This set of core tests included Digit span, Prose memory (delayed), TMT-A, Verbal fluency, Abstract thinking, Overlapping figures, Spontaneous drawing, Clock drawing test, and Praxis abilities.

Selection of tests for the classification between AD patients and healthy controls. A logistic regression model was built for 1000 iterations, each time on N = 230 AD and a random selection of N = 230 (out of 326) HC. A stepwise regression procedure was run for each iteration and the best predictors (i.e., tests) in the discrimination between AD and HC were highlighted. The tests which resulted as the best predictors in ≥95% of iterations were selected. Prose Memory (Imm.), Immediate recall prose memory; Prose Memory (Del.), Delayed recall prose memory (5 minutes delay); Int.Mem.10 s, Interference memory (10 seconds); Int.Mem.30 s, Interference memory (30 seconds); TMT-A, Trail Making Test A.
As a control analysis, we checked whether the diagnostic accuracy (i.e., AD versus HC) of the subset of tests was comparable to that of the whole battery by means of three machine-learning algorithms using a 10-folds cross-validation design. All algorithms showed a mean accuracy > 87%(i.e., 90.7%, 89.2%, 87.7%, respectively), and the difference in the classification performance between the subset of tests versus the whole battery was negligible (see Fig. 5a), indicating that using the selected 9 tests did not have a negative impact on diagnostic accuracy.

Classification of AD versus HC and identification of clusters (i.e., sub-phenotypes). a) Accuracy obtained by three machine-learning algorithms in the discrimination between AD and HC (error bars indicate SD computed across 1000 iterations, in each one the classification was performed between N = 230 AD and a random subsample of N = 230 HC from the whole HC sample of N = 326), both using the whole cognitive battery (blue line) and a subset of selected subtests (red line). This subset was selected by means of a recursive stepwise procedure across 1000 iterations (see Methods). b) Accuracy obtained by three machine-learning algorithms in the classification of the four cognitive sub-phenotypes emerged from FMA. The classification was performed using the selected tests (red line). The blue line indicates the reference (overfitted) classification using the whole battery of tests. Despite the diminished accuracy using the subset of tests, all algorithms still showed a good classification performance, suggesting that the four cluster (i.e., sub-phenotypes) could be identified also by means of a few tests. RF, Random Forest; SVM, Support Vector Machine; NB, Naïve Bayes.
We then tested the accuracy of the full and the reduced set of tests in the classification of cognitive sub-phenotypes (i.e., clusters) using the same classification approach. Accuracy obtained using the whole battery versus the selected tests is shown in Fig. 5b. Importantly, ceiling accuracy (i.e., overfitting) was expected when using the whole battery, since the phenotypes (clusters) were found on the same tests, thus the performance using the whole battery should be considered as a reference. Our focus was on the performance of the selected tests, which maintained a good classification accuracy (RF = 89.6%, SVM = 91.7%, NB = 90.4%) with a relatively small drop (7%on average) compared to the whole battery (See Supplementary Table 2 for further details on classification performance).
These results indicate that the selected tests can both identify critical core deficits for AD detection and capture cognitive sub-phenotypes (i.e., clusters). This suggests that a quick cognitive assessment based on a few tests might potentially be useful in clinical practice for identification and cognitive phenotypization of AD patients.
DISCUSSION
Precision medicine is a field of medicine which aims to optimize effectiveness of disease treatment (or prevention) by taking into account specific individual characteristics. In this study, we contribute to this approach by studying heterogeneity of cognitive profiles in a sample of patients with a clinical diagnosis of AD. We first aimed at investigating the presence of latent factors and how they combine to create clusters of patients with similar profiles (i.e., cognitive sub-phenotypes). Secondly, we aimed at supporting the implementation of this approach in the clinical routine by identifying a core set of cognitive tests able to characterize sub-phenotypes at the individual level.
We evaluated the cognitive heterogeneity in AD patients by means of FMA, a relatively novel approach which could allow to overcome some of the limitations of dimensionality reduction and cluster-analysis techniques adopted in previous research on this topic. This approach suggested a model with five factors and four cognitive clusters as the most suitable towards explaining our data. The latent factors were mainly grounded on memory (F1), visuospatial abilities (F2), working memory (F3), attention and executive functions (F4), and language and praxis abilities (F5). Along these dimensions, four cognitive sub-phenotypes were shown. The most represented (46%of patients) was called Mild AD (Cluster 3) since it was characterized by a mild general impairment. The Visuospatial AD cluster (Cluster 1) included 20%of patients, whose cognitive profile was mainly characterized by visuospatial deficits. Then, 18%of patients belonged to the Typical AD cluster (Cluster 2) which was characterized by a homogeneous cognitive profile with deficits primarily affecting memory performance. Finally, 16%of patients were labelled as Nonamnestic AD (Cluster 4) since they showed more deficits in language and praxis abilities, and relatively spared memory. Patients in the latter cluster also showed higher MMSE score and a relatively younger age (despite age difference was not significant) compared to other clusters. The contrast between profiles characterized by memory versus non-memory deficits is consistent with recent studies [18] and confirms memory involvement as one of the main dimensions explaining interindividual cognitive variability in AD. Moreover, the Visuospatial AD is consistent with recent findings [39] suggesting that such profile may be explained by a predominant right temporoparietal pattern of brain atrophy and hypoperfusion [16]. Our findings are also consistent with a previous study [40] on heterogeneity in patterns of global cognitive measures (i.e., MMSE and Dementia Rating Scale) employing Latent Class Analysis (LCA). The application of FMA in our work would allow to overcome some of the limitations of LCA; moreover, we faced heterogeneity of cognition in AD across many domains, thus providing a characterization of clusters’ cognitive profile.
To date, the literature on cognitive heterogeneity in AD has led to spurious results, with some studies agreeing on the existence of four clusters [16], while others suggesting different solutions [17, 18]. This weak consensus might be explained by a lack of ground-truth, e.g., out-of-sample validation of findings or relation between cognitive profiles and known neuroanatomical patterns.
A recent study on patients with mild to moderate AD reliably identified typical and atypical cognitive profiles describing 79.6%and 20%of patients, respectively [41]. Here, applying a similar approach we found similar results, but with a more fine-grained description of patients with atypical cognitive profile. For instance, we highlighted patients with characteristic visuospatial deficits which were not identified by Qiu et al.’s study since visuospatial measures were unavailable in the sample used for clusters’ identification. Furthermore, the results of the present study showed a substantial convergence with findings on neuroanatomical heterogeneity in AD and MCI patients (for a review, see [42]). For instance, a recent work by Dong and colleagues [43] analyzed MRI data of 314 AD and 530 MCI patients, and identified in both samples a four-dimensional categorization of neuroanatomical alterations, mainly characterized by: 1) a largely normal anatomy; 2) classical AD-like neuroanatomical pattern; 3) diffuse pattern of atrophy mainly involving parietal and dorsolateral regions with relatively spared medial temporal lobe (MTL); and 4) predominant involvement of MTL. Other studies on AD and prodromal AD patients have found neuroanatomical subtypes mainly characterized by right temporoparietal [39] or parieto-occipital [44] atrophy, clinically related to visuospatial difficulties.
Aside from the contribution of the present study to the controversial literature on cognitive subtypes in AD, the main take home message of this work is that AD cognitive heterogeneity should be taken into account in the clinical routine. Many protocols of cognitive interventions on AD patients have proven their efficacy at the group-level [45]. However, one of the main goals of neurocognitive assessment is to highlight cognitive strengths and weaknesses at the individual level, and design tailored cognitive trainings accordingly, with a positive impact on patients’ global functioning and quality of life. Moreover, some authors [46] have suggested that phase II pharmacological trials would benefit from taking into account finer neurocognitive descriptions of AD patients, since these features may dramatically change drug effect [3]. The identification of individual AD cognitive sub-phenotypes could to some extent improve accuracy and precision in the estimation of prognosis, with different clinical phenotypes being plausibly related to different neurobiological patterns [43]. Future investigations should also shed light on heterogeneity in early-onset AD patients, since pure AD pathology is more frequent in this population and comorbidities are more rarely present [18]. Taken together, the present findings highlight the necessity of further investigating the complex association between cognitive profiles and relative neurobiological features, and potentially lead to the development of finer-grained behavioral biomarkers of disease and disease progression, in a precision medicine perspective [13]. The approach adopted in this study is pivotal in clinical contexts, as it sheds light on the possibility to provide clinicians of a quick toolbox, able to identify individual cognitive sub-phenotypes. This was the main reason behind the second aim of our paper, i.e., to highlight the minimum set of cognitive tasks able to accurately identify AD patients, as well as their cognitive sub-phenotype. First, we identified the best set of tests for the discrimination between AD and HC by means of a recursive stepwise procedure (Digit span, Delayed prose memory, TMT-A, Verbal fluency, Abstract thinking, Overlapping figures, Spontaneous drawing, Clock drawing test, and Praxis abilities). This reduced cognitive battery allowed to identify AD patients and their cognitive sub-phenotypes (i.e., clusters) with an accuracy of about 87%. This implies that a few critical cognitive tests can replace the administration of a full neuropsychological battery, not only for a diagnostic purpose, but also for a fine-grained description of AD cognitive profile. Indeed, our results demonstrated that a subset of tests performed as the whole cognitive battery, both in the discrimination of AD versus HC and in the identification of AD cognitive sub-phenotypes. A quicker assessment is more suitable for clinical practice since clinicians are required to evaluate patient’s cognitive profile in short time-windows. Moreover, the probability to measure mental fatigue instead of proper cognitive deficits is reduced when less tests are employed.
A main limitation of the present study is the lack of biomarkers of AD and postmortem confirmation about pathology. Despite in principle we cannot rule out the possibility of misdiagnosis (which would affect cognitive heterogeneity), diagnoses were made by expert physicians through careful application of standard clinical criteria, thus we do not believe that our results were driven by potential misdiagnoses. However, future studies will rise from the present findings through the recruitment of a prospective sample of patients diagnosed with AD also by means of standard biomarkers.
A further limitation is the absence of an independent sample to test the generalization of our findings. In future studies we will recruit new patients and investigate the relation between cognitive sub-phenotypes and brain anatomical/functional patterns.
As a final remark, the use of data-driven models to study behavioral heterogeneity have some limitations, e.g., the possibility that results are not always generalizable beyond the data they are trained on [47]. In the present work, given our aims and taking into account the limited retrospective sample size, we decided to employ a data-driven method (i.e., FMA) to foster the interpretability of results. However, other valuable methods could be adopted, such as computational models (e.g., [2, 48]). We hope that the present study could be a starting point for the generation of hypotheses that could be tested in future studies applying computational models to larger datasets. We believe that the combination of data-driven and theory-driven approaches could boost the study of clinical heterogeneity in AD and other diseases.
In conclusion, our sample of AD patients was best described by four cognitive sub-phenotypes which could be detected even by means of a few tests, making this investigation suitable for clinical practice. The mapping of AD cognitive heterogeneity is important for two main reasons. First, it allows a more fine-grained description of the individual disease, which is desirable in a precision medicine framework. Second, it improves our understanding of AD pathology, by characterizing which features contribute more to the interindividual variability in the clinical manifestation of the disease.
DISCLOSURE STATEMENT
Authors’ disclosures available online (https://www.j-alz.com/manuscript-disclosures/21-0719r1).
