Abstract
Background:
Lipidomics may provide insight into biochemical processes driving Alzheimer’s disease (AD) pathogenesis and ensuing clinical trajectories.
Objective:
To identify a peripheral lipidomics signature associated with AD pathology and investigate its potential to predict clinical progression.
Methods:
We used Bayesian elastic net regression to select plasma lipid classes associated with the CSF pTau/Aβ42 ratio as a biomarker of AD pathology in preclinical and prodromal AD cases from the ADNI cohort. Consensus clustering of the selected lipid classes was used to identify lipidomic endophenotypes and study their association with clinical progression.
Results:
In the APOE4-adjusted model, ether-glycerophospholipids, lyso-glycerophospholipids, free-fatty acids, cholesterol esters, and complex sphingolipids were found to be associated with the CSF pTau/Aβ42 ratio. We found an optimal number of five lipidomic endophenotypes in the prodromal and preclinical cases, respectively. In the prodromal cases, these clusters differed with respect to the risk of clinical progression as measured by clinical dementia rating score conversion.
Conclusion:
Lipid alterations can be captured at the earliest phases of AD. A lipidomic signature in blood may provide a dynamic overview of an individual’s metabolic status and may support identifying different risks of clinical progression.
INTRODUCTION
Current diagnostic research criteria for the early detection of Alzheimer’s disease (AD) are based on disease-defining biomarkers of amyloidosis, tauopathy, and neurodegeneration [1]. These biomarkers, however, are not precise enough to predict individual clinical trajectories and risk of clinical conversion [2]. More recently, multi-omics approaches have been studied to account for the heterogeneity of clinical courses in AD and identify different clinic-pathological endophenotypes as a potential basis for personalized medicine [3, 4].
As one important example, lipidomics provides insight into metabolic endophenotypes that may modify the effect of AD pathology on neurodegeneration and clinical trajectories. Thus, lipids are involved in many downstream processes of AD pathology, such as membrane remodeling, modulation of trans-membrane proteins, including amyloid-β protein precursor (AβPP) and its secretases, maintaining blood-brain barrier function, myelination, cell signaling, and inflammation. In addition, they may even influence upstream events such as oxidative stress pathways and alterations of energy balance [5, 6]. Recent genetic studies supported the role of lipids in AD pathogenesis even beyond the apolipoprotein E ɛ4 allele (APOE4), which is considered the major genetic risk factor for late-onset sporadic AD (LOAD) [7]. Genome-wide association studies (GWAS) have identified associations between disease status and several genes involved in lipid homeostasis, such as CLU (clusterin), SORL1 (sortilin-related receptor 1), ABCA7 (ATP-binding cassette, sub-family A, member 7), and PLD3 (phospholipase-D3) [7] in addition to the microglia related PLCG2 (phospholipase C-gamma) [8].
Our study used targeted lipidomics data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort to identify lipid alterations in the blood associated with AD pathology biomarker, namely cerebrospinal fluid (CSF) pTau/Aβ42 ratio, in people with preclinical or prodromal AD. In a secondary exploratory analysis, we determined lipidomic endophenotypes within prodromal and preclinical cases, respectively, using a consensus clustering approach. We investigated whether these lipidomic endophenotypes contributed to predicting subsequent clinical progression as determined by dementia rating score (CDR) conversion in preclinical and prodromal AD cases.
MATERIALS AND METHODS
Cohort overview
This study used data provided by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu). ADNI is a large, multicenter, longitudinal study of older adults launched in 2003 by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, the Food and Drug Administration, private pharmaceutical companies, and non-profit organizations. The study was designed to acquire serial neuroimaging, clinical and neuropsychological assessments, and other biologic markers to monitor the progression of mild cognitive impairment (MCI) and early AD. A full description of the study protocols and analytical methods are provided at (http://www.adni-info.org/).
The final cohort consisted of 529 participants from the ADNI cohort having a baseline diagnosis of either cognitively normal or mild cognitive impairment along with complete CSF- biomarkers, lipidomics, and body mass index (BMI) data. BMI values were sorted into three categories as follows: BMI_low (average weight): 18.5–24.9 or (underweight): < 18.5, BMI_medium (overweight): 25–29.9 and BMI_high (at least moderately obese): > 30. We further classified our participants into three diagnostic groups based on their CSF pTau/Aβ42 status, such that the cognitively normal (CN) group represents cognitively normal participants with CSF pTau/Aβ42 below the cut-off (0.025) [9]. Preclinical and prodromal groups had CSF pTau/Aβ42 above the optimized cut-off and an initial diagnosis of cognitively normal and MCI, respectively.
APOE genotyping
At the baseline visit, blood samples were obtai-ned from the participants, shipped to the central biomarker analysis lab at the University of Pennsylvania, and processed using an APOE genotyping kit, as further described (http://adni.loni.usc.edu/wp-content/uploads/2010/09/ADNI_GeneralProceduresManual.pdf). For subsequent analysis, we coded participants’ APOE genotype according to the presence of ɛ4 allele present as follows; 0: no ɛ4 allele, 1 : 1 or 2 ɛ4 alleles.
CSF biomarkers measurements
CSF amyloid-β (1-42) (CSF Aβ42) and CSF Phospho-Tau (181P) (CSF pTau) were measured using the fully automated Roche Elecsys® immuno-assay platform at the UPenn/ADNI Biomarker Laboratory. CSF biomarkers Aβ42 and pTau/Aβ42 were binary classified based on the optimized cut-offs 977 pg/ml and 0.025, respectively. These cut-offs were determined on the ADNI cohort then validated against the visual reads of amyloid-β PET, as explained in [9].
Lipidomics data
Targeted Lipidomics analysis was carried out on the plasma samples from ADNI participants using ultra-high-performance liquid chromatography coupled with chromatographic separation to characterize isomeric and isobaric lipid species. Mass spectrometry analysis was performed on an Agilent (6490 QQQ) mass spectrometer in positive ion mode with dynamic scheduled multiple reaction monitoring (MRM). The analysis was conducted following the lipidomics protocol developed by Kevin Huynh and Peter Meikle in Baker Heart and Diabetes Institute, Metabolomics laboratory. A detailed description of their lipidomics platform was provided in the methodology file (ADNI_ADMCLIPIDOMICSMEIKLELABLONG_METHODS_20210121.pdf) and respective articles [10,11, 10,11].
After applying the standard normalization and batch correction procedures, measurements from 692 lipid species were provided in the file (ADMCLIPIDOMICSMEIKLELABLONG.csv). All the lipid measurements were log10 and z-transformed before any analysis. Lipid species (692) were then merged into one hundred and seven (107) composite scores defined through a hierarchical clustering approach that was applied within each of the lipid subclasses /classes.
Statistical analysis
Selection of salient lipids associated with biomarkers of AD pathology
We used Bayesian elastic net regularized logistic regression to select lipid composite scores associated with the CSF pTau/Aβ42 ratio as a biomarker of AD pathology. Regularized logistic regression methods were developed to carry out simultaneous parameter estimation and variable selection [12, 13]. Elastic net offers an optimum regularization and variable selection, particularly in high dimensional data settings, such as the current lipidomics data, where features are often highly collinear, and their number exceeds the sample size [13, 14]. As one of the regularization approaches, the elastic net provides a reasonable compromise between both ridge (L2) and lasso (L1) penalties [13, 14]. It performs an effective feature selection via the lasso penalty while better handling correlated features via the ridge penalty [14, 15]. Adopting a Bayesian approach possesses several advantages over classic elastic net regularized regression [12, 16]. First, Bayesian methods provide a straightforward statistical inference for the estimated coefficients through the posterior distributions and credibility intervals [12, 16]. Second, it allows for simultaneous estimation of both penalty parameters (L2 & L1) and model parameters [12, 16]. This is particularly important in controlling the double shrinkage problem (too small, estimated coefficients) due to sequential estimation of penalty parameters through cross-validation procedure in the classic method. Additionally, Bayesian approaches have shown better variable selection in real data examples and simulation studies [12].
Before conducting the analysis, lipid composite scores were transformed into W-scores using regression models estimated on the control group. W-scores are analogous to Z-scores yet adjusted for particular covariates, namely age and sex [17]. An initial filtering step was carried out to include only the top 60% of lipid composite scores correlated with the CSF pTau/Aβ42 status in the regularized logistic regression models. Then, a Bayesian logistic regression model with elastic net regularization was fitted in the RStan interface. We adapted the scripts provided by Sara van Erp on GitHub (https://github.com/sara-vanerp/bayesreg), implementing elastic net priors in Bayesian regularized regression models using Stan language [16]. A training dataset (80% of the whole cohort) was used for estimation of model parameters through Markov Chain Monte Carlo (MCMC) sampling (No-U-Turn Sampler (NUTS) algorithm). The resulting estimates were then used to predict the outcome in the test dataset (20 % of the whole cohort). Lipid composite scores were selected based on the credible interval criterion, where a variable is excluded if the credibility interval covers 0. A credibility interval level of 50% was used as recommended in [12]. Salient lipid composite scores were determined based on being selected in more than 50% of the cross-validation 100 iterations. Three different models were calculated: 1) Reference model, using the demographic criteria (Age and Sex); 2) Lipid model, using lipid composite W-scores, and 3) Lipid model + APOE4, where participants’ APOE4 status was added as a covariate to the Lipid model.
Prediction of clinical progression
Lipidomic endophenotypes based on consensus clustering. We applied a hierarchical clustering on those lipid composite scores that had been found associated with the CSF pTau/Aβ42 ratio in the previous regularized regression analysis. The clustering was performed separately in the preclinical and prodromal subgroups, respectively. We employed a consensus clustering approach using data subsampling [18, 19], repeated 5,000 times to ensure the stability and robustness of clustering results. During each repetition, 80% of the data samples (participants) were randomly selected for agglomerative hierarchical clustering using Ward’s criterion to minimize the total within-cluster variance. A consensus matrix/cluster-based similarity matrix was then constructed. Each element in the matrix is a number between 0 and 1 inclusive, representing the proportion of times that two samples (participants) were clustered together out of the times that the same samples were chosen in the bootstrap sub-sampling process. Then final cluster assignment was defined through the consensus function, cluster-based similarity partitioning algorithm (CSPA), first introduced by Strehl and Ghosh and implemented in diceR library [18]. CSPA is an efficient consensus function that re-clusters the data samples through applying hierarchical clustering on the constructed consensus matrix [18, 19]. Hence the cluster labels are inferred at the hierarchy level of the optimal number of clusters (k) previously defined.
The optimal number of clusters was defined based on a composite score combining the proportion of ambiguous clustering (PAC) score and Dunn’s index estimated within the consensus clustering. PAC is a robust estimate of cluster stability, mainly when data samples are not independent [20], an intrinsic feature of omics data. PAC score is the fraction of sample pairs with consensus index values falling in the intermediate interval, i.e., PAC window. In a perfect clustering, the consensus matrix would consist of zeros or ones, and therefore the PAC score would be zero [20]. Thus, the lower the PAC score, the more stable and near perfect the clusters. We used a PAC window of (0.1,0.9) in our analysis.
Conversely, Dunn’s index estimates clustering internal validity considering compactness and separation measures [21]. The larger the Dunn’s index, the better the inter-cluster separability and intra-cluster compactness. The composite score was computed as PAC score divided by Dunn’s index value; accordingly, the lower the composite score, the better the clustering.
Lipidomic endophenotypes and risk of CDR conversion. We assessed the potential of the defined lipidomic endophenotypes to predict Clinical Dementia Rating score (CDR) conversion from a value of 0 to 0.5 or 0.5 to 1 or higher in the preclinical and prodromal sub-cohorts, respectively. Using Bayesian survival analysis, we estimated the risk of conversion over a follow-up period of six years (average follow-up = 4.15 + 1.72) while accounting for censoring. We further explored the effect of several covariates, namely age, sex, BMI, APOE4, and years of education, on the estimated risk of conversion. Finally, Bayesian multivariate analysis (MANOVA) was conducted to reveal which lipid composite scores distinguished clusters at low versus high risk of clinical progression.
The whole analysis workflow is summarized in Fig. 1. All analyses were performed in R (version 3.6.3) using the following packages: RStan (version 2.21.2), RStanArm, brms, bayestestR, BayesFactor, pROC, diceR.

Overview of the data analysis workflow. This figure summarizes the analysis workflow adopted by this study as described in the Materials and Methods section. Panel A displays the preparation of the final cohort based on the defined inclusion criteria then the classification of the final diagnostic groups based on the CSF pTau/Aβ42 ratio. The statistical analysis is demonstrated in panels B and C. Panel B illustrates the selection of salient lipids associated with biomarkers of AD pathology through Bayesian elastic net regularized logistic regression models. Panel C explains the steps to predict clinical progression in the diagnostic groups, namely prodromal and preclinical. First, we defined clusters of participants having similar lipid profiles within each diagnostic group. Then we explored the defined clusters for the risk of conversion to MCI or dementia.
RESULTS
Demographic characteristics
A summary of the demographic characteristics of our final cohort is provided in (Table 1). The diagnostic groups did not differ in age, sex, or education years. The distribution of BMI categories differed between groups; the preclinical group had the highest proportion of BMI-low category. As expected, the APOE ɛ4 allele was more prevalent in preclinical and prodromal groups (≥60%)compared with the normal control group (pTau/Aβ42 -ve) (18%). AD CSF biomarker levels (pTau and pTau/Aβ42) were higher in prodromal participants than in the preclinical group.
Overview of cohort demographics
Summary of the demographic characteristics of our cohort split into the final three diagnostic groups cognitively normal elderly
Selection of salient lipids associated with biomarkers of AD pathology
Bayesian elastic net regularized logistic regression models performance
Using only age and sex as predictors, the performance of the Reference model was not better than random prediction. The Lipid model improved the prediction accuracy. The cross-validated area under the receiver operating curves (CV-AUC), CV-Accuracy, CV-Sensitivity, and CV-Specificity at the optimum threshold were 0.65, 0.66, 0.68, and 0.61, respectively. However, the best performance was achieved by the Lipid + APOE4 model; the estimated CV-AUC, CV-Accuracy, CV-Sensitivity, and CV-Specificity increased to 0.76, 0.71, 0.69, and 0.77, respectively. Supplementary Table 1 provides an overview of all tested models.
Identification of salient lipids
The Lipid + APOE4 model selected a set of twenty-eight lipid composite scores in at least 50% of cross-validation repetitions (Supplementary Table 2). A features’ relative importance and stability were determined by the median posterior β-coefficients and frequency of selection across the cross-validations. According to these criteria, lyso-glycero-phospholipids (LPL), alkenyl-glycerophospholipids (plasmalogens), free fatty acids (FFA), cholesterol esters and sphingolipids (complex ceramides) lipid classes/subclasses ranked on top of the list. Both lyso-phosphatidylcholine (LPC_7: poly-unsaturated fatty acid (PUFA)) and lyso-alkyl-phosphatidylcholine (LPC_O_2: long-chain fatty acid (FA)) were positively associated with the CSF pTau/ Aβ42 ratio. Similarly, phosphatidylcholine (PC_5: arachidonic acid (AA)) harboring arachidonic acid showed a positive association. Conversely, plasmalogens such as alkenyl- phosphatidylcholine (PC_P_5: docosahexaenoic acid (DHA), Eicosapentaenoic acid (EPA) & PC_P_2: saturated and mono-unsaturated FA) and alkenyl- phosphatidylethanolamine (PE_P_5: AA, DHA) showed negative associations.
Except for AA (FA_3), free fatty acids (FA_1: saturated, mono-unsaturated, PUFA) were negatively associated with the AD biomarkers. Cholesterol esters (Chols_ester_3: PUFA & Chols_ester_2) and long-chain acyl-carnitines (AC_4: PUFA) were positively associated with AD biomarkers, while di-acylglycerol (DG_3: EPA, DHA) and alkyl-di-acylglycerol (TG_O_3) showed negative relation.
Complex ceramides including hexosyl-ceramides (hexCER_6 & hexCER_7), gangliosides (GM1), and sulfatides were found to be positively associated with AD biomarkers yet di-hydro-ceramides (dhCER_1), gangliosides (GM3_3: very long FA), and sphingomyelin (SM_3: very long FA) were negatively associated. Figure 2 displays the median posterior β-coefficients and their credibility intervals across the cross-validations, as estimated by the Lipid + APOE4 model. Lipid species, constituting each of the salient lipid composite scores, are listed in Supplementary Table 3.

Salient lipids associated with CSF pTau/Aβ42 ratio. We used Bayesian elastic net logistic regression (Lipid+APOE4 model) to select salient lipid composite scores associated with CSF pTau/Aβ42 ratio. Estimated posterior β-coefficients are represented as points with their respective 50% and 90% credibility intervals as thick and thin error bars, respectively. The points’ color codes for their corresponding lipid class. LPC_O_2: Lyso-alkyl-phosphatidylcholine (long/ very long FA), Choles_ester_3: Cholesteryl ester (PUFA), hexCER: Hexosyl-ceramide, FA_3: Free fatty acid (AA), PC_5: Phosphatidylcholine (AA), LPC_7: Lysophosphatidylcholine (PUFA), AC_4: Acylcarnitine (PUFA), GM1: GM1 gangliosides, Choles_ester_2: Cholesteryl ester, SULF_1: Sulfatides, LPE_1: Lyso-phosphatidylethanolamine (saturated FA), PI_1: Phosphatidylinositol (PUFA), LPI_3: Lyso-phosphatidylinositol (AA), GM3_3: GM3 gangliosides (very long FA), dhCER: Dihydroceramide, LPC_P_2: Lyso-alkenyl-phosphatidylcholine (long FA), SM_3: Sphingomyelin (very long saturated FA), PI_2: Phosphatidylinositol (saturated, monounsaturated FA), LPC_5: Lysophosphatidylcholine (long, very long FA), LPC_2: Lysophosphatidylcholine (odd numbered FA), TG_O_3: Alkyl-diacylglycerol, DG_3: diacylglycerol (EPA & DHA), PC_P_2: Alkenyl-phosphatidylcholine (saturated and mono-unsaturated FA), PE_P_5: Alkenyl-phosphatidylethanolamine (AA, DHA), PC_P_5: Alkenyl-phosphatidylcholine (DHA & EPA) and FA_1: Free fatty acid.
Prediction of clinical progression
Lipidomic endophenotypes based on consensus clustering
We conducted consensus clustering to identify lipidomic endophenotypes based on the set of lipid composite scores selected by the Lipid + APOE4 model.
In the prodromal sub-cohort, we determined the optimum number of clusters to be (k = 5), as demonstrated in Supplementary Figure 1. Of the prodromal participants, 28% fell into the cluster (I), 23% in the cluster (IV), 20% each in the clusters (II) and (V), and 9% in the cluster (III). Apart from the BMI categories distribution, there was no conclusive evidence for differences in age, sex, years of education, APOE4 status, or the CSF levels of AD biomarkers between the defined clusters (Supplementary Table 4).
Following the same approach, we determined (k =5) the optimal number of clusters for the preclinical sub-cohort, as shown in Supplementary Figure 2. Of these participants, 28% fell into the cluster (I), while the rest were equally distributed over the remaining clusters. Details on the distribution of demographic characteristics, APOE4 genotype, and BMI categories can be found in Supplementary Table 5.
Lipidomic endophenotypes and risk of CDR conversion
We evaluated the risk of CDR conversion among prodromal sub-cohort clusters with and without adjusting for the effect of covariates as demonstrated in Supplementary Table 6. Cluster (IV) was chosen as the reference group since it exhibited a lower risk of CDR conversion. Moreover, cluster (IV) enclosed a relatively large proportion of participants. As shown in Fig. 3, the clusters (II) (HR = 1.97 (1.26–3.10)) and (V) (HR = 1.99 (1.30–3.00)) had an increased risk of conversion in the APOE4 adjusted model. To investigate whether these effects differed between sexes, we repeated the Bayesian survival models (APOE4 adjusted) in the male and female data subsets, respectively (Table 2). In men, the lipid profiles of clusters (II and V) showed an increased risk of conversion, whereas cluster (III) showed a decreased risk of conversion relative to the reference cluster (IV). In women, only cluster (II) had an increased risk of conversion.

Lipid endophenotypes predict clinical progression to dementia. We conducted a Bayesian survival analysis to estimate the risk of clinical progression to dementia among the pre-defined clusters of the prodromal sub-cohort. Clinical progression in the prodromal sub-cohort is defined as the conversion of clinical dementia rating score (CDR) from a value of 0.5 to 1. Clusters (II and V) are found to have ≃2 folds higher risk of progression to dementia compared to the reference cluster (IV).
Risk of clinical progression among prodromal lipidomic endophenotypes
Bayesian survival analysis was conducted to estimate the relative risk of progression to dementia among prodromal lipidomic endophenotypes while adjusting for APOE4. APOE4 adjusted model was selected based on the sensitivity analysis provided in Supplementary Table 6, which investigated the relative risk of several covariates. We further replicated the same model on male and female subsets separately to explore sex-specific effect of lipidomic endophenotypes on clinical progression. Throughout the analysis, we set cluster (IV) as our reference group. Results were interpreted in terms of high-density intervals (HDI) of posterior distributions, where hazard ratios with HDI not covering (1) were considered relevant and reported in red.
Finally, we conducted Bayesian multivariate analysis to identify differences in lipid composite scores between the reference cluster (IV) and the remaining clusters (Supplementary Table 7). Figure 4 shows the specific lipid profile for each cluster of the prodromal sub-cohort.

Heterogeneity of lipidomic endophenotypes among the prodromal sub-cohort. The specific lipid profile of each cluster is demonstrated on a heatmap in terms of average w-scores. On the color scale, red represents scores higher than expected in the age and sex-matched control group, and blue color represents lower scores. Bayesian multivariate analysis was conducted to identify lipid composite scores distinguishing clusters at higher risk of clinical progression from the reference group. Cluster (IV) was set as the reference group and marked by (Ref.). Clusters (II and V) were defined as groups at higher risk of progression and marked by (#). Asterisk (*) points to lipid scores that showed evidence of group differences. PC_5: Phosphatidylcholine (AA), PC_P_2: Alkenyl-phosphatidylcholine (saturated and mono-unsaturated FA), PC_P_5: Alkenyl-phosphatidylcholine (DHA & EPA), PE_P_5: Alkenyl-phosphatidylethanolamine (AA, DHA), PI_1: Phosphatidylinositol (PUFA), PI_2: Phosphatidylinositol (saturated, monounsaturated FA), LPC_2: Lysophosphatidylcholine (odd numbered FA), LPC_5: Lysophosphatidylcholine (long, very long FA), LPC_7: Lysophosphatidylcholine (PUFA), LPC_O_2: Lyso-alkyl-phosphatidylcholine (long/very long FA), LPC_P_2: Lyso-alkenyl-phosphatidylcholine (long FA), LPE_1: Lyso-phosphatidylethanolamine (saturated FA), LPI_3: Lyso-phosphatidylinositol (AA), dhCER: Dihydroceramide, hexCER: Hexosyl-ceramide, GM3_3: GM3 gangliosides (very long FA), GM1: GM1 gangliosides, SM_3: Sphingomyelin (very long saturated FA), SULF_1: Sulfatides, Choles_ester_2: Cholesteryl ester, Choles_ester_3: Cholesteryl ester (PUFA), DG_3: diacylglycerol (EPA & DHA), TG_O_3: Alkyl-diacylglycerol, FA_1: Free fatty acid, FA_3: Free fatty acid (AA) and AC_4: Acylcarnitine (PUFA).
In the preclinical sub-cohort, there was no evidence of a difference in risk of CDR conversion between the five clusters. Essentially identical results were obtained whether we adjusted or not for covariates.
DISCUSSION
We explored different lipid classes in preclinical and prodromal AD cases to analyze the relationship between lipid metabolism markers and biomarkers of amyloid and tau pathology, as well as clinical progression.
Our first goal was to determine associations be-tween peripheral lipid alterations and pathology markers of AD in the CSF. Ether glycerophospholipids, particularly plasmalogens, showed lower levels in preclinical and prodromal AD participants compared with controls. Conversely, we found ara-chidonic acid-containing phosphatidylcholine, PUFA (omega-3) lyso-phosphatidylcholine and lyso-alkyl-phosphatidylcholine with predominant satu-rated/mono-unsaturated long-chain fatty acid to be increased. Low levels of plasmalogens have been frequently linked to AD pathology [22], whether measured in brain tissue [23–25], CSF [25], or plasma blood samples [26]. Grey matter plasmalogens (DHA and AA at sn-2) depletion was found associated with disease progression and severity in AD patients [27–30]. A recent study by Lim et al. proposed that ether-lipids dysregulation may partly mediate the effect of two major AD risk factors, namely, age and APOE4.
Toledo et al. showed that higher baseline levels of long-chain and PUFA-containing alkyl phosphatidylcholines (PC ae 42 : 4, PC ae 44 : 4) correlated with abnormal levels of CSF Aβ42 in preclinical and prodromal AD participants of the ADNI cohort and predicted conversion from MCI to AD dementia [32]. In the current study, we observed high levels of arachidonic acid-containing phosphatidylcholine, and long-chain alkyl lyso-phosphatidylcholines (LPC-O), were associated with the CSF pTau/Aβ42 ratio. Results from both studies suggest an early role of arachidonated phosphatidylcholines, particularly long-chain alkyl isomers and their lyso derivatives, in AD pathogenesis, even in cognitively normal individuals with pathological levels of CSF AD biomar-kers. These phosphatidylcholine species are known precursors of potent inflammatory mediators, including platelet-activating factor (PAF) and arachidonic acid. Additionally, they are highly abundant in platelets and immune cells [33, 34]. This points to a potential regulatory role in inflammation processes and would represent a possible link between inflammation and AD [32].
Complex ceramides, including glycosylated cera-mides, GM1 gangliosides, and their precursors hexosyl-ceramides and sulfatides, showed higher levels in prodromal and preclinical AD participants, in contrast to di-hydro-ceramides, sphingomyelins, and GM3 gangliosides, which were decreased. Several studies suggested a shift in sphingolipids metabolism towards ceramides accumulation [35, 36] and depletion of sphingomyelins, particularly those with long-chain FA (C22, C24) [37, 38] and sulfated sphingolipids [35] early in the course of AD [39]. Ceramides, a key bioactive molecule in sphingolipids metabolism, were suggested to contribute to the increased susceptibility of neurons and oligodendrocytes to apoptotic cell death [40]. This hypothesis was further supported by the elevated activity of enzymes involved in ceramides synthesis, namely sphingomyelinases and ceramidases, in brain tissue of AD cases [38]. Consistent with these findings, gene expression of sphingomyelinases and serine palmitoyl transferase enzymes was found to be upregulated in AD patients’ brain tissue [36, 39].
The second goal of our study was to identify distinct lipidomic endophenotypes and assess their association with clinical progression. Lipidomics endophenotyping offers a global mapping of the alterations in biochemical pathways [41]. These alterations may partly reflect underlying AD pathology. Additionally, these endophenotypes can capture complementary information related to an individual’s specific comorbidities and/or genomic characteristics that could partly explain the diversity observed in clinical trajectories within AD populations [3]. In the prodromal sub-cohort, the lipid profiles of clusters (II and V) were associated with a higher risk of clinical progression. In both clusters, we observed lower levels of PUFA (mainly AA) containing plasmalogens and phosphatidylcholines associated with a compensatory increase of plasmalogens, mainly alkenyl phosphatidylcholines, containing saturated and mono-unsaturated FAs. Higher levels of cholesterol esters, complex ceramides together with the depletion of long-chain sphingomyelins, and di-hydro-ceramides were also noted in clusters (II and V) participants. Cluster (III) lipidomic profile was associated with a lower risk of progression (CDR conversion) yet only in men. Cluster (III) constituted a group of prodromal participants with a higher prevalence of low BMI and a slightly higher proportion of APOE4 carriers compared with the reference cluster (IV).
Previous studies used logistic regression or machine learning algorithms to investigate the association of lipids with dementia risk in cognitively normal individuals [42–44] and people with MCI [32, 45]. Several studies have found higher levels of sphingomyelin, phosphatidylcholines, and lysophosphatidylcholine associated with conversion from MCI to AD/dementia [32, 47]. Conversely, Mapstone et al. [43] and Ma et al. [45] showed that lower baseline levels of phosphatidylcholines and lysophosphatidylcholine were significantly associated with accelerated cognitive decline [45] and risk of conversion to MCI/AD compared to cognitively stable participants [43].
In a different approach, Wood et al. [48] addressed heterogeneity in lipid alterations patterns within groups of MCI and AD cases. They defined subgroups within each diagnostic group according to their Mini-Mental State Examination score (low versus high). Based on the literature, they focused on two lipid classes, ethanolamine plasmalogens and diacylglycerols. MCI and AD cases had elevated levels of diacylglycerols and plasmalogens depletion compared with controls [48]. Low and high Mini-Mental State Examination MCI cases, however, showed no differences in both lipid classes [48]. In contrast to such a hypothesis-driven approach, here we explored the diversity of lipidomic endophenotypes within prodromal cases using an unsupervised clustering approach. Thus, our findings serve to generate rather than confirm hypotheses on the association of lipid profiles with the risk of conversion.
Recent evidence suggested that sex has an effect on the association of lipids with AD pathology and rates of cognitive decline [31, 50]. In our study, cluster (III) showed a decreased risk of conversion in men but not in women. This cluster had high levels of long-chain fatty acids lysophosphatidylcholine (both acyl and ether) and plasmalogens together with low levels of acylcarnitines. Sex-specific remodeling of lipid metabolism was suggested before, where high levels of sphingomyelins and phosphatidylcholines were reported in women [49, 50]. Conversely, lysophosphatidylcholine and ceramides were found at higher levels in men [49]. Thus, phospholipases may have higher activity in men and sphingomyelin synthetase may have a higher activity in women [49]. Consequently, we adjusted lipid scores for age and sex based on the control group in an attempt to control for the complex interaction of lipids with sex during different stages of AD. Although we started with a substantial number of cases, the sample size within preclinical and prodromal sub-cohorts and their respective lipid endophenotypes clusters was small, so that it was not feasible to conduct the full analysis in a sex-stratified fashion, as recommended in [49, 50].
Lack of consistency across metabolomics studies’ results always was and still is a major limitation that hinders including lipid markers into diagnostic biomarker panels of AD [50, 51]. This heterogeneity is related to many factors, among them variability in data processing procedures and analytical platforms [51], as well as studies’ design, sample size, distribution of relevant risk factors, and used statistical approaches [50]. Another factor probably is the lack of strong effects which contributes to inconsistent findings across studies. In our Bayesian regression models, we observed overall small contributions from individual lipid composite scores to the association with AD pathology CSF biomarkers as indicated by poor model performance as well as small posterior coefficients with large credibility intervals. In addition, metabolomics data are inherently highly collinear. This could contribute to high variance observed within the models and difficulty assessing variables’ relative importance [52]. Taken together, a wide range of variance is observed in metabolomics data that limits their integration in the first line of diagnostic workflow and renders them likely more useful in adding to the accuracy of other prognostic markers [48].
Several limitations need to be acknowledged in this study. Instead of using raw lipid scores, we used composite scores based on hierarchical clustering applied within each lipid class. Such an approach could have masked the effects of some individual lipid species. Our objective was to reduce data dimensionality and overcome the drawback of variables’ multicollinearity, particularly on regression coefficients estimation and model stability. Concurrently we wanted to maintain the representation of all investigated lipid subclasses/classes and identify subsets of functionally similar lipid species. Finally, given the heterogeneity of lipidomics data, particularly in early AD individuals, even larger cohorts are needed to identify endophenotypes robustly. In future analysis, we would like to tune and then validate our approach on a larger sample derived from multiple cohorts and particularly enriched with participants in the preclinical stage of AD.
CONCLUSION
Through our study, we have shown that alterations in lipids, particularly those harboring poly-unsaturated fatty acids and ether bonds, can be captured at the earliest stages of AD. Lipidomics profiles provide an overview of an individual’s metabolic status whilst incorporating the balance within and between interacting biochemical pathways. Hence, identifying distinct lipidomic endophenotypes could contribute to AD risk and clinical trajectories. Refining and validating this approach could open a new avenue to adjuvant interventions modulating lipid metabolic pathways and allow for targeting subjects with the largest expected benefit.
Footnotes
ACKNOWLEDGMENTS
This study was supported by the Marie-Curie Innovative Training Network BBDiag (EU-Horizon 2020, Project ID: 721281).
Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (
). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.
