A Community-Based Study Identifying Metabolic Biomarkers of Mild Cognitive Impairment and Alzheimer’s Disease Using Artificial Intelligence and Machine Learning

Abstract

Background:

Currently, there is no objective, clinically available tool for the accurate diagnosis of Alzheimer’s disease (AD). There is a pressing need for a novel, minimally invasive, cost friendly, and easily accessible tool to diagnose AD, assess disease severity, and prognosticate course. Metabolomics is a promising tool for discovery of new, biologically, and clinically relevant biomarkers for AD detection and classification.

Objective:

Utilizing artificial intelligence and machine learning, we aim to assess whether a panel of metabolites as detected in plasma can be used as an objective and clinically feasible tool for the diagnosis of mild cognitive impairment (MCI) and AD.

Methods:

Using a community-based sample cohort acquired from different sites across the US, we adopted an approach combining Proton Nuclear Magnetic Resonance Spectroscopy (¹H NMR), Liquid Chromatography coupled with Mass Spectrometry (LC-MS) and various machine learning statistical approaches to identify a biomarker panel capable of identifying those patients with AD and MCI from healthy controls.

Results:

Of the 212 measured metabolites, 5 were identified as optimal to discriminate between controls, and individuals with MCI or AD. Our models performed with AUC values in the range of 0.72–0.76, with the sensitivity and specificity values ranging from 0.75–0.85 and 0.69–0.81, respectively. Univariate and pathway analysis identified lipid metabolism as the most perturbed biochemical pathway in MCI and AD.

Conclusion:

A comprehensive method of acquiring metabolomics data, coupled with machine learning techniques, has identified a strong panel of diagnostic biomarkers capable of identifying individuals with MCI and AD. Further, our data confirm what other groups have reported, that lipid metabolism is significantly perturbed in those individuals suffering with dementia. This work may provide additional insight into AD pathogenesis and encourage more in-depth analysis of the AD lipidome.

Keywords

Alzheimer’s disease machine learning metabolomics plasma markers targeted mass spectrometry

INTRODUCTION

Alzheimer’s disease (AD) is a clinically and patho-physiologically heterogeneous, complex neurodege-nerative disorder and is the most prevalent form of dementia [1]. The major historical neuropathologic characteristics include the accumulation of amyloid-β extracellular plaques and intracellular neurofibrillary tangles composed primarily of tau [2, 3]. Mild cognitive impairment (MCI) is the prodromal state of AD where objective cognitive deficits are present, but not of sufficient severity to impair functional capacity [4]. In some instances MCI is thought to be a transitional state between normal aging and AD; however, not all cases will convert to AD [5]. The conversion rate of MCI to AD is estimated to be ap-proximately 10% per year [6] and that number in-creases annually [7]. Current therapies for AD are typically initiated only after a clinical diagnosis of either MCI or AD is made. Their modest benefit may thus be partly explained by the fact that irreversi-ble brain pathology has already accumulated by the time the clinical features are recognized [8]. Increasing knowledge of the etiopathogenesis of AD holds promise for the development of new targeted therapies and biomarker-guided prevention strategies [9]. The development of valid and reliable biomarkers for AD will not only aid clinicians in recognizing the disease in its earliest symptomatic stages but will be especially important for effective disease-prevention or disease-modifying therapies when they become available [10]. Furthermore, reliable biomarkers might permit preclinical intervention or even prevention, before substantial neuropathological damage has occurred and before the onset of prodromal or manifest AD. Researchers are curr-ently developing novel treatments for AD that could potentially prevent neurodegeneration [11]. The dev-elopment of early biomarkers is a necessary first step to the design of such prevention and early-in-tervention trials.

Although substantial progress has been made in identifying positron emission tomography (PET) and cerebrospinal fluid (CSF) biomarkers for diagnosing AD, the impractical and invasive nature of these modalities make them a less attractive approach for most patients. Moreover, financial and logistical hur-dles limit the use of these markers as primary diagnostic tools [10, 12]. As tissue lesion, organ dysfunction and pathological states can cause alteration in both the chemical and protein composition of blood, most of today’s clinical laboratory tests are based on the analysis of the liquid component of blood, including serum and plasma [13]. Therefore, in the case of AD diagnosis, attention has turned toward the development of blood-based biomarkers [9, 10]. One of the major hurdles of measuring brain derived proteins associated with disease onset and progression in the blood has always been the blood-brain barrier (BBB) restricting their movement [9, 14]. Thus, blood-based markers may not entirely reflect the biochemical milieu of the CNS. Measuring small molecules that can flow freely across the BBB may provide a more comprehensive picture of CNS activity and provide an accurate readout of what is happening within the brain [9 , 15].

Metabolomics is a rapidly developing technology that is capable of quantifying the entire complement of metabolites in a given biological specimen, bio-fluid, or tissue [16] and promises great potential for early diagnosis, to monitor therapy, and for understanding the pathogenesis of many diseases [17]. We and others have shown the merit of metabolic pro-filing techniques for successfully characterizing neurodegenerative diseases [18 –25]. This technique has recently proven useful in identifying plasma bio-marker panels for differentiating preclinical AD subjects from controls [12 , 26–28], although these studies require further validation.

Plasma metabolomic analyses, therefore, represent a potential opportunity for the surveillance and early identification of individuals at risk of conversion to MCI and in turn to manifest AD, and thereby delineate a population for early therapeutic intervention [29 –31]. The complexity and heterogeneity of AD biospecimens make it necessary for optim-ized analyses to be carried out on large sample sizes. In this study, we present a combination of ¹H NMR and LC-MS based metabolomics to biochemically profile plasma harvested from patients at four sites across the US, including the Alzheimer’s Disease Research Centers (ADRC) at Emory and Rush, and samples obtained from the greater Rochester, NY and Irvine, CA communities, which is likely to have increased sample diversity in terms of ethnicity, educational level, and other relevant sociodemographic parameters. Using artificial intelligence and machine learning techniques on the two types of metabolomic data, our aim was to develop a diagnostic algorithm capable of accurately discriminating between cognitively intact controls, MCI, and AD participants.

MATERIAL AND METHODS

Samples

The study was conducted in the metabolomics division at the Beaumont Research Institute, MI, USA. A community-based sample cohort consisted of unrelated samples kindly provided by the ADRCs at Emory (Atlanta) and Rush (Chicago), and samples obtained from the greater Rochester, NY and Irvine, CA communities through the Rochester/Orange County Aging Study (R/OCAS). The sample cohort consisted of 77 AD patients, 71 MCI patients, and 101 cognitively healthy controls (HC). Peripheral blood was obtained following written informed consent for participation in the study and meeting study requirements [12, 26]. Study protocols and informed consent were approved by each respective Institutional Review Board (IRB). The combined sample set and proposed analysis was approved by the IRB at Beaumont Health (IRB # 2016-148).

The diagnosis of probable dementia due to AD was established according to the NINCDS-ADRDA criteria [32]. The diagnosis of MCI was determi-ned according to the Mayo Clinic criteria [33, 34]. Blood samples were obtained upon enrollment, along with the assessment of cognitive state for diagnostic determination. Cognitive assessments were varied across the sites with the Rush and Emory ADRCs utilizing the core ADRC cognitive protocols supplemented with other measures including for example, the Cambridge Cognitive Test (CAMCOG) [33] and the Mini-Mental State Examination (MMSE) [35]. Participants from the R/OCAS were grouped into diagnostic categories using a clinically relevant neuropsychological battery as reported in our prior work [12 , 36]. The primary measure of memory in this battery was the Rey Auditory Verbal Learning Test (RAVLT) [37]. In addition, critical parameters such as ethnicity, education and other sociodemographic factors for each patient was reported (Supplementary Table 1).

Sample preparation and data collection

For this investigation, blood collections have taken place at four notable AD research centers across the US. Sampling protocols have been previously reported [12, 26].

For NMR data collection, samples were prepared using a modified version of the method as described by Mercier et al. (2011) [38]. A total of 300μl of plasma were filtered through pre-washed (×7) 3.5 KDa filters (Amicon Micron YM-3; Sigma-Aldrich, St. Louis, MO) via centrifugation at 13,000 g, at 4°C for 30 min. To 228μl of the filtrate 28μl of D₂O and 24μl of 11.77 mM sodium 2,2-dimethyl-2-2silapentane-5-sulfonate (DSS (D6) in 50 mmol NaH₂PO₄ buffer (pH7) were added. Using a liquid handler system (Bruker Biospin, USA), 200μl of the mixture was transferred to a 3 mm NMR tube for analysis. All ¹H NMR spectra were acquired as previously described by Ravanbakhsh et al. (2015) [39] and Graham et al. (2016) [22]. Data collection took place on a Bruker Ascend III HD 600 MHz spectrometer (Bruker-Biospin, MA, USA) equipped with a 5 mm TCI cryo-probe at 300 K. Five hundred and twelve transients were acquired for each sample and chemical shifts (δ) are reported in parts per million (ppm). The singlet at 0.00 ppm produced by the methyl groups of the internal standard 4,4-dimethyl-4-silapentane-1-sulfonic acid (DSS-d6) was used for spectral referencing and quantification. All spectra were processed and analyzed using Chenomx NMR Suite (v8.1, Chenomx, Canada).

For targeted mass spectrometry, plasma samples were analyzed using a combination of direct injection mass spectrometry with a commercially available reverse-phase LC-MS/MS Kit (AbsoluteIDQ™ P180 Kit) as previously described [12 , 41]. Briefly, data were acquired on a Waters TQ-S mass spectrometer coupled to Acquity I-Class ultra-pressure liquid chromatography system. The samples were delivered to the mass spectrometer by a LC method followed by a direct injection (DI) method. MetIQ (Biocrates) was used to control the entire assay workflow, from sample registration to automated calculation of metabolite concentrations, to the export of data into other data analysis programs.

Statistical analysis

Univariate analysis

Using MetaboAnalyst (v 4.0) [42], a student’s t- test and Mann Whitney U test were performed for all pair-wise comparisons for both parametric and non-parametric distributions, respectively. To acco-unt for multiple comparisons false discovery rates (0.05 < fdr; q-values) were calculated. In order to find out whether sample demographics were statistically significantly different one-way Analysis of Variance analysis (ANOVA) and chi-square tests were conducted using IBM SPSS Statistics toolbox (ver 24.0).

Machine learning-based regression analysis

The machine learning models investigated in this study include Logistic Regression, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Gaussian Naïve Bayes, Linear Support Vector Ma-chine (SVM), Gaussian kernel SVM, K Nearest Neighbors, Decision Trees, Random Forest, Gradient Boosted Machine, and XGBoost. Prior to examining the predictive performance of the various machine learning (ML) approaches, sum normalized, generalized log transformed (glog) and auto-scaled data from each individual group were subjected to explo-rative multivariate statistical data analysis by Principal Component Analysis (PCA) to check for potential outliers or systematic variation (p < 0.05). Further, a Partial Least Squares-Discriminant Analysis (PLS-DA) was undertaken to assess the variation associated with sample collection sites. A Recursive Feature Elimination (RFE) method using logistic regression was applied to each pair-wise comparison (HC versus MCI, MCI versus AD, and HC versus AD) to find the most discriminative variables [43 –45]. We used the top 5 discriminant variables to avoid overfitting. Following variable selection, numerous machine-learning models were built. To find the optimal com-bination for each algorithm, hyper-parameters such as cost values, kernel functions and the numbers of trees to train the data set were tuned. 10-fold cross validation was performed during model development. Once an optimal model was developed, prediction was carried out through 5-fold cross validation. Therefore, using the optimal hyperparameters, our models have essentially been trained on 4 folds and tested on the remaining fold. This was repeated 5 times across all datasets. In small datasets, one potential pitfall is that randomly splitting the data into train and test sets may bias the models giving over optimistic results. In order to prevent this, 5-fold cross validations were used for prediction.

The Synthetic Minority Over-Sampling Technique (SMOTE) method from the imbalanced-learn library in python and K-Nearest Neighbors (knn) were used to account for oversampling. Different number of neighbors and different number of data points for oversampling were tested to determine the best combination. Using the k-fold approach, only the training portion was oversampled, and the test set kept as original. Model performance was assessed by classification accuracy rate, by the area under a receiver operating characteristic (AUROC) curve, and by the identification of true discriminating features. Supplementary Figure 1 displays the schematic of the approach used to identify the best plasma biomarker panel for each pair-wise comparison.

Metabolite set enrichment analysis

Metabolite set enrichment analysis (MSEA) was undertaken using MetaboAnalyst (v4.0) [46]. Met-abolite names were converted to their respective Human Metabolite Database (HMDB) identifiers, and the raw data were imported in rows. The raw data were subsequently normalized to the sum, log transformed and auto scaled. The pathway-associated metabolite set was the chosen metabolite library, and all compounds in this library were used. Pathways with a Holm corrected p-value less than <0.002 were considered to be statistically, significantly different between the various pair-wise comparisons.

RESULTS

Using a combination of ¹H NMR and DI–LC–MS/MS we profiled plasma collected from four different sites across the United States, from participants with MCI or AD and compared this with plasma from healthy, cognitively intact controls (HC). A total of 46 and 183 metabolites were accurately identified and quantified in plasma using ¹H NMR and DI–LC–MS/MS, respectively. Of those recorded metabolites, 17 were detected across both platforms. Using these metabolite concentrations, three pair-wise univariate, and multi-variate statistical comparisons were carried out (HC versus MCI, MCI versus AD, and HC versus AD). Important sociodemographic factors for each participant have been reported in Supplementary Table 1. As evide-nced using the mean values of the respective MMSE score for each group and the p-values calculated using the Chi-square test (Supplementary Table 2), each sample group is well defined based on their cognitive scoring [47]. Supplementary Table 2 also reports the results of the multi-group comparisons of important demographic factors such as age and gender. The results of the one-way analysis of variance (ANOVA) and Chi-square test revealed that both gender and age were not statistically different between the groups. Supplementary Table 3 lists the results of the univariate analysis comparing the mean concentrations of all metabolites from HC versus MCI participants. As shown in the table, of the 212-recorded metabolites, 23 metabolites reached statistical significance (p < 0.002: q < 0.05). ‘Acetic acid’, ‘lysoPC a C16 : 1’, ‘C18 : 2’, ‘SM C24 : 1’ and ‘SM C24 : 0’ were identified by RFE to be the most discriminative between HC and MCI plasma extracts (Fig. 1A). In this particular comparison, 5 nearest neighbors with 200 oversampled points provided the best predictive outcome for all ML based models (Fig. 1B). The relative performance of 11 ML models used in this study were assessed by calculating a classification accuracy rate, AUROC, sensitivity and specificity values for each model. Based on these performance metrics the best three classifiers were found to be logistic regression, linear support vector machine (linear svm), and kernel support vector machine (kernel svm) (Fig. 1C), with AUC values = 78.3, 79.1, and 78.7, respectively. Following the ML based predictive model development using HC versus MCI data, the results of the pathway topology analysis identified amino sugar metabolism, alanine metabolism, and the glucose-alanine cycle to be the top three biochemical pathways that were significantly disrupted as a direct result of MCI (Supplementary Table 4).

Fig. 1

Performance evaluation metrics for each ML based model to include sensitivity, specificity, and AUC for the metabolite panels identified using the RFE variable selection algorithm for distinguishing HC from MCI.

Supplementary Table 5 lists the results of the univariate analysis comparing the mean concentrations of all metabolites from MCI versus AD participants. Of the 212 metabolites, 20 were found to have significantly different plasma concentrations (p < 0.002; q < 0.05). In this instance, RFE identified ‘Acetic acid’, ‘PC ae C36:0’, ‘SM C26:0’, ‘Acetone’, and ‘PC ae C36:4’ as the most important metabolites for separating MCI sufferers from AD patients (Fig. 2A). The best oversampled model for this data set was developed using 9 nearest neighbors with both groups oversampled 500 times (Fig. 2B). Using these settings among the various ML algorithms, logistic regres-sion analysis, linear discriminant analysis and linear svm analysis were found to be top 3 outperforming classifiers providing AUC values = 72.9, 72.1, and 73.1, respectively (Fig. 2C).

Fig. 2

Subsequent MSEA revealed that multiple biochemical pathways were significantly perturbed (p <0.002) in AD as compared to MCI. These include ketone body metabolism, fatty acid biosynthesis, cysteine and methionine metabolism, carnitine meta-bolism, mitochondrial β-oxidation of short-chain saturated fatty acid (SCSFA), mitochondrial β-oxid-ation of long-chain saturated fatty acid (LCSFA), and folate metabolism (Supplementary Table 6).

Supplementary Table 7 lists the results of the univariate analysis comparing the mean concentrations of all metabolites for HC versus AD participants. For this comparison, 12 of the recorded metabolites had statistically, significantly different concentrations (p < 0.002; q < 0.05) between HC and AD plasma. In this case, RFE identified ‘SM C16:0’, ‘PC ae C42:1’, ‘PC ae C40:2’, ‘PC ae C42:3’, and ‘PC ae C38:4’ to be the most important classification features (Fig. 3A). The data were oversampled using 3 nearest neighbors with both groups oversampled 500 times (Fig. 3B). In this scenario logistic regression, linear svm analysis and kernel svm analysis exhibited a significantly higher classification accuracy (70.2, 70.2, 70.2), and AUC values = 74.7, 74.7, 75.8, respectively.

Fig. 3

MSEA of the mean concentration data identified Tryptophan metabolism, glutamate metabolism, phospholipid metabolism, taurine and urea cycle, and sphingolipid metabolism to be significantly perturbed between HC and AD plasma extracts (p < 0.002; (Supplementary Table 8).

DISCUSSION

To the best of our knowledge, this is the first reported study to apply ¹H NMR and DI/LC-MS/ MS to biochemically profile plasma from AD and MCI patients with the aim of identifying diagnostic biomarkers of disease, in comparison to control subjects. As metabolites exhibit a wide range of chemical and physical properties, it is currently impossible to employ one single analytical platform to analyze the entire metabolome. Thus, combining quantitative NMR and MS-based metabolomics approaches will increase our coverage of the metabolome making it much more likely we will identify robust, clinically useful biomarkers of disease. Moreover, given the complex and heterogeneous nature of AD, combining metabolomics data obtained from various analytical platforms may better reflect potential etiologic mechanisms and provide improved insight into the underlying pathobiological processes [48]. An additional strength of this particular study is the multi-center approach, where community-based samples have been collected from across the US and are likely to reflect a higher phenotypic diversity and real-life scenarios (i.e., more applicable to the gen-eral population). To account for any potential technical differences in sample acquisition between the various institutes and prior to employing any supervised classification approaches, we employed PCA to each individual group to make sure that systematic (technical) variation was not overshadowing any biological variation. Our results show that variation due to sample collection was negligible as evidenced by the scores plots in Supplementary Figure 2A–C. No separation was apparent due to the samples originating from different sites. While PCA is generally satisfactory to show that no variation was evident between collection sites, we decided to run PLS-DA models and measured the explained variation by each latent variable (LV). The top LVs for each model explained between 2 and 3% of the variation which is comparable to technical noise (Supplementary Figure 3A–C). Statistical analysis for multi-group comparison of demographic information revealed no group or gender differences between each group.

Based on univariate statistics, substantial alterations in plasma levels of various metabolites were detected (p < 0.05). A notable finding of this targeted metabolomics study was distinct metabolites belonging to acylcarnitines (AC), phosphatidylcholines (PC), and sphingomyelin (SM) classes and how they are related to AD. In particular, they are present at much lower levels in AD as compared to HC, as also reported by several other groups [49, 50]. Knowing that approximately 20% of total glucose-derived energy is consumed by the brain, it is not surprising that as the increasingly, inefficient AD brain attempts to maintain energy homeostasis, it switches to an alternative source of energy in the form of lipids. Strikingly, we observed that lipid levels are also lower in MCI when compared to HC, and that they tend to rise in AD samples. This has also been reported by other groups [10 , 26] and we hypothesize that this drop-in lipid levels is directly related to the initial insult of AD pathogenesis in the brain. While these changes are in plasma, our group have also observed these changes in post-mortem human brain (data not presented) and we believe that the changes in plasma may occur in parallel to those observed in brain [23]. We speculate that during the prodromal phase of AD multiple metabolic pathways are perturbed and as a result, many metabolites are affected. After the emergence of clinical symptoms, the brain tries to maintain a resting level of specific metabolites and cannot reach initial values as observed in healthy, cognitively intact individuals due to significantly advancing hallmark pathology. We have previously reported a similar pattern in preclinical AD, where certain blood lipid levels are lower just prior to the emergence of clinical features and return to near normal levels later during the course of the disease [12].

Several ML approaches, such as knn [51], decision tree [52], logistic regression [25], and svm [53], have been routinely used by metabolomics investigators, but it is not quite clear which, if any, of these models are the best choice for the analysis of metabolomic data sets. Toward this end, we attempted to systematically evaluate several ML algorithms through classification accuracy rate, sensitivity and specificity, and the correct identification of the true discriminant features to build multi-biomarker models for clinical diagnosis and to predict AD and MCI based on the measurement of metabolite concentrations in plasma samples. The best three classifiers differentiating HC and MCI participants using the RFE based variable selection were logistic regression, linear svm, and kernel svm with classification accuracies of 72.1%, 70.9%, and 71.5%, respectively. In terms of AUC, the average AUC using these RFE features were 78.3, 79.1, and 78.7 which demonstrates the utility of this subset over all available metabolites. MCI differed from control specimens due to decreased levels of lysoPC a C16:1, C18:2, SM C24:0, and increased levels of acetic acid and SM C24:1 (Fig. 1A).

The same procedure was repeated with the top five metabolites identified using the RFE method, namely, acetic acid, PC ae C36:0, SM C26:0, acetone, and PC ae C36:4 for distinguishing MCI from AD (Fig. 2A). We used the concentration values of these metabolites to develop ML models which accurately distinguish MCI from AD participants (Fig. 2B). Of those models, AUC values of 72.9, 72.1, and 73.1 (95% CI) for logistic regression analysis, LDA, and linear svm analysis were acquired, respectively (Fig. 2C). AD differed from MCI samples due to increases in PC ae C36:4, SM C26:0, and PC ae C3:0 and decreases in acetic acid and acetone.

Figure 3A illustrates an optimal five-feature panel selected by RFE, and the corresponding performance metrics provided by each ML based approach utilizing this panel are presented in Fig. 3B. Linear discriminant analysis, quadratic discriminant analysis, and kernel svm models have similar AUC perfor-mance (74.7, 77.5, 74.7) and accuracy (70.8%, 72.5%, 70.2%) for AD diagnosis (Fig. 3C). The panel can simply be characterized with respect to concentration changes in PC and SM groups, respectively.

It is worth noting that both the logistic regression and linear SVM models performed equally well. Of note, each statistical algorithm follows different optimization procedures to reach optimal classification and predictive accuracy. Therefore, finding out the most compatible statistical approach with the metabolomics data in hand will help to maximize the predictive accuracy for a certain disease which is clinically vital.

Many studies have reported perturbations in amino acid levels in the blood and brain of AD patients and mouse models [54 –56]. In parallel with those studies, the results of our MSEA when comparing the plasma metabolic profile of HC and MCI participants identified significant changes in amino acid metabolism (Supplementary Table 4). Identifying amino acid changes in patients with MCI is important as these changes may be directly related to the onset of AD. When neurons fail to catabolize glucose efficiently, during AD progression, they may become reliant upon amino acid oxidation as an alternative source for energy production. If neuronal amino acids become depleted or if the machinery used to metabolize amino acids becomes dysregulated, the neurons may enter apoptosis, contributing to the developing pathophysiology. However, even if neuronal energy levels are maintained via amino acid oxidation, the increased amounts of ammonia released during amino acid catabolism could lead to neuronal cell death as the urea cycle would become saturated as a result [55, 57].

In order to provide insights into the pathogenic mechanism or biochemical perturbations associa-ted with MCI and AD, we performed enrichment analyses as previously mentioned (Supplementary Table 6). Overall, significant alterations in fatty acid-related metabolism (Fatty Acid Biosynthesis, Mitochondrial Beta-Oxidation of SCSFA, and Mitochondrial Beta-Oxidation of LCSFA), ketone body metabolism and folate metabolism were observed. The association between fatty acids and AD has previously reported and their protective effect against age-related cognitive decline and onset of AD have been shown [58]. Our data suggested that dysregulation of fatty acid related metabolism lead to changes in levels of carnitines, phosphocholines, and sphing-olipids. Carnitine performs a crucial role in the energy metabolism of the brain, controlling the influx of fatty acids into the mitochondria, facilitating the oxida-tion of pyruvate, and protecting the neurons from the accumulation of membrane-destabilizing acylCoA [58 –60]. Non-esterified LCFAs or their activated forms exert a large variety of harmful side-effects on mitochondria, such as enhancing the mitochondrial reactive oxygen species generation in distinct steps of the beta-oxidation and therefore potentially increasing oxidative stress [61]. Taken all together we believe that during the progression from MCI to AD, fatty acid metabolism is downregulated to avoid significant peroxisomal side-effects. It is also important to stress, that while not all cases of MCI will phenotypically convert to AD due to the heteroge-neity of the condition (likely to encompasses other emerging neurodegenerative conditions), it does highlight the clinical need to develop a test to identify those at greatest risk of converting from MCI to AD earlier. It is also important to note that all three comparative tests (HC versus MCI, HC versus AD, MC versus AD) resulted in three different sets of metabolites being selected, implicating slightly different metabolic pathways. This reinforces the emerging idea that the pathobiology is not a monotonic increase of a single process (proteopathy), but rather a dynamic set of metabolic processes which enter into and exit the pathobiological cascade at different time points and for different lengths of time.

While this study presents a novel approach, it is not free of limitations. Having access to all the soc-iodemographic information for all the samples may have improved our model performance and enabled us to highlight any correlations between these in-teresting factors and particular metabolic variables. Secondly, having access to imaging and CSF bio-markers according to the NIA-AA research framework for defining AD [62] would also have been extremely useful to compare/include with our meta-bolomics analyses. Unfortunately, due to the cost-prohibitive nature of these tests and the limited budget available for these combined studies, we do not have all this information for all samples. Finally, while our results are encouraging, further validation work is required.

Conclusion

In summary, our group, for the first-time, have combined quantitative, targeted metabolomics data acquired using ¹H NMR and DI-LC-MS/MS in-tegrated with several robust machine learning approaches to identify blood-based biomarkers for the detection of MCI and AD. We propose that perturbations in lipid metabolism may be integral to the evolution of AD neuropathology as well as to the eventual onset of AD. We believe that minimally invasive blood metabolite markers as presented herein could have future clinical utility for AD and MCI diagnosis. Based on these results, further work is warranted to validate our findings in a much larger cohort. Considering the fact that lipids are central for prediction and evaluation of AD neuropathology, intensive investigation of the entire brain and blood lipidome is warranted.

Footnotes

ACKNOWLEDGMENTS

We thank Emory Alzheimer’s Disease Research Center and Rush Alzheimer’s Disease Center for providing plasma samples. This work was partially funded by US National Institutes of Health grants R01AG030753, generous contributions made by the John and Marilyn Bishop Foundation and the Fred A. & Barbara M. Erb Foundation, and a Department of Defense grant (DOD W81XWH-09-1-0107) awarded to H.J.F.

Authors’ disclosures available online ().

The supplementary material is available in the electronic version of this article: .

References

Livingston

, Sommerlad

, Orgeta

, Costafreda

, Huntley

, Ames

, Ballard

, Banerjee

, Burns

, Cohen-Mansfield

, Cooper

, Fox

, Gitlin

, Howard

, Kales

, Larson

, Ritchie

, Rockwood

, Sampson

, Samus

, Schneider

, Selbaek

, Teri

, Mukadam

(2017) Dementia prevention, intervention, and care. Lancet 390, 2673–2734.

Holscher

(2005) Development of beta-amyloid-induced neurodegeneration in Alzheimer’s disease and novel neuroprotective strategies. Rev Neurosci 16, 181–212.

Belbin

, Brown

, Shi

, Medway

, Abrahams

, Passmore

, Mann

, Smith

, Holmes

, McGuiness

, Craig

, Warden

, Heun

, Kölsch

, Love

, Kalsheker

, Williams

, Owen

, Carrasquillo

, Younkin

, Morgan

, Kehoe

(2011) A multi-centre study of ACE and the risk of late-onset Alzheimer’s disease. J Alzheimers Dis 24, 587–597.

Albert

, DeKosky

, Dickson

, Dubois

, Feldman

, Fox

, Gamst

, Holtzman

, Jagust

, Petersen

, Snyder

, Carrillo

, Thies

, Phelps

(2011) The diagnosis of mild cognitive impairment due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement 7, 270–279.

Burns

, Zaudig

(2002) Mild cognitive impairment in older people. Lancet 360, 1963–1965.

Mitchell

, Shiri-Feshki

(2009) Rate of progression of mild cognitive impairment to dementia–meta-analysis of 41 robust inception cohort studies. Acta Psychiatr Scand 119, 252–265.

Geslani

, Tierney

, Herrmann

, Szalai

(2005) Mild cognitive impairment: An operational definition and its conversion rate to Alzheimer’s disease. Dement Geriatr Cogn Disord 19, 383–389.

Graham

, Chevallier

, Elliott

, Hölscher

, Johnston

, McGuinness

, Kehoe

, Passmore

, Green

(2015) Untargeted metabolomic analysis of human plasma indicates differentially affected polyamine and L-arginine metabolism in mild cognitive impairment subjects converting to Alzheimer’s disease. PloS One 10, e0119452–e0119452.

Hampel

, O’Bryant

, Molinuevo

, Zetterberg

, Masters

, Lista

, Kiddle

, Batrla

, Blennow

(2018) Blood-based biomarkers for Alzheimer disease: Mapping the road to the clinic. Nat Rev Neurol 14, 639–652.

10.

Fiandaca

, Mapstone

, Cheema

, Federoff

(2014) The critical need for defining preclinical biomarkers in Alzheimer’s disease. Alzheimers Dement 10, S196–212.

11.

Cao

, Hou

, Ping

, Cai

(2018) Advances in developing novel therapeutic strategies for Alzheimer’s disease. Mol Neurodegener 13, 64.

12.

Mapstone

, Cheema

, Fiandaca

, Zhong

, Mhyre

, MacArthur

, Hall

, Fisher

, Peterson

, Haley

, Nazar

, Rich

, Berlau

, Peltz

, Tan

, Kawas

, Federoff

(2014) Plasma phospholipids identify antecedent memory impairment in older adults. Nat Med 20, 415.

13.

Psychogios

, Hau

, Peng

, Guo

, Mandal

, Bouatra

, Sinelnikov

, Krishnamurthy

, Eisner

, Gautam

, Young

, Xia

, Knox

, Dong

, Huang

, Hollander

, Pedersen

, Smith

, Bamforth

, Greiner

, McManus

, Newman

, Goodfriend

, Wishart

(2011) The human serum metabolome. PloS One 6, e16957–e16957.

14.

Montagne

, Barnes

, Sweeney

, Halliday

, Sagare

, Zhao

, Toga

, Jacobs

, Liu

, Amezcua

, Harrington

, Chui

, Law

, Zlokovic

(2015) Blood-brain barrier breakdown in the aging human hippocampus. Neuron 85, 296–302.

15.

Zipser

, Johanson

, Gonzalez

, Berzin

, Tavares

, Hulette

, Vitek

, Hovanesian

, Stopa

(2007) Microvascular injury and blood-brain barrier leakage in Alzheimer’s disease. Neurobiol Aging 28, 977–986.

16.

Wishart

(2011) Advances in metabolite identification. Bioanalysis 3, 1769–1782.

17.

Gowda

GAN

, Zhang

, Gu

, Asiago

, Shanaiah

, Raftery

(2008) Metabolomics-based methods for early disease diagnostics. Expert Rev Mol Diagn 8, 617–633.

18.

Kaddurah-Daouk

, Zhu

, Sharma

, Bogdanov

, Rozen

, Matson

, Oki

, Motsinger-Reif

, Churchill

, Lei

, Appleby

, Kling

, Trojanowski

, Doraiswamy

, Arnold

, Pharmacometabolomics Research Network (2013) Alterations in metabolic pathways and networks in Alzheimer’s disease. Transl Psychiatry 3, e244.

19.

Kaddurah-Daouk

, Rozen

, Matson

, Han

, Hulette

, Burke

, Doraiswamy

, Welsh-Bohmer

(2011) Metabolomic changes in autopsy-confirmed Alzheimer’s disease. Alzheimers Dement 7, 309–317.

20.

Han

, Rozen

, Boyle

, Hellegers

, Cheng

, Burke

, Welsh-Bohmer

, Doraiswamy

, Kaddurah-Daouk

(2011) Metabolomics in early Alzheimer’s disease: Identification of altered plasma sphingolipidome using shotgun lipidomics. PloS One 6, e21643–e21643.

21.

Motsinger-Reif

, Zhu

, Kling

, Matson

, Sharma

, Fiehn

, Reif

, Appleby

, Doraiswamy

, Trojanowski

, Kaddurah-Daouk

, Arnold

(2013) Comparing metabolomic and pathologic biomarkers alone and in combination for discriminating Alzheimer’s disease from normal cognitive aging. Acta Neuropathol Commun 1, 28–28.

22.

Graham

, Kumar

, Bjorndahl

, Han

, Yilmaz

, Sherman

, Bahado-Singh

, Wishart

, Mann

, Green

(2016) Metabolic signatures of Huntington’s disease (HD): (1)H NMR analysis of the polar metabolome in post-mortem human brain. Biochim Biophys Acta 1862, 1675–1684.

23.

Pan

, Nasaruddin

, Elliott

, McGuinness

, Passmore

, Kehoe

, Holscher

, McClean

, Graham

, Green

(2016) Alzheimer’s disease-like pathology has transient effects on the brain and blood metabolome. Neurobiol Aging 38, 151–163.

24.

Graham

, Turkoglu

, Kumar

, Yilmaz

, Bjorndahl

, Han

, Mandal

, Wishart

, Bahado-Singh

(2017) Targeted metabolic profiling of post-mortem brain from infants who died from sudden infant death syndrome. J Proteome Res 16, 2587–2596.

25.

Yilmaz

, Geddes

, Han

, Bahado-Singh

, Wilson

, Imam

, Maddens

, Graham

(2017) Diagnostic biomarkers of Alzheimer’s disease as identified in saliva using 1H NMR-based metabolomics. J Alzheimers Dis 58, 355–359.

26.

Fiandaca

, Zhong

, Cheema

, Orquiza

, Chidambaram

, Tan

, Gresenz

, FitzGerald

, Nalls

, Singleton

, Mapstone

, Federoff

(2015) Plasma 24-metabolite panel predicts preclinical transition to clinical stages of Alzheimer’s disease. Front Neurol 6, 237.

27.

Stamate

, Kim

, Proitsi

, Westwood

, Baird

, Nevado-Holgado

, Hye

, Bos

, Vos

SJB

, Vandenberghe

, Teunissen

, Kate

, Scheltens

, Gabel

, Meersmans

, Blin

, Richardson

, De Roeck

, Engelborghs

, Sleegers

, Bordet

, Ramit

, Kettunen

, Tsolaki

, Verhey

, Alcolea

, Lleo

, Peyratout

, Tainta

, Johannsen

, Freund-Levi

, Frolich

, Dobricic

, Frisoni

, Molinuevo

, Wallin

, Popp

, Martinez-Lage

, Bertram

, Blennow

, Zetterberg

, Streffer

, Visser

, Lovestone

, Legido-Quigley

(2019) A metabolite-based machine learning approach to diagnose Alzheimer-type dementia in blood: Results from the European Medical Information Framework for Alzheimer disease biomarker discovery cohort. Alzheimers Dement (N Y) 5, 933–938.

28.

Oeckl

, Otto

(2019) A review on MS-based blood biomarkers for Alzheimer’s disease. Neurol Ther 8, 113–127.

29.

Mapstone

, Gross

, Macciardi

, Cheema

, Petersen

, Head

, Handen

, Klunk

, Christian

, Silverman

, Lott

, Schupf

, Alzheimer’s Biomarkers Consortium–Down Syndrome (ABC-DS) Investigators (2020) Metabolic correlates of prevalent mild cognitive impairment and Alzheimer’s disease in adults with Down syndrome. Alzheimers Dement (Amst) 12, e12028.

30.

Weng

, Huang

, Tang

, Cheng

, Chen

(2019) The differences of serum metabolites between patients with early-stage Alzheimer’s disease and mild cognitive impairment. Front Neurol 10, 1223.

31.

van der Velpen

, Teav

, Gallart-Ayala

, Mehl

, Konz

, Clark

, Oikonomidi

, Peyratout

, Henry

, Delorenzi

, Ivanisevic

, Popp

(2019) Systemic and central nervous system metabolic alterations in Alzheimer’s disease. Alzheimers Res Ther 11, 93.

32.

McKhann

, Knopman

, Chertkow

, Hyman

, Jack

Jr. , Kawas

, Klunk

, Koroshetz

, Manly

, Mayeux

, Mohs

, Morris

, Rossor

, Scheltens

, Carrillo

, Thies

, Weintraub

, Phelps

(2011) The diagnosis of dementia due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement 7, 263–269.

33.

Petersen

, Caracciolo

, Brayne

, Gauthier

, Jelic

, Fratiglioni

(2014) Mild cognitive impairment: A concept in evolution. J Intern Med 275, 214–228.

34.

Petersen

, Stevens

, Ganguli

, Tangalos

, Cummings

, DeKosky

(2001) Practice parameter: Early detection of dementia: Mild cognitive impairment (an evidence-based review). Report of the Quality Standards Subcommittee of the American Academy of Neurology. Neurology 56, 1133–1142.

35.

Folstein

, Folstein

, McHugh

(1975) “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 12, 189–198.

36.

Mapstone

, Lin

, Nalls

, Cheema

, Singleton

, Fiandaca

, Federoff

(2017) What success can teach us about failure: The plasma metabolome of older adults with superior memory and lessons for Alzheimer’s disease. Neurobiol Aging 51, 148–155.

37.

Rey

(1964) L’examen clinique en psychologie, Presses universitaires de France, Paris.

38.

Mercier

, Lewis

, Chang

, Baker

, Wishart

(2011) Towards automatic metabolomic profiling of high-resolution one-dimensional proton NMR spectra. J Biomol NMR 49, 307–323.

39.

Ravanbakhsh

, Liu

, Bjordahl

, Mandal

, Grant

, Wilson

, Eisner

, Sinelnikov

, Hu

, Luchinat

, Greiner

, Wishart

(2015) Accurate, fully-automated NMR spectral profiling for metabolomics. PLoS One 10, e0124219.

40.

Bahado-Singh

, Poon

, Yilmaz

, Syngelaki

, Turkoglu

, Kumar

, Kirma

, Allos

, Accurti

, Li

, Zhao

, Graham

, Cool

, Nicolaides

(2017) Integrated proteomic and metabolomic prediction of term preeclampsia. Sci Rep 7, 16189.

41.

Graham

, Rey

, Yilmaz

, Kumar

, Madaj

, Maddens

, Bahado-Singh

, Becker

, Schulz

, Meyerdirk

, Steiner

, Ma

, Brundin

(2018) Biochemical profiling of the brain and blood metabolome in a mouse model of prodromal Parkinson’s disease reveals distinct metabolic profiles. J Proteome Res 17, 2460–2469.

42.

Xia

, Mandal

, Sinelnikov

, Broadhurst

, Wishart

(2012) MetaboAnalyst 2.0—a comprehensive server for metabolomic data analysis. Nucleic Acids Res 40, W127–W133.

43.

Richhariya

, Tanveer

, Rashid

(2020) Diagnosis of Alzheimer’s disease using universum support vector machine based recursive feature elimination (USVM-RFE). Biomed Signal Process Control 59, 101903.

44.

Lin

, Yang

, Zhou

, Yin

, Kong

, Xing

, Lu

, Jia

, Wang

, Xu

(2012) A support vector machine-recursive feature elimination feature selection method based on artificial contrast variables and mutual information. J Chromatogr B Analyt Technol Biomed Life Sci 910, 149–155.

45.

Guyon

, Weston

, Barnhill

, Vapnik

(2002) Gene selection for cancer classification using support vector machines. Mach Learn 46, 389–422.

46.

Chong

, Soufan

, Li

, Caraus

, Li

, Bourque

, Wishart

, Xia

(2018) MetaboAnalyst 4.0: Towards more transparent and integrative metabolomics analysis. Nucleic Acids Res 46, W486–W494.

47.

O’Bryant

, Humphreys

, Smith

, Ivnik

, Graff-Radford

, Petersen

, Lucas

(2008) Detecting dementia with the mini-mental state examination in highly educated individuals. Arch Neurol 65, 963–967.

48.

Boksa

(2013) A way forward for research on biomarkers for psychiatric disorders. J Psychiatry Neurosci 38, 75–77.

49.

Varma

, Oommen

, Varma

, Casanova

, An

, Andrews

(2018) Brain and blood metabolite signatures of pathology and progression in Alzheimer disease: A targeted metabolomics study. PLos Med 15, e1002482.

50.

Costa

, Joaquim

HPG

, Forlenza

, Gattaz

, Talib

(2019) Three plasma metabolites in elderly patients differentiate mild cognitive impairment and Alzheimer’s disease: A pilot study. Eur Arch Psychiatry Clin Neurosci 270, 483–488.

51.

Lee

, Styczynski

(2018) NS-kNN: A modified k-nearest neighbors approach for imputing metabolomics data. Metabolomics 14, 153.

52.

Hummel

, Strehmel

, Selbig

, Walther

, Kopka

(2010) Decision tree supported substructure prediction of metabolites from GC-MS profiles. Metabolomics 6, 322–333.

53.

Bahado-Singh

, Yilmaz

, Bisgin

(2019) Artificial intelligence and the analysis of multi-platform metabolomics data for the detection of intrauterine growth restriction. PLoS One 14, e0214121.

54.

Esposito

, Belli

, Toniolo

, Sancesario

, Bianconi

, Martorana

(2013) Amyloid beta, glutamate, excitotoxicity in Alzheimer’s disease: Are we on the right track?. CNS Neurosci Ther 19, 549–555.

55.

Griffin

JWD

, Bradshaw

(2017) Amyloid beta, glutamate, excitotoxicity in Alzheimer’s disease: Are we on the right track?. Oxid Med Cell Longev 2017, 5472792–5472792.

56.

Ibanez

, Simo

, Martin-Alvarez

, Kivipelto

, Winblad

, Cedazo-Minguez

, Cifuentes

(2012) Toward a predictive model of Alzheimer’s disease progression using capillary electrophoresis-mass spectrometry metabolomics. Anal Chem 84, 8532–8540.

57.

Fekkes

, van der Cammen

, van Loon

, Verschoor

, van Harskamp

, de Koning

, Schudel

, Pepplinkhuizen

(1998) Abnormal amino acid metabolism in patients with early stage Alzheimer dementia. J Neural Transm (Vienna) 105, 287–294.

58.

Cunnane

, Schneider

, Tangney

, Tremblay-Mercier

, Fortier

, Bennett

, Morris

(2012) Plasma and brain fatty acid profiles in mild cognitive impairment and Alzheimer’s disease. J Alzheimers Dis 29, 691–697.

59.

Nasaruddin

, Hölscher

, Kehoe

, Graham

, Green

(2016) Wide-ranging alterations in the brain fatty acid complement of subjects with late Alzheimer’s disease as detected by GC-MS. Am J Transl Res 8, 154–165.

60.

Wilson

, Binder

(1997) Free fatty acids stimulate the polymerization of tau and amyloid beta peptides. In vitro evidence for a common effector of pathogenesis in Alzheimer’s disease. Am J Pathol 150, 2181–2195.

61.

Schonfeld

, Reiser

(2017) Brain energy metabolism spurns fatty acids as fuel due to their inherent mitotoxicity and potential capacity to unleash neurodegeneration. Neurochem Int 109, 68–77.

62.

Jack

Jr. , Bennett

, Blennow

, Carrillo

, Dunn

, Haeberlein

, Holtzman

, Jagust

, Jessen

, Karlawish

, Liu

, Molinuevo

, Montine

, Phelps

, Rankin

, Rowe

, Scheltens

, Siemers

, Snyder

, Sperling

(2018) NIA-AA Research Framework: Toward a biological definition of Alzheimer’s disease. Alzheimers Dement 14, 535–562.