Abstract
Background:
Cognitive impairment is a clinical manifestation that occurs in the course of dementia like Alzheimer’s disease. The association between cognitive impairment and gut microbiota is unclear.
Objective:
We aimed to identify gut microbiota characteristics and key gut microbiota biomarkers associated with cognitive impairment in a relatively large cohort of older adults in China.
Methods:
A total of 229 adults aged ≥60 years from Shenzhen, China were recruited into this cross-sectional study. Participants were divided into cognitive impairment (CI) and no cognitive impairment (NCI) groups according to the results of the Mini-Mental State Examination. Diversity analysis and network analysis were used to characterize the gut microbiota between the two groups. The linear discriminant analysis effect size method and machine learning approaches were sequentially performed to identify gut microbiota biomarkers. The relationship between biomarkers and lifestyle factors was explored using Transformation-based redundancy analysis (tb-RDA).
Results:
A total of 74 CI participants and 131 NCI participants were included in the analysis. The CI group demonstrated lower α-diversity compared to the NCI group (Shannon: 2.798 versus 3.152, p < 0.001). The density of the gut microbiota interaction network was lower in the CI group (0.074) compared to the NCI group (0.081). Megamonas, Blautia, Pseudomonas, Stenotrophomonas, and Veillonella were key biomarkers for CI. The tb-RDA revealed that increased fruit intake and exercise contribute to a higher abundance of Megamonas, Blautia, and Veillonella.
Conclusions:
We identified a significantly reduced abundance of certain beneficial gut microbiota in older Chinese adults with cognitive impairment.
Keywords
INTRODUCTION
Cognitive impairment is a clinical manifestation that occurs in the course of dementia like Alzheimer’s disease. 1 People with cognitive impairment experience difficulty in remembering, learning, concentrating and decision-making, adversely impacting their daily lives. 2 The underlying causes of cognitive impairment are multifaceted, and the exact pathogenic mechanisms remain poorly understood.
Aging is the key contributor to the causes of cognitive impairment, and with a global aging population, the number of people with dementia is projected to reach 78 million by 2030. 3 The burden of dementia on countries and families is immense, with the projected global cost of dementia reaching US$2.8 trillion after correcting for care costs. 4 Several studies have shown that there are modifiable factors associated with cognitive impairment, such as exercise, 5 social engagement, 6 and nutrition. 7 Most modifiable factors also affect gut microbiota composition and diversity, directly or indirectly, by impacting the gut environment, nutrition, inflammation, and more.8, 9 Understanding gut microbiota characteristics provides insights into how modifiable factors may contribute to cognitive impairment at a microscopic level.
The human gut microbiota is often referred to as the body’s ‘second genome’. 10 It plays a crucial role in shaping intestinal epithelial cells, improving intestinal integrity, modulating host immunity, harvesting energy, and protecting against pathogens. 11 Besides, bidirectional communication between the gut and the central nervous system (CNS) can be achieved through the microbiota-brain-gut axis, with potential effects on CNS disorders. 12 Previous studies have investigated the association of gut microbiota and cognitive impairment,13–15 but most of these were animal studies, with only limited studies in humans.15, 16 Besides, most of these studies have small sample sizes, averaging around 40 cases per group, leading to relatively unstable and less representative results. Inconsistent findings were also reported across studies regarding the diversity of the gut microbiota and change trends in specific microbiota.
In this paper, we aimed to identify gut microbiota characteristics and key gut microbiota biomarkers associated with cognitive impairment in a relatively large sample of older adults in China. We also explored the effect of modifiable factors such as diet, sleep and exercise on the key gut microbiota biomarkers we identified. The study findings may provide new insights into the mechanisms of cognitive impairment and inform new intervention strategies to prevent and ameliorate cognitiveimpairment.
MATERIALS AND METHODS
Research participants
In a cross-sectional study conducted between 24 August 2020 and 5 July 2021, we recruited 229 participants mainly from community activity centers for older adults in Longhua District, Shenzhen, China. Our inclusion criteria were: 1) age ≥60 years; 2) willingness to participate in this study. Our exclusion criteria included: 1) a history of taking antibiotics at least two months before sampling; 2) a history of taking probiotics or prebiotics in the past month; 3) a history of taking gastrointestinal motility drugs or other drugs that affect gastrointestinal motility in the past month; 4) a history of gastrointestinal operations in the past five years (except cholecystectomy and appendectomy); 5) any infectious disease or cancers; 6) cannot cooperate with a cognitive evaluation due to aphasia, deafness, blindness or other physical diseases. All participants’ cognitive functions were evaluated using the Mini-Mental State Examination (MMSE, Chinese version) through a face-to-face interview. We classified participants into two groups: cognitive impairment (CI) and no cognitive impairment (NCI). As in previous studies,17–19 the definition of cognitive impairment was based on the score criteria of the MMSE (by education level: illiteracy ≤17 points, primary school ≤20 points, junior high school and above ≤24 points). 20
Ethics approval and consent to participate
The study was conducted in accordance with the Declaration of Helsinki and was approved by the Biomedical ethics committee of the Medical Department of Xi’an Jiaotong University (Approval Number: 2020-10). Written informed consent was obtained from each of the participants or family caregivers.
Collection of demographic, medical history, lifestyle, and MMSE data
We conducted standardized training for study investigators. First, study investigators explained the research purpose to potential participants and then began the questionnaire investigation after obtaining informed consent. For participants who were unable to complete the survey independently, family members or primary caregivers helped to complete the answer. The collected data were entered using the Wenjuanxing (www.wjx.cn 21 ), which is a widely used online questionnaire platform in China, and then exported in CSV format for analysis.
We collected demographic, medical history, lifestyle, and MMSE data through a face-to-face interview. The general information included: 1) Data on demographic characteristics such as age, sex, educational level, and marital status; 2) Data on history of previous diseases such as hypertension and diabetes; 3) Data on lifestyles such as exercise status, sleep duration, and diet status. Cognitive function was evaluated based on MMSE from five dimensions, orientation, registration, attention and calculation, recall, and language (details of the questionnaire in Supplementary Material 1).
Collection and analysis of oral and stool samples
Oral epithelial cell samples were collected at the end of the questionnaire process. The collected samples were immediately sent to Chi Biotech Co., Ltd. (Shenzhen, China) and stored at –20°C before DNA extraction. After extracting DNA from the samples, rs429358 and rs7412 of APOE were detected, generating APOE haplotypes.
Stool samples were collected from participants using stool collection tubes contained in a preservation solution (Hcy Technology Co., Ltd., Shenzhen, China). The collected samples were immediately sent to Chi Biotech Co., Ltd. (Shenzhen, China) and stored at –80°C for DNA extraction. After extraction of microbial genomic DNA, the V4 region of the 16 S ribosomal RNA (rRNA) gene was amplified and sequenced using the Illumina® Novaseq platform. See Supplementary Material 2 for details.
Bioinformatic analysis
The 16 S rRNA gene sequence data was demultiplexed, truncated Barcode and PCR amplification primer by bcl2fastq (version 2.20) first. Then the sequence data were analyzed using VSEARCH (version 2.14.2), 22 and an amplicon sequence variants (ASVs) abundance table was generated by denoising, correcting errors, merging paired-ends, and removing chimeras. The taxonomy of the representative sequence of each ASV was annotated with the naive Bayesian Classifier based on Ribosomal Database Project (RDP) by Qiime1, 23 and Greengenes 13_8 (99% OTUs dataset) as the reference database. Then the taxonomy table was exported and used for downstream analysis with R software (version 4.1.1).
Chord diagrams were employed to illustrate microbiota composition at phylum and genus taxonomic levels. All subsequent analyses were primarily based on genus-level microbiota data. The α-diversity was used to measure species diversity (richness) within a sample, while β-diversity was employed to describe species differentiation between samples. 24 The vegan package (version 2.5-7) was used to calculate α-diversity indices (Chao1, ACE, Shannon, and Simpson) and β-diversity (Bray-Curtis and Jaccard dissimilarity). The β-diversity dissimilarity were visualized after principal coordinate analysis (PCoA). Correlation analysis between gut microbiota was performed using a sparse compositional correlation (SparCC) algorithm by FastSpar (version 1.0.0) 25 and presented in a network diagram. The features of gut microbiota between two groups were performed using the linear discriminant analysis (LDA) effect size (LEfSe) method (http://huttenhower.sph.harvard.edu/galaxy/ 26 ), and its evaluation standard was set as LDA score > 3 in this study. See Supplementary Material 2 for details.
Statistical analysis
Continuous variables were presented as medians with interquartile ranges. Categorical variables were presented as the frequency with percentage (%). Different groups were compared with a t-test or Mann-Whitney test for Continuous variables and χ2/Fisher’s test for categorical data. Package of chisq.posthoc.test (version 0.1.2) was also used for multicategorical variables. Permutational multivariate analysis of variance (PERMANOVA) was used to test for β-diversity dissimilarity. Least absolute shrinkage and selection operator (Lasso) regression was used to select the final biomarkers from differential microbiota at the genus level obtained by LEfSe analysis. We then used Multivariable Logistic Regression (MLR), Random Forest (RF), Gradient Boosting Machine (GBM), and Extreme Gradient Boosting (XGBoost) after 5-fold cross-validation to establish the models for identifying cognitive impairment based on biomarkers. The receiver operating characteristic (ROC) was drawn and the average area under the ROC curve (AUC) was calculated to display the performance of the models. Spearman correlation was used to explore the relationship between the biomarker and the extent of cognitive impairment evaluated with MMSE. Transformation-based redundancy analysis (tb-RDA), a commonly used method for determining the influence of environmental factors on microbial communities,27–29 was applied in this study. The analysis identified significant lifestyle factors associated with the relative abundance of microbiota biomarkers. A permutation test was then used to verify the significance of the tb-RDA model. We conducted a series of sensitivity analyses on the new dataset generated using the propensity score matching and repeated the analysis steps mentioned above. See Supplementary Material 2 for other details. Missing values were handled by multiple imputations using the mice package (3.14.0). A p-value<0.05 was considered statistically significant for all tests.
RESULTS
Participants characteristics
After excluding participants not meet the inclusion criteria, this study ultimately included 74 CI and 131 NCI participants. CI Participants were significantly older than NCI participants (79.00 versus 68.00, Mann-Whitney test, p < 0.001, Table 1). No significant difference was found in sex, education, occupation, marital status, and APOE ɛ4 carriers (all p > 0.05) in both groups. The history of hypertension, diabetes, hyperlipidemia, etc. also showed no significant difference between the CI and NCI groups (all p > 0.05).
Main characteristics of participants (N = 205)
CI, cognitive impairment; NCI, no cognitive impairment.
Differences in composition and diversity of gut microbiota between CI and NCI participants
The composition of the gut microbiota in the CI and NCI participants was assessed at different taxonomic levels. At the phylum level, the relative abundance of Firmicutes (CI versus NCI: 44.25% versus 44.54%) and Proteobacteria (CI versus NCI: 30.34% versus 33.40%) was slightly lower in the CI group (p < 0.05, Fig. 1A). At the genus level, the relative abundance of Shigella (CI versus NCI: 9.80% veresus 7.82%) and Akkermansia (CI versus NCI: 5.60% versus 5.47%) was higher in the CI group (p < 0.05, Fig. 1B), while the relative abundance of Bacteroides (CI versus NCI: 7.18% versus 7.74%) was lower in the CI group (p < 0.05).

Composition of the gut microbiota in the CI and NCI groups. The top ten microbiota with relative abundance at the phylum level (A) and genus level (B) were displayed. Chi-square tests and post hoc chi-square tests were conducted for group comparisons. The color of the microbiota at the genus level was similar to its corresponding phylum level, such as Shigella, Enterobacter, Sutterella, and Escherichia all belonged to Proteobacteria. Abbreviation: CI, cognitive impairment; NCI, no cognitive impairment.
All α-diversity indices consistently showed lower values in the CI group than in the NCI group (Supplementary Figure 1A-D, all p < 0.05). The PCoA analysis of β-diversity also showed significant differences between the CI group and the NCI group (all p < 0.05), participants in the CI group had higher β-diversity (Supplementary Figure 1E-H).
Network characteristics of gut microbiota between CI and NCI participants
We identified a more complex network of interactions in the NCI group (Fig. 2B) than in the CI group (Fig. 2A) (Network density: 0.081 versus 0.074). Among which 96 bacterial genus pairs overlapped in both groups, they were most positively correlated (92.7%), therein four pairs (Stenotrophomonas and Delftia, Delftia and Brevundimonas, Delftia and Burkholderia, Ruminococcus and Dorea) had a greater correlation coefficient in CI group, two pairs (Butyricicoccus and Blautia, Clostridium and Blautia) had a smaller correlation coefficient compared with NCI group (Supplementary Table 1). We found that the hub bacterial genera were predominantly Proteobacteria in the CI group (Fig. 2C) and principally Firmicutes in the NCI group (Fig. 2D).

Correlation network of gut microbiota at genus level in CI and NCI groups. The sparse compositional correlation (SparCC) algorithm was used to analyze the correlation between the gut microbiota of the CI (A) and NCI (B) groups, and the correlated pairs with correlation coefficient absolute values > 0.5 and p < 0.05 were selected to construct networks, red lines represented positive correlations, green lines represented negative correlations, circles size represented relative abundance. CytoHubba plugin was used to select the top fifteen bacterial genera with Maximal Clique Centrality (MCC) value as the core microbiota and generate the subnetwork of the CI group (C) and the NCI group (D), circles size represented the MCC value. CI, cognitive impairment; NCI, no cognitive impairment.
Identification of key differential gut microbiota biomarkers between CI and NCI participants
The LEfSe analysis showed that 75 taxa were different between the two groups (p < 0.05, Supplementary Figure 2). We found that Synergistetes and Fusobacteria increased in the CI group, while Firmicutes, Bacteroidetes, and Actinobacteria decreased at the phylum level. At the genus level, although 30 bacterial genera changed between the two groups, ten genera— Shigella, Acidaminococcus, Synergistes, Pyramidobacter, Alistipes, Pseudomonas, Fusobacterium, Aquabacterium, Stenotrophomonas, Sediminibacterium— increased in the CI group.
We obtained sixteen gut microbiota biomarkers by Lasso regression to construct models. The five most important biomarkers were Megamonas, Blautia, Pseudomonas, Stenotrophomonas, and Veillonella (Supplementary Table 2). We found the four models can effectively distinguish CI and NCI participants, the average AUC of RF model (90.29%) was higher than that of MLR (88.25%), GBM (88.80%), and XGBoost (89.66%) models (Fig. 3). At the peak of the Youden index, the average sensitivity of RF (83.29%) and XGBoost (85.82%) models were better than the MLR (77.16%) and GBM (75.70%) models (Supplementary Table 3).

Receiver operating characteristic (ROC) curves for different types of models on the test set. Models for identifying CI through sixteen gut microbiota biomarkers selected by Lasso regression were constructed using 5-fold cross-validation. The red curve represented the average ROC curve after 5-fold cross-validation, and the grey curve represented ROC curve in each cross-validation procedure. CI, cognitive impairment; MLR, Multivariable Logistic Regression; RF, Random Forest; GBM, Gradient Boosting Machine; XGBoost, Extreme Gradient Boosting; AUC, the area under the ROC curve.
Association between the biomarker and cognitive impairment by dimensions
We found that the association between the biomarker and the five cognitive dimensions was divided into two categories, most were positive correlations (Supplementary Figure 3). Among these, Megamonas had the greatest correlation with Orientation (p < 0.001). Among the biomarker negatively correlated with the five dimensions, Pyramidobacter had the strongest correlation with attention and calculation (p < 0.001), and Pseudomonas correlated with all five dimensions (p < 0.05).
Impact of lifestyle factors on the biomarker
We found the intake of grains, beans, fruits, and eggs, exercise, and sleep duration had a significant impact on biomarkers’ abundance (p < 0.05, Fig. 4). Most biomarkers’ abundance were positively correlated with fruit intake and exercise, such as Faecalibacterium, Sutterella, Megamonas, and Veillonella. The abundance of these four biomarkers increased more with a greater intake of fruits and exercise compared with other genera. We found that the abundance of Pseudomonas and Synergistes increased with longer sleep duration and a greater intake of beans, grains, and eggs. Synergistes increased more than Pseudomonas.

Transformation-based redundancy analysis (tb-RDA) of gut microbiota biomarkers and lifestyle factors. A permutation test was used to verify the significance of the tb-RDA model. The yellow arrows indicated biomarkers, blue arrows indicated lifestyle factors, and the angle of yellow arrows and blue arrows indicated correlation (acute angle: positive correlation; obtuse angle: negative correlation; right angle: no correlation). CI, cognitive impairment; NCI, no cognitive impairment.
Sensitivity analysis based on datasets with propensity score matching
We used propensity score matching to identify 48 pairs of matching participants in CI and NCI groups. The two groups did not differ in age and sex (all p > 0.05, Supplementary Table 4). Similar to the above analysis, the microbiota composition showed a slightly lower relative abundance of Firmicutes in the CI group than the NCI group (43.11% versus 44.83%, p < 0.05, Supplementary Figure 4). Diversity analysis continued to indicate lower α-diversity and higher β-diversity in the CI group (all p < 0.05, Supplementary Figure 5), and the network density of microbiota interactions in the CI group remained lower than that in the NCI group (0.059 versus 0.082, Supplementary Figure 6).
The LEfSe analysis identified 43 differential taxa, of which 88.37% were consistent with those identified in the above analysis (Supplementary Figure 7). Further selection using Lasso regression yielded twelve bacterial genera as biomarkers, nine of which overlapped with the sixteen biomarkers found in the above analysis. Megamonas and Veillonella remained in the top five biomarkers (Supplementary Table 5). Models constructed using these twelve biomarkers showed the RF model continued to perform the best among the models (AUC = 89.67%, Supplementary Figure 8 and Supplementary Table 6). Most biomarkers remained positively correlated with the five cognitive dimensions (Supplementary Figure 9). Similar to the above analysis, grain and fruit intake and sleep duration significantly affected biomarkers’ abundance (p < 0.05, Supplementary Figure 10).
DISCUSSION
Our study investigated the pattern of gut microbiota in older Chinese adults living with cognitive impairment. We found the composition of gut microbiota differed between CI and NCI groups, with the CI group exhibiting significantly lower α-diversity and higher β-diversity. Sixteen bacterial genera, including Megamonas, Blautia, Pseudomonas, Stenotrophomonas and Veillonella, were identified as potential gut microbiota biomarkers that identify cognitive impairment. Greater fruit intake and exercise favored an increase in the abundance of gut microbiota biomarkers found in the NCI group and a decrease in those found in the CI group. Sensitivity analyses conducted on the new dataset generated using propensity score matching did not change the key findings and conclusions.
Our study demonstrated significantly lower α-diversity and higher β-diversity in the CI group. This is similar to previous studies on Alzheimer’s disease.30–32 While some studies found no significant differences in α-diversity and β-diversity between Alzheimer’s disease patients and cognitively normal individuals, 33 the majority of studies reported lower α-diversity in Alzheimer’s disease patients,31,32, 31,32 and Guo et al. also found higher β-diversity in Alzheimer disease patients. 15 Our findings suggested that a lower richness and uneven distribution of gut microbiota in the CI group.
Our investigation identified that the abundance of Pseudomonas and Stenotrophomonas are elevated in the CI group, they all belong to Gram-negative bacteria. Under stressful environmental conditions conducive to biofilm formation, some Pseudomonas spp. can produce amyloid curli, a crucial component of biofilms.34–37 These amyloids may enter the CNS through circulation and accelerate the amyloid-β fibrillation, then the microglia may be activated causing a series of immune response that promotes neurodegeneration and cognitive impairment.38–40 Our study also discovered a negative correlation between Pseudomonas and the five cognitive dimensions. Similar to our findings, Stenotrophomonas was reported to be enriched in the gastrointestinal tract of patients with moderate Alzheimer’s disease in the study by Chen et al. 41
Our study identified that Megamonas, Blautia, and Veillonella have been reduced in the CI group. They were all positively correlated with five cognitive dimensions. Megamonas is a Gram-negative bacteria that can ferment various carbohydrates, producing acetic, propionic, and lactic acids as end products. 42 These short-chain fatty acids (SCFAs) can exert direct or indirect effects on CNS processes, ultimately yielding protective benefits for cognitive function. 43 Renson et al. also observed a positive correlation between Megamonas and memory/attention in a longitudinal study, 44 which was consistent with our results. Blautia is also a member of a group of bacteria that can produce SCFAs. 45 It was observed that Parkinson’s disease patients with mild cognitive impairment exhibited a significant reduction in Blautia abundance compared to those with normal cognitive function. 46 Veillonella is nonmotile anaerobic cocci that is part of the normal microbiota in the gastrointestinal tract, capable of producing propionic acid.47, 48 Lu et al. 49 reported a decrease in Veillonella abundance among hypertensive individuals with cognitive impairment compared to those with normal cognitive function.
Our study demonstrated that the combination of sixteen gut microbiota biomarkers presented a high discriminative power for cognitive impairment. Especially, the Random Forest model exhibited exceptional performance with an AUC > 90%. In the six previous studies that used gut microbiota as biomarkers to discriminate individuals with Alzheimer’s disease from those with normal cognitive function, the models they constructed had AUC values ranging from 76% to 94%.31, 50–52 Our study’s model outperformed all studies but one. 31 However, Liu et al.’s study had a limited sample size of only about 30 individuals per group, which could lead to increased result uncertainties. Our models have important public health implications, its non-invasive, rapid and easy sampling method could have great potential for application in clinical practice.
Our study demonstrated that microbiota biomarkers with higher abundance in the NCI group were positively correlated with fruit intake and exercise, while negatively correlated with sleep duration, bean, grain, and egg intake. Conversely, the biomarkers that were abundant in the CI group exhibited a negative association with fruit intake, exercise, sleep duration, bean, grain, and egg intake. Fruits are known to contain polyphenolic compounds that can be metabolized by the gut microbiota and thereby confer protection against neurotoxic damage, reduce inflammation, and enhance cognitive function. 53 A meta-analysis indicated that a high level of physical activity was associated with a decreased risk of cognitive impairment. 54 Our findings implied that healthy lifestyle choices, such as consuming sufficient fruit and engaging in regular exercise, could promote the growth of gut microbiota which is beneficial for cognitive function, and consequently confer protection against cognitive impairment.
Our study has several limitations. First, owing to the difficulties of diagnosing CI with a comprehensive clinical examination in a field setting, we defined CI using MMSE results. Second, due to the binary nature of the classification variable of CI or NCI, we are unable to assess the impact of the severity of cognitive impairment on gut microbiota. Third, the 16 S rRNA gene amplicon method employed in this study precludes the accurate identification of bacteria at the species level. Fourth, our study reflects results without controlling for confounding factors such as diet and lifestyle, caution should be exercised when extrapolating our findings to other populations, as differences in diet, climate, and genetics may exist between countries. Fifth, as this is a cross-sectional study, the findings serve primarily as evidence of association rather than definitive evidence of causality.
Conclusion
Our study revealed significant differences in the gut microbiota composition between participants with and without cognitive impairment, and the sixteen identified gut microbiota biomarkers effectively discriminated between the two groups. Further, a diet rich in fruits and regular exercise could promote the growth of gut microbiota which is beneficial for cognitive function.
AUTHOR CONTRIBUTIONS
Jing Wang (Data curation; Formal analysis; Visualization; Writing – original draft; Writing – review & editing); Gong Zhang (Data curation; Formal analysis; Writing – review & editing); Hao Lai (Methodology; Writing – review & editing); Zengbin Li (Methodology; Writing – review & editing); Mingwang Shen (Writing – review & editing); Chao Li (Writing – review & editing); Patrick Kwan (Writing – review & editing); Terence J. O’Brien (Funding acquisition; Writing – review & editing); Ting Wu (Writing – review & editing); Siyu Yang (Writing – review & editing); Xueli Zhang (Writing – review & editing); Lei Zhang (Conceptualization; Funding acquisition; Project administration; Supervision; Writing – review & editing).
Footnotes
ACKNOWLEDGMENTS
We thank all the participants and field investigators for their valuable contributions.
FUNDING
This work was supported by the Ministry of Science and Technology of the People’s Republic of China (2022YFC2304900 and 2022YFC2304905 to LZ), the National Key R&D Program of China (2022YFC2505100 and 2022YFC2505103 to LZ); Outstanding Young Scholars Support Program (3111500001 to LZ); Epidemiology modeling and risk assessment (20200344 to LZ); and Xi’an Jiaotong University Young Scholar Support Grant (YX6J004 to LZ). This work was also supported by the National Health and Medical Research Council (NHMRC) Investigator Grant (APP1176426 to TJO).
CONFLICT OF INTEREST
The authors have no conflict of interest to report.
DATA AVAILABILITY
The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request.
