Abstract
Background
Ischemic stroke (IS) is a prevalent and serious neurological disorder, and inflammation and immune responses are crucial in the development of IS. O-GlcNAcylation is a form of post-translational modification that plays roles in numerous significant biological processes.
Objective
The major objective of the current study was to examine the involvement of O-GlcNAcylation associated genes in the pathogenesis of IS.
Methods
We downloaded two IS datasets from the GEO database, and subsequently the infiltration level of immune cells was quantified and compared. Differentially expressed O-GlcNAcylation genes were identified and machine learning algorithms were utilized to screen the hub genes. Subsequently, the IS samples were further classified based on hub genes through consensus clustering.
Results
Overall, nineteen O-GlcNAcylation related DEGs were identified. Through the machine learning algorithms, eight hub genes related to immune cell infiltration was identified. GSEA results showed that hub genes significantly correlated with immune system, RNA metabolism, and translation. Then two distinct subclusters mediated by O-GlcNAcylation were further defined, and functional analysis of cluster-specific DEGs demonstrated their participation in processes related to inflammation and immune response.
Conclusion
The O-GlcNAcylation has a significant impact on the pathogenesis of IS, which is correlated with immunological response and metabolic activity. The findings of this research could serve as a valuable guide for exploring the molecular mechanisms of IS and offer insights into drug screening and immunotherapy for IS.
Introduction
Stroke frequently causes impairment to the central nervous system. Among all causes of death, it ranks third after malignant tumors and heart disease. Ischemic stroke (IS) and hemorrhagic stroke are the two primary categories, with IS representing approximately 80% of cases. 1 IS is characterized by high morbidity, disability, mortality and recurrence rate, which brings heavy psychological and economic burden to patients, families and society. 2 Therefore, early diagnosis and effective emergency treatment for individuals suffering from IS are imperative to minimize the risk of impairment and fatality.
The brain injury after cerebral ischemia involves complex pathophysiological mechanisms, and the inflammatory immune processes play fundamental regulatory roles. 3 Previous research has indicated the immune system is involved in the development of stroke from the acute phase to convalescent phase.4,5 Regulating the immune status of the central nervous system can enhance the prognosis and neurological function of individuals with IS. 6 Exploration the infiltration profile of various immune cells, and assessment the variations of immune components in the immune system will provide new insights and perspectives for elucidating the molecular mechanism underlying stroke and selecting the targets for immunotherapy.
O-GlcNAcylation is a form of post-translational modification where a solitary O-linked N-acetylglucosamine (GlcNAc) unit is added to protein serine or threonine residues. 7 The process of modification is controlled by two enzymes, namely O-GlcNAc transferase (OGT) and O-GlcNAcase (OGA), which are responsible for adding and eliminating GlcNAc from proteins. This modification is reversible and dynamic in response to metabolic stress (e.g., hypoxia), and plays roles in numerous important biological processes, including gene expression, signal transduction, and cellular metabolism.8–10 Furthermore, an increasing body of evidence indicates that O-GlcNAcylation plays a crucial role in controlling the maturation and activities of immune cells. 11 Notably, the levels of both OGT and OGA are elevated in the brain in comparison to other tissues, indicating a significant involvement of O-GlcNAcylation in brain functionality, including neurogenesis, synaptic assembly and plasticity, neuronal functions, and energy metabolism. 12 Cells are safeguarded against different stress conditions, such as oxidative stress, glucose deprivation, and endoplasmic reticulum stress, through a sudden increase in O-GlcNAcylation. 13 Therefore, we can infer a strong association between O-GlcNAcylation and the development of IS. The dysregulation of O-GlcNAcylation has been linked to various illnesses, such as cancer, diabetes, Alzheimer's disease, and IS.14–16
To date, there has been no comprehensive analysis of genes related to O-GlcNAcylation in IS. The biological process of O-GlcNAcylation in IS and its relationships with clinical features and immune state are still unclear. Hence, in this investigation, we thoroughly assessed genes associated with O-GlcNAcylation in IS. In both IS and control samples, we investigated the expression patterns of genes associated with O-GlcNAcylation. Additionally, we conducted consensus clustering, immune infiltration analysis, and functional enrichment analysis utilizing differential O-GlcNAcylation genes. Furthermore, we employed machine learning algorithms to identify eight hub genes that have the potential to predict the occurrence of IS. The potential of our research lies in establishing a basis for enhanced comprehension of the molecular mechanism involved in IS and providing insights into the potential of O-GlcNAcylation as a therapeutic target for IS.
Materials and methods
Acquisition of microarray data
We obtained the datasets GSE16561 and GSE22255 from GEO database.17,18 The expression profile dataset of GSE16561 contained 39 IS patients and 24 healthy control subjects, and those of GSE22255 contained 20 IS patients and 20 controls matched for sex and age. The GSE16561 dataset was sequenced using the GPL6883 platform, and GSE22255 with GPL570 platform. The batch effects were firstly removed using the package sva to create an integrated dataset, which included 59 IS patients and 44 controls. 19 PCA was implemented based on package FactoMineR and factoextra. Variation among the samples was quantile-normalized using the preprocessCore package.
Genes associated with O-GlcNAcylation were obtained from the MSigDB (v7.4) using the the search terms: GOBP_PROTEIN_O_LINKED_GLYCOSYLATION and REACTOME_O_LINKED_GLYCOSYLATION. 20 In total, a summary of 151 genes related to O-GlcNAcylation was obtained by screening the data retrieved (Supplementary Table S1).
Selection of differentially expressed O-GlcNAcylation genes
The limma package was employed to identify the differentially expressed genes (DEGs) between IS and controls in the integrated dataset. 21 The threshold for differential genes was set at the adjusted P value < 0.05. The results of differential gene expression were visualized using a combination of volcano plot and heatmap. By intersecting the DEGs with the genes related to O-GlcNAcylation, we identified the O-GlcNAcylation genes that exhibited differential expression.
Functional enrichment analysis
ClusterProfiler package was employed to analyze the biological functions of DEGs, which contained Gene Ontology (GO) and KEGG. 22 GO categories comprised biological processes (BPs), cellular components (CCs), and molecular functions (MFs). The P value was adjusted with the Benjamini-Hochberg approach. The differentially expressed O-GlcNAcylation genes were then submitted to the STRING database, resulting in the acquisition of the protein-protein interaction (PPI) network.
Identification of hub genes by machine learning
To further screen hub genes for IS, three different machine learning algorithms, namely LASSO, random forest, and SVM, were utilized. The glmnet package was used to perform LASSO analysis with family = “binomial”, alpha = 1, type.measure = “deviance”, and nfolds = 10. 23 The random forest algorithm utilized recursive feature elimination (RFE) to rank the differential O-GlcNAcylation genes. This was done using the randomForest package with ntree = 500 and nodesize = default, 24 and the genes with a relative importance greater than 2.50 were identified as the candidate hub genes. The kernlab package was utilized for feature selection using SVM-RFE algorithm with sizes = c(2,4,6,8, seq(10,40,by = 3)), functions = caretFuncs, and method = “cv”. 25 Spearman's correlation analysis was employed to illustrate the association among hub genes. The pROC package was used to estimate the diagnostic effectiveness of hub genes by applying ROC curves and calculating the AUC. 26
Immune cell infiltration analysis
ssGSEA method in GSVA package was implemented to quantify the activation level of immune cell marker gene sets (e.g., T cells, B cells, macrophages, etc.) in individual samples by calculating their enrichment scores. Higher enrichment scores indicate greater infiltration levels of the corresponding immune cell types in the sample. 27 The correlation between different immune cell types was determined by Spearman's correlation test and displayed as a correlation matrix. Differential immune infiltration enrichment scores between IS and control groups were examined by the Wilcoxon test.
Gene set enrichment analysis (GSEA)
The correlation analyses between each hub gene and all other protein-coding genes were performed using the integrated expression profile dataset. Pearson's correlation coefficient was firstly calculated, then the genes significantly correlated with each hub gene were ranked and subjected to GSEA. The clusterProfiler package was utilized to functionally clarify the biological importance of hub genes using GSEA, based on the GO, KEGG, and Reactome pathway databases. 28
Consensus clustering analysis
Using the expression profile of eight hub genes, the ConsensusClusterPlus package was utilized to quantitatively assess the quantity of unsupervised categories in IS samples. 29 PCA was employed to determine the variation in expression of hub genes between the two clusters. The ggplot2 package was used to illustrate the PCA diagram. The limma package was utilized to identify the DEGs between two clusters, and the ClusterProfiler software was employed for conducting enrichment analysis. The pheatmap package was used to create the heatmap.
Gene set variation analysis (GSVA)
Initially, the reference sets were obtained from the MSigDB, including Hallmark, KEGG, and Reactome gene sets. Next, the GSVA toolkit and its ssGSEA algorithm were employed to calculate the GSVA enrichment score for each set of genes. 30 The limma package was utilized to compare the difference in the enrichment score of every gene set among the two clusters.
MicroRNAs (miRNAs) - transcription factors (TFs) network analysis
TFs have the ability to either activate or suppress gene transcription before it occurs, whereas miRNAs can regulate gene expression at the post-transcriptional level. miRNAs-TFs regulatory network for hub genes was built by means of RegNetwork database and visualized in Cytoscape software. 31
Statistical analysis
All statistical tests were implemented utilizing R software 4.2.2. The difference between the two groups was examined using either the Wilcoxon test or the Student's t-test. Pearson's or Spearman's correlation test was utilized to determine the correlation between the variables. Statistical significance was determined by considering P values less than 0.05 as significant, with all P values being two-sided.
Results
Identification and function analysis of DEGs
To obtain the integrated dataset for subsequent analysis, the GSE16561 and GSE22255 cohorts firstly underwent removal of batch effects (Figure 1A and B). Then the dataset was normalized using quartile normalization procedure, and differential gene expression analysis was performed (Figure 1C and D). In IS samples, a total of 2044 DEGs were identified, with 1290 genes showing downregulation and 754 genes showing upregulation (Figure 1E and F, Supplementary Table S2). The DEGs displayed distinct expression patterns in the IS and control samples, which provided strong evidence for the involvement of certain genes in the pathological processes of IS.

Identification of DEGs in the integrated expression profile. (A) Raw PCA plot for the integrated expression profile before de-batching. (B) Combat PCA plot for the integrated expression profile after de-batching. (C) Statistics of gene expression in the combined dataset prior to quantile normalization. (D) Statistics of gene expression in the combined dataset following quantile normalization. (E) Volcano plot showing DEGs between IS patients and healthy controls. Red nodes indicated upregulated DEGs, while blue nodes indicated downregulated DEGs. (F) Heatmap depicting the expression levels of DEGs. The pink indicated IS samples, and green indicated healthy control samples. Red indicated high expression, whereas blue indicated low expression.
Then we conducted enrichment analysis to explore the potential functions of these DEGs. For the GO-BPs analysis, these DEGs were significantly enriched in mitochondrial gene expression, positive regulation of cytokine production, ribosome biogenesis, T cell receptor signaling pathway, immune response-related cell activation, ribonucleotide metabolic process, and lymphocyte differentiation (Figure 2A). For the GO-CCs analysis, these DEGs were mainly enriched in mitochondrial matrix, mitochondrial inner membrane, organellar ribosome, mitochondrial ribosome, vesicle lumen, and specific granule (Figure 2B). For the GO-MFs analysis, DEGs were predominantly enriched in ribosome structure, rRNA binding, protein serine/threonine kinase activity, nuclease activity, electron transfer activity, death receptor activity, glucose binding, and damaged DNA binding (Figure 2C). In addition, KEGG analyses suggested that DEGs were enriched in pathways of neurodegeneration, NF-kappa B pathway, apoptosis, T cell receptor pathway, Th1 and Th2 cell differentiation, ribosome, Th17 cell differentiation, and N-Glycan biosynthesis (Figure 2D).

Function enrichment analysis of DEGs. (A-C) The enriched items involved in BPs, CCs, and MFs by DEGs. (D) The significantly enriched KEGG pathways by DEGs. GeneRatio, which represented the proportion of enriched gene count to total DEGs, was used to sort the top 20 enriched items. The color from red to blue represented the significance of P value, while the number of genes enriched into the items was indicated by the size of the dots.
In order to examine the involvement of genes associated with O-GlcNAcylation in the development of IS, we utilized the VennDiagram package to determine the overlap between DEGs and O-GlcNAcylation related genes. Among 151 O-GlcNAcylation related genes, six presented upregulation and thirteen presented downregulation in IS patients compared to healthy controls (Figures 3A and B). For the GO-BPs analysis, these nineteen differential O-GlcNAcylation related genes were enriched in protein O-linked glycosylation, macromolecule glycosylation, glycoprotein biosynthetic process, O-glycan processing, and protein O-linked mannosylation. For the GO-CCs analysis, these nineteen DEGs were enriched in rough endoplasmic reticulum membrane, platelet alpha granule, sarcoplasmic reticulum, and Golgi cisterna. For the GO-MFs analysis, nineteen DEGs were mainly enriched in glycosyltransferase, UDP-glycosyltransferase, acetylglucosaminyltransferase and hexosyltransferase activity (Figure 3C). Moreover, KEGG analyses indicated that these nineteen DEGs were enriched in O-glycan biosynthesis, N-Glycan biosynthesis, Rap1 pathway, and the signaling pathway of PI3K-Akt

Function enrichment analysis of O-GlcNAcylation related DEGs. (A) Venn diagram of O-GlcNAcylation related genes and upregulated DEGs. (B) Venn diagram of O-GlcNAcylation related genes and downregulated DEGs. (C) The enriched items about BPs, CCs, and MFs by O-GlcNAcylation related DEGs. (D) Mainly enriched KEGG pathways by the O-GlcNAcylation related DEGs. (E) Visual network of the relationships between O-GlcNAcylation related DEGs and the top five enriched KEGG pathways.

Expression analysis of O-GlcNAcylation related genes in IS. (A) Volcano plot showing the variation in expression of O-GlcNAcylation related genes between IS patients and healthy controls. Upregulated genes were represented by red nodes, while downregulated genes were represented by blue nodes. (B) Heatmap depicting the expression levels of O-GlcNAcylation related DEGs. The pink indicated IS samples, and green indicated healthy controls. Red indicated high expression, whereas blue indicated low expression. (C) Histogram displaying the expression levels of O-GlcNAcylation related DEGs in IS compared to controls. (**P < 0.01, ***P < 0.001, ****P < 0.0001).
Nineteen O-GlcNAcylation related DEGs underwent the application of three algorithms in order to identify hub genes. The LASSO algorithm was utilized to select the minimum requirements for constructing the LASSO classifier, and thirteen genes were identified, including THBS1, B4GALT5, PLOD2, PLOD1, OGT, DPM3, TMTC4, GALNT12, POMT1, SLC35C2, SPON1, ADAMTSL2 and TMTC3 (Figure 5A). The random forest algorithm identified ten genes, namely GALNT12, VEGFB, POMT1, THBS1, SLC35C2, PLOD2, B3GNT5, SPON1, B4GALT5, and OGT, with a relative importance greater than 2.50 (Figure 5B). It was observed that the classifier achieved the lowest error when the feature count was nineteen using the SVM-RFE algorithm (Figure 5C). Finally, eight hub genes shared by three algorithms were identified, containing THBS1, GALNT12, SLC35C2, OGT, B4GALT5, PLOD2, POMT1, and SPON1 (Figure 5D).

Selection of hub genes among O-GlcNAcylation related DEGs by machine learning. (A) Identification of hub genes using the LASSO method. (B) The ranking of genes based on their relative importance using the random forest algorithm. (C) The feature selection process using the SVM-RFE algorithm. (D) Venn diagram illustrating the common hub genes identified by three algorithms. (E) Circos plot displaying the correlation analysis between eight hub genes. A positive correlation was symbolized by the red color, while a negative correlation was symbolized by the green color. (F) The ROC curves estimating the diagnostic accuracy of hub genes.
In order to investigate the association between these genes, the Spearman's correlation analysis was utilized, and the results showed there existed high correlations between eight hub genes (Figure 5E). Among these hub genes, three (THBS1, B4GALT5, and PLOD2) presented upregulation and five (GALNT12, SLC35C2, OGT, POMT1, and SPON1) presented downregulation in IS patients compared to healthy controls, suggesting their potential roles during the development of IS. Additionally, we assessed the diagnostic accuracy of every hub gene in the combined dataset. The AUC values were 0.740 for THBS1, 0.731 for GALNT12, 0.694 for SLC35C2, 0.725 for OGT, 0.734 for B4GALT5, 0.709 for PLOD2, 0.716 for POMT1, and 0.664 for SPON1, indicating that these hub genes demonstrated a strong predictive power for IS (Figure 5F).
Numerous studies have indicated that immune cells might have a significant impact on the progression of IS through various mechanisms.32–34 The evaluation of infiltration of distinct immune cells revealed significant interactions among immune cell populations throughout the IS (Figure 6A). The activated CD4 T cells were significantly negatively correlated with macrophage, activated dendritic cell, monocyte, and gamma-delta T cell. The macrophage were significantly positively correlated with neutrophil, natural killer cell, regulatory T cell, and monocyte. Compared with healthy controls, eosinophil, activated dendritic cell, macrophage, gamma-delta T cell, natural killer cell, mast cell, plasmacytoid dendritic cell, neutrophil, and Th17 cell presented higher infiltration levels in IS. However, CD56dim tural killer cell, activated CD8 T cell, and activated B cell presented lower infiltration levels in IS (Figure 6B).

Changes in the infiltration of immune cells in IS. (A) Heatmaps illustrating the associations among different compositions of infiltrated immune cell types. (B) Box plots showing the infiltration levels of immune cells in IS patients and controls. ns P > 0.05, **P < 0.01, ***P < 0.001. (C) Correlation analysis between immune infiltration and each hub gene. The P value was indicated by the color, while the correlation coefficient was depicted by the size of the circle.
In order to gain a deeper understanding of the involvement of hub genes in immunomodulation, we conducted Spearman correlation analysis to investigate the potential association between hub genes and the infiltration of immune cells. The correlation analysis indicated that eight hub genes exhibited a significant positive or negative association with the infiltration of various types of immune cells (Figure 6C). For example, the expression of B4GALT5 exhibited significant positive correlations with mast cell, neutrophil, Th17 cell, eosinophil, natural killer cell, macrophage, activated dendritic cell, immature dendritic cell, natural killer T cell, regulatory T cell, monocyte, and MDSC but negative correlations with activated CD8 T cell and CD56dim tural killer cell. The expression of THBS1 showed significant positive correlations with immature dendritic cell, Th17 cell, eosinophil, mast cell, and natural killer cell but a negative correlation with activated CD8 T cell. Therefore, the hub genes could potentially regulate immune characteristics in the development of IS.
Correlation analyses between each hub gene and other genes were performed using data from the integrated cohort, and the genes significantly correlated with each hub gene (P < 0.05) were ranked and subjected to GSEA (Figure 7 and Supplementary Figure 2). The results of GSEA-Reactome (Figure 8) revealed that B4GALT5 was enriched in the translation, neutrophil degranulation, immune system, metabolism of RNA, and mitochondrial translation pathways. GALNT12 was enriched in the translation, rRNA processing, nonsense-mediated decay, metabolism of RNA, MyD88 deficiency (TLR2/4) pathways. OGT was enriched in the translation, tRNA processing, mitochondrial translation, metabolism of RNA, rRNA processing, transcriptional regulation by small RNAs, and transport of mature mRNA. PLOD2 was enriched in the hemostasis, platelet activation, mitochondrial translation, clotting cascade, olfactory signaling pathway, extracellular matrix organization, citric acid cycle, and mRNA Splicing pathways. POMT1 was enriched in the metabolism of RNA, tRNA processing, translation, rRNA processing, DNA Repair, mitochondrial translation, and mRNA Splicing pathways. SLC35C2 was enriched in the interferon signaling, mitochondrial translation, membrane trafficking, interleukin-10 signaling, GPCR ligand binding, asparagine N-linked glycosylation, vesicle-mediated transport, and cytokine signaling in immune system. SPON1 was enriched in the rRNA processing, metabolism of RNA, translation, nonsense-mediated decay, viral mRNA translation, and mRNA splicing pathways. THBS1 was enriched in the mitochondrial translation, translation, metabolism of RNA, tRNA processing, interleukin-10 signaling, NGF-stimulated transcription, the signaling of interleukin-13 and interleukin-4, as well as PI3 K/AKT signaling. The results strongly suggested that these key genes may have a strong association with immune system, RNA metabolism, translation. The top 20 results of GSEA

Heatmaps showing the top 50 genes positively correlated with each hub gene in the integrated IS cohorts. The red indicated high expression, and blue indicated low expression of associated genes.

The top 20 GSEA results for reactome pathways of each hub gene. A positive correlation was indicated by an enrichment score greater than 0, whereas a negative correlation was indicated by a score less than 0.
Using the consensus clustering technique, the IS samples were further grouped according to the expression pattern of eight hub genes. It was determined that the ideal count of subtypes was two, namely cluster A and cluster B (Figure 9A, Supplementary Figure 5). Cluster A contained 28 IS samples, while cluster B had 31 samples. A significant difference in the expression of hub genes between the two clusters was observed (Figure 9B). Cluster A exhibited significantly lower expression levels of B4GALT5 and THBS1 compared to cluster B, while the expression of OGT was significantly higher in cluster A than in cluster B (Figure 9C).

Construction of two clusters of IS samples using hub genes in the integrated cohorts. (A) Plot of the consensus matrix for k = 2. (B) Heatmap displaying the expression of hub genes between two clusters. Red represented high expression, and blue represented low expression. (C) Boxplot showing the expression levels of hub genes across two clusters. ns P > 0.05, **P < 0.01, ***P < 0.001.
Then GSVA analysis was conducted, and several pathways were enriched. As illustrated in Figure 10, the cluster A had higher enrichment scores in protein secretion, peroxisome, mismatch repair, homologous recombination, MYC targets, E2F targets, primary immunodeficiency, N glycan biosynthesis, tRNA processing, and nucleotide biosynthesis than those in cluster B, however, cluster A had lower enrichment scores in coagulation, kRAS signaling, Notch signaling, inflammatory response, Hedgehog signaling, TNFa signaling, glycosaminoglycan metabolism, ECM receptor interaction, adenylate cyclase activating pathway, signaling by ERBB4, TP53 activity regulation, interleukin-10 signaling, and gap junction assembly than those in cluster B, which indicating two clusters had different immunological features and molecular mechanisms underlying the process of IS.

Heatmaps showing the enrichment levels of Hallmark, KEGG, and Reactome gene sets between two clusters.
The expression pattern between the two clusters was shown to be significantly different according to PCA analysis (Figure 11A). Based on the integrated expression profile to further validate the clusters, 702 DEGs were identified across two clusters, including 277 upregulated genes and 425 downregulated genes in cluster B (Figure 11B, Supplementary Table S3). Subsequently, we analyzed the biologically relevant functions of DEGs between the two clusters. For the GO-BPs analysis, these DEGs were primarily enriched in cytokine-mediated pathway, cell chemotaxis, leukocyte activation and migration, T cell differentiation, granulocyte migration, and cell-cell adhesion regulation. For the GO-CCs analysis, these DEGs were involved in tertiary granule lumen, external side of plasma membrane, serine-type peptidase complex, and tertiary granule. For the GO-MFs analysis, DEGs were enriched in chemokine receptor binding, cytokine receptor binding, and the activity of immune components, such as receptor ligand, cytokine, cytokine receptor, chemokine (Figure 11C). Besides, KEGG analyses revealed that these DEGs were highly enriched in various pathways, including TNF signaling pathway, cytokine-cytokine receptor interaction, IL-17 signaling pathway, FoxO signaling pathway, NOD-like receptor pathway, lipid and atherosclerosis, complement and coagulation cascades, PI3K-Akt pathway, NF-kappa B pathway, and chemokine pathway (Figure 11D). These results suggested that two different molecular subtypes existed in IS patients.

Differential and functional analysis of two immune clusters. (A) PCA plot depicting that IS samples were categorized as two clusters. (B) Volcano plot describing the differential expression of genes across two clusters. Red nodes represented upregulated genes, while blue nodes represented downregulated genes. (C) Main BPs, CCs, and MFs enriched by cluster-specific DEGs. (D) Main KEGG pathways enriched by cluster-specific DEGs.
The regulatory network of miRNAs and TFs provided insights into the interaction between miRNAs, TFs, and the hub genes. The regulatory network of miRNAs-TFs, built by RegNetwork, consisted of 143 nodes and 167 edges (Figure 12). Out of the 143 nodes, there were 8 dedicated to hub genes, 36 to TFs, and 99 to miRNAs. MAX had been predicted to regulate OGT, SLC35C2, and B4GALT5, while THBS1, SLC35C2, and B4GALT5 could be regulated by EGR1 and USF1. OGT was a possible down-stream of miR-101, miR-1271, miR-16, miR-182, miR-24, miR-340, miR-497, miR-548c-3p, and miR-96, while THBS1 was predicted to be regulated by miR-101, miR-19a, miR-19b, miR-21, and miR-340. This network suggested that these miRNAs or TFs might regulate the expression of some crucial mRNAs in IS, thereby playing key roles in the development of IS. These discoveries offered guidance for future therapeutic strategies targeting these genes.

The miRNAs-TFs-hub genes regulatory network. RegNetwork database was used to characterize miRNAs-TFs regulatory network. This network embodied with 143 nodes and 167 edges. Red color denoted hub genes.
Nucleocytoplasmic proteins frequently undergo O-GlcNAcylation, which is a prevalent form of post-translational modification. O-GlcNAcylation controls various cellular processes including protein synthesis, gene expression, cell signaling, metabolic pathways, and programmed cell death.8–10,35 The deregulation of O-GlcNAcylation has been linked to a range of human diseases including cancer, diabetes, cardiovascular disorders, Alzheimer's disease, and stroke.14–16 Enhancing comprehension regarding the role of O-GlcNAcylation in physiopathological mechanisms could facilitate the development of novel strategies for therapeutic intervention.
In the mouse models of cerebral ischemia, there was an initial increase in O-GlcNAcylation (within 1–4 h after ischemia) followed by a subsequent decrease. The slight rise in brain O-GlcNAcylation improved ischemia-reperfusion (I/R) injury and neurological impairments, whereas the disruption of the temporary increase in O-GlcNAcylation worsened the brain damage and increased mortality. 36 The ischemic regions of human brains also exhibited dynamic changes in O-GlcNAcylation. 36 The collective data indicate that O-GlcNAcylation has a significant regulatory function in cerebral I/R injury and also suggest that enhancing O-GlcNAcylation could be a potential approach to improve the clinical outcome of IS.37,38 The XBP1 s/HBP/O-GlcNAc axis is formed by combining spliced X-box binding protein-1 (XBP1 s) with the hexosamine biosynthetic pathway (HBP) and O-GlcNAcylation.37,39 Extensive research has been conducted on this axis in cases of heart and brain I/R injury. In the brain, the upregulation of XBP1 s in neurons enhanced the expression of HBP enzymes and elevated the levels of O-GlcNAcylation. Simultaneously with the activation of XBP1 s, there was a rapidly rise in O-GlcNAcylation following stroke, specifically in neurons situated in the penumbra. The brain of neuron-specific OGT knockout mice exhibited reduced O-GlcNAcylation activation and experienced a more unfavorable outcome. 40 The data presented here demonstrate that the XBP1/HBP/O-GlcNAc axis is functional in the brain and its activation offers neuroprotection following IS.38,41 O-GlcNAcylation has been implicated in the regulation of cellular stress responses, which are activated during ischemic events. The modification of proteins involved in stress response pathways can enhance cell survival and reduce apoptosis, thereby providing a protective effect against ischemic injury. This protective role of O-GlcNAcylation is particularly evident in the context of I/R injury, where it has been shown to attenuate calcium overload, reduce mitochondrial permeability transition pore opening, and modulate inflammatory responses. Increasing O-GlcNAcylation is a promising cytoprotective strategy to improve functional outcomes after stroke. Thiamet-G or glucosamine could reverse O-GlcNAcylation activation and significantly improves neurologic function after stroke.38,40
Nevertheless, the molecular mechanisms and signaling pathways linked to O-GlcNAcylation in IS have not been extensively explored. In the present study, we firstly obtained the integrated dataset from the GSE16561 and GSE22255 cohorts by removing the batch effects. The functional analysis of DEGs between the IS and controls revealed enrichment of mitochondrial gene expression, cytokine production, NF-kappa B pathway, immune response, T cell receptor pathway, apoptosis, and lymphocyte differentiation, which were consistent with the previous studies. 42 By intersecting the DEGs identified in combined dataset and O-GlcNAcylation related genes, we obtained nineteen O-GlcNAcylation related DEGs, including six upregulated and thirteen downregulated genes in IS patients. The analysis of these DEGs indicated that they were primarily enriched in the O-linked glycosylation process. Then, eight hub genes, THBS1, GALNT12, SLC35C2, OGT, B4GALT5, PLOD2, POMT1, and SPON1 were identified by three machine learning algorithms. Additionally, we assessed the diagnostic accuracy of every hub gene in the integrated dataset. The AUC values for these hub genes indicated their central involvement in O-GlcNAcylation mediated pathogenesis of IS, though their predictive performance required validation in independent patient cohorts. Moreover, we found that there existed strongly synergistic or antagonistic interactions among hub genes in IS.
THBS1, a critical mediator of hemostasis, promotes platelet activation in thrombus formation. miR-487b enhanced the growth, migration, and formation of tubes in umbilical vein endothelial cells by regulating THBS1 in IS, 43 and THBS1 was identified as one of the most valuable immunosuppression-related genes, which might be proposed as potential target for stroke. 44 In present work, THBS1 presented upregulation in IS patients. However, Chen et al. found that there was no significant association between THBS1 expression level with IS risk and long-term death after IS. 45 GALNT gene family are responsible for initiating O-glycans. GALNT12 defect was found to be associated with colorectal cancer susceptibility due to glycosylation pathway defects. 46 In addition, GALNT12 facilitated the malignant characteristics of glioblastoma via the PI3 K/Akt/mTOR axis and might serve as a potential target for glioblastoma. 47 OGT reduction contributed to hypoxia-induced inflammatory response in vascular endothelial cells, while upregulated OGT protected against cerebral I/R injury by inhibiting Drp1 in mice, suggesting that increased O-GlcNAcylation level could promote cell survival under cellular stresses, making OGT a potential target for intervention in I/R injury. 48 SLC35C2 can function as a transporter of GDP-fucose, competing with SLC35C1. Alternatively, it can act as a factor that boosts the fucosylation process of Notch, which is essential for efficient Notch signaling in mammals. 49 The downregulation of GALNT12, SLC35C2, and OGT in IS patients may result in glycosylation pathway defects and decreasing O-GlcNAcylation level.
Diabetes and obesity showed a positive correlation with the expression of B4GALT5. By promoting adipocyte commitment and reducing inflammation caused by M1 macrophages, downregulation of B4GALT5 improved insulin resistance. 50 The upregulation of B4GALT5 in IS may promote hypoxia-induced inflammation. PLOD2 is expressed in fibroblasts, which required for tube formation, collagen fiber alignment, and extracellular matrix stiffening. PLOD2 was increased in the early stage after cerebral ischemia of young rats, suggesting its essential role in post-stroke angiogenesis. 51 POMT1 is a critical enzyme involved in protein O-mannosylation. POMT1 is expressed especially in astrocytes, immature neurons, and mature neurons in the nervous system. 52 Mutations in the POMT1 gene, leading to decrease of glycosylated alpha-dystroglycan, have been related to congenital muscular dystrophies concomitant central nervous system lesions. 53 By binding to the BACE1 binding site of amyloid precursor protein, SPON1 inhibits the initiation of amyloidogenesis. Additionally, SPON1 has the ability to alleviate memory dysfunction and cognitive impairment in a mouse model of Alzheimer's disease. 54 The downregulation of SPON1 in IS could worsen the brain damage and neurological deficits caused by ischemia.
Numerous studies have demonstrated that the immune system runs through the whole process of stroke, including the mechanism underlying risk factors, neurotoxicity, tissue repair and remodeling. The communication between nervous system and immune system regulates the development of stroke. Cerebral ischemia after stroke can induce the initial immune inflammatory response, aggravate the neurological deficit, and lead to stroke-associated immunosuppression, which related to susceptibility to infection. Peripheral immune cells, such as dendritic cells, neutrophils, B cells, T cells, and macrophages, penetrate blood-brain barrier into the ischemic brain tissue after IS. 55 Compared with healthy controls, activated dendritic cell, eosinophil, macrophage, neutrophil, mast cell, natural killer cell, and Th17 cell presented higher infiltration levels in IS. This further illustrates the significance of immunity during the progression of IS. The interplay between O-GlcNAcylation and immune function is crucial, especially in the context of stroke, where inflammation and immune responses are pivotal in determining the extent of neuronal damage and recovery. O-GlcNAcylation can modulate the activity of immune cells, such as lymphocytes and monocytes, which are involved in the inflammatory cascade following a stroke. This modification can alter the expression of cytokines and chemokines, which are critical mediators of inflammation and immune cell recruitment to the site of injury. Moreover, we found that eight O-GlcNAcylation hub genes had significantly correlations with the infiltration of immune cells. The findings of GSEA also indicated that these hub genes highly correlated with immune system and metabolism. Therefore, the hub genes could potentially regulate immune characteristics in the development of IS.
miRNAs-TFs regulatory network provided information on the interaction between miRNAs, TFs, and the hub genes. The mice model is shielded from ischemia by C-MYC, as it increases the expression of SIRT1 targeted by miR-200b-5p, while EGR1 expression exaggerates brain injury through reducing BDNF expression.56,57 miR-302a-3p overexpression targets E2F1 expression and promotes nerve repair and alleviating IS. 58 In mice model of stroke with middle cerebral artery occlusion, miR-1224 negatively regulates the activation of NK cells through a SP1-dependent mechanism. 59 miR-101 protects neuronal cells from apoptosis and brain injury by regulating JAK2/STAT3 signaling pathway, and miR-21 protects ischemic injury by targeting p53/Bcl-2/Bax pathway.60,61 The elevated miR-19a-3p mediates ischemic injury by inhibiting ADIPOR2, and the low expression of miR-497 is closely associated with poor prognosis in patients with IS. 62 The network analysis suggest that these miRNAs or TFs may have significant involvement in the development of IS through complex transcriptional regulatory mechanisms.
Through the consensus clustering, the IS samples were further categorized into two clusters according to the expression profile of eight O-GlcNAcylation hub genes. The GSVA analysis found that cluster A had higher enrichment scores in protein secretion, mismatch repair, MYC targets, E2F targets, primary immunodeficiency, N glycan biosynthesis, and nucleotide biosynthesis, however, cluster B had higher enrichment scores in coagulation, inflammatory response, kRAS signaling, Hedgehog signaling, TNFa signaling, Notch signaling, TP53 activity regulation, and interleukin-10 signaling. Subsequently, the biologically functions of DEGs between the two clusters were analyzed, and these DEGs were primarily concentrated in pathways and processes associated with the immune system. These results indicated two clusters mediated by O-GlcNAcylation had different immunological features and molecular mechanisms underlying the process of IS. Therefore, it has become crucial to develop novel medications that target O-GlcNAcylation and immune-related pathways in IS.
There were, however, some limitations to our study. Initially, our findings relied solely on bioinformatics analysis and lack of confirmation from biological experiments. Besides, the sample size was relatively small from the GSE16561 and GSE22255 cohorts. More prospective data with larger sample size are required to prove our results. In addition, there may be a selection bias because our data came from the public database. Lastly, further research is required to clarify the mechanisms of hub O-GlcNAcylation genes in physiopathological processes of IS, and the potential application in IS diagnosis of hub genes needs more exploration.
Conclusion
This study provides the comprehensive analysis of O-GlcNAcylation related genes in the pathogenesis of IS for the first time. We demonstrate the strong association of O-GlcNAcylation related genes with immune infiltration and immunological responses among IS patients. O-GlcNAcylation is a critical regulatory mechanism that influences immune responses and cellular stress pathways in IS. By modulating the activity of immune cells and stress response proteins, O-GlcNAcylation can potentially alter the course of ischemic injury and recovery. Eight hub O-GlcNAcylation genes selected through machine learning can accurately assess the diagnosis of IS patients and classification of distinct IS subtypes. Understanding the precise mechanisms by which O-GlcNAcylation affects these processes could lead to novel therapeutic strategies for managing stroke and its associated complications.
Supplemental Material
sj-docx-1-thc-10.1177_09287329251389496 - Supplemental material for Identification of O-GlcNAcylation related genes and immune infiltration profile in ischemic stroke utilizing bioinformatics and machine learning
Supplemental material, sj-docx-1-thc-10.1177_09287329251389496 for Identification of O-GlcNAcylation related genes and immune infiltration profile in ischemic stroke utilizing bioinformatics and machine learning by Hongchao Liu, Zhihao Wei, Yajun Yang, Yu Zhang and Jiaqiong Li in Technology and Health Care
Supplemental Material
sj-xls-2-thc-10.1177_09287329251389496 - Supplemental material for Identification of O-GlcNAcylation related genes and immune infiltration profile in ischemic stroke utilizing bioinformatics and machine learning
Supplemental material, sj-xls-2-thc-10.1177_09287329251389496 for Identification of O-GlcNAcylation related genes and immune infiltration profile in ischemic stroke utilizing bioinformatics and machine learning by Hongchao Liu, Zhihao Wei, Yajun Yang, Yu Zhang and Jiaqiong Li in Technology and Health Care
Supplemental Material
sj-csv-3-thc-10.1177_09287329251389496 - Supplemental material for Identification of O-GlcNAcylation related genes and immune infiltration profile in ischemic stroke utilizing bioinformatics and machine learning
Supplemental material, sj-csv-3-thc-10.1177_09287329251389496 for Identification of O-GlcNAcylation related genes and immune infiltration profile in ischemic stroke utilizing bioinformatics and machine learning by Hongchao Liu, Zhihao Wei, Yajun Yang, Yu Zhang and Jiaqiong Li in Technology and Health Care
Supplemental Material
sj-xls-4-thc-10.1177_09287329251389496 - Supplemental material for Identification of O-GlcNAcylation related genes and immune infiltration profile in ischemic stroke utilizing bioinformatics and machine learning
Supplemental material, sj-xls-4-thc-10.1177_09287329251389496 for Identification of O-GlcNAcylation related genes and immune infiltration profile in ischemic stroke utilizing bioinformatics and machine learning by Hongchao Liu, Zhihao Wei, Yajun Yang, Yu Zhang and Jiaqiong Li in Technology and Health Care
Footnotes
Author contributions
Liu HC and Wei ZH conceived the study. Yang YJ, Zhang Y and Li JQ participated in the study design, data analysis and statistics. Liu HC drafted the manuscript. All authors read and approved the final manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Henan Provincial Medical Science and Technology Research Project (LHGJ20230835), and the Medical Key Cultivation Discipline Program of Luoyang (STE-2022-5).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The datasets presented in current study can be found in the GEO repository (accession number: GSE16561 and GSE22255).
Supplementary material
The supplementary material for this article can be found online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
