Abstract
Naotai formula (NTF) is clinically used for stroke treatment, yet its molecular mechanisms involving vascular and metabolic regulation remain unclear. This study combines network pharmacology (NP) and Mendelian randomization to explore NTF’s therapeutic targets and pathways in stroke. Stroke-related genes were sourced from public databases, and NTF’s active compounds were screened using SwissADME. Summary-data-based Mendelian randomization (SMR) analysis, combined with colocalization, integrated stroke genome-wide association study data with blood expression quantitative trait loci and protein quantitative trait loci datasets to identify genes/proteins causally linked to stroke risk. Protein–protein interaction (PPI) network and drug-compound-target networks were constructed using Cytoscape and R. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses identified functional roles. Molecular docking assessed interactions between key compounds and prioritized targets. A total of 579 overlapping genes linked NTF and stroke. SMR identified 44 stroke-associated genes, with vascular endothelial growth factor A, angiotensinogen (AGT), and lipoprotein(a) replicated. Validation in Brain-eMeta supported eight of these targets, supporting tissue relevance. Enrichment analyses highlighted pathways including PPAR signaling, cholesterol metabolism, and vascular function. Core targets (Adiponectin, C1Q and collagen domain containing (ADIPOQ), Scavenger Receptor Class B Member 1 (SCARB1), and AGT) emerged from PPI networks. Molecular docking confirmed strong binding between NTF’s Calycosin and AGT, a key renin–angiotensin system protein. NTF likely mitigates stroke by modulating genes involved in cholesterol metabolism and vascular regulation. The predicted Calycosin–AGT interaction provides a genetically informed hypothesis for a possible role in renin–angiotensin modulation. This integrative approach provides genetic and mechanistic insights into NTF’s therapeutic efficacy.
Keywords
INTRODUCTION
Stroke ranks as the second leading cause of both disability and death globally, disproportionately affecting low- and middle-income countries. 1 The global incidence of stroke continues to rise annually, reaching a striking 13.7 million new cases in 2016 alone. 2 Despite this high prevalence, effective acute treatments remain critically underutilized worldwide. Fewer than 5% of ischemic stroke patients receive timely thrombolysis, and access to mechanical thrombectomy remains limited. 3 This treatment gap fuels devastating long-term disability and diminishes quality of life for millions. Stroke manifests with heterogeneous clinical symptoms, ranging from subtle neurological impairments to profound paralysis, speech deficits, and cognitive dysfunction. 4 Consequently, stroke places an enormous strain on health care systems due to the substantial costs of acute care, long-term rehabilitation, and ongoing management of complications. 5 Globally, the estimated economic burden of stroke exceeds US$721 billion, representing 0.66% of the global GDP in 2022. 6 Thus, the development of effective, accessible, and affordable stroke therapies is an urgent global health priority.
Traditional Chinese Medicine (TCM) formulas, including Naotai formula (NTF), are clinically utilized for stroke management in China. 7 NTF is a composite herbal remedy consisting of Astragalus mongholicus (Radix Astragali), Ligusticum chuanxiong, Pheretīma, and Bombyx batryticatus. 8 A clinical trial protocol has been registered to investigate NTF for hypertensive cerebral small vessel disease, 9 suggesting clinical interest in its cerebrovascular applications. Preclinical studies suggest NTF may exert neuroprotective, anti-inflammatory, and circulation-enhancing effects relevant to stroke. For example, NTF mitigates ischemic stroke-induced neuronal ferroptosis by modulating iron metabolism homeostasis.10–12 Specifically, NTF upregulates the Heat Shock Factor 1/Heat Shock Protein Family B 1 (HSF1/HSPB1) pathway, inhibits transferrin receptor 1 expression to reduce neuronal iron uptake, and enhances ferritin heavy chain 1 expression to increase ferritin iron storage. 10 However, its broader molecular mechanisms remain incompletely understood.
Network pharmacology (NP) offers a system approach to investigate the multicomponent nature of TCM formulas.13,14 By constructing drug-compound-target-disease networks, it facilitates systematic identification of potential drug targets and pathways. 15 This framework has been widely applied in TCM research using TCMSP-based screening for hypothesis generation. 16 While NP excels at hypothesis generation, it primarily reveals correlations. 17 Mendelian randomization (MR) provides a complementary approach to infer causality using genetic variants as instrumental variables. 18 Recent studies have demonstrated the translational utility of combining MR with hypothesis-generating pipelines, including in stroke. For instance, Zhang et al. have used MR to prioritize druggable genes for cerebrovascular disease. 19 Moreover, a workflow integrating NP, MR, and molecular docking has been recently applied to Zhenbao pills in stroke treatment. 20
Based on these frameworks, this study aimed to elucidate the mechanisms of NTF in stroke treatment. We employed NP to identify potential molecular targets and pathways of NTF in stroke. Subsequently, we utilized MR to assess the causal relevance of NP-identified stroke-related target genes. Finally, molecular docking was used to explore interactions between key formula compounds and target proteins. This integrative strategy offers a systemic approach to dissecting TCM mechanisms and has the potential to identify new therapeutic targets and contribute to precision stroke treatment and drug development. Our analysis suggests that NTF influences stroke-relevant processes such as vascular regulation and lipid metabolism, which are well established in the pathophysiology of atherosclerosis and cerebrovascular disease. 21
MATERIALS AND METHODS
Study Design
This study employed an integrated approach of NP, MR, and molecular docking to investigate the mechanisms of NTF in stroke (Fig. 1). The workflow comprised compound screening, target prediction, integration with stroke-related genes, genetic causal inference, functional enrichment, and molecular docking validation.

Study design and identification of candidate therapeutic targets.
Identification of Potential Targets
Bioactive compounds of NTF were retrieved from TCMSP 22 (https://old.tcmsp-e.com/tcmsp.php, accessed January 2025), BATMAN-TCM 23 (http://bionet.ncpsb.org.cn/batman-tcm/#/home, accessed January 2025), and the Similarity Ensemble Approach 24 (https://sea.bkslab.org/, accessed January 2025). Both “known” and machine learning-predicted targets from BATMAN-TCM (score > 0.84) were included. Pharmacokinetic properties were evaluated using SwissADME 25 (http://www.swissadme.ch/, accessed January 2025), and compounds were retained if they had a bioavailability score > 0.5, high gastrointestinal absorption, and satisfied at least two of five drug-likeness rules (Lipinski, Ghose, Veber, Egan, and Muegge). Compounds with SMILES strings exceeding 200 characters were excluded. Predicted targets of the retained compounds were compiled.
Acquisition of Stroke-Related Disease Genes
Stroke-related genes were obtained from OMIM 26 (https://www.omim.org/, accessed January 2025), MalaCards 27 (https://www.malacards.org/, accessed January 2025), and GeneCards 28 (https://www.genecards.org/, accessed January 2025). For GeneCards, only genes with relevance scores in the top 25% for “stroke” were included. The intersection of compound-related targets and stroke-associated genes defined the candidate set for subsequent causal analyses.
MR Analysis and Colocalization Analyses
For MR, we utilized summary-level data from large-scale stroke genome-wide association study (GWAS; primary: C_STROKE FinnGen R12; validation: UK Biobank, GWAS Catalog). Blood expression and protein quantitative trait loci (eQTL/pQTL) data were sourced from the eQTLGen Consortium 29 and three European cohorts (Fenland, 30 UKB-PPP, 31 and deCODE 32 ) respectively. Detailed descriptions of all databases, screening criteria, and cohort sample sizes are provided in the Supplementary Data. The use of public data obviated the need for further ethical approval.
We conducted a two-sample summary data-based MR (SMR) analysis to infer causal effects of genetically predicted gene/protein expression on stroke risk, using cis-QTLs as instrumental variables. Associations with p_SMR < 0.05 and p_HEIDI > 0.01 were considered nominally significant, reflecting common practice in exploratory SMR analyses. To account for multiple testing, False Discovery Rate (FDR)-adjusted p-values (Benjamini–Hochberg) were also reported, but nominal thresholds were retained for primary inference. Colocalization analysis was then performed on significant SMR hits to test for a shared causal variant (posterior probability [PP].H4 > 0.5, PP.H3 < 0.5), following established guidelines.33,34
PPI Network and Enrichment Analyses
High-confidence targets identified through MR were used to construct a protein–protein interaction (PPI) network via the STRING database. 35 Genes that did not reach the predefined interaction confidence (STRING score > 0.4, default setting) were not included in the main network. Network topology was analyzed using the igraph R package. Seven centrality parameters were calculated for each node. Each gene was ranked across these measures, and an average rank was derived to generate a composite score. The top-ranked genes were defined as core targets. Details of the core target selection rule are provided in the Supplementary Data. We then performed Gene Ontology and KEGG pathway enrichment analyses using the “clusterProfiler” R package (p < 0.05).
Drug–Target Interaction Prediction and Molecular Docking
Molecular docking simulations used AutoDock Vina. 36 Three-dimensional structures of ligands were obtained from PubChem and protein receptors from the protein data bank. Binding affinity < –5.0 kcal/mol was considered a stable binding. All detailed statistical parameters are available in the Supplementary Data.
Validation Analyses in Brain Tissue
To assess tissue relevance, we validated candidate genes using the Brain-eMeta eQTL resource, 37 which was generated by meta-analyzing GTEx brain samples (n ≈ 233), the CommonMind Consortium (n = 467), 38 and ROSMAP (n = 494) 39 cohorts through the MeCS framework, yielding an effective sample size of ∼1,194. Candidate NTF–stroke targets with expression data available in Brain-eMeta were tested using SMR. Associations with p_SMR < 0.05 and p_HEIDI > 0.01 were considered nominally significant.
RESULTS
A Network-Based Approach Identifies Hundreds of Potential Stroke Targets for NTF
To begin dissecting the complex pharmacology of NTF, we first constructed a comprehensive map of its potential molecular interactions in stroke. An initial screen of traditional medicine databases identified 1,517 protein targets for NTF’s 313 bioactive compounds (Supplementary Table S1). Parallelly, we compiled a list of 2,140 genes robustly associated with stroke from public genetic databases (Supplementary Table S2). Intersecting these datasets revealed 579 candidate genes at the interface of NTF pharmacology and stroke pathology (Fig. 1B, Supplementary Table S3), providing a foundational pool of targets for subsequent causal validation.
MR Distills Candidate List to Genetically Validated Causal Targets
With hundreds of potential targets identified, our next critical step was to distinguish which of these were likely to be causally involved in stroke pathogenesis. To achieve this, we employed a two-SMR) approach. Analysis of blood eQTL data 29 revealed that the genetically predicted expression of 44 genes was significantly associated with stroke risk (Fig. 2A). Colocalization analysis further refined this list, confirming that seven of these signals, including that for ATG7 (PP.H4 = 0.954, PP.H3 = 0.17), likely arise from a shared causal variant (Fig. 2B). We then extended this validation to the proteome level using three pQTL cohorts,30,31,32 identifying 14, 15, and 8 proteins, respectively, with causal links to stroke, of which 6, 5, and 5 showed colocalization evidence (Fig. 2C). This multilayered genetic evidence, supported by cross-cohort replication of angiotensinogen (AGT) and lipoprotein(a) (LPA) in the pQTL analyses (Table 1) and strong colocalization for proteins such as ADIPOQ (PP.H4 = 0.578, PP.H3 = 0.08; Fig. 2D), yielded a nonredundant union of 16 high-confidence targets (Table 1, Supplementary Tables S4–S6 and Supplementary Figures S1–S4). The complete selection process is depicted in Supplementary Figure S5.

Causal inference for NTF targets on stroke risk using Mendelian randomization (MR)
Significant SMR Associations Between Genetically Predicted Molecular Traits and Stroke Risk
The table presents significant causal associations identified by SMR (p_SMR < 0.05, p_HEIDI > 0.01). Data source indicates the origin of the summary statistics used for the analysis. Analysis type specifies whether the molecular trait is gene expression (eQTL) or protein abundance (pQTL). The “FinnGen (C_STROKE),” “Fenland,” “UKB-PPP,” and “deCODE” groups represent discovery analyses for eQTLs and pQTLs. The “GWAS Catalog (GCST9003861)” and “UK Biobank (6150_3)” groups represent replication analyses. OR (95% CI) refers to the odds ratio and 95% confidence interval for stroke risk per standard deviation increase in the molecular trait. Gene symbols for key findings discussed in the main text are shown in bold.
eQTL, expression quantitative trait loci; HEIDI, heterogeneity in dependent instruments test; pQTL, protein quantitative trait loci; SMR, summary-data-based Mendelian randomization.
Functional Analysis Reveals Convergence of Causal Targets on Stroke-Relevant Pathways
Having established a core set of 16 high-confidence targets, we sought to understand their collective biological role. Of the original 313 compounds, 36 had predicted interactions with at least one of the 16 targets and therefore constituted the drug-compound-target network visualized in Figure 3A. A subsequent PPI network demonstrated that these targets form a tightly connected functional module of 14 nodes and 26 interactions, rather than acting independently (Figure 3B). The PPI network and the 14 core targets are presented in Supplementary Table S7. Hub proteins such as ADIPOQ (degree = 9) and SCARB1 were central to this network. Pathway enrichment analysis then revealed significant associations with cholesterol metabolism, vascular tone regulation, and PPAR signaling (Fig. 3C). These pathways share overlapping targets such as AGT, vascular endothelial growth factor A (VEGFA), and FN1, suggesting interconnected functional modules (full enrichment results in Supplementary Table S8).

Network pharmacology and functional enrichment analyses of core stroke-related targets.
Molecular Docking Predicts a Direct Inhibitory Mechanism for Calycosin on AGT
The strong genetic link to AGT and its centrality in the enriched vascular pathways40,41 prompted us to investigate a more direct, molecular-level mechanism. We hypothesized that a specific NTF compound might physically interact with and modulate AGT function. To test this in silico, we performed molecular docking simulations. 36 Among 18 candidate compounds, Calycosin, a major isoflavonoid in NTF, was predicted to bind AGT with the highest affinity (binding energy of −6.3 kcal/mol; Fig. 4A). The docking model positioned Calycosin (PubChem CID: 5280448) within the enzyme’s active site, forming interactions with residues near the renin cleavage site (Fig. 4B). This result provides a plausible molecular hypothesis, suggesting that Calycosin could interact with AGT to influence the renin-angiotensin system (RAS) cascade, but experimental confirmation is required (docking details in Supplementary Table S9).

Molecular docking analysis of NTF compounds with the prioritized target AGT.
Validation in Brain Tissue
To validate the blood-based MR findings, we examined candidate genes in brain tissue datasets. Among the 579 candidate NTF–stroke targets, 153 genes had expression data available in the Brain-eMeta eQTL resource. SMR analysis in the C_STROKE cohort identified 16 of these genes as significantly associated with stroke risk (p < 0.05, p_HEIDI > 0.01). The complete SMR results are summarized in Supplementary Table S10. Eight genes (COMT, SQSTM1, COX6B1, NARS2, EP300, ATG7, ITGB3, and KARS1) overlapped with blood-based SMR hits, with COMT and ATG7 having colocalization support in blood QTL analysis. Of these, all except ATG7 showed negative associations with stroke risk. Moreover, all except EP300 displayed consistent effect directions between blood and brain analyses (Table 2). These findings support the reliability of SMR-prioritized NTF targets in brain tissue.
Significant Brain Tissue-Based SMR Associations Between Genetically Predicted Gene Expression and Stroke Risk (C_STROKE Cohort, Brain-eMeta eQTLs)
Associations with p_SMR < 0.05 and p_HEIDI > 0.01 were considered nominally significant. Odds ratios (OR) and 95% confidence intervals (CI) reflect the effect of increased gene expression on stroke risk. Gene symbols in bold indicate overlap with blood-based SMR findings.
DISCUSSION
NTF, a composite herbal remedy, is clinically applied in stroke management, yet a comprehensive understanding of its vascular and metabolic mechanisms has remained elusive.7,11 Our study addresses this gap by integrating NP with MR, a strategy designed to advance beyond correlational data13–15,17 toward genetically validated, causal insights into drug action.18,42–44
Our MR analysis identified AGT, LPA, and VEGFA as high-confidence causal targets for stroke. This genetic prioritization is a critical advance over traditional NP, which often yields extensive but unvalidated target lists. The identification of AGT is particularly compelling, as it directly implicates the RAS, 41 a central pathway in cardiovascular disease. This finding suggests that NTF may exert its therapeutic effects through mechanisms shared by established RAS inhibitors, providing a hypothesis that can be explored in future experimental studies. Importantly, several SMR-prioritized targets were also validated in brain tissue eQTL data, supporting their tissue relevance.
Pathway analysis of the high-confidence targets revealed that NTF likely modulates a network of interconnected biological processes. We observed significant enrichment in cholesterol metabolism, PPAR signaling, and vascular function, alongside cellular stress responses such as HIF-1 and AMPK signaling.45–49 These pathways share overlapping targets, including AGT, VEGFA, and FN1, suggesting that the genetically supported signals converge on a coordinated vascular–metabolic module. The involvement of targets such as GPX7 further points to a role in mitigating oxidative stress.50–52
To explore a potential direct mechanism, we modeled the interaction between NTF’s bioactive compounds and the genetically validated target AGT. Our docking simulations predicted a strong binding affinity between Calycosin, a key constituent of Radix Astragali, 53 and AGT. The model shows Calycosin interacting with residues near the renin cleavage site,54,55 suggesting it may function as a direct inhibitor of angiotensin production. This computational result should be regarded as hypothesis-generating rather than confirmatory and requires experimental validation.
LIMITATIONS
Our study has several limitations. The MR analysis relied on QTL data from European-ancestry populations,32,44,56–63 and the findings may not be fully generalizable to other ethnic groups.32,44,56–63 Furthermore, our results are based on computational and genetic inference; they require experimental validation. The predicted effects of NTF on target pathways and the direct binding of Calycosin to AGT must be confirmed in appropriate cellular and animal models of stroke. Although we incorporated brain tissue-based SMR validation to improve tissue relevance, our findings remain computational and require experimental confirmation. Future studies using cellular assays, protein–ligand binding experiments, and stroke animal models will be necessary to verify the predicted target interactions and functional effects. Importantly, while our findings identified genetically supported targets, they did not confirm functional regulation by NTF. Experimental work, including transcriptomic or epigenetic assays and direct binding validation, is required to determine whether NTF compounds modulate targets such as AGT or TLR4.
CONCLUSIONS
In conclusion, our study provides robust genetic evidence linking NTF to key causal pathways in stroke, most notably the RAS and lipid metabolism. By identifying high-confidence targets such as AGT and proposing a specific molecular interaction with Calycosin, we offer a clear path forward for experimental investigation. Future work should focus on validating this interaction and characterizing the functional consequences of NTF’s multitarget activity in preclinical models. This work demonstrates how modern genetic tools can deconstruct the complexity of traditional medicines, paving the way for more targeted therapeutic strategies.
AUTHORS’ CONTRIBUTIONS
Conception and design: H.J.; administrative support: B.Z.; collection and assembly of data: C.G.; data analysis and interpretation: D.W.; article writing: all authors; final approval of article: all authors.
Footnotes
FUNDING INFORMATION
This research was supported by the Traditional Chinese Medicine Research Program of Hubei Provincial Health Commission (No. ZY2011M071) and The Shizhen Talent Project for Traditional Chinese Medicine in Hubei Province.
AVAILABILITY OF DATA AND MATERIALS
All data generated or analyzed during this study are included in this published article.
DISCLOSURE STATEMENT
The authors declare no conflict of interest.
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Abbreviations
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
