Abstract
Background:
Human diseases are multi-factorial biological phenomena resulting from perturbations of numerous functional networks. The complex nature of human diseases explains frequently observed marginal or transitory efficacy of mono-therapeutic interventions. For this reason, combination therapy is being increasingly evaluated as a biologically plausible strategy for reversing disease state, fostering the development of dedicated methodological and experimental approaches. In parallel, genome-wide association studies (GWAS) provide a prominent opportunity for disclosing human-specific therapeutic targets and rational drug repurposing.
Objective:
In this context, our objective was to elaborate an integrated computational platform to accelerate discovery and experimental validation of synergistic combinations of repurposed drugs for treatment of common human diseases.
Methods:
The proposed approach combines adapted statistical analysis of GWAS data, pathway-based functional annotation of genetic findings using gene set enrichment technique, computational reconstruction of signaling networks enriched in disease-associated genes, selection of candidate repurposed drugs and proof-of-concept combinational experimental screening.
Results:
It enables robust identification of signaling pathways enriched in disease susceptibility loci. Therapeutic targeting of the disease-associated signaling networks provides a reliable way for rational drug repurposing and rapid development of synergistic drug combinations for common human diseases.
Conclusion:
Here we demonstrate the feasibility and efficacy of the proposed approach with an experiment application to Alzheimer’s disease.
INTRODUCTION
The genetic component of common human diseases results from the combination of disease-predisposing alleles in sets of genes (disease susceptibility loci), which cooperatively increase the risk of disease. Genome-wide association studies (GWAS) were designed for the identification of common variants (alleles) associated with specific phenotypic traits and are routinely used for discovering the polygenic component of complex human diseases. Methodologically, two criteria are commonly applied for validation of GWAS: significance threshold of genetic associations and replication across different human populations [1]. While high significance threshold (such as p < 10–8) is generally considered as disease relevant, low significant GWAS associations are often neglected on account of the high risk of false-positive findings, which are poorly replicated and reflect non-specific population to population diversity rather than disease state. At the same time, the aggregated effect of high significant genetic associations is often too modest to explain the observed disease risk, a challenge often referred to as “missing heritability”. In parallel, it has been suggested that single nucleotide polymorphisms (SNPs) with lower statistical significance could also contribute to the observed disease risk, provided that the corresponding genes cluster into relevant signaling networks [2, 3]. This idea stimulated the development of pathway-based approaches in order to enhance functional annotation and interpretation of GWAS (often referred to as GSEA for Gene Set Enrichment Analysis). But skepticism towards this strategy is reasonably justified by poor replication of low significant associations. Another methodological limitation is the correction applied for testing multiple genetic associations within genes of different sizes (e.g., Bonferroni correction applied by dividing the p-values by the number of SNPs) that disregards complex connections between gene length and function and could generate a non-representative list of disease-associated long-size genes, especially those implicated in neuronalsignaling.
Despite these methodological challenges, we remain convinced that GWAS can potentially rationalize and accelerate development of innovative combinational therapies for complex human diseases. For instance, it has been stated that selecting therapeutic targets supported genetically is likely to double the success rate of clinical programs [4, 5]. Moreover, by capitalizing on existing drugs, drug repurposing has the potential to accelerate the evaluation of GWAS findings into preclinical and clinical studies. As an illustration, Sanseau et al. proposed the use of replicated high confidence genetic associations (p < 10–7) for identifying drugs repurposing opportunities. With the available GWAS catalogue, the authors assigned 19 approved drugs as repurposing candidates for 16 disease traits [6]. Though promising, this strategy could be limited by too rigorous criteria for selecting genetically associated therapeutic hits and does not explore disease-relevant information that could be deduced from the analysis of low significant geneticassociations.
We present here an integrative systems biology approach designed for pathway-based functional annotation of GWAS data, including low significant genetic associations, and applied to the development of biologically plausible combinational therapies for common human diseases (Fig. 1A). Efficacy of this methodological concept was validated by an experimental proof-of-concept studies for Alzheimer’s disease (AD).

Integrated approach for combinational drug repurposing from genetic networks. A) General pipeline. Pathway-based functional annotation of GWAS findings is applied for formalized identification and reconstruction of biologically relevant signaling networks enriched in disease-associated genes. Inclusion in these functional cascades is defined as a criterion for prioritizing therapeutic hits in GWAS data. B) We hypothesize that different individual genes from the same disease-associated pathway could induce similar signaling defects though they are poorly replicated by independent GWAS due to genetic diversity between human populations. Meta-analysis of GWAS from different population for Alzheimer’s disease was performed using a fixed effect meta-analysis. It allows robust identification of signaling defects though each single gene is poorly replicated due to genetic diversity. As well, meta-analysis of GWAS datasets lead to effective and accumulative detection of genetic association for neuronal ionotropic glutamate receptors only when applied without gene size correction. Red cells: p < 0.05.
MATERIALS AND METHODS
GWAS analysis
We selected five GWAS corresponding to three indications (AD, type-I (T1D) and type-II diabetes (T2D)) performed on unrelated case-control samples from Caucasian populations and available at the time we initiated this project. We selected also one GWAS (ADHD) performed on parent/child trios. All GWAS are summarized in Table 1. The four datasets available at the raw genotype level (AD-TGEN, AD-ADNI, T1D-WTCCC, and T2D-WTCCC) were analyzed following the same process using Plink (https://pngu.mgh.harvard.edu/∼purcell/plink) [7] and a set of internal bash and R 4.1.2 scripts (https://cran.r-project.org). Quality control was based on minor allele count (<10), Hardy-Weinberg equilibrium (p < 10–8 in controls), missing data rate per individual (>5% except for AD-ADNI>10%) and per SNP (>5%), inbreeding coefficient (>0.2), relatedness (identity by descent >0.1875), the number of heterozygous SNP (<40) and gender mismatch following common principles [8]. Population stratification was assessed by Principal Component Analysis (PCA) with Eigensoft (https://www.hsph.harvard.edu/alkes-price/software) [9]. Association to the disease was assessed by Logistic Regression (LR), using an additive model and the 5 resulting axes of the PCA as covariables, found to be the most efficient approach to control for population stratification [10, 11].
Genome-wide association studies. AD, Alzheimer’s disease; ADHD, attention deficit/hyperactivity disorder; T1D and T2D, Type-I and II Diabetes
The datasets available at the summary level only were analyzed using an LR additive model (AD-GSK) or a TDT test (ADHD). In order to generate one association result per disease for each SNP, a fixed-effect meta-analysis model was applied to the SNPs shared by at least two GWAS within a given disease using the method implemented in the rma.uni() function of the metafor R package version 3.4-0 [12]. Genomic annotation of the results was done with dbSNP build 134 and the GRCh37.2 human genome build.
From SNPs to genes
First SNPs were assigned to genes when they fall within its first and last exons. SNPs outside these limits were mapped to the closest gene within a distance 20 kb from the first and last exons. The value of 20 kb was chosen based on the average length of haplotype blocks in the CEU population that ranges between 5.9 kb and 16.3 kb [18, 19]. To summarize the association between disease and all the SNPs assigned to a gene into a single statistic, we applied the ‘minimum p-value’ (min-p) approach that attributes to the gene the p-value of its most significant SNP [3, 20–24]. We recognize that applying min-p may not be the most optimal method to summarize association signals when multiple, distinct variants in the gene contribute to the overall association signal, min-p. However, and although there are numerous cases in which rare variants in different parts of the gene may lead to the same or a similar disease phenotype, generally only one or very few linkage-disequilibrium blocks in a gene will harbor common causal variants, supporting the use of min-p. In principle, for multiple-testing, it is recommended to adjust gene size and linkage disequilibrium structure by applying a classical phenotype-permutation procedure [25]. Here, the use of GWAS data available at the summary statistics level only prevents the application of such a procedure.
Gene set enrichment analysis (GSEA)
In order to identify enriched pathways, we used a common enrichment analysis that compares the proportion of associations exceeding a given significance threshold within a gene set, to the proportion outside, by using the Fisher’s exact test implemented in the fisher.test() R function [26].
Master gene sets
We elaborated a list of gene sets containing master genes for core signaling modules of canonical pathways (that we refer to as pathway-specific matrices) or some functional groups of genes. Most of them represent modified versions of canonical pathways found in Ingenuity (https://www.ingenuity.com) and KEGG (https://www.genome.jp/kegg) databases but are depleted of signaling proteins with redundant functions complicating pathway identification (general transcription factors, most protein kinases, phosphatases, Gα proteins, adenylate cyclases, etc.). For instance, gene sets for neurotransmitter-receptor mediated signaling pathways include neuronal receptors only. Some of the gene sets were generated after manual curation of comprehensive reviews and are therefore potentially biased (e.g., mitochondrial permeability transition pore complex). In total, 159 pathway-specific gene sets were used for characterization of GWAS and can be provided on request. These proprietary generated “pathway-specific matrices” are universal in that they can be used for enrichment of large-scale genetic data for any human disease, regardless its etiology. The lists of genes in “AβPP processing” set includes ADAM10, ADAM17, APH1A, APH1B, APOE, APP, BACE1, BACE2, MAPT, NCSTN, PSEN1, PSEN2, and PSENEN and genes set “AβPP processing and AD susceptibility loci”—ADAM10, ADAM17, APH1A, APH1B, BACE1, BACE2, MAPT, PSENEN, NCSTN, ABCA7, APP, PSEN1, PSEN2, APOE, CASS4, CELF1, FERMT2, HLA-DRB5, INPP5D, MEF2C, NME8, PTK2B, SORL1, ZCWPW1, SLC24A4, CLU, PICALM, CR1, BIN1, EPHA1, CD2AP and TREM2.
Disease-associated signaling networks reconstruction
Proteins encoded by master genes from genetically enriched signaling pathways/functional groups were used as seeds for constructing networks of protein-protein interactions (PPI). Human protein interaction data were extracted from HPRD (https://www.hprd.org; Release 9 for direct interactions), Ingenuity (IPA, QIAGEN Redwood City, https://www.qiagen.com/ingenuity), and Biogrid (https://thebiogrid.org) databases. Network construction was implemented with R 4.1.2 (https://cran.r-project.org) and visualized with Cytoscape 3.6.1 (https://www.cytoscape.org). The network construction was performed as follows:
Experimental findings from PPI databases were additionally inspected for distinguishing direct protein interactions from data obtained for protein complexes, which were manually curated and integrated in the PPI network according to the “spoke model” [27].
Selection of candidate drugs
The information of drug-target interaction was mainly obtained from Ingenuity (https://www.ingenuity.com/products/ipa) and by manual literature surveying (see Supplementary Table 1). Many approved drugs, even from the same therapeutic class, have unique pharmacodynamics profile, modify activity of several proteins, and could be functionally non-equivalent in experimental essays. This complexity sometimes obliged us to test a few drugs for the same therapeutic target; for instance, three generics mexiletine, moricizine and zonisamide were included as sodium channel blockers in the list of candidate drugs (see Supplementary Table 1). For evaluating candidate drugs toxicity, pharmacokinetic profile and IP status Medicines Complete (https://about.medicinescomplete.com), RxList (https://www.rxlist.com), FDA Orange Book (https://www.accessdata.fda.gov/scripts/cder/ob/index.cfm), EPO (https://www.epo.org), and USPTO (mycharhttps://ppubs.uspto.gov) databases were used.
Functional in vitro assays and analysis
Neuronal cells
Rat primary cortical neurons were obtained and maintained in culture as described previously [28]. Aβ1 - 42 batches contained low and high molecular weight of Aβ1 - 42 oligomers and were prepared and controlled as previously described in detail [29]. BDNF and β-estradiol were used to validate the Aβ1 - 42 toxicity model and as positive controls (not shown); memantine was considered a reference neuro-protective drug with clinically demonstrated beneficial effect in moderate to severe states of AD. All drugs were purchased from Sigma-Aldrich. Cells were pre-treated for 1 h with tested drugs or their combinations and then intoxicated with 10μM Aβ1 - 42 for 24 h. Drugs were maintained in cell culture until the end of Aβ intoxication. 24 h after Aβ intoxication, culture supernatants were analyzed with the Cytotoxicity Detection Kit following the manufacturer’s instructions (LDH, Roche Applied Science, Meylan, France). Three separate plates with 6 replicates each were performed for each drug in each experiment and analyzed.
Endothelial cells
Human brain microvascular endothelial cerebral cells (HBMEC, cells system) were rapidly thawed in a water bath at +37°C, and medium was immediately completed with Dulbecco’s modified Eagle’s medium (DMEM; Pan Biotech) containing 10% of fetal calf serum (FCS; GIBCO). Cell suspensions were centrifuged at 180× g for 10 min at +4°C and the pellets were suspended in CSC serum-free medium (CSC serum free, Cell System) with 1.6% of Serum free RocketFuel (Cell System), 2% of Penicillin 10.000 U/ml and Streptomycin 10 mg/ml (PS; Pan Biotech) and were seeded at the density of 20 000 cells per well in 96 well-plates (Matrigel layer biocoat angiogenesis system, BD). On Matrigel support, endothelial cerebral cells spontaneously started the process of capillary network morphogenesis [30]. Three separate cultures were performed per condition, 6 wells per condition.
One hour after HBMEC seeding on matrigel, drugs or VEGF-165 were solved in culture medium (+0.1% DMSO) and then pre-incubated with HBMEC for 1 h before the Aβ1 - 42 application. One hour after drugs or VEGF incubation (2 h after cell seeding on matrigel), Aβ1 - 42 peptide was added to a final concentration of 2.5μM diluted in control medium in presence of drugs or VEGF, for 18 h incubation.
For capillary network quantification, two pictures with 4x lens were taken using InCell AnalyzerTM 1000 (GE Healthcare) in light transmission per well. All images were taken in the same conditions. Analysis of the angiogenesis networks was done using Developer software (GE Healthcare). The total length of capillary network was assessed. Data were expressed in percentage of control conditions (no intoxication, no amyloid = 100 %) in order to express the amyloid injury. All values were expressed as mean±SEM of the 3 cultures (n = 6 wells per condition) with SEM standing for standard error of the mean.
For identification of synergistic drug combinations, every candidate drug was systematically tested in binary combinations with other drugs selected for these experiments.
Statistical analysis
Analyses were performed with R 4.1.2 (https://cran.r-project.org). Statistical tests were two-tailed and conducted at a 5% significance level. Data were normalized as percentage of effect between Aβ1 - 42 oligomer treated and untreated cultures. We applied a Dunnett’s test for comparison of more than one experimental group against the reference Aβ1 - 42 intoxicated conditions. Drug combination analysis was performed with calculation of Combination Indexes (CI) recognized as the standard measure of combination effect and indication of synergy (CI <1) or antagonism (CI >1) compared to the expected additive effect (CI = 1) [31]. CI were deduced from the two most popular approaches: 1) The Bliss Independence model is an effect-based approach following the principle that drug effects are outcomes of probabilistic processes and assumes that drugs act independently [32]. It allows comparing of the observed combination effect to the expected additive effect by the common formula for probabilistic independence. The model applies only to effects ranging between 0% and 100%. 2) The Loewe additivity model is a dose-effect-based approach following the principle of dose equivalence and relies on the individual dose-effect curves [33, 34]. When the effect of the combination was greater than the maximum effect of both single drugs, the standard Loewe model could not be applied. In this particular situation, if the combination is significantly greater (with a t-test), it represents evidence of synergy and the CI is set to 0 [35]. A combination with a CI <1 for at least one of the two approaches was interpreted assynergistic.
Miscellaneous
Genes abbreviations were standardized with GeneCard (https://www.genecards.org). Materials developed internally can be made available upon request through a data access agreement.
RESULTS
Adapted statistical analysis and functional annotation of GWAS data
Despite development of compensatory mechanisms, signaling pathways could be equally sensitive to dysfunction of any constituent protein integrated in the signaling chain. We hypothesized that different genes from the same disease-causative pathway might be associated with disease in independent GWAS due to heterogeneity across human populations, but functional effects they induce are consimilar in affected human groups (Fig. 1B). Hence, replication criteria at the level of individual genes cannot be formally applied to definitively reject marginally significant genetic data, and prioritization of genetic data provided by meta-analysis of independent GWAS data for a given indication can attenuate to some extent the negative impact of inter-population genetic heterogeneity on identification of genetically supported therapeutic targets.
Following these considerations, the first step of our approach consisted in collecting independent GWAS data obtained, when possible, from different human populations for the disease of interest, and from other diseases as control for biological relevance and specificity. GWAS analysis was performed using Plink and a set of internal bash and R scripts; results from three different populations for the same disease were summarized using a fixed effect meta-analysis and used for functional annotation of genetic findings for AD (see Materials and Methods).
Then, pathway-based functional annotation is reasonably considered as an efficacious and pertinent way for biological interpretation of large-scale genetic findings, but has serious methodological limitations given its high sensitivity to the protocol used for gene size correction and to the content of predefined sets of genes for enrichment analysis. To evaluate this impact, the pathway-based functional annotation was performed by applying a gene set enrichment analysis of genetic associations obtained for neurodegenerative AD, for a neuro-behavioral attention deficit/hyperactivity disorder (ADHD), and for T1D and T2D, two related metabolic disorders with different etiology. Firstly, gene size correction underrates signaling cascades integrating long-size genes and can lead to misinterpretation of the genetic data by enrichment analysis, especially for neuronal diseases (Figs. 1B, 2). As shown in Fig. 2A, gene sets corresponding to signaling for several main neuronal synapses are not genetically enriched in AD or ADHD after gene size correction. These findings look counter-intuitive, contrary to results obtained when gene size correction is not applied. The latter allows obtaining more biologically plausible and disease-specific results when statistical significance threshold for including GWAS for enrichment is decreased from p < 5×10–2 to 10–2, while 10–3 seems to be too restrictive for obtaining instructive findings (Fig. 2B). For instance, this modification of GSEA protocol not only confirms the neuronal nature of AD and ADHD diseases and distinguishes these two types of brain disorders from T1D, but it also supports accumulating functional findings that demonstrate prominent role of neuronal signaling in etiology of T2D [36, 37]. Interestingly, we found that genes set for insulin secretion pathway was statistically enriched in AD and T2D supporting current hypothesis that these two disorders share some risk factors, and that AD might be considered as “Type-III” diabetes [38, 39]. We suppose that more conservative threshold for GWAS significance attenuates the negative impact of an increased number of false-positive findings generated by statistical analysis without gene size correction and enables revealing biologically relevant information. To further evaluate the pertinence of the detected biological signals, we arbitrarily selected several directly interacting proteins with well-established involvement in amyloid processing and APP/APOE signaling in AD and verified impact of gene size correction and statistical threshold on the identification of this functional module. We concluded that applying p < 10–2 threshold without gene size correction reveals representative APP/APOE functional cascade in AD GWAS data and clearly discriminates AD from ADHD (Fig. 3).

Impact of gene size correction and significance threshold on enrichment analysis. Gene set enrichment of GWAS data was performed with (A) or without (B) gene size correction. Performing enrichment of GWAS with statistical significance threshold p < 10–2 without correction generates biologically plausible and disease-specific results allowing formulating medical hypotheses about disease mechanism and providing instructive guidelines for their functional validation. Gene sets were exported from KEGG.

Impact of gene size correction and significance threshold to the enrichment of the APP/APOE signaling complex. Applying a significance threshold of p < 10–2 without gene size correction permits to highlight the APP/APOE signaling complex for AD and discriminates well AD from ADHD, two neurological diseases with different etiology.
Refining functional annotation with master gene sets
Gene set enrichment analysis of GWAS data with predefined sets of functionally connected genes was developed for fast biological annotation of large-scale genetic data and a priori, decreases negative impact of false-positive findings on obtained results. Gene sets of canonical pathways are routinely used for this purpose. At the same time, these gene sets were mainly built by manual curation of functional findings, thereby is obviously biased, conditioned by prior knowledge of cellular processes and vary significantly between available online resources such as KEGG or Ingenuity databases. Moreover, the discriminatory power of pathway-based analysis is limited by the incertitude on which genes can be considered as proper constituents of a particular signaling pathway. Additionally, numerous proteins are often implicated in several different signaling cascades or represent house-keeping genes.
For these reasons, in addition to canonical pathways, we elaborated a proprietary in-house database of gene sets focused around ‘master genes’ specific for core signaling modules of canonical pathways (that we refer to as pathway-specific matrices) or representing distinct functional classes of signaling proteins potentially allowing more discriminative identification of enriched groups of proteins. The pathway-specific matrices were depleted of house-keeping genes and particular groups of signaling proteins, which could compromise detection and unambiguous identification of involved signaling cascade (Materials and Methods). For instance, in our application, the large AD pathway found in KEGG that contains groups of ubiquitous house-keeping genes is not enriched in genes associated to AD, while two master gene sets integrating proteins involved in AβPP processing and genes increasing risk of development AD (susceptibility loci) [40] are strongly and specifically over-represented (Fig. 4A). Without generalizing this example, we do not consider our in-house database of master gene sets as an alternative to canonical pathways, but as a complementary informative resource for gene set enrichment of large-scale genetic data.

Functional annotation of GWAS data. A) Gene set enrichment for AD with genes set corresponding to the KEGG Alzheimer’s disease pathway and disease-associated master gene sets for genes implicated in AβPP processing and genes associated with increased risk of development AD. Complementary analysis with both sources increases operational efficacy of the approach. Though the KEGG AD pathway integrates highly representative list of genes with well-documented role in AD, it also contains large groups of house-keeping genes like genes encoding proteins of electron transport chain that could strongly affect the results of gene set enrichment analysis or even generates misleading findings. B) Representative examples for discriminative and common signaling pathways for human neurological and metabolic disorders with different pathological mechanisms. In comparison with enrichment results obtained with KEGG gene sets covering entire neuronal synapses proteomes (insert, on the left), using master gene sets allows to identify specific functional groups of neuronal receptors and channels enriched in AD GWAS. VGCCs, voltage-gated calcium channels; VGSCs, voltage-gated sodium channels.
Robustness and specificity of enrichment based on master gene sets were evaluated by comparative functional annotation of GWAS data in four human diseases. It revealed characteristic, although partially overlapping, patterns of enriched master gene sets in various diseases (Fig. 4B). Some of the obtained findings are biologically meaningful and endorse the pertinence of the applied approach. For instance, a significant enrichment for genes implicated in immune response was detected as expected in T1D, while the genetic association of genes set for T-cell receptors confirms recent publications indicating an unsuspected role of immune signaling in ADHD behavioral abnormalities [39, 41]. Furthermore, enrichment analysis based on master gene sets allowed to precise the relative importance of different classes of neuronal receptors constituting neuronal synapses for development of AD. We found that genes encoding glutamate receptors were strongly and in relatively specific manner enriched in AD. Glutamate signaling is supposed to play an important role in the etiology of AD. Memantine (an antagonist of NMDA receptors) was approved as AD treatment in 2003 [42–45]. Surprisingly, despite supposed relative sparing of the GABAergic system in AD patients, GABA receptors are also enriched in AD [46]. These data suggest a pivotal role for the imbalance between excitatory and inhibitory glutamate/GABA neuronal signaling in onset and development of AD in comparison with neuronal activity mediated by other main brain neurotransmitters. Noteworthy, this observation could explain the well-documented relationship between susceptibility to epileptic seizure and development of cognitive impairments in AD [47]. Another interesting finding concerns enrichment of angiogenic proteins supporting a causative role for vascular abnormalities in the development of AD [48]. At the same time, we did not detect enrichment of acetylcholine receptors, suggesting that enrichment based on master gene sets should be rather used for fast prioritization, but not for definitive discarding of signaling cascades for further functionalvalidation.
We conclude that our overall approach enables robust functional annotation of GWAS and might be applied for introductory identification of therapeutic opportunities. At the same time, it is not pertinent for detailed specification of associated signal transduction cascades, since number of genes in master gene sets is limited. For this reason, we elaborated a formalized approach for reconstructing signaling networks extended from master genes in gene sets over-represented in disease GWAS datasets. As well, we expected that integration of disease-associated genes in such networks let additional negative selection of false-positivefindings.
Reconstructing disease-associated signaling PPI networks
As a next step, we elaborated a formalized approach for reconstructing signaling networks extended from enriched master gene sets for which we expected that integration of disease-associated genes in such networks allow additional rejection of false-positive findings (see Materials and Methods). These disease-associated signaling networks were built using known human protein-protein interactions (PPI) listed in databases such as Biogrid. Since we were guided by the conceptual assumption that the entire disease-relevant network is enriched in disease-associated genes, the primary networks were further refined by prioritizing interactions between proteins encoded by genes associated with AD (genetic filtering step, Fig. 5). The genetic filtering was applied for all functional classes of proteins in order to obtain core networks enriched in disease-associated genes and putative therapeutic targets. However, we supposed that some core networks might be insufficiently informative for selecting enough therapeutic targets for such an indication as AD. For this reason, the core networks were expanded by re-integrating known therapeutic targets interacting with master genes referred to as extension 1 (Fig. 5 and Supplementary Table 1). These re-integrated targets we further prioritized if they have at least one interacting protein partner encoded by a gene associated with AD. The same extension was also applied to few core network proteins other than master genes, selected arbitrarily, and referred to as extension 2 (Fig. 5 and Supplementary Table 1).

Reconstruction of disease-associated signaling pathways and selection of candidate drugs. Proteins encoded by master genes are used as seeds for primary network reconstruction. For disclosing signaling cascades enriched in disease susceptibility loci, PPI interactions including associated genes are prioritized for core network generation (genetic filtering step). Extensions 1 and 2 allow integration of additional therapeutic targets in core networks. Insert - total number of targets and total number of approved generic drugs modulating these targets that have been identified in PPI networks for glutamate receptors, GABA receptors, calcium, sodium, and potassium channels using different PPI network construction protocols. In brackets, number of drugs finally selected for functional validation studies. For example, 90 therapeutic targets modulated by 40 generic drugs with oral formulation were identified in five core PPI networks, and 14 of these medicaments were prioritized for experimental validation studies. *a smaller number of drugs compared to the number of targets is explained by multi-target nature of some drugs: for example, dalfampridine modulates activity of 15 different potassium channels.
This iterative network reconstruction strategy has the advantage of respecting the key role of master genes in identified disease-associated signaling cascades. Principally for this reason, the better part of elaborated networks is biologically interpretable and as a rule, allows generation of testable hypotheses for experimental validation studies. As an illustration, the glutamate signaling networks can be readily represented as schematic images of well-established functional cellular complexes targeted by approved drugs (Fig. 6). Importantly, we noticed that the corresponding network includes functional partners for glutamate receptors from pathways that were not enriched in GWAS data. For instance, adenosine and dopamine GPC receptors were automatically integrated as constituent proteins into glutamate cascades, increasing opportunities for candidate drug repurposing [49, 50].

Signaling network of neuronal glutamate receptors enriched in disease-associated genes. Only the subset of network corresponding to post-synaptic density signaling complexes for iono- and metabotropic glutamate receptors is shown. Adenosine and dopamine receptors were automatically integrated in the glutamate signaling cascades. A few proteins from other networks, ionotropic KCNA2 and SCN4A channels and GABAB receptors that are involved in functional regulation of glutamatergic synapses at both pre- and postsynaptic sites, were manually added to glutamate signaling complexes (black circle) on purpose to demonstrate their direct interaction with other types of neuronal receptors/channels. Both proteins, encoded by associated genes (red symbols), and their partners, encoded by genes non-associated with AD, in the PPI network were prioritized as therapeutic targets. p-value threshold for including associated genes was p < 0.01.
Selection of candidate repurposed drugs
Targeting known drug targets in the reconstructed disease-associated signaling networks allows the generation of drug repurposing hypotheses. We focused on candidate drugs modulating neuronal signaling networks for glutamate receptors, GABA receptors and calcium, sodium, and potassium channels, characterized by a large available pharmacopoeia. Selection of candidate drugs for identified target proteins was mainly carried out by using information available in Ingenuity and Medicines Complete databases supplemented by manual literature surveying for some targets (Supplementary Table 1). A first round of selection gave priority to approved generic drugs with available oral formulation. As a result, a total of 83 approved drugs were selected: 40 drugs directly from the five core networks and 43 additional drugs after applying network extension 1.
As a second round, candidate drugs were further evaluated for their safety including known side effects, warnings labels, carcinogenic, teratogenic and mutagenic potentials, pharmacokinetic properties like, for instance, crossing of blood-brain barrier, and their patentability as new therapeutics for AD, namely, their freedom to operate (FTO) status as treatments for AD. The relevant information was additionally gathered from RxList database and reference patent FDA Orange book, EPO and USPTO databases. Because of strong limitations imposed by these criteria, some drugs are obviously not optimal for modulating AD-associated networks. From this second round we selected 14 drugs from core networks and 5 drugs from extension 1 (see insert in Fig. 5 and Supplementary Table 1). Finally, 3 drugs were also added to the final selection after applying network extension 2 (Supplementary Table 1).
Experimental screening for synergistic drug combinations
Results from pathway-based functional annotation allow creating a preliminary biological signature of the disease that can be used to guide the design of pertinent functional assays for experimental validation studies. In virtue of the pleiotropic nature of signaling pathways, we applied multiple-assays screening approach and candidate drugs were evaluated in different disease-relevant cellular settings successively. As a result, in vitro assays with neuronal and endothelial cells were prioritized for AD as master gene sets associated with these cellular entities were genetically highlighted.
We decided to focus experimental validation studies on discovery of synergistic drugs combinations for AD. We have assumed that synergistic interaction between target proteins indicates their etiological involvement in onset and progression of AD, thereby minimizing probability of false-positive experimental findings. Drugs were first tested as single agents in vitro in neuronal Aβ toxicity model of AD, validated by the treatments with BDNF, β-estradiol, and the reference compound memantine [29]. In total, more than 70% of the candidate drugs preserved primary rat cortical neuronal cultures from toxic effects of soluble Aβ1 - 42 oligomers in a dose-dependent manner (Fig. 7A, Supplementary Figure 1A). Over the 14 drugs selected from the core networks, 11 (∼80%) were found to be neuroprotective, suggesting that they can be considered as a valid and reliable source for a first and fast identification of candidate drugs. Over the 5 drugs selected from network extension 1, 3 (∼60%) were found to be neuroprotective, providing additional candidates although it appears less productive. Finally, network extension 2 could also lead to relevant candidates around non-master genes in AD-associated networks that may represent a special interest (Supplementary Table 1). Importantly, drugs targeting proteins in AD-associated networks encoded by associated or non-associated genes demonstrated similar efficacy in neuro-protective testing.

Therapeutic targeting of disease-associated signaling networks allows efficacious discovery of synergistic drug combinations. A) Neuro-protective effect of single drugs (left) and binary drug combinations (right) against the toxicity of soluble Aβ1 - 42 oligomers. Normalized data for the difference between Aβ1 - 42 oligomer-treated and untreated or drug-treated cell cultures are presented. Memantine, approved for AD treatment, was used as a reference neuro-protective agent. Examples of single drugs with bell-shaped or sigmoid dose-response curves are shown. Sub-active concentrations of individual drugs were used for testing drugs interaction in binary combinations. All values are mean±SEM. **p < 0.01, ***p < 0.001 versus Aβ; ANOVA with Dunnett’s test. S, synergy. Data for niclosamide found to be toxic at minimal tested concentration in these experimental settings are not shown. B) Synergistic binary drug combinations in neuro-protection assays against cytotoxic soluble Aβ1 - 42 oligomers. Selected candidate drugs were systematically tested in binary drugs combinations. Drug combinations were annotated as synergistic (red cells) according to the Bliss and Loewe approaches as described in Materials and Methods. Grey cells, non-synergic combinations. C) Pattern of drug interactions in synergistic versus non-synergistic binary combinations. In case of synergistic combination, baclofen and acamprosate do not both interact synergistically with the same other drug (common synergistic interaction). On the contrary, in non-synergistic combination, both baclofen and cinnarizine cooperate synergistically with two other drugs - cinacalcet and trimetazidine. Overall, drugs from synergistic combinations have half as many common synergistic interactions than drugs constituting non-synergistic pairs (data not shown). Red lines, synergistic drug combinations.
Active drugs were then systematically studied for possible synergic interaction in binary combinations. Detailed dose-response curves established for candidate drugs were used as a pre-requisite for the assessment of synergy. Drug combination effects were assessed by calculation of Combination Indexes (CI) recognized as the standard measure of combination effect and indication of synergy (CI <1) or antagonism (CI >1) compared to the expected additive effect (CI = 1) (see Materials and Methods). We found that about 30% of the tested binary combinations protected, in a synergistic fashion, cortical neuronal cultures from toxic Aβ1 - 42 oligomers, at least at one drug concentration (Fig. 7A, B and Supplementary Table 2). Such a high success rate possibly indicates that the identified signaling networks are inter-connected and constitute a larger authentic functional network targeted by neurotoxic Aβ as the presumed causative agent of AD. Coherently, we observed a notable regularity in drug interactions in characterized synergistic combinations. As exemplified in Fig. 7C, two synergistically co-operating drugs, when tested separately in other combinations, interact synergistically with preferably different drugs. In practical terms, these preliminary findings could be interpreted as a sign that drugs with partially overlapping functional properties, i.e., targeting the same signaling network(s), might be less suitable for developing a synergistic combination.
Most of the neuro-protective drugs were tested in a second functional assay, where practically half of them preserved endothelial cells against deleterious effect of Aβ (Supplementary Figure 1B and Supplementary Table 2). Moreover, some combinations were found to be synergistic in both neuronal and endothelial in vitro models of AD (Supplementary Figure 2). These data demonstrated that perturbed signaling pathways, presumably recognized as neuronal, can be also responsible for pathological manifestations of AD in the cell types other than neurons.
DISCUSSION
Translation of large-scale genetic data, provided through GWAS, into a drug repurposing pipeline remains challenging. The number of therapeutic targets associated with disease traits at a genome-wide significance level (such as p < 10–8) is limited, while the number of nominally associated targets (p < 0.05) exceeds capacities of experimental validation studies. Obviously, GWAS cannot be mechanistically used for drug repurposing without a rigorous prioritization of biologically plausible targets. Here, we have presented an integrative approach for combinational repurposing from GWAS findings. This approach combines a pathway-based functional annotation using gene set enrichment technique, and a computational reconstruction of signaling networks enriched in disease-associated genes allowing rapid identification and experimental validation of candidate drugs.
The development of such a comprehensive and integrated approach inevitably raises a set of general methodological issues. The analysis of large-scale genetics data is a rapidly progressing field, which is only partially standardized [8, 51–53]. It is now widely recognized that an approach based on single highly significant SNPs may underestimate the complexity of cellular processes [3], and studying the cumulative effect of polymorphisms in multiple genes acting in functional pathways provides complementary information in understanding genetic determinants of common diseases [54]. However, assigning SNPs to genes, assessing the genetic association at the level of a whole pathway, and re-constructing disease-relevant signaling modules remain complex methodological issues for which no standard has yet emerged. For instance, we observed that gene set enrichment of GWAS that are not corrected by gene size provides much more biologically plausible results for different classes of neurodegenerative, behavioral, auto-immune, and metabolic human diseases. Finally, systems biology analysis is strongly determined by the current state of omics databases and personalized prioritization and/or interpretation of these datasets. Decision about which database(s) to use and which type of information is worthy of retention for analysis (among multiple types of physical, functional, and predicted protein-protein or drug-target interactions for instance) is far from straightforward and, ideally has to be confirmed each time by functional validation studies. Evidently, we are far from claiming to solve these methodological problems here. Limited by the use of GWAS available either at the raw genotype or at the summary statistics level (excluding de facto the use of phenotype-permutation procedures), and in the absence of clear standards or experimentally justified guidelines for functional annotation of GWAS, we applied at each step the strategies that appeared the most reasonable for our purposes.
In order to minimize the risk of false-positives through the whole process, we elaborated and validated several adaptations to the standard approaches that includes (Fig. 1A): optional prioritization of genetic data generated by meta-analysis of independent GWAS data, selection of reasonable conservative genetic significance threshold for GWAS data allowing i) validation of at least some known biological findings/drugs targets for disease under study, and ii) robust identification of specific disease-associated signaling cascades, using of master gene sets as a complementary source in addition to available canonical pathways databases for enrichment analysis, reconstruction of disease-associated signaling networks with master genes proteomes as seeds, with subsequent prioritization of protein interactions integrating genetically associated genes (genetic filtering), prioritization of networks for selecting candidate targets/drugs for functional validation studies, multiple functional assays defined by physiological role that can be attributed to genetically enriched signaling cascades in disease GWAS data, prioritization of pleotropic synergistic drugs combinations effective in more than one cellular setting.
The resulting high success rate of this approach in discovering effective mono- and combinatorial candidate treatments for AD demonstrates its feasibility and efficacy for generating innovative therapies for complex human diseases. The experimental findings indicate that both proteins encoded by associated genes and their non-associated functional partners in disease-relevant signaling networks, can be used as hits for successful drug repurposing. Hence, inclusion in disease-associated signaling network can be proposed as one of the effective norms for prioritizing therapeutic targets in GWAS data, even if this therapeutic target is not genetically associated with human disease.
The study described here was carried out as a part of a drug development program for AD launched in 2009. As an additional proof-of-concept, one of the synergistic combinations made of baclofen/acamprosate (PXT864) was subsequently successfully validated in animal AD models [28]. It was also tested in a first-in-patient phase IIa clinical trial involving 45 AD patients (registered as ClinicalTrial.gov: NCT02361424) and demonstrated preliminary encouraging safety and efficacy data presented at CTAD 2016 and AAIC 2018. Remarkably, this combination targets receptors implicated in neuronal excitatory/inhibitory balance, which, as suggested by functional annotation of GWAS data, could play a crucial role in the pathogenesis of AD. Though these signaling cascades are foremost considered as neuronal, excitatory/inhibitory imbalance might be responsible for pathogenic manifestations of AD in several other cellular settings. Indeed, our ensuing experiments demonstrated that combination of baclofen/acamprosate attenuates cytotoxic effect of Aβ in both neuronal and endothelial cells in vivo; it also decreased hallmarks of neuro-inflammation and alleviated cognitive deficits in a panel of functional tests in acute and transgenic models of AD [28]. Interestingly, functional glutamate and GABA receptors are expressed in endothelial cells, while accumulating evidence demonstrates a causative role for vascular abnormalities in the development of AD [55]. Considering that glutamate and GABA receptors are affected by Aβ and modulate themselves processing of AβPP, deteriorative excitatory/inhibitory imbalance might represent the self-promoting vicious cycle in development of AD and a promising target for development of innovative therapeutics or as enhancers for already approved AD treatments [56, 57].
Finally, the proposed integrated approach supports the emerging trend to capitalize on a comprehensive and computerized knowledge of a disease derived from large-scale molecular data, to accelerate the identification of relevant therapeutic targets and the selection of drug candidates for clinical evaluation [58]. Given the rapid emergence of approaches for systematic drug repurposing and combination, fast progress in this area will be assured by unbiased comparisons as well as integration of different methodological strategies [59–65].
