Abstract
Dengue virus (DENV) is one of the deadly arboviruses, which is primarily transmitted by Aedes aegypti, and causes dengue infection to the humans. According to WHO, every year around 390 million humans are affected by DENV, of which around 50 million deaths are reported. Knowledge of the various diseases caused by the DENV would greatly encourage to understand the infection mechanism and help to design new antiviral drug discovery. We propose a quasi-clique and quasi-biclique algorithm to classify infection gateway proteins of the human body and possible pathways of DENV leading to various diseases. For this, we have examined three networks, dengue-human protein–protein interaction network, human protein interaction network, and human proteins-disease association network. The prediction result states that DENV may lead to various diseases in the human body, including cancer, asthma, ulcerative colitis, multiple sclerosis, premature birth, and so on. Some of the results have recently been validated experimentally. This study may endow with potential targets for more effective anti-dengue remedial contribution.
1. Introduction
Dengue virus (DENV) is a growing threat to worldwide human being health. It consists of four serotypes, DENV-1, DENV-2, DENV-3, and DENV-4. Each of these serotypes causes the same disease. But, some serotypes, for example, DENV2, are linked with more severe dengue disease. All these serotypes have different types of interactions with the antibodies in human blood. The viral genome codes for 10 proteins. Three are structural proteins, capsid (C), the precursor of membrane protein (PrM/M), and envelope protein (E), while the rest are nonstructural (NS) proteins—NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5.
The DENV has the capability to survive and replicate in both human and mosquito host organisms. Because the dengue genome encodes only 10 viral proteins, the virus needs to manipulate its hosts at a molecular level (Doolittle and Gomez, 2011). So, virus-host protein–protein interaction (PPI) enlightens the function of virus proteins and how they reproduce and cause diseases. Several studies are reported to show PPI between dengue proteins and human proteins. Moreover, DENV is capable of causing several other diseases in the human body. Some medical reports state that the patients affected by DENV result in other forms of the illness like organ damage, severe bleeding, dehydration, and even death. So, complete pathway analysis of DENV is important to understand its functionality in the human body.
Various researches are going on for predicting pathway analysis of different diseases. In Yamanishi et al. (2004), a variant of the kernel-based method is used for pathway protein interaction. Some databases have used Bayesian classification for PPI prediction in yeast (Chowdhary et al., 2009). But most of these techniques are applied within a single organism, such as yeast, human, and so on. However, identification of PPI among viral proteins and human proteins and pathway analysis of the interacting human protein will help to predict possible disease that may occur due to infection of particular viral protein.
In Khadka et al. (2011), yeast two-hybrid test is used to screen new interactions between DENV and human proteins. The article also demonstrates that DENV like other diseases preferentially interacts with the proteins that are centrally located in the human protein interaction network. It is also reported that the human proteins that interact with dengue proteins also interact with hepatitis C virus (HCV) and HIV virus proteins. But experimental methods are very time consuming and costly due to the requirement of expensive laboratory equipment. So, it reveals only a small part of the all possible disease occurred due to DENV infection. This leads to the application of computational methods to investigate the pathways of dengue disease.
The main objective of this article is to predict the diseases that may occur due to DENV infection in the human body. We have used a quasi-clique and quasi-biclique algorithm to find the gateway proteins of infection of the human interactome. Analysis of gateway proteins is very important, as they are targeted by the viral proteins for entering and affecting a particular cell. Three networks are considered for generating the gateway proteins. First, the dengue–human PPI, which stores interactions between dengue proteins and human proteins. Second, human PPI, which depicts the interactions between the human proteins, present in the first set, with the rest of the human proteins. Third, human-disease association network, which portrays the interaction between human proteins with different diseases. Then the quasi-biclique algorithm is applied to dengue–human PPI to get the strong interconnection network. The quasi-clique algorithm is applied to human PPI database to identify the human proteins that overlap with the human proteins of quasi-bicliques. These overlapping proteins are called gateway proteins. Using the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and gene ontology (GO) analysis, the complete pathway of these gateway proteins is analyzed.
Furthermore, we applied the quasi-biclique algorithm on human-disease association network corresponding to each quasi-clique. The overlapping proteins are also treated as gateway proteins. Thus, two sets of gateway proteins are generated in this study. There are some diseases that are directly associated with human proteins, which interact with different dengue proteins. These direct interactions are also reported in this article.
2. Materials and Methods
2.1. Databases and preprocessing
In this section, the collection and preprocessing of the datasets have been described below.
2.1.1. Dengue–human protein interaction database: DenvInt
Dengue–human PPI database-DenvInt (https://denvint.000webhostapp.com) is composed of 545 nonredundant interactions between 341 different human proteins and 10 dengue proteins (Dey and Mukhopadhyay, 2017). The database is prepared by considering dengue–human PPI published in peer-reviewed journals and virus databases like VirHostNet and VirusMentha. Only experimentally validated interactions like yeast two-hybrid test (Folly et al., 2011; Khadka et al., 2011; Le Breton et al., 2011; Dumrong Mairiang et al., 2012), bacterial two-hybrid test (Folly et al., 2011) RNA affinity chromatography screen (Bidet et al., 2014), complex pull-down assay and co-ap assay (Dumrong Mairiang et al., 2013), and colocalization (Balinsky et al., 2013) based PPIs are added in the database to build a high quality PPI repository. All the interactions in the database are annotated with dengue protein, human protein, interaction type, dengue serotype, experimental procedure, article name, author name, and PubMed publication id (PMID) (Supplementary Material S1).
2.1.2. Human protein interaction database
A human protein interaction database is prepared to depict the relationship between the human proteins, present in DenvInt database, with the rest of human proteins in human PPI network. There are many databases, publicly available, to collect and store the human PPI data, such as BioGrid, MINT, HPRD, STRING, and so on. DenvInt database contains 341 distinct human proteins, and all the human proteins may not be equally responsible for pathway analysis. So, to reduce the computational complexity we have considered only those human proteins that are identified by the γ-quasi-biclique algorithm on DenvInt database. Then these human proteins are chosen as input to STRING database (http://string-db.org) to prepare a human PPI database. We have considered four active prediction parameters—Text-mining, Co-Expression, Experiments, and Databases and a high confidence value 0.700 and the minimum threshold for evidence score 0.621 to accumulate the PPI information.
2.1.3. Human protein-disease association database
The human protein-disease association database is downloaded from http://geneticassociationdb.nih.gov. The database is prepared from the published articles based on complex diseases and disorders. The database contains ∼12,400 interactions, including 4260 unique diseases and 3576 unique human proteins/genes. The human-disease association database is given in Supplementary Material S2.
2.2. Definitions
Before discussing the proposed methods and algorithms, some terms are needed to be known.
3. Algorithms
3.1. Quasi-biclique algorithm
A biclique is a complete bipartite subgraph. It consists of two disjoint vertex subsets where every vertex is adjacent to all vertices in the other subset (Li et al., 2008). In many studies, the quasi-biclique algorithm has been used to predict the PPI, interaction sites, and motifs (Liu et al., 2008, 2010). All PPI related data are not usually fully available. So, the subgraphs created by interacting protein group pairs may not always perfect bicliques (Liu et al., 2008). They are more often near-complete bipartite subgraphs. Therefore, if we try to find bicliques, some important protein interactions may be missed. For this reason, we apply quasi-biclique algorithm rather than bicliques so that the interactions which are absent in a protein interaction subnetwork can be discovered.
In this study, we propose a new technique to generate quasi-bicliques from the PPI network (Algorithm 1).
3.2. Quasi-clique algorithm
Cliques are the largest complete subgraphs within a given graph (Brunato et al., 2007). Now, maximum size subgraphs that are almost fully connected are called quasi-cliques. We have developed the quasi-clique algorithm based on the degree of the proteins of the human PPI network. We have applied the heuristic approach to finding the all possible quasi-cliques of the PPI graph.

Let G = (V, E) is a graph, where V and E are vertex and edge set of the graph, respectively. Let SU be the subgraph of G. According to the clique relaxation model, SU will be called as quasi-clique if the degree of each vertex is at least γ (|SU| − 1), where γ signifies the density of the network (Pastukhov et al., 2018). The algorithm can be described as follows (Algorithm 2):
4. Proposed Methodology
The proposed methodology of finding gateway proteins of DENV infection is described in this section. Most of the biological networks are incomplete due to the lack of experimental data. The incompleteness of these networks is modeled as a quasi-biclique and quasi-clique problem. Two sets of gateway proteins are identified in this study. The steps of generating gateway proteins are described below.
We have prepared dengue-human PPI database from literature and some existing databases. This interaction is modeled as a bipartite network where dengue proteins and human proteins are present in two different sets.
Then the γ-quasi-biclique algorithm, proposed above, is applied to this bipartite graph to gather strong interactions between dengue protein and human protein. We have varied the density of the network γ from 0.5 to 0.9, and the minimum number of dengue proteins considered in a quasi-biclique is 2.
Using this algorithm we got four quasi-bicliques: QB1, QB2, QB3, and QB4. The densities and number of dengue and human proteins of these quasi-bicliques are shown in Table 1. The second quasi-biclique QB2 consists of 4 dengue proteins, C, E, NS3, and NS5, and 54 human proteins. These four dengue proteins are the top four highest degree dengue proteins in the dengue–human PPI network.
Now unique human proteins of these quasi-bicliques are fed to String database to prepare a human PPI. It depicts the interaction between dengue-focused human proteins with the rest of the human proteins.
Then the γ-quasi-clique algorithm is applied to this human PPI to identify the human proteins that overlap with the human proteins generated by γ-quasi-biclique in step 2. We have considered the range of γ value from 0.5 to 0.9. By means of this procedure, we got eight quasi-cliques as shown in Table 2. The corresponding quasi-bicliques and densities are also mentioned in Table 2. From the human proteins present in the QB1 we got QC1, from QB2 we found QC2, from QB3 we found three quasi-cliques-QC3, QC4, and QC5, and from QB4 we got three quasi-bicliques-QC6, QC7, and QC8.
Some of the human proteins of each of the eight quasi-cliques overlap with the four quasi-bicliques. These overlapping human proteins are considered as gateways of infection. The list of overlapping proteins and corresponding quasi-cliques and quasi-bicliques is stated in Table 3. These overlapping human proteins are further analyzed to predict its functionality.
The human gene-disease association database is downloaded from http://geneticassociationdb.nih.gov A bipartite graph is prepared where human proteins are on one side and different diseases are on another side of the graph. The edges of the graph signify the association of human proteins with diseases. Then, the above described γ quasi-biclique algorithm is applied to this network. Instead of dividing the bipartite graph into dengue protein set and human protein set, here it is divided into human proteins in one set and all the diseases in the other set. Three Quasi-bicliques are generated, QBD1, QBD2, and QDB3, considering density γ threshold 0.5 (Table 4). The overlapping human proteins between these quasi-bicliques and quasi-cliques are reported in Table 4. These proteins are very important for any type of disease spreading as they are gateway proteins.
Quasi-Cliques Found from Human Protein Interaction Database That Overlap with the Human Proteins Present in the Quasi-Bicliques
Overlapping Human Proteins of Quasi-Cliques and Quasi-Bicliques
Quasi-Bicliques Generated from Human Protein-Disease Association Network Corresponding to Quasi-Cliques
So, we have explored three networks in this article. They are dengue–human interaction network, human protein interaction network, and human protein-disease association network to find the potential pathways of infection by the DENVs that lead to various diseases in the human body. We have generated four quasi-bicliques from dengue–human interaction network, eight quasi-cliques from human protein interaction network, and three quasi-bicliques from human protein-disease association network. The overlapping relation among them is demonstrated in the following Figure 2.

The diagrammatic representation of the overlapping relation between quasi-bicliques and quasi-cliques. The overlapping proteins are treated as gateway proteins.
5. Results and Discussion
5.1. KEGG pathway analysis of quasi-cliques
We have analyzed the KEGG pathways and GO of the quasi-cliques using the DAVID tool. Few significant GO terms along with KEGG pathways of the four quasi-cliques are stated in Table 5. QC1 mainly consists of the proteins that participate in translational elongation biological process. This quasi-clique QC1 overlaps with QB1, QB2, and QB3. The list is given in Table 5. Now human proteins PSMC1, RPL12 of QC1 and QC2 overlap with QB1 and QB2 and they also interact with dengue proteins NS3, NS5, C, and E. In the same manner, RPL24 of QC1 overlaps with QB2 and interacts with dengue protein NS3 and E. We have examined that QC1, QC2, and QC3 the three quasi-cliques have same KEGG pathway ribosome. Now, the ribosome is one of the important cellular units responsible for making proteins. The same ribosomal proteins (RPS7, RPS27, and RPL12) are present in all these quasi-cliques. Besides this, QC1 and QC2 contain three other ribosomal proteins—RPL5, RPL6, and RPL24.
The Significant Important Gene Ontology Terms and KEGG Pathways Found in the Quasi-Cliques
GO, gene ontology.
Several studies showed that ribosomal biogenesis is evidently related to cancer especially colorectal cancer (Lempiäinen and Shore, 2009; Lascorz et al., 2011). It also has direct influence in tumor genesis through some ribosomal functions (Mao-De and Jing, 2007). Thus, DENV can lead to or cause several tumors and cancer through these ribosomal gateway proteins—RPS7, RPS27, RPL12, RPL5, RPL6, and RPL24. The 10 proteins of quasi-clique QC4 regulate the apoptosis, and the KEGG pathway is NOD-like receptor signaling pathway and small cell lung cancer. The GO molecular function is zinc ion binding. There is only one human protein BIRC2 of QC4 that overlaps with all the three quasi-bicliques QB1, QB2, and QB3. BIRC2 interacts with dengue proteins NS5, NS1, and C. So, these dengue proteins can damage the Nod-like receptor signaling pathway and cause lung cancer by the gateway protein BIRC2.
QC5 mainly consists of the proteins that participate in the positive regulation of nitrogen compound metabolic process and transcription regulator activity. So, malfunctioning of these proteins may hamper the normal regulatory roles of these transcription factors. These proteins are also involved in the KEGG pathways in cancer. Human protein STAT3 of QC5 overlaps with quasi-biclique QB3 and also interacts with dengue proteins E and NS1. So, this study suggests that dengue proteins E and NS1 may show the way to cancer in the human body, and the main gateway protein responsible for that is STAT3. Some interesting evidence shows the relationship between mosquito-borne diseases and cancer in Benelli et al. (2016). The cancer pathways are activated by mosquito biting. For example, Aedes aegypti which carries DENV transfers reticulum sarcoma with the help of tumor cells.
The host proteins of QC6 and QC8 are involved in a JAK-STAT signaling pathway, which signifies that the dengue proteins in QB4 interact with QC6 and QC8 through the common human protein STAT3. Now, this JAK-STAT signaling pathway transmits cellular signals to the nucleus of the cells, which handles DNA transcription of the cell. So, disruption of JAK-STAT functionality may cause different immune deficiency syndromes and cancers (Ivashkiv and Hu, 2004). Different studies (Lin et al., 2004; Souza-Neto et al., 2009) have identified that when dengue proteins interact with human proteins, they suppress the JAK-STAT signaling pathway. The quasi-clique QC7 consists of eight human proteins, which involve in calcium ion binding and regulate the apoptosis. The KEGG pathway associated with these proteins is antigen processing and presentation. By this technique, the cells assimilate proteins from inside or outside the cell. The gateway proteins that are responsible for this activity are STAT3, HSPA5, and CALR. The association of dengue proteins and antigen processing is reported in Hershkovitz et al. (2008).
So, GO and KEGG pathway analysis of the eight quasi-cliques helps to identify the gateway proteins, which are captured by the dengue proteins at the time of disease spreading in the human body. Dengue proteins may damage the regulatory system and immune system and may lead to different types of cancers, including lung cancer, colorectal cancer, and so on.
5.2. Literature support of the diseases predicted by quasi-cliques and human-disease association network
To analyze the diseases associated with the human proteins in the quasi-cliques for getting the possible pathway of pathogenesis leading to various diseases, we have applied the quasi-biclique algorithm on the human gene-disease association network corresponding to eight quasi-cliques. We got three quasi-bicliques—QBD1, QBD2, and QBD3 as shown in Table 4. QBD1 overlaps with two quasi-cliques—QC3 and QC5 with two human proteins, namely, STAT3 and EGFR. These two proteins are associated with four different diseases. QBD2 overlaps with QC4 with four proteins, which are connected with seven diseases. QBD3 overlaps with QC6 with three host proteins, which are associated with two diseases.
We have searched existing literature in support of the diseases that are reported in Table 4. Many of them are already established in recent articles. Asthma is a chronic long-term lung disease that narrows the airways of the lung. In Pang et al. (2017), the authors identified diabetes, cardiac disorders, and asthma, as independent risk factors of severe organ involvement during dengue infection. DENV causes colitis and other liver and kidney diseases (Park et al., 2008; Lizarraga and Nayer, 2014; Samanta and Sharma, 2015). DENV infection during pregnancy causes multiple problems like miscarriage, stillbirth, infertility, premature birth, and so on. (Ribeiro et al., 2017). It may also cause bleeding tendencies if she goes in for cesarean section (Chitra and Panicker, 2011). Multiple sclerosis is a central nervous system disease that disrupts the flow of information within the brain. Recently published literature has proved that some dengue fever causes multiple sclerosis as blood cells are affected by dengue fever (Fragoso and Brooks, 2016).
Schizophrenia is one type of mental disorder defined by abnormal social behavior. Research shows that as DENV affects the neurons, it increases the risk of schizophrenia in the patients. An 18-year-old male student from a rural family was diagnosed with schizophrenia-like disorder due to dengue viral fever in Fragoso and Brooks (2016)). Dengue infection increases the level of TNF-α, which may stimulate inflammation and endothelial dysfunction (Kurane, 2007; Cheng et al., 2015). There are lots of studies that are reported to identify the dengue-HIV coinfection issue. Some research explores that dengue protein NS5 downregulates HIV coreceptor protein CXCR4 (López-Lemus et al., 2012; Pang et al., 2015). Six clinical reports of dengue serotype-1 infection in HIV infected patients are identified in Delgado-Enciso et al. (2017). Case studies showed that DENV-HIV coinfected patients have higher eosinophil proportion and pulse rate but lower serum hematocrit level (Pang et al., 2015). According to the WHO 2009 criteria, this coinfection causes a major problem due to anemia and severe plasma leakage. So, these types of patients need high monitoring from the first stage.
The above analysis indicates that most of the diseases found in Table 5 have evidence in published articles for their connection with DENV transmission. The DENV affects the gateway proteins, namely, EGFR, STAT3, IL10, CALR, JAK2, and IL10RA to cause these diseases in the human body. So, the algorithms we have used in this study may help to get possible pathways of DENV infection leading to various diseases.
5.3. Gateway protein analysis
In this study, we have identified two sets of gateway proteins. Using the KEGG pathway and GO analysis process we got 22 gateway proteins for dengue infection. They are PSMC1, RPS7, WWP1, RPS27, RPL12, RPL5, RPL6, RPL24, RRP12, CLU, HNRNPC, NAP1L1, PAIP1, PTBP1, STAT3, HSPA5, BIRC2, GTPBP4, HSP90AB1, NFKBIA, IL10, and CALR. The second set contains six gateway proteins EGFR, STAT3, IL10, CALR, JAK2, and IL10RA using the quasi-biclique algorithm on human protein-disease association network. These proteins play a very significant role in dengue infection in the human body. The average degree of the gateway proteins is significantly high (26.56) compared to nongateway proteins (7.94). The degrees (calculated from HPRD database release 9) to these gateway proteins are given in Supplementary Material S3. This degree difference may reflect the fact that viral proteins tend to attack the proteins having higher connectivity so that the maximum number of neighbor proteins are affected and, hence, causes multiple other diseases in the human body. It is evident that most of the gateway proteins of the first set are involved in JAK-STAT signaling pathways and pathways in cancer. Gateway protein BIRC2 is directly related to esophageal cancer (Brown et al., 2011) and cervical cancer (Choschzick et al., 2012). We have predicted that a list of ribosomal proteins, such as RPS27, RPL12, RPL5, RPL6, RPL24, RRP12, and RPS7, is treated as infection gateway by DENV.
In Wang et al. (2015), the authors established the fact that ribosomal proteins are linked to the development of different hematological, metabolic, and cardiovascular diseases and cancer. So, dengue infection can lead to several ribosome related diseases. HSPA5 is an important protein, which regulates ER stress response in the human cell. All the viral proteins specially capsid proteins of dengue, Ebola, influenza, etc., synthesize a huge amount of proteins and induce an ER stress response in the infected cells by stimulating HSPA5 expression (Booth et al., 2015).
The second set is concerned with diseases such as asthma, ulcerative colitis, multiple sclerosis, premature birth, kidney, and colorectal cancer. The degree of EGFR protein in the HPRD database (version 9) is 163. So, a large number of proteins can be affected by the human body through the EGFR protein network. In Komposch and Sibilia (2016), the authors illustrated that a change in EGFR protein's signaling can cause acute and chronic liver damage. In addition, the deregulation of EGFR signaling pathway causes chronic lung disease and also lung cancer (Vallath et al., 2014). Rest of the diseases associated with these gateway proteins are reported in Supplementary Material S4. According to our analysis, dengue proteins attack the gateway proteins to enter into the human PPI network. So, the probability of the occurring diseases caused by these gateway proteins increases during or after the dengue infection in the human body.
5.4. Direct PPI with diseases
The dengue–human PPI database contains 545 interactions between 10 dengue proteins and 341 unique human proteins. We have prepared the database from published literature. The human protein-disease association database contains ∼12,400 associations, including 4260 unique diseases and 3576 unique human proteins. Now among the 341 unique human proteins of dengue-human PPI, 88 human proteins overlap with the human-disease database. Some of these proteins are associated with more than one disease. The detailed list is given in Supplementary Material S5. The degree of each human protein is depicted in Figure 3. Human protein APOE (apolipoprotein E) has maximum degree 88, that is, APOE is associated with 88 different types of diseases. There are 45 human proteins which have degree 1 as they are responsible for one type of disease.

Frequency of dengue viral interacting human proteins with different human diseases.
6. Conclusion
In this study, we have identified the possible infection pathway of DENV in the human body. For this reason, first, we have calculated the quasi-bicliques of dengue-human PPI. Then the proteins of these quasi-bicliques are mapped onto the quasi-cliques of the human PPI. The quasi-cliques are also used to find the overlapping proteins with human protein-disease association networks. So, these two sets of overlapping proteins are described as gateway proteins for DENV infection. They are very important in terms of disease spreading in the human body as dengue proteins attack these gateway proteins to enter into the human PPI network. We have analyzed the KEGG pathway and GO of these gateway proteins. This analysis reveals that these proteins are related to different diseases such as lung cancer, kidney cancer, asthma, ulcerative colitis, multiple sclerosis, premature birth, infertility, and so on. Most of the predicted diseases have evidence in the published literature.
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
