Abstract
BACKGROUND:
The COVID-19 pandemic, caused by the new virus of the coronavirus family, SARS-CoV-2, could lead to acute respiratory syndrome. The molecular mechanisms related to this disorder are still debatable.
METHODS:
In this study to understand the pathogenicity mechanism of SARS-CoV-2, using the bioinformatics approaches, we investigated the expression of involved genes, their regulatory, and main signaling pathways during the time on days 1, 2, 3, and 4 of SARS-CoV infected cells.
RESULTS:
Here, our investigation shows the complex changes in gene expression on days 2 and 3 post-infection. The functional analysis showed that especially related to immune response, response to other organisms, and defense response. IL6-AS1 is the predicted long non-coding RNA and is a key regulator during infection. In this study, for the first time has been reported the role of IL6-AS1. Also, the correlation of differential expression genes with the level of immune infiltration was shown in the relationship of Natural killer cells and T cell CD 4
CONCLUSION:
In the current study, identification of the altered expression pattern of genes in SARS-CoV-infected cells in time course also can help identify and link the molecular mechanisms and explore the holistic view of infection of SARS-CoV-2.
Introduction
The COVID-19 pandemic is caused by SARS-CoV-2, a virus from the coronavirus family that had seriously threatened humanity [1]. Explanation of the basic biology process and how the virus functions in infected cells can still be a debatable issue. the major receptor SARS-CoV-2 on membranes of target cells is the angiotensin-converting enzyme (ACE2) which initiates a complex downstream signaling pathway [2]. In some cases, following infection, acute respiratory syndrome is caused by a severe immune response in lung epithelial cells are the main cells involved [3]. Also, ACE2 acts as a functional receptor on the surface of many target cells, including lung pneumocytes, small intestinal enterocytes, proximal renal tubule cells, vascular endothelial cells, and arterial smooth muscle [4].
In this study, to precisely identify the dynamic changes in the gene expression pattern caused during the time of infection, re-analyzed datasets, including samples from human bronchial epithelial cells infected with SARS-CoV, at different time points in 1, 2, 3, and 4 days. In the first step differential expressed (DE) genes were identified with meta-analysis. Then after microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) enriched with DE genes, network analysis was performed. In addition, the Gene ontology, pathway, functional analysis, and time series analysis are used to study dynamic biological processes. Finally, immune cell infiltration related to DE genes was assessed to provide insight into immune changes in this disorder. The present study’s aim is to discover insights into the performance of SARS-CoV and link it to the biological mechanism of SARS-CoV-2.
Methods
Microarray data processing
In this step, the re-analysis of the expression profiling by an array of Human airway epithelium cultures infected with SARS-CoV is performed. The gene expression profile from GSE17400 [5], GSE33267 [6], GSE37827, GSE47960, GSE47961, and GSE47962 [7] are extracted from the Gene Omnibus Expression (GEO) database [8], which each contains the expression information of infected cell samples and mock samples as controls at different post-infection times. The Expression profile data were selected at 24, 48, 72, and 96 hours (days 1, 2, 3, and 4). These times are determined based on the purpose of the study. Due to the several and close times in these datasets, in this study, we only focused on the days from the first day to the fourth day of infection. After normalization by log2 transformation and quality control checking, the group comparison was performed with the meta-analysis for each time separately using the NetworkAnalyst database [9]. Genes were declared as differentially expressed and had adjusted
Gene regulatory network construction and analysis
Gene regulatory networks were created using microRNAs (miRNAs), and long non-coding RNAs (lncRNAs) predicted with the DE genes by the Enrichr web server [11]. For this purpose, the miRTarBase and lncHUB libraries were used, respectively and the
The gene interaction networks were retrieved based on STRING Interactome [13], using CluePedia plugin version 1.5.4 [14] with a confidence cutoff of 0.80 were constructed. Retrieving interactions between genes were selected based on activation, post-translational modification, binding, and inhibition links. The integrated network for each time point was built via the Cytoscape Merge tool. Using the cytoHubba [15], the top 5% of important hub nodes were calculated and extracted from merged complex networks based on computing the topological parameters, including Degree, Betweenness centrality, closeness centrality, and Eccentricity scores.
Gene ontology (GO), pathway, and functional analysis
Function and gene ontology analysis for each network at different time points was performed using the ClueGO plugin version 2.1.7 [16] of Cytoscape software. GO “Biological Process” [17] and KEGG (Kyoto Encyclopedia of Genes and Genomes) [18] web resources were chosen for retrieving pathways. Signaling pathways with an adjusted
Short time series gene expression data analysis
Short Time Series Expression Miner (STEM) software version 1.3.11 [19] was used to detect significant temporal expression profiles. For profile identification, gene sets are considered as one replication. The dynamical pattern of genes assigned to the profile and GO enrichment for sets of genes having the same temporal expression pattern were identified. The profiles were ordered according to significant results with adjusted
Immune infiltration evaluation
Using the CIBERSORT algorithm, the proportions of immune cell subsets correlated with DE genes were analyzed via the TIMER database, a comprehensive resource for systematical analysis of immune infiltrates [20].
The principal component analysis results of datasets for each time point (24 hr, 48 hr, 72 hr, and 96 hr) are depicted. Samples of each dataset consist of infected cells and uninfected cells as control samples. The outcomes show the samples from each group are separated correctly based on their states (1A). The percentages of overlapped DE genes are represented (1B).
Meta-analysis determined differentially expressed genes
We reused the GSE17400, GSE33267, GSE37827, GSE47960, GSE47961, and GSE47962 datasets, the expression profiles of affected cells and normal samples. Due to the necessity of quality checking in the first step [21], the quality control with principal component analysis (PCA) was performed. The plots shown groups were segregated based on their states, indicating the satisfactory quality of these datasets (Fig. 1A). The comparison of affected cells and wild-type samples with the meta-analysis identified DE genes with dramatic changes during days 2 and 3 and then decreased on day 4. The numbers of DE genes during infection are 220, 2341, 2827, and 1763, respectively. There are relative overlaps in DE genes at each time point (Fig. 1B).
The top microRNAs and long non-coding RNAs enriched with DE genes, with the highest centrality parameters in gene regulatory networks at each time point are shown
The top microRNAs and long non-coding RNAs enriched with DE genes, with the highest centrality parameters in gene regulatory networks at each time point are shown
First, the gene interaction assay and the analysis of non-coding RNAs as the key regulators of genes were performed. For this purpose, miRNA and lncRNA enrichment analyses were performed by libraries of the Enrichr database. The 0, 23, 101, and 2 miRNAs and 19, 60, 27, and 25 lncRNAs enriched with genes, respectively. The 20 top them are represented in Table 1. The hsa-miR-146a-5p and hsa-miR-124-3p are overlapped miRNAs during disease. IL6-AS1 is involved during 1–4 days of the disorder.
The gene interaction networks with 36, 789, 810, and 326 genes were achieved. FOSB, CXCL2, IYD, IL6, HES1, NR4A1, CSRNP1, NFKBIA, ASB9, RGR, CALML4, DUSP1, RPS6KA1, IFNL1, JUN, BCL3, FAM8A1, BBS2, DUSP8, RNASE4, TMEM154, and MAP3K14 are involved genes during infection.
The 5% of top genes with the highest scores of Degrees, Betweenness centrality, Closeness centrality, and Eccentricity parameters in the integrated network in each post-infected time are demonstrated. (a: 24 hr, b: 48 hr, c: 72 hr, and d: 96 hr). As seen, significantly the complexity of the gene interaction network increases at 48 hours after infection and then decreases at 96 hours.
For time series analysis, the significant profiles of genes with a pattern of expression during infection are shown. The gene ontology results for each cluster of genes are presented. 24 hr are considered as 0 points and other time points in order of time post-infection, are arranged respectively.
These genes are linked to the TNF signaling pathway, IL-17 signaling pathway, and C-type lectin receptor signaling pathway. Finally, the merged networks consisting of 55, 872, 938, and 353 nodes were constructed. The complex networks on the second and third days are shown. The topology of the networks was analyzed to identify the crucial genes. The nodes with the top rank of the Degree, Betweenness centrality, Closeness centrality, and Eccentricity parameters were considered critical nodes. The 5% of top genes were explored from networks and are represented in Fig. 2. These are especially related to the B cell receptor signaling pathway, IL-17 signaling pathway, Toll-like receptor signaling pathway, NOD-like receptor signaling pathway, TNF signaling pathway, Cytosolic DNA-sensing pathway, RIG-I-like receptor signaling pathway, and RIG-I-like receptor signaling pathway.
Gene ontology and functional analysis of DE genes revealed interconnected pathways that could be associated with viral infection. TNF signaling pathway, NOD-like receptor signaling pathway, C-type lectin receptor signaling pathway, Viral carcinogenesis, IL-17 signaling pathway, Cytosolic DNA-sensing pathway, Relaxin signaling pathway, Apoptosis, Toll-like receptor signaling pathway, RIG-I-like receptor signaling pathway, Cellular senescence, NF-kappa B signaling pathway, Human T-cell leukemia virus one infection, Neurotrophin signaling pathway, MAPK signaling pathway, FoxO signaling pathway, Insulin resistance are enriched on days 2, 3, and 4 are strongly related to the infection. Also, the C-type lectin receptor signaling pathway, IL-17 signaling pathway, and TNF signaling pathway are paths enriched during all days of infection.
Short time series expression minor analysis
The results of the temporal expression patterns analysis of DE genes showed eight significant profiles clustered. In 4 profiles, gene ontology analysis has significant results with adjusted
Proportion of immune cells for each time point.
Cell-type Identification by Estimating Relative Subsets of RNA Transcripts (CIBERSORT) an analytical tool that allows immune cell profiling analysis by DE gene expression microarray data. Immune infiltration analysis (Fig. 4) investigated the role of immune cells during SARS-CoV infection. The result of the analysis revealed elevated levels of T cell CD8
Discussion
The acute respiratory syndrome is the major complication caused by the coronavirus family viruses, so that has a significant public health problem [1]. The high-throughput methodology is extremely helpful in investigating biological events globally. The objective of this study is to reuse the expression profiling by an array of infected cells with SARS-CoV to develop the model to explore its function and find critical genes and their key regulators by mapping the interaction network.
The network analysis investigated whole gene interactions. The critical nodes, FOSB, CXCL2, IYD, IL6, HES1, NR4A1, CSRNP1, NFKBIA, ASB9, RGR, CALML4, DUSP1, RPS6KA1, IFNL1, JUN, BCL3, FAM8A1, BBS2, DUSP8, RNASE4, TMEM154, and MAP3K14, are obtained based on measures combination of different centrality parameters in each network. The genes have a high rank of centrality indexes correlated with immune responses to pathogens, including C-type lectin receptor signaling pathway, B cell receptor signaling pathway, IL-17 signaling pathway, Toll-like receptor signaling pathway, NOD-like receptor signaling pathway, TNF signaling pathway, Cytosolic DNA-sensing pathway, and RIG-I-like receptor signaling pathway.
Also, in the regulatory levels, IL6-AS1 is the lncRNA predicted to be a key regulator during infection. In this study, for the first time has been reported the role of IL6-AS1 in this disease. The hsa-miR-146a-5p and hsa-miR-124-3p are miRNAs involved during disease. The hsa-miR-146a-5p could enhance the replication of infectious bronchitis virus (IBV), a type of coronavirus [22] and its role in SARS-CoV2 infection could be central. The hsa-miR-124-3p was shown in the previous study associated with cardiovascular complications of COVID-19 by controlling the transferrin receptor that interacted with ACE2 [23].
In functional and enrichment analysis, significant results included many associated pathways in response to viral infection, particularly immune response and inflammation. In addition, functions such as apoptosis, focal adhesion, gap junction, cellular senescence, autophagy, and mitophagy are interesting results related to symptoms. The other considerable results could be known as viral myocarditis and carcinogenesis. Also, the functional analysis has shown this infection process is similar to other pathogenic events, including Influenza A, Kaposi sarcoma-associated herpesvirus infection, Human immunodeficiency virus one infection, Epstein-Barr virus infection, Hepatitis B, Hepatitis C, Herpes simplex virus one infection, Human papillomavirus infection, Salmonella infection, Human cytomegalovirus infection, Measles, Leishmaniasis, Tuberculosis, and Toxoplasmosis. The temporal expression patterns analysis showed eight significant profiles, a set of genes that exhibit a similar pattern during infection. The gene ontology of these sets major involved in immune response, response to the organism, defense response, and external stimuli.
Also, the results of the correlation of differential expression of genes at each point in time with immune infiltration are the other interesting results of this study. On the first day mostly immune infiltration NK cells, in days 2, 3, and 4, frequently the correlation of differential expression gene with level of immune infiltration T cell CD 4
Conclusion
Here, using computational tools, we generated a systematic view of the pathogenic associated with SARS-CoV2 infection to explore the comprehensive molecular mechanisms of this disorder based on time series analysis.
Availability of data and materials
The materials and data used and/or analyzed during this study are available from the corresponding author on request.
Ethics approval and consent to participate
Not applicable. This study, as a re-analysis article, is exempt from Institutional Review Board approval.
Consent for publication
Not applicable.
Competing interests
The authors stated there is no conflict of interest to declare.
Funding
This research received no specific grant from any funding agency.
Author’s contributions
M.K., M.S.; Conceptualization. R.F.; Formal analysis and investigation. R.F., F.K., Writing original draft. M.K., M.S.; Review and editing. M.K.; Supervision.
Footnotes
Acknowledgments
This study was supported by the Isfahan University of Medical Science.
