Abstract
There are many documents about benefits of exercise on human health. However, evidences indicate to positive effect of exercise on disease prevention, understanding of many aspects of this mechanism need more investigations. Determination of critical genes which effect human health.
GSE156249 including 12 gene expression profiles of healthy individual biopsy from vastus lateralis muscle before and after 12-week combined exercise training intervention were extracted from gene expression omnibus (GEO) database. The significant DEGs were included in interactome unit by Cytoscape software and STRING database. The network was analyzed to find the central nodes subnetwork clusters. The nodes of prominent cluster were assessed via gene ontology by using ClueGO. Number of 8 significant DEGs and 100 first neighbors analyzed via network analysis. The network includes 2 clusters and COL3A1, BGN, and LOX were determined as central DEGs. The critical DEGs were involved in cancer prevention process.
Introduction
The advantages of exercise are presented in numerous published documents. A positive role in cardiovascular health, support of mental health, skeletal muscle physiology improvement, and promotion of several key physiological conditions are the parts of benefits of exercise [1, 2, 3]. Several investigations are focused on the anticancer properties of exercises. It is reported that exercise induces irisin, the fat browning myokine which can be considered as an anticancer agent [4]. It is proposed that exercise may support the immune system against cancer promotion [5].
Since genomics investigation leads to determine large numbers of differentially expressed genes (DEGs), scientists are interested to find the whole set of genes that are affected in the studied cases [6]. Due to the high frequency of DEGs that determine in the high throughput methods, a complementary analysis is required to detect critical DEGs among all introduced DEGs [7]. Bioinformatics is introduced as a suitable method to analyze the high throughput experiments [8]. Network analysis as a bioinformatics approach is used widely to interpret high throughput finding [9, 10].
Box plot presentation of gene expression profiles of samples before and after doing exercises.
Protein-protein interaction analysis can be applied to construct a network of genes or proteins in an interactome unit. The genes are elements (nodes) of the network which are connected by edges. In the scale-free networks, based on the properties of the genes, the numbers of connections and also the neighbor genes are different for each node. The nodes that are connected to the high numbers of first neighbors are called hub nodes. It is proposed that the hubs are critical elements of the constructed network [11, 12].
A dense part of a network that includes the genes that are connected via large numbers of connections is known as a cluster. It is proposed that elements of a cluster are involved in similar functions. Cluster analysis is applied to explore the molecular mechanism of several diseases [13, 14, 15].
Gene ontology is another approach that is used to determine related molecular function, biochemical pathways, and biological processes for the studied genes [16]. In the present study, gene expression profiles of healthy individual biopsy from vastus lateralis muscle before and after 12-weeks combined exercise training intervention from GEO is analyzed via network analysis to find the main effect of exercise on human health.
GEO includes sets of GSEs which contain gene expression profiles of samples in various conditions as like diseases and treated cases. GSE156249 which comprises 24 GSMs was selected from GEO. This GSE contains two groups of samples; the first group embraces 12 gene expression profiles of healthy individual biopsy from vastus lateralis muscle and the second group includes the same profiles after 12-weeks combined exercise training intervention. In the first step, the studied profiles were statistically matched via boxplot analysis by GEO2R program. The top 250 DEGs based on p-value were extracted via GEO2R analysis. Considering FC
A sub cluster of the main network which contains 68 nodes including the 6 query genes (INS-IGF2, MXRA5, THBS4, BGN, COL3A1, and LOX).
Statistical analysis revealed that the gene expression profiles before and after exercises were comparable (see Fig. 1). The number of 8 significant DEGs were identified as discriminator factors relative to the doing exercises. Network analysis is a suitable method to detect additional proteins to differentiate gene expression profiles relative to the after and before of exercises. In this regard, the added 100 first neighbors led to appear two clusters of proteins. Clustering analysis of the studied networks is a well-known method that is applied to assess the molecular mechanism of several diseases [16, 17, 18, 19]. The two identified clusters of proteins were significantly different; cluster 1 includes 6 queried DEGs among the 8 individuals while cluster 2 contained 2 queried DEGs (see Figs 2 and 3). As it is reported the protein interaction networks are mainly scaled free type [20]. In this regard, cluster 1 obeys the rule of scale-free networks but cluster 2 has not corresponded to the scale-free network rules. In the free scale networks, there are a few nodes that are differentiated from the other nodes as we know as central nodes [21]. Since cluster 1 is a scale-free network, it can be determining the central nodes which play important role in response to exercises. As is shown in Table 1, the analysis revealed that COL3A1, BGN, and LOX were the top central nodes in cluster 1. This finding can be considered as the first outcome of the analysis. However, cluster 2 was not a scale-free network but it appeared as a dense cluster. It is reported that nodes of a dense part of a network act in a very similar manner [22]. Therefore, it can be concluded that the elements of cluster 2 be involved in a similar biological process. It is the second finding of the investigation. Due to the importance of cluster 1 (contains 6 queried genes) gene ontology analysis for the elements of this cluster led to introduce 33 biological terms that were related to the COL3A, LOX, BGN, and THBS4.
Cluster 2 which contains 40 nodes of the main network including 2 queried genes (THY1 and PRND).
The 6 queried nodes of cluster 1. The top3 nodes were appeared as hub nodes
The GO terms that are related to the queried genes of cluster 1
As it is shown in Fig. 1, the distribution of gene expressions in the sample has a similar statistical pattern and is corresponded to select suitable samples for analysis. Among large numbers of genes, 8 significant DEGs were selected after applied p-value and fold change criteria. Cluster analysis divided the critical DEGs into two classes; class 1; INS-IGF2, MXRA5, THBS4, BGN, COL3A1, and LOX, and class 2; THY1 and PRND. Based on centrality analysis the first class of critical DEGs includes two sets of DEGs; central DEGs contain; COL3A1, BGN and LOX, and INS-IGF2, MXRA5, THBS4. Gene ontology revealed THBS4 can be considered as a prominent gene because 5 biological terms (15% of all terms) were dependent to this gene.
COL3A1 is a key element in various studied cancers; Shi et al. showed that up-regulation of COL3A1 has a prominent role in triple-negative breast cancer metastasis [23]. Shen et al. reported that up-regulation of FN1, COL3A1, FBN1, BGN, COL5A2, THBS2, COL5A1, and SPARC is a possible central core in gastric cancer, and COL3A1 plays a significant role as a prognostic marker in this cancer. Based on this report, the prognostic value of the COL3A1/FBN1/COL5A2/SPARC-mir-29a-3p-H19 ceRNA network is related to gastric cancer [24]. Zhang et al. published a document about the role of COL3A1 in esophageal cancer. In this report it is discussed that COL3A1 and POSTN are upregulated in the esophageal cancer tissues relative to the healthy tissues and high-value expression of these genes is accompanied by a relatively high stage of pathologic esophageal cancer in patients [25]. In another study, J Chen et al. pointed to COL3A1 together with COL1A1, COL1A2, and DCN as potential biomarkers to detect progression of lung adenocarcinoma in the smoker patients [26]. In the all mentioned studies, it is emphasized that the high expression value of COL3A1 is related to the progression of various types of cancers. Gene ontology analysis (see Table 2) revealed that COL3A1 is related to 36% of the determined biological terms.
The second critical gene which is determined as the central gene is BGN. Investigation of Zhao Liu et al. revealed that BGN overexpression is correlated with gastric cancer properties such as microvascular tumor thrombus (
The third central gene is LOX is identified as a hub node in cluster 1. like COL3A1 and BGN there are several documents have corresponded to the role of LOX in the malignancy of various types of cancers. MK Kim et al. are introduced LOX as a biomarker of malignant pleural mesothelioma. In the report of S Lin et al. prognostic value of LOX family genes in kidney renal clear cell carcinoma. Based on this investigation, significant upregulation of LOX and LOXL2 genes is correlated to poor survival of patients [30, 31]. R Mongkolrob et al. are highlighted LOX polymorphism as an influencer of cancer risk in a published meta-analysis article [32].
THBS4 was not included in the central nodes of cluster 1 but was related to 15% of the explored biological terms. Researchers are pointed that THBS4 is a cancer-related gene. The association between THBS4 and the progress of colorectal cancer is confirmed by MS Kim research team [33]. The assessment indicates that silencing of TSBS4 has a regulatory effect on prostate cancer [34].
THY1 and PRND are involved in cluster 2. As it is published by Liu et al. positive expression of THY1 is strictly accompanying by the condition and properties of extrahepatic cholangiocarcinoma in the studied patients [35]. Ryskalin et al. investigated the role of cellular prion protein in the progress of cell differentiation in cancer. Their investigation revealed cancer-promoting effect of the studied prion protein [36].
All studied queried genes are down-regulated in the present assessment and the previous researches have corresponded to the over-expression of these gene set in the developmental process of various types of cancers. It can be concluded that cancer prevention is one of the significant outcomes of doing exercises.
Conclusion
Doing exercises affects grossly on the gene expression profile of treated samples. In the present study, it was resulted that COL3A1, BGN, LOX, THBS2, THY1, and PRND as significant DEGs were down-regulated. Since this set of genes are involved in cancer development and promotion, it can be concluded that doing exercises leads to cancer prevention significantly.
Footnotes
Acknowledgments
Shahid Beheshti University of Medical Sciences supports this research.
Conflict of interest
The authors declare no conflict of interest.
Ethical considerations
Not applicable.
