Abstract
Intervertebral disc degeneration (IDD) is the major cause of low back pain. The current study was aimed to further elucidate the mechanisms underlying it. Microarray data sets GSE70362 containing Thompson degeneration grades I–V were divided into the control and the degenerative group and were analyzed. Differentially expressed genes (DEGs) were screened and clustered, followed by functional enrichment analysis. Then the protein–protein interaction (PPI) network and the microRNA (miRNA) regulatory network were constructed and integratedly analyzed. Finally, modules from the integrated network were mined, and gene ontology and pathway enrichment analysis were performed. DEGs in three clusters had the overall expression trend with the Thompson grades. The upregulated DEGs were associated with protein transport and localization, while the downregulated DEGs were enriched in membrane lipid metabolic process and endocytosis. After the integrated analysis of PPI and miRNA–target interactions, some hub genes such as HSP90B1, RPS4Y1, RPL15, and UTY, as well as hub miRNAs, including miR-124a and miR-506, were screened. Finally, modules in the integrated network were functionally associated with protein targeting, peptide processing, and metabolic process, as well as protein folding. Taken together, our data showed that the protein synthesis, targeting, and localization can be greatly changed to contribute to the progress of IDD. Besides, the related genes such as RPS4Y1 and HSP90B1 would be used as diagnostic and therapeutic targets for IDD.
Highlights
Microarray data sets GSE70362 containing degeneration grades I–V were analyzed.
Protein synthesis, targeting, and localization may contribute to degenerated intervertebral disc.
RPS4Y1 and HSP90B1 would be used as diagnostic and therapeutic targets for the disease.
1. Introduction
I
Currently, genome-wide bioinformatics analysis as a popular approach for gaining insight into the overlapping and specific genetic mechanisms of disease is widely used. For example, Zhao et al. (2016) has identified 1854 long non-coding RNAs (lncRNAs) and 2804 protein-coding genes, which are differentially expressed in IDD group compared with the spinal cord injury group using RNA sequencing. Zhu et al. (2017) screened a total of 558 differentially expressed genes (DEGs), such as IL6, VEGFA, THBS1, and ITGA4, between the degenerated and normal samples based on array data of GSE42611. In our study, we used the gene expression profile of GSE70362 to further identify the molecular mechanism of IDD through identifying the DEGs, microRNAs (miRNAs), and interactions of miRNA-targets, as well as further analyzing their functions and pathways associated with the progression of IDD. The results obtained in this study may help to prove better insights into pathogenesis and define novel systemic biomarker for IDD.
2. Materials and Methods
2.1. Gene expression profile data
Microarray data sets with the access number GSE70362 were downloaded from the Gene Expression Omnibus database (www.ncbi.nlm.nih.gov/geo). The discs with Thompson degeneration grades I–II were assigned to the nondegenerative (control) group, while those having grades III–V were assigned to the degenerative group. The tissues were obtained from 24 donors, 8 of which (4 grade I, 2 grades I–II, and 2 grade II) were subject to the control group and the others (n = 16, including 6 grade III, 6 grade IV, and 4 grade V) were the degenerative group. The platform was GPL17810 (Affymetrix Human Genome U133 Plus 2.0 Array).
2.2. Data preprocessing and mining
The original CEL files were preprocessed by the robust multiarray average algorithm (Irizarry et al., 2003) from the “affy” package (Gautier et al., 2004). The preprocessing steps involve background adjustment, quantile normalization, finally summarization and probe ID to gene symbol, and after that the expression profile matrix was extracted. When different probes were mapping to the same gene symbol, the mean value of probes represents the expression value of gene. As a result, a total of 20,545 remaining genes were obtained.
2.3. Cluster analysis for DEGs
Noise robust soft clustering of gene expression data between control and degenerative groups was conducted by the Mfuzz package (Futschik and Carlisle, 2005; Kumar and Futschik, 2007). Then the fuzzy C-means clustering algorithm was used for gene ranking and clustering based on expression values. Multiple clustering results, including up- and downregulated DEGs, were obtained based on minSTD = 0.20, acore = 0.5, and cOptimal = 9.
2.4. Enrichment analysis and functional annotation
The Database for Annotation, Visualization, and Integrated Discovery clustering algorithms (Huang et al., 2007) were used for functional and pathway enrichment analysis of DEGs. The function and pathway enrichment analyses were conducted by the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathways (Kanehisa and Goto, 2000). The threshold for the number of enriched genes was count ≥2.
2.5. Prediction of protein–protein interactions
Protein–protein interaction (PPI) between proteins was analyzed using the Search Tool for the Retrieval of Interacting Genes/Proteins (Mering et al., 2003) online tool (http://string-db.org), and the threshold for protein pairs was required confidence (combined score) >0.4.
2.6. Prediction of miRNA-target gene pair
Putative miRNAs that regulate genes in PPI network were predicted using Web-based Gene Set Analysis Tool Kit (http://bioinfo.vanderbilt.edu/webgestalt; Zhang et al., 2005). The threshold for the number of enriched genes was ≥4. The interactions of adjusted p < 0.05 based on Benjamini and Hochberg (Bader and Hogue, 2003) were selected, and the miRNA regulatory network was constructed.
2.7. Integration of PPI network and miRNA regulatory network
The PPI network and the miRNA regulatory network were integrated using the Cytoscape software (Shannon et al., 2003). Most biological networks have the common feature of scale-free property. The connectivity degree distribution of individual nodes (genes) in the network was analyzed to identify network hubs.
2.8. Network module detection
The MCODE Plugin (Bader and Hogue, 2003) of Cytoscape software (www.cytoscape.org) was implied to identify modules from the integrated network (score >5). In addition, GO and pathway enrichment analysis for genes in modules were performed.
3. Results
3.1. Gene clusters
A total of nine clusters were obtained, among which three clusters exhibiting an overall downward or upward trend which was coincident with the Thompson grades were further analyzed. As shown in Table 1, there were 127, 122, and 154 genes in clusters 2, 8, and 9, respectively. Moreover, the expression of genes in cluster 2 was significantly increased, while those in clusters 8 and 9 were significantly decreased, respectively (Fig. 1).

Three clusters obtained after genetic clustering. Every curve represents a gene. The darker color of the curve corresponds to a higher compliance with the overall gene expression trend.
Gene Clustering Analysis
3.2. Functional terms enriched by up- and downregulated genes
The results of functional enrichment are shown in Table 2. The upregulated genes were mainly associated with biological process (BP) of protein transport, establishment of protein localization, protein localization, and pathway of MAPK signaling pathway, while the downregulated genes were involved in GO-BP of membrane lipid metabolic process, regulation of endocytosis and positive regulation of endocytosis, and pathways of lysosome, drug metabolism, and fatty acid elongation in mitochondria.
The Functional Enrichment Results of Up- and Downregulated Differentially Expressed Genes
BP, biological process; count, the number of genes enriched in the term; GO, gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
3.3. The prediction of miRNA–target interactions and integrated network construction
The significant miRNA–gene interactions are shown in Table 3, and among them, miR-124a, miR-145, miR-498, miR-506, and miR-96 had more targets. Next, the integrated miRNA–target gene regulatory network was constructed. As illustrated in Figure 2, the top 6 hub nodes of miRNAs in the network were miR-124a (degree = 18), miR-506 (degree = 18), miR-17-5p (degree = 14), miR-20a (degree = 14), miR-106a (degree = 14), and miR-20b (degree = 14). The top 6 hub genes were HSP90B1 (Heat Shock Protein 90 kDa Beta Family Member 1, degree = 18), FOS (FBJ Murine Osteosarcoma Viral Oncogene Homolog, degree = 17), IMPDH2 (Inosine 5′-monophosphate dehydrogenase 2, degree = 16), RPS4Y1 (Ribosomal Protein S4, Y-Linked 1, degree = 16), RPL15 (Ribosomal Protein L15, degree = 14), and UTY (Ubiquitously Transcribed Tetratricopeptide Repeat Containing, Y-Linked, degree = 14).

The integrated miRNA-target gene regulatory network. Red and green circles indicate up- and downregulated genes, respectively. The blue rhombus represents miRNA. T-shaped edges refer to the regulatory relationships between miRNAs and their targets; the others represent the interactions between proteins. The node size is in accordance with the degree. miRNA, microRNA.
The Significant MicroRNA–Target Gene Interactions
p Value was adjusted by the multiple test adjustment (adj.p). Count indicates the number of genes regulated by miRNA.
miRNA, microRNA.
3.4. Modules obtained from the integrated network
With the threshold of score >5, a total of two modules were obtained from the integrated network (Fig. 3). There were 12 nodes and 37 interaction pairs in module 1 while the module 2 consisted of 7 nodes and 17 interaction pairs. Next, enrichment analysis of the genes in two modules is listed in Table 4. And the results showed that the cluster 1 was associated with signal peptide processing, cotranslational protein targeting to membrane, translation, protein targeting to membrane and peptide metabolic process, and pathway of Ribosome. Genes in cluster 2 were mainly enriched in GO-BP terms of protein folding.

Modules obtained from the integrated network. Red and green circles indicate up- and downregulated genes, respectively. Module 1
Gene Ontology Terms and Pathways Enriched by Genes in Two Modules
4. Discussion
IDD is one of the common causes of low back pain, which affects most adults and is a major health problem in the world (Osborn and Smith, 1998). In this study, the gene expression profile generated from nondegenerative and degenerative intervertebral discs was analyzed. The results showed that the upregulated DEGs between two groups were associated with protein transport and localization, while the downregulated DEGs were enriched in membrane lipid metabolic process and endocytosis. After the integrated analysis of protein–protein and miRNA–target interactions, some hub genes such as HSP90B1, RPS4Y1, RPL15, and UTY, as well as hub miRNAs, including miR-124a and miR-506, were screened. Moreover, modules in the integrated network were functionally associated with protein targeting, peptide processing, and metabolic process, as well as protein folding. These genes and BPs may be involved in the progression of IDD, which could provide the theoretical basis to develop novel and effective treatment of IDD.
Currently, an increasing number of genes have been regarded to be involved in IDD development (Wu et al., 2015; Yan et al., 2015). In the present study, the DEGs, such as HSP90B1, RPS4Y1, RPL15, and UTY, between nondegenerative and degenerative intervertebral disc groups were identified as the key genes with the higher degrees. Actually, protein encoded by RPS4Y1 catalyzes protein synthesis, and increased protein synthesis can occur during disc degeneration (Cs-Szabo et al., 2002; Le Maitre et al., 2007), suggesting that RPS4Y1 plays an important role in IDD development. Similarly, the DEGs of HSP90B1 in cluster 2 were mainly found to be associated with protein folding, of which the encoded protein functions in the stabilizing and folding of other proteins and involves in a variety of pathogenic states (Bloor et al., 2010). In addition, an increasing number of abnormally expressed miRNAs, such as miR-193a-3p and miR-27b, serve important roles in the regulation of IDD through targeting matrix metalloproteinase (MMP)14 and MMP13, respectively (Ji et al., 2016; Li et al., 2016). In accordance with previous studies, some hub miRNAs, including miR-124a and miR-506, in our study could also be involved in the mechanism of IDD. However, little correlation between miRNAs and hub genes was identified. Thus the miRNA expression profile may be needed to integrate differential miRNA expression in gene expression in our next study.
Previous studies suggested that occlusion of vertebral end plate openings limits the transport of nutrients and metabolites and affects the loss and synthesis of matrix structural proteins, resulting in disc degeneration (Bibby et al., 2002; Benneker et al., 2005). Moreover, the site-specific localization of cathepsins was involved in the pathomechanism of IDD (Ariga et al., 2001), and the MAPK pathway plays a role in osmoregulation in intervertebral disc cells through TonEBP and its target genes (Tsai et al., 2007). The receptor-mediated endocytosis is mediated by the MAPK signaling pathway (Gong et al., 2007). In our study, the gene cluster analysis revealed that many genes, especially those in clusters 2, 8, and 9, were gradually changed in gene expression and were coincident with the degenerative grade. In the following analysis, upregulated DEGs that may be involved in IDD were enriched in protein transport and localization and MAPK signaling pathway. In addition, membrane lipid metabolic process and endocytosis enriched by downregulated DEGs may also be important for the development of IDD. Actually, even though several studies have proposed that serum lipid levels may be a risk factor for intervertebral disc pathology (Longo et al., 2011), and lipid peroxidation contributes to degeneration with aging (Ames et al., 1993), there is no direct evidence about the correlation between membrane lipid metabolism and IDD. Hence, our data could provide the theoretical basis for the related study in future.
In conclusion, protein translation, folding, transport, targeting, and localization could be greatly dysregulated during IDD. Some genes, including RPS4Y1, UTY, RPL15, and HSP90B1, were abnormally expressed and may be important for the development of this disease. However, there are some limitations in the current analysis that should be of concern, including the data generated from a single platform. Moreover, some validation experiments are still needed to confirm the results.
Footnotes
Author Disclosure Statement
The authors declare they have no competing financial interests.
Funding Information
No funding was received for this article.
