Abstract
Long noncoding RNAs (lncRNAs) have been increasingly accepted to function importantly in human diseases by serving as competing endogenous RNAs (ceRNAs). To date, the ceRNA mechanisms of lncRNAs in the progression of atherosclerosis (AS) remain largely unclear. On the basis of ceRNA theory, we implemented a multistep computational analysis to construct an lncRNA–mRNA network for AS progression (ASpLMN). The probe reannotation method and microRNA–target interactions from databases were systematically integrated. Three lncRNAs (GS1-358P8.4, OIP5-AS1, and TUG1) with central topological features in the ASpLMN were firstly identified. By using subnetwork analysis, we then obtained two highly clustered modules and one dysregulated module from the ASpLMN network. These modules, sharing three lncRNAs (GS1-358P8.4, OIP5-AS1, and RP11-690D19.3), were significantly enriched in biological pathways such as regulation of actin cytoskeleton, tryptophan metabolism, lysosome, and arginine and proline metabolism. In addition, random walking in the ASpLMN network indicated that lncRNA RP1-39G22.7 and MBNL1-AS1 may also play an essential role in the pathology of AS progression. The identified six lncRNAs from the aforementioned steps could distinguish advanced- from early-staged AS, with a strong diagnostic power for AS occurrence. In conclusion, the results of this study will improve our understanding about the ceRNA-mediated regulatory mechanisms in AS progression, and provide novel lncRNAs as biomarkers or therapeutic targets for acute cardiovascular events.
Introduction
Cardiovascular disease (CVD) has become a major threat to public health in both developed and developing countries, accounting for 31% of all-cause mortality worldwide (World Health Organization, 2017). The fundamental pathogenesis of CVD is atherosclerosis (AS), a chronic context featured by endothelial dysfunction, vascular smooth cells (VSMCs) proliferation and migration, immunoinflammatory response, lipid deposition, and the formation of atherosclerotic plaque (Brown et al., 2017). With AS progression, the erosion and rupture of atherosclerotic plaque leading to thrombosis is responsible for acute cardiovascular events, such as myocardial infarction and ischemic stroke (Fleg et al., 2012; Libby et al., 2019). These events contribute largely to the unacceptable high mortality and disability in patients with CVD (Virani et al., 2020). Thus, identification of biomarkers related to AS progression is essential for the management of CVD patients.
Recent genomic and transcriptomic advances have facilitated the discovery of noncoding RNAs in various human diseases. Long noncoding RNAs (lncRNAs) are recognized as a newly identified subclass of noncoding RNAs with exceeding 200 nucleotides in length. LncRNAs compose the main transcriptional repertoire of the mammalian genome, with >100,000 human lncRNAs recorded in public databases (Maracaja-Coutinho et al., 2019). A growing body of evidence has revealed that lncRNAs can regulate protein-coding gene expressions at epigenetic, transcriptional, and posttranscriptional levels, by acting as archetypes, including scaffolds, signals, guides, and decoys (Kornienko et al., 2013; Li et al., 2014b; Schmitz et al., 2016). The dysregulated expression of lncRNAs has been documented in a variety of disorders, especially in cancers (Mirhosseini et al., 2019). Also in CVD, accumulating experiments have found that lncRNAs exert functional roles by regulating pathological cardiac remodeling (Sun and Wang, 2019), myocardial ischemia (Wang et al., 2018), abnormal cardiac electrophysiology (Zhang et al., 2019), and so on.
The competing endogenous RNAs (ceRNA) activity has been widely accepted as an important regulatory mechanism for functional lncRNAs (Salmena et al., 2011). The ceRNA theory was first reported in plants (Franco-Zorrilla et al., 2007; Paschoal et al., 2018), in which lncRNAs could communicate with mRNAs through serving as microRNAs (miRNAs) sponges to mitigate the miRNA-targeted suppression of mRNAs. Interestingly, an increasing number of studies have demonstrated that this regulatory map is present in AS progression. For instance, lncRNA MIAT was observed to upregulate CD47 expression and inhibit efferocytosis in advanced AS through sponging miR-149-5p (Ye et al., 2019). In addition, lncRNA HOTAIR could ameliorate oxidation and inflammation response in oxidized low-density lipoprotein (LDL)-treated macrophages by binding to miR-330-5p, thereby promoting the development of AS (Liu et al., 2019). But to date, the ceRNA crosstalk involved in AS progression has not been systematically investigated.
In this study, we constructed an lncRNA–mRNA regulatory network for AS progression (ASpLMN) based on ceRNA hypothesis. Several functional lncRNAs and ceRNA regulatory mechanisms in AS progression were then identified from the network using multistep computational methods. These findings will improve our knowledge about the ceRNA regulatory mechanisms in AS progression, and further facilitate the discovery of novel lncRNAs for diagnosis or therapeutics.
Materials and Methods
Data sources
We downloaded the transcriptomic data related to AS progression from Gene Expression Omnibus database (GEO) with an accession number of GSE28829. This data set contained 16 advanced and 13 early atherosclerotic plaque tissues, and the gene expressions were profiled using Affymetrix HG-U133 Plus 2.0 array (Doring et al., 2012). With robust multichip average algorithm (Irizarry et al., 2003), the microarray data were normalized and log2 transformed for further analyses.
miRNA–target interactions
StarBase aims to decipher the RNA–RNA and RNA–protein regulatory networks from large-scale CLIP-Seq data (Li et al., 2014a). DIANA-LncBase is designed for providing an extensive collection of miRNA–lncRNA biological interactions (Paraskevopoulou et al., 2016). In this study, the experimentally supported miRNA–lncRNA and miRNA–mRNA interactions were extracted from StarBase and DIANA-LncBase. Finally, we obtained a total of 64,040 miRNA–lncRNA pairs and 423,975 miRNA–mRNA pairs, including 1135 miRNAs, 6768 lncRNAs, and 13,801 mRNAs.
Microarray reannotation
A probe reannotation strategy was applied to get lncRNAs and mRNAs transcripts from the Affymetrix HG-U133 Plus 2.0 array, as described in our previous studies (Qian et al., 2019). First, the probe sequences for the array were downloaded from the Affymetrix website and aligned to the human genome (h19) using SeqMap tool (Jiang and Wong, 2008). We only retained the probes that were uniquely matched to the genome with no mismatch. Second, the chromosomal locations of the remaining probes were remapped to the chromosomal coordinates of lncRNAs or mRNAs transcripts annotated by the GENCODE project (release 28), using BEDTools. The probes that mapped to both lncRNAs and mRNAs were discarded.
Construction of the ASpLMN
A multistep computational approach was used to construct the ASpLMN based on ceRNA theory, as depicted in Figure 1. By using the “limma” package (Ritchie et al., 2015) in R, we first identified the differentially expressed lncRNAs (DElncRNAs) and differentially expressed mRNAs (DEmRNAs) between early and advanced AS plaques. The mRNAs with |fold change| > 1.5 and p < 0.01 were treated as differentially expressed. For lncRNAs, the threshold for difference was set to |fold change| > 1.2 and p < 0.01 owning to their generally lower abundance. Second, the DElncRNAs and DEmRNAs were integrated into the miRNA–target relationships to get the DElncRNA–miRNA–DEmRNA crosstalks. Third, the statistical significance of the common miRNAs between each lncRNA–mRNA pair was detected using hypergeometric test, and the formula for p-value was as follows:
where l is the number of miRNAs that interacted with a given lncRNA, m is the number of miRNAs that regulated a given mRNA, N refers to the total number of human miRNAs in StarBase and DIANA-LncBase, and t represents the number of shared miRNAs between a given mRNA–lncRNA pair. The hypergeometric test was conducted using the “phyper” function in R, and the p-values were adjusted by false discovery rate (FDR). The lncRNA–mRNA pairs with FDR <0.01 were integrated to form the ASpLMN.

The approach to construct the lncRNA–mRNA network for AS progression based on ceRNA mechanism. First, the GPL570 platform was reannotated to get both lncRNA and mRNA transcripts using SeqMap and BEDTools, and DElncRNAs and DEmRNAs were then obtained using limma test. Second, the DElncRNAs and DEmRNAs were integrated into the miRNA–target relationships from StarBase and DIANA-LncBase to get candidate lncRNA–mRNA pairs. Third, hypergeometric test was performed to screen out the lncRNA–miRNA interactions with FDR <0.01, and the remaining lncRNA–mRNA pairs were merged to construct the lncRNA–mRNA network for AS progression. AS, atherosclerosis; ceRNA, competing endogenous RNA; DElncRNAs, differentially expressed lncRNAs; DEmRNAs, differentially expressed mRNAs; FDR, false discovery rate; lncRNA, long noncoding RNA; miRNA, microRNA; mRNA, messenger RNA. Color images are available online.
Subnetwork analyses in the ASpLMN
Previous studies have demonstrated that lncRNAs often exert their biological roles in a module (Pan et al., 2017). To identify the lncRNA-regulated modules in the ASpLMN, we first conducted a bidirectional hierarchical clustering using complete linkage approach and viewed the results by R package “pheatmap.” In addition, we also investigated the dysregulated modules in the network using Pearson's correlation coefficients (PCCs), as described in our previous studies (Qian et al., 2019). In brief, the PCC for each lncRNA–mRNA interaction in the ASpLMN was calculated in the early and advanced AS samples, respectively. The lncRNA–mRNA pairs with PCC >0.5 in the early or advanced AS plaques were selected as candidate dysfunctional pairs. To reflect the extent of dysregulation, we then calculated the change of PCC (ΔPCC) of the candidate pair in advanced compared with early AS samples, by using the following formula:
where PCCa (l, m) indicates the PCC between a given lncRNA–mRNA pair in advanced AS samples, and PCCe (l, m) indicates the PCC between a given lncRNA–mRNA pair in early AS samples. To determine the significance of ΔPCC, a 1000-time permutation for each candidate dysregulated pair was performed by randomly shuffling the AS sample labels. The p-value of ΔPCC was generated as the frequency of the ΔPCC values in random permutations, which were larger than the ΔPCC value in real condition. The candidate ceRNA pairs with |ΔPCC| > 0.5 or p < 0.05 were selected to construct the dysregulated module.
KEGG pathway enrichment analysis for the clustered or dysregulated module was performed using DAVID 6.8 tool (Huang da et al., 2009). The pathways with p < 0.05 were regarded as significantly enriched.
Random walk with restart
The random walking in a network refers to the transition of an iterative walker from a certain node to a randomly chosen neighbor that begins from a given node (node0). In this study, we carried out a random walk algorithm in the ASpLMN that had the probability of restart at node0 in every step. This algorithm was described as follows:
where r represents the restart probability, M denotes the column-normalized adjacency network matrix, P t is a vector whose length is equivalent to the count of nodes in network, and the i th element in P t stands for the probability of being at node0 at step t.
In this study, the functional genes for AS progression were identified using protein–protein interaction (PPI) network analysis. We then matched the functional genes to the ASpLMN to get the seed nodes for random walking. The initial parameter P 0 was established as follows: 1/n was allocated to each seed node (n indicates the count of seed nodes), and 0 was assigned to nonseed nodes. The probability r was set as 0.5. For the vector Pt , it would reach stable at step t if the difference between Pt and Pt +1 was <10−10. Therefore, each lncRNA node could be scored according to the corresponding value in P t. To determine the statistical significance, we computed the scores of lncRNAs in the ASpLMN after 3000 random permutations of the seed nodes, while keeping the network topological properties consistent. During permutations, the times that the score of each lncRNA node was higher than the actual one was recorded as k. The statistical p-value for every lncRNA was thus calculated as k/3000.
Diagnostic performance of the hub lncRNAs in AS
To explore the distinguished power for early and advanced AS, we conducted an unsupervised hierarchical clustering for the expressions of hub lncRNAs, which were identified from previous steps. Then, the gene expression patterns of the core lncRNAs in AS progression were validated in available independent data sets, including GSE41571 (Lee et al., 2013), GSE120521 (Mahmoud et al., 2019), and E-TABM-190 (Papaspyridonos et al., 2006). In addition, we constructed a scoring signature to assess the diagnostic performance of the hub lncRNAs for AS occurrence in data set of GSE40231 (Hagg et al., 2009). In brief, the Z-score of the hub lncRNAs across samples were summarized to build a diagnostic signature. To evaluate the signature performance in AS, receiver operation characteristic (ROC) analysis was performed and the area under the curve (AUC) was calculated using R “pROC” package. For background comparison, an equal number of lncRNAs were extracted to construct a random signature.
Results
Identification of differentially expressed genes
Using the probe reannotation pipeline, a total of 4499 probe-lncRNA pairs and 36,291 probe-mRNA relationships were documented for Affymetrix HG-U133 Plus 2.0 array. Then, the lncRNA and mRNA expression profiles were extracted from the microarray data of GSE28829, and limma test was performed to identify the differentially expressed genes. As a result, we obtained 56 DElncRNAs (Supplementary Table S1) and 507 DEmRNAs (Supplementary Table S2) between early and advanced atherosclerotic plaques.
Construction and topological feature of the ASpLMN
The ASpLMN were established based on the ceRNA hypothesis by integrating the differentially expressed genes and the miRNA–target relationships from StarBase and DIANA-LncBase. Finally, we identified 1371 lncRNA–mRNA pairs with FDR <0.01 in hypergeometric test (Supplementary Table S3). The ASpLMN were constructed by merging these pairs, which consisted of 23 lncRNAs and 254 mRNAs (Fig. 2A).

Topological feature of the lncRNA–mRNA network for AS progression.
After building the network, the topological parameters, including degree, betweenness centrality (BC), and closeness centrality (CC), were analyzed, as mentioned in our previous studies (Qian et al., 2019). We found that the degrees of all nodes in the ASpLMN showed a power-law distribution (R 2 = 0.759, Fig. 2B), suggesting this network was scale free. Wilcoxon rank-sum test also indicated that the median degree of lncRNA nodes was remarkably higher than that of mRNA nodes (p < 0.001). We then identified the nodes with important topological properties in the ASpLMN, and the top 10 ranked nodes according to degree, BC and CC were documented (Supplementary Table S4). It was found that three lncRNAs (GS1-358P8.4, OIP5-AS1, and TUG1) were all critical in each topological parameter (Fig. 2C), implying that they might play crucial regulatory roles in AS progression.
Subnetwork analyses in the ASpLMN
To further explore the ceRNA mechanisms in the network, we analyzed the lncRNA-related modules by using bidirectional hierarchical clustering. Two dense modules were obtained from the heat map (Fig. 3A), indicating some lncRNAs and mRNAs tended to form a functional module in the progression of AS. The clustered module 1 (C-module 1) consisted of 100 ceRNA interactions, including four lncRNAs (GS1-358P8.4, OIP5-AS1, TUG1, and HCG11) and 25 mRNAs (Fig. 3B). The clustered module 2 (C-module 2) contained four lncRNAs (CTC-429P9.3, RP11-677M14.3, SNHG12, and RP11-690D19.3) and 38 mRNAs, forming 134 ceRNA pairs (Fig. 3C). KEGG pathway analyses were then performed for the mRNAs in these two modules. We found that the genes in C-module 1 were mainly involved in regulation of actin cytoskeleton, whereas the genes in C-module 2 were significantly enriched in tryptophan metabolism and lysosome (Fig. 3D).

Subnetwork analysis of the lncRNA–mRNA network for AS progression.
We next investigated the dysregulated module in the ASpLMN, which has been suggested to exert a “switch” role in pathology of human disease (Shao et al., 2015). A total of 39 dysregulated ceRNA pairs were identified using the threshold of |ΔPCC| > 0.5 or p < 0.05, as described in the “Materials and Methods” section. These ceRNA interactions were then merged to construct the dysfunctional module (D-module), which consisted of 15 lncRNA and 8 mRNAs (Fig. 3E). The lncRNA nodes with top 20% of degrees in this module were considered as the key lncRNAs, including GS1-358P8.4, OIP5-AS1, and RP11-690D19.3. KEGG pathway enrichment analysis demonstrated that the mRNAs in the D-module were involved in arginine and proline metabolism (Fig. 3D). Finally, by integrating the core lncRNAs from the C-modules and D-module, we gained three common lncRNAs (GS1-358P8.4, OIP5-AS1, and RP11-690D19.3; Fig. 3F) that might function importantly in the subnetwork of ASpLMN.
Random walking in the ASpLMN
To get more functional lncRNAs related to AS progression, we implemented a random walk algorithm with restart in the ASpLMN. First, the PPI network of DEmRNAs was analyzed, and the mRNAs with top 5% of degrees in the network were selected as seed genes for random walking (Fig. 4A). As a result, 25 critical mRNAs were set as the seed nodes and were then mapped to the ASpLMN. Second, each lncRNA node was assigned a score by random walking and the statistical significance of the score was obtained using 3000-time iterations. Two lncRNA (RP1-39G22.7 and MBNL1-AS1) with p < 0.05 were finally identified (Fig. 4B and Supplementary Table S5), which were considered important in the pathology process of AS progression.

Random walking in the lncRNA–mRNA network for AS progression.
Diagnostic power of the functional lncRNAs
To assess the discriminatory ability for AS progression, an unsupervised hierarchical clustering was carried out for the six lncRNAs, which were obtained from the aforementioned steps. In Figure 5A, all advanced AS samples were grouped into Cluster 1, and all early AS plaques were grouped into Cluster 2. The result indicated that the six lncRNAs were closely associated with the disease status of AS and may serve as potential biomarkers for AS progression. Next, the expression patterns of the functional lncRNAs were validated in three independent data sets (GSE41571 [n = 11], GSE120521 [n = 8], and E-TABM-190 [n = 11]). All the lncRNAs were significantly downregulated in at least two of the data sets, with the exception of lncRNA OIP5-AS1 (Fig. 5B). Finally, we conducted ROC analysis to further confirm the importance of the functional lncRNAs in AS. The Z-score of the six lncRNAs across AS samples in GSE40231 (n = 80) were totalized to form an integrated signature (termed 6-lncRNA). The ROC curve of the 6-lncRNA for AS occurrence showed an AUC of 0.941, which was significantly higher than that of the random signature (Fig. 5C).

Diagnostic performance of the hub lncRNAs in AS.
Discussion
The progression of AS is the fundamental mechanism for the transition from stable CVD to acute cardiovascular events (Ahmadi et al., 2019), such as myocardial infarction and stroke. Therefore, identification of biomarker related to AS progression will improve the current management of CVD patients. In the past decade, great efforts have been put into the uncovering of novel pathomechanisms in AS progression. However, the majority of the studies have focused on revealing the molecular mechanisms mediated by protein-coding genes or miRNAs (Feinberg and Moore, 2016; Poznyak et al., 2020). As a novel subtype of noncoding RNAs, lncRNAs has been increasingly shown to exert critical regulatory roles in a variety of diseases, from malignancies to cardiovascular disorders. Also, animal experiments have identified some lncRNAs that could participate in the development of AS, by acting as ceRNAs to relieve the miRNA-mediated silencing of protein-coding genes (Cao et al., 2019). But until now, there is no systematical study for the ceRNA-based mechanism in AS progression to explore the functional lncRNAs.
In this analysis, we constructed a biological network (ASpLMN) for AS progression on the basis of ceRNA hypothesis by mapping differentially expressed genes to miRNA–target interactions. We first identified three lncRNAs that showed central topological properties in the ASpLMN. Then, subnetwork analyses screened out three modules that shared three functional lncRNAs. Random walking in the network was also carried out, and we obtained two lncRNA that might exert important functions in the development of AS. More importantly, the expression patterns of the six lncRNAs identified from the aforementioned steps could distinguish the disease stages of AS, with an excellent diagnostic performance for atherogenesis.
In KEGG pathway analysis, we found the lncRNA-related functional modules were significantly enriched in pathways, including regulation of actin cytoskeleton, tryptophan metabolism, lysosome, and arginine and proline metabolism. Actin cytoskeleton has been demonstrated to be involved in vascular contractility and remodeling, inflammatory cell recruitment, and cell proliferation (Zhou et al., 2011). Realignment of actin cytoskeleton results in the preferential migration of VSMCs to the thrombosis at the fibrous cap, thus increasing the vulnerability of atherosclerotic lesions (Yuen and Robinson, 2013). Accordingly, inhibition of the critical regulators (Rho-kinase) of actin cytoskeleton could reverse the arteriosclerotic plaques in porcine coronary vessels (Shimokawa et al., 2001). Changes in amino acid metabolism may affect the polarization and proliferation of immune cells during the development of AS (Nitz et al., 2019). Previous studies have indicated that the tryptophan and arginine–proline metabolism were altered in the context of AS (Martin-Lorenzo et al., 2015; Jung et al., 2018). Lysosome is a central organelle for digestion of cholesteryl esters in the core of LDL. Dysfunctional lysosome leads to the cellular deposits of cholesteryl esters, which subsequently promote the macrophages conversion to foam cells and the proliferation of VSMCs, thereby facilitating the growth of atherosclerotic plaques (Maxfield, 2014).
Among the six core lncRNAs, only TUG1 has been directly linked to development of AS in animal experiments, highlighting the novelty of our findings. Wu et al. (2020) reported that in the pathology process of AS, lncRNA TUG1 could regulate the proliferation and apoptosis of endothelial cells and VSMCs by sponging miR-148b. Although the biological functions of other lncRNAs remain largely unclear, the expression pattern of them was stable across independent data sets, as shown in Figure 5B. Of note, lncRNA OIP5-AS1 was not validated as a significantly expressed gene, which may be attributed to the small sample size of the validation data sets. Indeed, lncRNA OIP5-AS1 was shown to regulate the oxidized LDL-mediated endothelial cell injury through serving as a ceRNA of miR-320a (Zhang et al., 2020), implying that OIP5-AS1 has a potential functional role in AS. Moreover, the expressions of the six lncRNAs showed an excellent diagnostic performance for AS occurrence. These findings, together with results of relevant studies, indicated that the six lncRNAs may function importantly in the progression of AS.
Several limitations of our study should be recognized. First of all, we used a probe reannotation strategy to obtain lncRNAs transcripts due to the lack of open data set, which may filter out many lncRNAs that cannot match to the probe sequences. However, this is a widely used method in bioinformatics analyses and could help investigators reanalyze the microarray data to identify both lncRNAs and mRNAs expression profiles (Du et al., 2013). Second, the ASpLMN was built by using the miRNA–target interactions from animal experiments. Because we emphasized reliability rather than completeness, the predicted data were not considered in this study. Experimentally supported data were neither complete nor unbiased; therefore, future experiments were required to validate the identified lncRNAs and ceRNA mechanisms.
Conclusion
Taken together, the results of our study provide a regulatory network for AS progression based on the ceRNA hypothesis. We identified a total of six key lncRNAs from the network by topological analysis, subnetwork analysis, and random walking algorithm. The expression patterns of the six lncRNAs were highly correlated with the disease stage of AS and showed strong diagnostic power for atherogenesis. We also obtained some ceRNA-mediated signaling pathways for AS progression. These results will improve our knowledge about the ceRNA regulatory mechanisms involved in the progression of AS, and provide potential lncRNAs as diagnostic biomarkers or therapeutic targets.
Data availability
The transcriptomic data were obtained from the Gene Expression Omnibus database and the ArrayExpress database. The detailed sources for all data used this study were summarized in Supplementary Table S6.
Footnotes
Disclosure Statement
No competing financial interests exist.
Funding Information
This study was supported by the grant from The Doctoral Scientific Research startup Fund Project of the Affiliated Hospital of Southwest Medical University (No. 19071).
Supplementary Material
Supplementary Table S1
Supplementary Table S2
Supplementary Table S3
Supplementary Table S4
Supplementary Table S5
Supplementary Table S6
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
