Abstract
Maintenance of the pluripotent state or differentiation of the pluripotent state into any germ layer depends on the factors that orchestrate expression of thousands of genes through epigenetic, transcriptional, and post-transcriptional regulation. Long noncoding RNAs (lncRNAs) are implicated in the complex molecular circuitry in the developmental processes. The ENCODE project has opened up new avenues for studying these lncRNA transcripts with the availability of new datasets for lncRNA annotation and regulation. Expression studies identified hundreds of long noncoding RNAs differentially expressed in the pluripotent state, and many of these lncRNAs are found to control the pluripotency and stemness in embryonic and induced pluripotent stem cells or, in the reverse way, promote differentiation of pluripotent cells. They are generally transcriptionally activated or repressed by pluripotency-associated transcription factors and function as molecular mediators of gene expression that determine the pluripotent state of the cell. They can act as molecular scaffolds or guides for the chromatin-modifying complexes to direct them to bind into specific genomic loci to impart a repressive or activating effect on gene expression, or they can transcriptionally or post-transcriptionally regulate gene expression by diverse molecular mechanisms. This review focuses on recent findings on the regulatory role of lncRNAs in two main aspects of pluripotency, namely, self renewal and differentiation into any lineage, and elucidates the underlying molecular mechanisms that are being uncovered lately.
Introduction
L
In recent years, stem cells and their therapeutic applications have emerged as a rapidly developing field of research all over the world [21]. After decades of research, we have gained fragmented knowledge about stem cell biology and their actual therapeutic applications that needs to be assimilated to carry out useful implementations (as discussed by Parker and colleagues [22,23]). Pluripotent stem cells are a special type of stem cells that have the potential for differentiating into any of the three germ layers: endoderm, mesoderm, and ectoderm [24]. Owing to the capability of these pluripotent stem cells of self-renewal and of giving rise to cells from any lineage, these are thought to be the most potent cells that can be used for damage treatment for replacement of injured tissues and possibly whole organs [25,26]. In mammals, these pluripotent stem cells are derived from epiblasts in early embryos and are called embryonic stem cells (ESCs) [24]. Induced pluripotent stem cells (iPSCs) are engineered pluripotent cells that can be generated by reprogramming an adult somatic cell in such a way that they mimic the gene expression profile of an ESC. After the pioneering discovery of John Gurdon in 1958 for successful cloning of a frog using the intact nuclei from somatic cells of tadpoles [27], several advancements followed in the field of nuclear reprogramming [28 –30]. However, it was in the year 2006 when reprogramming researches achieved a new height after the study of Yamanaka and colleagues, where the authors described the induction of the pluripotency-associated changes after the introduction of four key transcription factors overexpressed in ESCs, namely, Oct4, Sox2, Klf4, and c-Myc, into adult mouse fibroblasts [31]. This experiment was successfully repeated on humans [32]. However, a clear understanding of the cellular factors playing their parts in the maintenance of pluripotency is very important for successful therapeutic application of these pluripotent cells [33 –37]. The transcriptional and epigenetic scenarios in the pluripotent state are very different from adult cells [38]. The distinctive gene expression signature of pluripotent cells is controlled by the key pluripotency-associated transcription factors acting in concert with other regulatory factors, such as noncoding RNAs [39]. This is likely because the widespread activation or repression of gene expressions defining the pluripotent state has to be controlled by some molecules that guide and bring different regulatory components close together.
MicroRNAs are known to be implicated in their regulatory roles in the developmental process [40 –43]. Similarly, long noncoding RNAs are also identified as the key players in the developmental process, and many of them are differentially expressed during development [44 –47]. Recent studies have identified hundreds of long noncoding RNAs showing significant high expression profiles in both embryonic as well as iPSCs [48]. Many of these lncRNAs have been identified to be transcriptionally activated or repressed by pluripotency-associated transcription factors. A recently published database [49] presents a comprehensive annotation of the transcription factor-binding sites in lncRNAs and transcriptional regulatory relationships of transcription factors and lncRNAs based on chromatin immunoprecipitation with the next-generation DNA sequencing (ChIP-Seq) data. This database includes lncRNAs whose promoters are found to have binding sites for pluripotency-associated transcription factors. There are also lines of evidence of lncRNAs binding with chromatin-modifying complexes to modulate the expression of pluripotency-associated factors to maintain pluripotency or to promote differentiation of pluripotent cells [40]. This review gives a brief account of pluripotency-associated lncRNAs, those acting in favor of maintaining pluripotency, and those facilitating differentiation and the possible molecular mechanisms underlying their regulatory effects.
Differential Expression of lncRNAs in Pluripotent Stem Cells
Mostly, long noncoding RNAs have cell type-specific expression signatures [10]. This selective expression signature is attributed to the epigenetic modifications such as DNA methylation, histone modification, and chromatin remodeling, similar to the protein-coding transcripts [50]. The study by Zhang and colleagues [51] of the lncRNA methylation status and expression profiles of lncRNAs in mouse ESCs, neuronal progenitor cells, and terminally differentiated fibroblasts revealed dynamic changes in the histone modification status for the lncRNA promoters in these three cell types. Moreover, they found EzH2-mediated histone modification H3K27 to play a key role in silencing lncRNAs in mouse ESCs [51]. Recent studies have identified many lncRNA transcripts to be differentially expressed in pluripotent cells, both in ESCs and in iPSCs [39,48,52], compared to fibroblasts or neuronal progenitors. We discuss in the following section the differential expression patterns of lncRNAs in both embryonic and iPSCs.
Differential expression in ESCs
A recent study sought to identify lncRNAs, which are differentially expressed in ESCs, fibroblast-derived iPSCs, and adult fibroblasts, from a pool of 900 lncRNAs. It uncovered over a 100 lncRNAs showing significant overexpression in ESCs and iPSCs compared to fibroblasts [48]. They also identified 104 lncRNAs significantly repressed in both ESCs and iPSCs. Previously, an RNA-seq based study identified 226 lncRNAs predominantly expressed in mouse ESCs [53]. The lncRNA microarray expression profiling study by Ng et al. [52] to find differentially expressed lncRNAs in human ESC and human neural progenitor cells (NPC) identified 36 lncRNAs showing significant overexpression in hESCs compared to NPCs, of which three lncRNAs were exclusively expressed in human ESCs and iPSCs. Two of these lncRNAs were identified to bind the epigenetic regulator SUZ12 and SOX2. A study of noncoding RNA expression profiles of mouse ESCs differentiating as embryoid bodies (EBs) over a 16-day timecourse identified many lncRNAs associated with each of the three stages of differentiation; pluripotency, primitive streak formation, and mesoderm differentiation [54]. Twelve lncRNA were found whose expression profiles correlated with the expressions of pluripotency markers Oct4, Nanog, and Sox2, while 7 lncRNAs showed correlated expression with EB differentiation marker genes Evx1 and brachyury (T).
The X-chromosome inactivator Xist, which represses one X-chromosome at the onset of differentiation of female ESCs, is found to be silenced in ESCs and only to be expressed at the onset of differentiation. Female ESCs are marked by the presence of two active X-chromosomes, one of which becomes silent at the verge of cell differentiation. This phenomenon is a landmark of initiation of differentiation and exits from the pluripotent state in female ESCs. Regulation of the phenomenon of X-chromosome inactivation (for differentiation) or reactivation (in case of iPSCs) is controlled by noncoding RNAs in the X-inactivation center (Xic), Xist being the central and most widely reviewed lncRNA in this area. Apart from Xist, other lncRNAs in Xic are also identified to be linked in the interplay of controlling X-inactivation (Xi), examples including lncRNAs Tsix, Jpx, and RepA [55 –57]. These lncRNAs are in turn regulated by pluripotency-associated transcription factors such as Oct4, Nanog, and Sox2 in ESCs and iPSCs. A detailed account of this regulatory network in mouse ESCs and iPSCs can be found in the review by Kim et al. [58].
Besides this, many well-annotated lncRNAs are found to be differentially expressed in ESCs, such as NEAT1 or Rian [39]. The lncRNA NEAT1 was undetected in human ESCs (hESCs) and only induced upon differentiation of ESCs [59]. The lncRNA NEAT1 has been found to regulate the cytoplasmic export for the mRNAs containing inverted repeat (IRAlu) elements by facilitating nuclear paraspeckle formation in differentiated cells. Absence of the lncRNA NEAT1 in pluripotent hESCs seems to be responsible for the failure of nuclear paraspeckle formation in hESCs, which facilitates the cytoplasmic export of IRAlu-containing mRNAs [59]. One such example is the transcript of the pluripotency-associated gene Lin28. Lin28 is a very important factor for maintaining pluripotency and was one of the four factors used for reprogramming of mouse adult somatic cells into iPSCs [32]. On the contrary, NEAT1 lncRNA has been reported to be one of the mouse ESC-expressed lncRNAs that interact with a number of chromatin-binding protein/complexes in mouse ESCs [39]. Table 1 gives a list of annotated lncRNAs showing differential expression in ESCs and iPSCs.
ESCs, embryonic stem cells; iPSCs, induced pluripotent stem cells; mESC, mouse ESC.
Differential expression in iPSCs
The study by J. Rinn and colleagues identified 26 lncRNAs that were overexpressed in both human ESCs and iPSCs, but showed an elevated level in iPSCs compared to ESCs. They thought these lncRNAs as potential candidates important for reprogramming [48].
LncRNA transcripts Meg3, Meg9, and Rian in the imprinted Dlk1-Dio3 gene cluster on chromosome 12qF1, which is normally expressed in the ESCs [39] and found to interact with polycomb-repressing complex PRC2 in mouse ESCs [60], were found to be aberrantly silenced in most of the iPSC clones in a comparison of genetically identical mESCs and mouse iPSCs. These iPSCs failed to support the development of the entirely iPSC-derived animals [61]. Table 1 shows lncRNAs differentially expressed in iPSCs compared to ESCs.
LncRNAs Acting in Concert with Key Transcription Factors to Aid the Controlling of Pluripotency
LncRNAs are an integral part of the pluripotency-controlling network and a part of them are implicated in the maintenance of pluripotency and repressing differentiation and lineage programs [39], and on the other hand, some lncRNAs are found to act in the opposite way, by promoting differentiation of pluripotent cells. In the section below, we will discuss how lncRNAs act with pluripotency transcription factors in an orchestrated way to aid the controlling of pluripotency: being transcriptionally activated or repressed by pluripotency transcription factors and then regulating gene expressions to maintain the pluripotent state or promote differentiation of pluripotent cells. In addition, the evidence toward a potential regulatory feedback loop involving transcription factors and lncRNAs will be discussed.
Transcriptional activation or repression of lncRNAs by pluripotency-associated transcription factors
Transcription factors acting as key players for maintaining pluripotency control gene expression signature specific to pluripotent cells by transcriptionally activating thousands of genes in ESCs and also in iPSCs. As discussed in the previous section, hundreds of lncRNAs are differentially expressed in pluripotent cells, and many of them are likely to be regulated by the pluripotency-associated transcription factors. The recently published database ChIPBase [49] shows more than 8000 lncRNAs in mouse that harbor binding sites for at least one of nine key pluripotency transcription factors (including Oct4, Sox2, Nanog, c-Myc, n-Myc, Klf4, Zfx, E2F1, and Smad1) in their promoter region; identified by analyzing ChIP-Seq data. In the following section, we will discuss about lncRNAs activated or repressed by pluripotency-associated transcription factors in both ESCs and iPSCs.
In ESCs
A search for pluripotency transcription factors, the Oct4- and Nanog-binding sites in mouse ESC-expressed lncRNAs resulted in recognition of 10% of the Oct4-binding sites in the vicinity of the lncRNA genes [62]. Further search for the candidates with strong genomic conservation in the exonic sequences yielded four candidate lncRNAs, two of which, namely, Gomafu (AK028326) and AK141205 (cis-antisense to C18ORF22 homolog), were identified to be the direct targets of Oct4 and Nanog, respectively. shRNA-mediated knockdown of Oct4 and Nanog in mESCs resulted in a change in the expression levels of all four lncRNAs with putative binding sites for these two transcription factors. Of the two lncRNAs directly regulated by Oct4 and Nanog, lncRNA AK028326 is transcriptionally activated by Oct4, and AK141205 is transcriptionally repressed by Nanog. A recent study by Lander and colleagues identified that ∼75% of the all 226 lncRNAs expressed in mouse ESCs have binding sites for at least one of 9 pluripotency-associated transcription factors (Oct4, Sox2, Nanog, cMyc, nMyc, Klf4, Zfx, Smad, and Tcf3) they studied [39], and this observation was further validated by shRNA-mediated knockdown of 11 pluripotency-associated transcription factors, for which 60% mESC-expressed lincRNAs showed significant downregulation.
In the previously mentioned study by Ng et al. [52], the authors identified three human ESC-specific lncRNAs likely to be regulated by pluripotency-associated transcription factors Oct4 and/or Nanog, as these lncRNAs showed decreased expression after RNAi knockdown of Nanog and/or Oct4 in hESCs.
The phenomenon of X-chromosome inactivation at the onset of differentiation of female ESCs and the role of noncoding RNAs in X-inactivation center have been introduced in Section 1. The noncoding RNAs in this region are regulated by pluripotency-associated transcription factors such as Oct4, Sox2, and Nanog. The regulator of Xist transcription, lncRNA Tsix, and its activator lncRNA Xite are identified to be the direct targets of Oct4 in mESCs. A second pluripotency transcription factor Sox2 also binds Xite while indirectly interacting with Tsix by a looping interaction [63]. The lncRNA Xist itself contains Nanog binding sites in its intron 1, and co-occupancy of Nanog and Oct4 can reduce the Xist RNA level in undifferentiated mESCs, most likely in a Tsix-independent manner, as Nanog-deficient mESCs showed elevated levels of Xist, but no significant changes in the Tsix RNA level. Interestingly, Oct4 and Sox2 are found to be associated with Xist intron 1 in Nanog-depleted mESCs [64].
In iPSCs
Pluripotency-associated transcription factor-mediated activation of lncRNAs was also observed in iPSCs. The transcriptional signature of the ncRNAs in the X-inactivation center undergoes extensive changes after reprogramming mouse somatic cells to mouse iPSCs by introducing four key transcription factors: Oct4, Sox2, Klf4, and c-Myc. Xist becomes undetectable, and Tsix expression becomes biallelic while Xite is also expressed. These changes in the expression of lncRNAs in the X-inactivation center leads to reactivation of both X-chromosomes and eventually into a pluripotent state [65,66].
The study by Rinn and colleagues [48] of identifying lncRNAs playing key roles in reprogramming of somatic cells toward induced pluripotency detected three lncRNAs, overexpressed in iPSCs (compared to ESCs), to have the Oct4-binding sites in their promoters. The levels of all three lncRNAs, namely, lincRNA-SFMBT2, lincRNA-VLDLR, and lincRNA-ST8SIA3, fell after knockdown of Oct4 in the iPSC cell lines lincRNA-ST8SIA3 and lincRNA-SFMBT2, showing the strongest response.
Regulation of gene expression by lncRNAs to maintain pluripotency
LncRNAs are found to be an integral part of the regulatory network controlling the gene expression signature specific to pluripotency. In the lincRNA loss-of-function study of Lander and colleagues [39], 137 of 147 lincRNA, successfully targeted by shRNAs in mouse ESCs, induced changes in the gene expression profile in a significant level upon knockdown. Knockdown of each of these 137 lincRNAs affected expression of 175 genes in average, the range for individual lincRNAs being 20–936. This change in the gene expression signature was comparable to that obtained by knockdown of 40 known transcriptional or chromatin regulator genes playing a key part in pluripotency (38 of them induced expression changes for 207 genes in average, the range for individual regulator genes being 28–1187). Among the 147 lincRNAs expressed in mouse ESCs, 26 lincRNAs were identified that play key parts in the maintenance of pluripotency by inducing expression of pluripotency marker transcription factors such as Nanog and Oct4. Many of the lncRNAs studied here are seen to repress the lineage program in mESCs. Knockdown of 30 lincRNAs induced differentiation of mESCs into specific lineages, which in turn suggests that these lncRNAs are normally a hindrance to the differentiation programs.
A recent RNAi screening study by Buccholz and colleagues [67] identified three candidate lncRNAs in mouse ESCs (mESCs) with a potential role in maintaining pluripotency (based on loss-of-expression study of Oct4-driven GFP after transfection of the lnc-esiRNA library), and referred to them as pluripotency-associated noncoding transcripts 1–3 or Panct1–3. They further examined the role of Panct1 (as it exhibited the strongest phenotype) in mESC pluripotency by an RNAi-mediated loss-of-function study and found the pluripotency marker Oct4 and Nanog mRNA level to be reduced after Panct1 knockdown, while expression of lineage markers such as Gata6 (endoderm) and Fgf5 (ectoderm) increased. This gave an indication toward a role of the lncRNA Panct1 in maintaining the pluripotency in mESCs.
Rinn and colleagues [48] found iPSC-enriched lincRNA-ST8SIA3, named as linc-RoR (Regulator of Reprogramming), to have a functional role in iPSC colony formation by negatively regulating the proapoptotic pathways, such as the p53 pathway, DNA damage, and oxidative stress response pathway. That was indicated by the upregulation of genes involved in these proapoptotic pathways upon knockdown of lincRNA-ST8SIA3 in iPSC cells, which resulted in reduced iPSC colony formation [48].
Regulation of gene expression by lncRNAs to promote differentiation of pluripotent stem cells
Conversely, some lncRNAs are identified to promote differentiation. This class of lncRNAs includes the lncRNA Mistral (Mira), which directs germ layer differentiation in mESCs by transcriptional activation of the homeotic genes Hoxa6 and Hoxa7. Mira is located in the spacer region between Hoxa6 and Hoxa7 and is transcriptionally silent in ESCs. Upon retinoic acid-induced activation, Mira activates transcription of Hoxa6 and Hoxa7 by recruiting the epigenetic activator MLL1 to chromatin—which in turn induces expression of the genes involved in germ layer specification [68]. Another lncRNA HOTAIRM1, expressed from the HOXA cluster, undergoes a dramatic change in the expression level upon neuronal differentiation from iPSC and regulates genes from that cluster in cis [69].
The lncRNA–transcription factor feedback loop in pluripotency
The circuitry controlling pluripotency in ESC and iPSCs have a complex interplay between pluripotency-associated factors such as transcription factors, chromatin-modifying complexes, and regulatory RNAs. These factors are interlinked by intriguing the potential feedback loops that maintain gene expression signatures specific to the pluripotent state, or promote differentiation under certain conditions. LncRNAs are also identified to be a part of such feedback loops with pluripotency-associated transcription factors that fine-tune their expression in pluripotent cells (see Fig. 1). The lncRNA Gomafu/Miat (AK028326) was identified to control Oct4 expression in a regulatory feedback loop in mouse ESCs [62]. This lncRNA is a direct target of Oct4 and is activated by Oct4 in mESCs. Knockdown of this lncRNA by siRNA resulted in decreased levels of Oct4 and Nanog in mESCs—which suggests that this lncRNA is involved in an autofeedback loop with its regulator Oct4 to maintain its level in mESCs and eventually promoting pluripotency [62].

LncRNA–transcription factor regulatory feedback loop in pluripotency. Color images available online at
Molecular Mechanisms for Regulation of Gene Expression by lncRNAs
LncRNAs are implicated in diverse regulatory roles in epigenetic, transcriptional, or post-transcriptional regulations of gene expression, and we are just beginning to understand the molecular mechanisms that enable the lncRNA molecules to perform such regulatory roles. In this context, a very interesting hypothesis by Guttman and Rinn [70] leads to an intriguing new direction for future investigations about lncRNA functionality. In this section, we will discuss about different modes of regulation by lncRNAs in pluripotent cells with specific examples.
Cis- or trans-acting lncRNAs in regulation of gene expression
The mechanism of gene expression regulation by lncRNAs has been debated over the issue that whether they preferentially act in cis (regulating nearby genes) or in trans (regulating genes in distal loci) (see Fig. 2a). While some of the earlier studies claimed that lncRNAs mainly work in cis [71,72], this conclusion was contradicted by the study of Lander and colleagues [39], where they found only a small number of lincRNAs affect the expression of neighboring genes, as knockdown of only 8 lincRNAs (out of 137 mESC-expressed lincRNAs) had any effect on the expression of genes located within 300 kb of those lincRNAs. This finding was in contrast with an earlier report where 7 of the 12 lincRNAs studied were found to regulate the expression of the neighboring genes residing within 300 kb of the respective lincRNAs. Lander argued that this cis-regulatory behavior may be attributed to the shared upstream regulations or local transcriptional effects.

Molecular mechanisms of lncRNA regulation.
Recently, these conclusions were challenged by the findings of the GENCODE consortium under the ENCODE project [9]. From pairwise expression correlation study, they found that nearly 40% of the GENCODE-annotated lncRNA transcripts overlap the protein-coding genes, and these lncRNAs showed a significant positive correlation of expression with the intersecting protein-coding genes than the mRNA–mRNA control pairs [10]. Especially, the lncRNAs with antisense orientation to their intersecting mRNA exons showed more-significant positive correlation of expression, pointing toward the cis-antisense regulatory mechanism previously discovered for some lncRNAs. In the GENCODE release, they reported a total of 187 antisense lncRNAs whose expression strongly correlated with that of their respective intersecting genes.
Apart from the debate of whether lncRNAs preferentially act in cis or trans, there are several lines of evidence of both types of interactions in pluripotent cells. In addition to the previously mentioned 137 mESC-expressed lincRNA loss-of-function studies by Lander and colleagues [39], where most of the lincRNAs were found to affect the expression of distal genes, some of the other lincRNAs studied in mouse ESCs also showed that a trans-acting regulation of gene expression, such as lncRNA- AK028326 (Gomafu/Miat), was found to regulate Oct4 expression [62].
Antisense lncRNAs overlapping the Oct4 promoter and coding region and Oct4 pseudogene 5 (Oct4-ps5) coding region were found to regulate expression of these genes in a cis-regulatory manner, as depletion of the antisense lncRNAs in MCF7 cells resulted in increased expression of Oct4 and Oct4-ps5 [73].
LncRNAs emerging as effecter molecules in epigenetic regulation
The way in which differential gene expression profiles are maintained in different cell types is a matter of immense interest in biological research. Transcriptional activation or inactivation of genes is largely attributed to the epigenetic modulations such as chromatin remodeling, histone modification, or DNA methylation. Not only for tissue-specific gene expression, have epigenetic modifications played an important part in many diseases, including cancer. In a pluripotent state, chromatin exhibits a more permissive state that allows expression of genes that maintain pluripotency. The set of protein complexes that are responsible for chromatin modifications depends upon other cellular factors that help the complexes guide to their targets. In recent years, more and more numbers of lncRNAs are identified to mediate such epigenetic regulation by associating with chromatin-remodeling complexes. The role of lncRNAs functioning as guides to epigenetic modulator proteins is sufficiently well studied and well reviewed [15,20,74,75]. In this section, we are going to focus more on ESC- or iPSC-expressed lncRNAs that are identified to be associated with the chromatin-remodeling complexes and their roles in pluripotency and differentiation of pluripotent cells.
LncRNAs involved in epigenetic modulation of gene expression related to the maintenance of pluripotent cells
Lander and colleagues found many mESC-expressed lincRNAs to be physically associated with chromatin-remodeling proteins in mESCs [39]. By immunoprecipitation of the RNA–protein complexes in mESCs, they identified 24 lincRNAs among the set of 226 previously recognized mESC-expressed lincRNAs, to bind to the polycomb-silencing complex, which plays a critical role in the mESC pluripotency. Besides the polycomb complex, they identified 11 additional lincRNA-associated chromatin-modifying complexes, involved in reading (PRC1, Cbx1, and Cbx3), writing (Tip60/P400, PRC2, Setd8, ESET, and Suv39h1), and erasing (Jarid1b, Jarid1c, and HDAC1) histone modifications, and also a DNA-binding protein YY1 that is associated with chromatin. In total, 74 lincRNAs were identified to be associated with at least one of these 12 chromatin complexes. Moreover, many of these lincRNAs were found to interact with multiple chromatin-modifying complexes that are functionally interlinked with one another (a reader, writer, and eraser combination). They have found 17 lincRNAs associated with the PRC2 complex (writer of H3K27 repressive marks), PRC1 complex (reader of H3K27-repressive marks), and Jarid1b complex (eraser of H3K4-activating marks). These findings lead to a potential model of lncRNAs acting as scaffolds by interacting with two or more protein complexes via distinct binding domains, to bridge these functionally related protein complexes together for epigenetic regulation, which is consistent with the hypothesis by Guttman and Rinn [70].
LncRNAs involved in epigenetic modulation of gene expression related to differentiation of pluripotent cells
An example of lncRNA involved in promoting differentiation by epigenetic activation is the lncRNA Mistral (Mira) that recruits the epigenetic activator MLL1 to the Mira gene locus (see Fig. 2b) and triggers transcription of Hoxa6 and Hoxa7 [68] (discussed in Section 2).
The role of the lncRNA Xist in X-chromosome inactivation in female embryos has been studied broadly [76]. Xist lncRNA is expressed from only the inactive X-chromosome, and coats the entire chromosome in cis to epigenetically silence the gene expression from that chromosome [77]. The epigenetic silencing of one X involves the recruitment of the polycomb group to confer silencing epigenetic mark H3K27Me3 on histone. The polycomb-silencing complex PRC2 on the chromosome is recruited by a noncoding RNA originating from the 5′-end of Xist—called RepA [58]. RepA directly interacts with EzH2, a catalytic subunit of PRC2, via a secondary structure within repeat A. This ncRNA RepA is activated before X-chromosome inactivation and involved in aiding the activation of Xist. Among many regulators of Xist within the X-inactivation center in pluripotent female ESCs, perhaps the most well known is lncRNA Tsix, which transcribes from both alleles and has a 40-kb region enriched for H3K4 methylation and H4 acetylation in active state [78]. It mediates repression of Xist by physically associating with DNA methyltransferase Dnmt3a at the Xist promoter, leading to DNA methylation and silencing of Xist only on the active X chromosome [55,79,80]. Interplay of these lncRNAs with epigenetic modulators, along with other pluripotency factors, mediates the X-chromosome inactivation state in female ESCs, which is a marker of its differentiation status [14,58].
As discussed in the previous section, an lncRNA antisense to Oct4 pseudogene 5 (asOct4-pg5) was found to regulate expression of Oct4 in MCF7 cells, and possibly its regulatory function is mediated by chromatin modification at the Oct4 promoter. A loss-of-function study of this lncRNA as Oct4-pg5 to find its effect on the chromatin structure showed decreased levels of repressive histone marks H3K9me2, H3K27me3, and Ezh2 at the Oct4 promoter ([measured by chromatin immunoprecipitation (ChIP) after 72 h of transfection of siRNA targeting asOct4-pg5)]. This leads to the speculation that asOct4-pg5 may function by guiding the epigenetic silencer Ezh2 to the promoter of Oct4 [73] in MCF7 cells. Oct4 being a very important pluripotency factor, such findings also lead us to investigate about similar types of Oct4 regulation in pluripotent cells.
LncRNAs in transcriptional and post-transcriptional regulation of gene expression
Till now, most of the identified lncRNAs are implicated in the epigenetic regulatory process, acting as scaffolds for chromatin-modifying complexes. However, there exist other modes of actions for many lncRNAs, such as transcriptional or post-transcriptional regulation of gene expression. However, these types of functional roles of lncRNAs are much less studied, but our understanding is expanding.
Transcriptional regulation
There are several lines of evidence of lncRNAs acting in transcriptional regulation by directly associating with RNA pol II or transcription factors: by acting as decoys by binding to transcription factors and preventing them to bind to the promoter region of a gene [81] or blocking pol II or transcription factor binding and transcription initiation [82]. Detailed examples are discussed in the reviews by Chang and colleagues [15,20]. The above-discussed examples of lncRNA-mediated transcriptional regulations are not specific to pluripotent cells. However, another finding of transcriptional regulation of the pluripotency marker Oct4 gene by an antisense lncRNA as Oct4-pg5 (discussed in previous section) suggests possible roles for some of the ESC- or iPSC-expressed lncRNAs in the process of transcriptional regulation [73].
Post-transcriptional regulation
There are also examples of post-transcriptional regulation of gene expression by lncRNAs: by regulating the alternative splicing of the pre-mRNA transcripts, controlling mRNA export into the cytoplasm, or by interfering with the miRNA pathways. The lncRNA NEAT2 or MALAT1 is found to regulate pre-mRNA alternative splicing by modulating the cellular levels of active serine/arginine (SR) splicing factors [83].
Controlling nuclear retention of mRNAs
Another mode of regulation of gene expression by lncRNAs is controlling the nuclear retention of mRNAs with inverted repeats (Alu repeats in human). Often, in many types of cells, mRNAs containing the Alu elements in their 3′-UTRs are inefficiently transported to the cytoplasm and thus are retained to the nucleus, leading to their loss of expression. The phenomenon of nuclear retention of these mRNAs correlates with paraspeckle formation in the nucleus. The lncRNA NEAT1 expression was undetected in pluripotent human ESCs, but its expression was only found upon differentiation, and this correlates with the absence of nuclear paraspeckle formation in hESCs, while the differentiated cells show paraspeckle formation [59]. The absence of NEAT1 in hESCs resulted in an efficient export of some pluripotency-associated mRNAs such as Lin28 that contain inverted repeated Alu in their 3′-UTR (Fig. 2c).
LncRNAs interfering with miRNAs
LncRNAs can also influence post-transcriptional regulation by interfering with the miRNA pathways, by acting as competing endogenous RNAs (ceRNAs). These lncRNAs have miRNA-binding sites in them and control endogenous miRNAs available for binding their target mRNAs, thus reducing the repression of these mRNAs. This class of lncRNAs have been found to be an important regulator in cell cycle control and tumor suppression (e.g., PTEN-P1 blocking miR-19b and miR-20a from binding to PTEN tumor suppressor), as well as in developmental stages (e.g., linc-MD1-blocking miR-133 and miR-135 from binding to transcription factors involved in myogenic differentiation) [84,85]. Until now, there are no reports of this type of post-transcriptional regulation by lncRNAs in pluripotent cells; a detailed study in this direction may prove to be beneficial.
Discussion
LncRNAs, previously thought as junk products, are being identified as important regulatory molecules to control lineage-specific gene expression, and an important player in diseases such as cancer. The role of lncRNA in controlling the epigenetic landscape during development has been reviewed previously. The role of lncRNAs in maintenance of pluripotency has come under attention recently [86 –88]. Our review goes a step beyond and elucidates the functions of lncRNAs in two quite distinct aspects of pluripotent stem cells, namely, their ability for maintenance of pluripotent state, and equally importantly their capability to differentiate into any germ layer. These two aspects taken together have profound potential in regenerative medicine. Some recent discoveries regarding lncRNA expression and their interplay with other cellular factors such as transcription factors and epigenetic modulators in pluripotent ESCs, as well as induced pluripotent cells, take us a step further in the process of understanding the intriguing cellular processes controlling pluripotency. The extent of lncRNA-mediated regulation and their molecular mechanisms are beginning to be understood. The number of functionally well-annotated lncRNAs was scarce until recently, but the recent release from the ENCODE project has cast light on the functioning of many previously unrecognized and unannotated large noncoding transcripts. The GENCODE consortium under the ENCODE project has already made available a list of 14,880 manually annotated lncRNAs in humans, with their expression profiles in 31 human tissues [10]. They have also analyzed these transcripts for their modes of action (cis or trans); their findings in this regard are summarized in this review.
As recent findings suggest, lncRNAs play an integral part in the circuitry controlling pluripotency along with specific transcription factors and chromatin complexes. They are activated by pluripotency-associated transcription factors, and some of them in turn regulate expression of these transcription factors through a feedback loop. Further experimentations are needed to conduct large-scale studies for understanding this kind of regulatory feedback loops in ESCs and iPSCs. Recently available vast datasets [49] of transcription factor binding sites in the lncRNA regulatory regions, identified by ChIP-Seq, can be explored and further investigated for actual regulatory relationships between pluripotency-associated transcription factors and lncRNAs in pluripotent cells. Association of the lncRNAs with chromatin-modifying complexes in pluripotent cells is comparatively well studied in comparison to their other regulatory mechanisms such as transcriptional or post-transcriptional regulation. More studies in the direction of finding potential roles of lncRNAs in transcriptional or post-transcriptional regulation of gene expression in pluripotent cells are needed.
Footnotes
Acknowledgments
This work was supported by intramural research funds of our host institutions.
Author Disclosure Statement
All authors have read and approved the manuscript, and hereby we declare that none of them has any competing interest.
