Abstract
With their ability to integrate their genetic material into the target cell genome, retroviral vectors (RV) of both the gamma-retroviral (γ-RV) and lentiviral vector (LV) classes currently remain the most efficient and thus the system of choice for achieving transgene retention and therefore potentially long-term expression and therapeutic benefit. However, γ-RV and LV integration comes at a cost in that transcription units will be present within a native chromatin environment and thus be subject to epigenetic effects (DNA methylation, histone modifications) that can negatively impact on their function. Indeed, highly variable expression and silencing of γ-RV and LV transgenes especially resulting from promoter DNA methylation is well documented and was the cause of the failure of gene therapy in a clinical trial for X-linked chronic granulomatous disease. This review will critically explore the use of different classes of genetic control elements that can in principle reduce vector insertion site position effects and epigenetic-mediated silencing. These transcriptional regulatory elements broadly divide themselves into either those with a chromatin boundary or border function (scaffold/matrix attachment regions, insulators) or those with a dominant chromatin remodeling and transcriptional activating capability (locus control regions,, ubiquitous chromatin opening elements). All these types of elements have their strengths and weaknesses within the constraints of a γ-RV and LV backbone, showing varying degrees of efficacy in improving reproducibility and stability of transgene function. Combinations of boundary and chromatin remodeling; transcriptional activating elements, which do not impede vector production; transduction efficiency; and stability are most likely to meet the requirements within a gene therapy context especially when targeting a stem cell population.
Introduction
Despite efficient gene delivery that can be achieved with γ-RV and LV, their high propensity to undergo silencing via repressive epigenetic (DNA methylation, histone modification) effects, which can severely hamper therapeutic potential, is well documented (Bestor, 2000; Chang et al., 2006; Cherry et al., 2000; Ellis, 2005; Klug et al., 2000; Minoguchi and Iba, 2008; Mok et al., 2007; Pannell et al., 2000; Parera et al., 2004; Rosenqvist et al., 2002; Stein et al., 2010; Xia et al., 2007). Epigenetic-mediated silencing of γ-RV and LV is further compounded when these vectors are used to initially target a stem cell population. Upon differentiation into their specialized progeny, stem cells will undergo large-scale changes in gene expression patterns that will be associated with major alterations in their epigenetic landscape. In this scenario initial vector integration within a transcriptionally permissive location within a stem cell may subsequently acquire a repressive epigenetic state with concomitant transgene silencing in the differentiated cell population in which therapeutic benefit is sought.
Mechanisms that can lead to therapeutic gene silencing resulting in at least a variegated gene expression pattern can be classified under two broad headings: integration site position dependent and integration site position independent. Position-dependent silencing arises from vector integration either within or near regions of heterochromatin, both centromeric and noncentromeric. In this situation the repressive heterochromatin structure would spread and encompass the integrated vector cassette and silence expression. In position-independent transgene silencing, abrogation of expression takes place even with vector integration within a transcriptionally permissive region of chromatin. In this latter scenario silencing is via DNA methylation of the transgene control elements, especially the promoter region. Given that both γ-RV and LV show a marked preference for integration within or near actively transcribed genes and their regulatory elements, silencing of gene expression from these vectors has been observed to predominantly occur via promoter DNA methylation (for example, Bestor, 2000; Ellis, 2005; Minoguchi and Iba, 2008; Stein et al., 2010).
Curtailment of therapeutic benefit that can arise from transgene silencing is perhaps no better illustrated than the outcome from a clinical trial for X-linked chronic granulomatous disease (X-CGD). In this trial a γ-RV containing a spleen focus forming virus (SFFV) promoter-enhancer element within the vector long terminal repeat (LTR) to drive the therapeutic gp91phox cDNA, was employed in an ex vivo hematopoietic stem cell (HSC) procedure to treat two adult male X-CGD patients (Ott et al., 2006). Initially the outcome was good, with both patients showing restoration of active neutrophil cell populations resulting in the clearance of pre-existing pathogen infections. However, 27 months following gene therapy treatment one of the patients died following severe bacterial sepsis after colon perforation. Analysis of the patients' peripheral blood system revealed that despite neutrophil numbers being maintained, the SFFV-gp91phox had become silenced and thus was no longer providing a therapeutic effect. Silencing was found to have taken place from DNA methylation of the promoter region of the SFFV element. However, the enhancer component of the SFFV control region had remained unmethylated and hence still active and able to participate in an insertional mutagenesis event resulting in host (EVI1) gene activation leading to premalignant expansion of myeloid progenitor cells and hence maintenance of the neutrophil cell numbers (Stein et al., 2010).
Negating Variable Transgene Expression and Silencing
Broadly speaking there are two classes of transcriptional regulatory elements that can be used to reduce or avoid γ-RV and LV transgene silencing: First, elements with a border or boundary function, which include insulators and scaffold/matrix attachment regions (S/MARs); second, locus control regions (LCRs) and ubiquitous chromatin opening elements (UCOEs), which possess a dominant chromatin remodeling and transcriptional activating capability.
Since the use of insulator and S/MAR elements in LVs has recently been extensively reviewed (Emery, 2011; Ramezani and Hawley, 2010), we will focus predominantly on the application of genetic regulatory elements that possess a dominant chromatin remodeling and transcriptional activating function.
S/MARS and Insulators
The concept behind the use of S/MARs and insulators in LVs is that by flanking a transcription unit with these elements at both its 5′ and 3′ ends, one achieves two outcomes: first, a reduction in negative host genome–mediated position effects at sites of LV integration resulting in possibly higher but at least less variable overall levels of transgene expression between transduced cells, and second, a lower propensity of enhancer or LCR type regulatory elements within the LV to disturb host gene function and thus a reduced insertional mutagenesis potential (Emery, 2011; Ramezani and Hawley, 2010).
S/MARs are AT-rich sequences of between 200 and 5000 bp in length and function as chromatin domain boundary elements, shielding active regions and protecting them from the surrounding repressive heterochromatic environment by anchoring DNA to the nuclear protein skeleton (Heng et al., 2004; Ramezani and Hawley, 2010). In addition, S/MARs are also involved in other nuclear processes such as DNA replication (Wilson and Coverley, 2013) and gene expression (Bode et al., 2003). As with other DNA boundary elements such as insulators, DNA loop formation has been proposed as a mechanism of action for S/MAR-regulated transcription (Heng et al., 2004). A number of different S/MAR elements such as those from the human interferon-β (IFNB1) and mouse immunoglobulin-κ light chain (Igkc) loci have been incorporated in LVs (Park and Kay 2001; Ramezani et al., 2003). In addition, the immunoglobulin heavy chain gene enhancer flanked by the associated MAR have been included in LVs designed for B-cell–specific expression (Lutzko et al., 2003; Taher et al., 2008). However, these studies, which link the S/MAR element upstream of a heterologous promoter, confer at most a modest increase in transduction efficiency (Park and Kay, 2001). S/MARs in this LV design are able to at best partially negate integration site position effects but do not produce expression dependent on vector copy number (Ramezani et al., 2003). Any increase in transgene expression observed can be accounted for by an increase in transduction efficiency and thus higher LV copy number per cell rather than any enhancement of expression per vector delivered (Park and Kay, 2001). Furthermore, the functional definition of S/MARs as boundary elements demands that their effective use should be such that the transgene is flanked by these elements at both 5′ and 3′ ends rather than as a single copy upstream of the promoter. Surprisingly, this transgene flanking S/MAR configuration, which can be readily achieved by incorporation within the LV LTRs as has been done for insulator elements, has not been tested. Therefore, the ability of S/MAR elements to provide a flanking border function and thus potentially confer reproducible, high, and stable transgene expression within an LV context remains unknown.
A generic, bioinformatics-led search of the human (Girod et al., 2007) and mouse (Harraghy et al., 2011) genomes for novel S/MARs has resulted in the discovery of elements with an exceptionally high S/MAR boundary function, which are able to give rise to very high and stable transgene expression in stably transfected mammalian cells. Unfortunately, the size (2–3 kb) of these human 1–68 and murine S4 S/MARs and the need for flanking the transgene cassette precludes their use in any current generation LV.
Insulators are DNA–protein complexes that are classically defined by two inherent properties. First, they have the ability to generate a region or “border” of open chromatin and thus negate the spread of repressive heterochromatin and second an enhancer blocking function (Barkess and West, 2012; Ghirlando et al., 2012). However, more recently, insulators have also been found to be involved in chromatin domain loop formation and enhancer promoter interactions with the transcription factor CTCF being crucial in these processes (Krivega and Dean, 2012; Yang and Corces, 2012). Insulators have been incorporated within LVs by insertion within the vector 3′ LTR in place of the U3 region. Upon reverse transcription and integration of the vector provirus within the target cell genome, this results in the transcription unit being flanked by the insulator at both 5′ and 3′ ends and thereby providing a border function. A number of excellent reviews covering the use of insulators within γ-RV and LV have recently been published to which the interested reader is referred (Emery, 2011; Ramezani and Hawley, 2010).
However, it is worth noting the following in connection with the forthcoming section on LCRs. Most studies using insulators in LV employ the cHS4 element from the chicken β-globin gene (HBB_CHICK) LCR (Emery, 2011; Perumbeti and Malik, 2010). Insertion of the full genomic 1.2-kb cHS4 can severely inhibit LV production and therefore compromise utility (Puthenveetil et al., 2004), although a reduced sized but still functional element of 650 bp has gone a considerable way to overcoming this limitation (Arumugam et al., 2009b). Finally, cHS4 is not universal in activity being functional in some, especially erythroid (Perumbeti and Malik, 2010), but not all cell types (Sharma et al., 2012).
Finally, it is also possible to combine S/MAR and insulator elements within the same LV with positive outcomes, which may give a superior boundary function to either element alone (Ramezani et al., 2003; Ramezani and Hawley, 2010).
Dominant Chromatin Remodeling and Transcriptional Activating Elements
To date, two types of genetic regulatory elements have been described whose properties fall within this category: LCRs and UCOEs.
Locus control regions
LCRs are tissue-specific regulatory elements that are functionally defined by their ability to confer upon a gene linked in cis site-of-integration independent, full physiological levels of expression that is proportional to transgene copy number, properties that distinguish this class of element from classical enhancers (Li et al., 2002). The prototypical LCR is that present within the human β-globin gene (HBB) locus (Grosveld et al., 1987). This βLCR, which is responsible for the high-level, erythroid, and developmental stage–specific regulation of the HBB family, consists of five elements marked by a high degree of DNaseI hypersensitivity (designated HS1-HS5) and distributed over a 16- to 17-kb interval starting approximately 5 kb 5′ of the human ɛ-globin gene, HBE (Fig. 1A). Elements HS1–HS4 possess a transcriptional activating function, whereas HS5 is a developmental stage–specific insulator element (Palstra et al., 2008). Each HS consists of a 200- to 400-bp core region, which contains a high density of ubiquitous and tissue-specific transcription factor binding sites particularly those for GATA-1, NF-E2, EKLF, and Sp1 (Kim and Dean, 2012; Palstra et al., 2008). However, since this initial discovery LCRs associated with numerous other genes with different cell specificities have been discovered (Dean, 2006; Li et al., 2002). This includes the T cell–specific LCRs of human CD2 (Greaves et al., 1989; Lang et al., 1991); murine T helper cell (Th)2 interleukin (IL)-4, IL-5, and IL-13 cytokines (Fields et al., 2004; Lee et al., 2003); and α/δ TCR (Diaz et al., 1994; Gomos-Klein et al., 2007) gene loci; the pituitary-specific human growth hormone gene (GHN) cluster (Jones et al., 1995); and muscle-specific human desmin gene (DES) LCR (Raguz et al., 1998; Tam et al., 2006). Invariably LCRs have been found to consist of multiple elements and can be located either 5′ or 3′ of and at a considerable distance from the gene or genes that they regulate (Dean et al., 2006; Li et al., 2002). The mechanism of LCR-mediated gene activation occurs via a DNA looping mechanism (Dean, 2011), which brings together the regulatory element and gene promoter to form a folded structure designated as an active chromatin hub (ACH; de Laat and Grosveld, 2003; Kim and Dean, 2012; Palstra et al., 2008). It is conceivable that some of the large number of potential genetic regulatory elements recently discovered within the vast expanse of the noncoding region of the human genome may constitute LCRs (Pennisi, 2012; Sanyal et al., 2012).

The human β-globin gene (HBB) locus control region (LCR) and its incorporation within lentiviral vectors.
Although very potent and with clear potential utility to be employed within expression vectors, LCRs possess a number of features, which are a drawback to their use in either γ-RV or LV, namely their multicomponent complexity, overall size, and requirement for sufficient distance between elements and gene promoters to allow effective ACH formation and efficient transgene expression. Indeed, to date only the CD2 and HBB LCRs have been incorporated within these vector systems.
The CD2 LCR is T-cell–specific in function and consists of three elements distributed over 2 kb and located within the 3′ flanking region of CD2 (Festenstein et al., 1996; Greaves et al., 1989; Lang et al., 1991). The incorporation of the complete 2-kb CD2 LCR originally within γ-RV gave disappointing results, with this element reported to be unable to either improve expression or negate transgene silencing in vivo (Kaptein et al., 1998). However, subsequent studies using both γ-RV (Indraccolo et al., 2001) and LV (Indraccolo et al., 2001; Kowolik et al., 2001) showed that the full 2-kb element was able to confer expression on heterologous (LTR, SV40) promoters that was elevated, more reproducible, stable, and proportional to transgene copy number in the T-cell lineage both in vitro and more importantly in vivo. The inclusion of the CD2 LCR compromised γ-RV but not LV titer (Indraccolo et al., 2001; Kowolik et al., 2001). These latter studies clearly demonstrated the potential utility of the CD2 LCR to provide the required long-term therapeutic gene expression within a gene therapy context. Despite these encouraging results LVs harboring genes of interest under CD2 LCR control have yet to be tested within a disease animal model system.
Gene therapy for the hemoglobinopathies (thalassemia, sickle cell anemia) requires high level (25%–50% of normal) production of globin polypeptide chains specifically within the red blood cell lineage following genetic correction of HSCs (see Higgs et al., 2012). Work with γ-RVs containing an HBB with local erythroid-specific promoter and enhancer elements but devoid of the βLCR gave rise to at best a variable level of expression of approximately 1%–4% of normal levels following ex vivo transduction and transplantation of bone marrow cells in mice, which is well below the therapeutic threshold (e.g., Dzierzak et al., 1988). It is therefore true to say that it was only with the discovery of the βLCR that made gene therapy for the hemoglobinopathies feasible because only with its inclusion can the required very high level, erythroid-specific expression of globin polypeptide chains to affect therapy, in principle, be achieved. Therefore, the βLCR has been extensively studied initially within γ-RV and subsequently in LV in an effort to develop effective vectors for these conditions. Due to space limitations for transgenes within γ-RV and LV, reduced sized “mini” versions of both HBB and also the βLCR have had to be employed. In addition, the incorporation of βLCR elements in γ-RV resulted in severely compromising vector production and proviral stability (Leboulch et al., 1994). Proviral stability could to a large degree be overcome by extensive mutagenesis of cryptic RNA splicing and polyadenylation sites within the γ-RV and mini-HBB βLCR transcription unit (Leboulch et al., 1994). However, the need to use highly reduced sized βLCR elements consisting of combinations of the core HS regions, although retaining some erythroid enhancer activity, resulted in a loss of LCR function (Leboulch et al., 1994; Sadelain et al., 1995).
The breakthrough in the field occurred with the report of the first LV (named “TNS9”) harboring a βLCR-HBB transcription unit (May et al., 2000). This study showed that the ability of LVs to accommodate more complex transcription units without compromising titer, allowed the incorporation of a much larger (3.2 kb) βLCR combining the most transcriptionally active elements HS2, HS3, and HS4 linked to a mini-HBB plus its 3′ enhancer (Fig. 1b). The greater functionality of this larger βLCR element combination than was previously possible in γ-RV, gave rise to a much higher average level of HBB expression resulting in amelioration of severe β-thalassemia intermedia (May et al., 2000) and thalassemia major (Rivella et al., 2003) in mouse model systems. Since these initial reports, several other βLCR-HBB–based LVs with similar design features to TNS9 have been described (Arumugam and Malik, 2010; Dong et al., 2013). All contain a βLCR with elements HS2, HS3, and HS4 with the exception being GLOBE, which contains elements HS2 and HS3 alone (Miccio et al., 2008). In addition, all contain a mini-HBB including its two introns to allow efficient posttranscriptional processing and stable accumulation of cytoplasmic mRNA (Antoniou et al., 1998). All have reported encouraging results in mouse model systems of β-thalassemia intermedia, β-thalassemia major, and sickle cell disease with correction of the disease phenotype (Arumugam and Malik, 2010; Dong et al., 2013; Perumbeti and Malik, 2010). In addition, it was discovered that GLOBE LV genetically corrected erythroid cells have a selective growth and survival advantage over noncorrected cells and thus accumulate over time (Miccio et al., 2008). This encouragingly suggests that only partial myeloablation in connection with the ex vivo HSC transplant procedure along with relatively few LV transduced HSCs may be adequate to affect therapy. Furthermore, genetic correction of HSCs derived from patients with β-thalassemia has resulted in restoration of HBB chain balance both in vitro (Rosseli et al., 2010) and in NOD-SCID mice in vivo (Puthenveetil et al., 2004).
It is important to note that despite the use of larger fragments of the βLCR within LVs providing higher and more consistent levels of mini-HBB transgene expression, full activity of this regulatory element has not been obtained. Data from both in vitro (May et al., 2000) and in vivo (Arumugam et al., 2007) studies show a high degree of variation in expression between transduced cells, highlighting the influence of position effects at LV sites of integration. This results in only a proportion of delivered mini-HBB βLCR LVs (at best ∼50%) able to express within a therapeutic range. The addition of βLCR element HS1 to sites HS2, HS3, and HS4 can improve expression on average by approximately 50% per LV copy but again does not restore full βLCR activity capable of providing vector copy number dependent levels of transcription (Lisowski and Sadelain, 2007). Improvements in transgene expression in terms of reproducibility are more significantly enhanced by flanking the mini-HBB βLCR cassette with the cHS4 insulator element, which can reduce position effects and thereby increase the probability of a given LV expressing at a therapeutically relevant level (Arumugam et al., 2007). Other elements that have shown encouraging insulator function within mini-HBB βLCR LVs are the HS2 element from the GATA1 control region (Miccio et al., 2011) and the ankyrin 5′ HS barrier insulator (Breda et al., 2012). As in the case of cHS4, both of these elements augmented mini-HBB βLCR function by reducing site of integration position effects, providing a more reproducible level of gene expression and therapeutic benefit at lower vector copy number per cell.
However, collectively available current data on all mini-HBB βLCR LV designs suggest that an average of at least two vector copies per cell will in all likelihood be required for curative therapy of thalassemia major, which raises some concerns about possible insertional mutagenesis. Therefore, from both an efficacy and safety perspective advances in LV design that can effect curative therapy for thalassemia major and sickle cell anemia at a single vector copy per cell remains an important challenge for the field.
Nevertheless, given the encouraging results obtained in preclinical studies, a number of clinical trials with various mini-HBB βLCR LV designs targeting both β-thalassemia and sickle cell anemia are planned or on-going. Thus far there has been one report on a single transfusion dependent thalassemia intermedia patient who was a compound heterozygote with β0 and βE alleles (Cavazzana-Calvo et al., 2010). Part of a trial that commenced in 2006 and has treated three individuals to date, this patient underwent an ex vivo LV transduction and transplantation of their HSCs and, controversially, with concomitant full myeloablative chemotherapy. Although this patient eventually stabilized at 9–10 g hemoglobin (Hb)/dL and became transfusion independent, the outcome constitutes only a partial success of the gene therapy. This is due to the fact that only a third of the total Hb was vector derived, with the rest equally divided between elevated fetal HBG and residual HbE chains (Cavazzana-Calvo et al., 2010). In addition, up to 50% of the LV-derived HBB chains were produced from a single clonally expanded myeloid progenitor cell. This clonal dominance resulted from an insertional mutagenesis event in which the LV had integrated within the third intron of HMGA2 and gave rise to aberrant splicing of the third exon of the host gene onto a cryptic splice acceptor site within the core cHS4 insulator elements present in the LV LTRs. This aberrant splicing resulted in production of a truncated HMGA2 mRNA devoid of let-7 miRNA regulatory target sequences and a protein product encoded by only the first three exons of the gene at highly elevated levels, with consequent downstream disturbances in gene function. Given the special nature of and outcomes in this case it is therefore not possible to deduce whether the LV used is capable of ameliorating the condition of patients with β-thalassemia major, who possess little or no HBB-like chains. (Note: the other two patients treated in this trial have been uninformative with regards to efficacy of the LV and procedure used. The first patient treated failed to engraft with their LV-transduced HSCs and the third patient at 16 months posttreatment is showing only very low levels of vector presence in their circulation and thus no therapeutic benefit).
βLCR-heterologous promoter interactions
The βLCR can activate some but not all heterologous promoters within the erythroid cell compartment (Blom van Assendelft et al., 1989; Collis et al., 1990; Noordermeer et al., 2008). This property has recently been exploited to develop a novel LV design to potentially provide systemic detoxification in patients with a severe combined immune deficiency (SCID)-adenosine deaminase (SCID-ADA) deficiency condition (Montiel-Equihua et al., 2012). Linkage of the βLCR consisting of elements HS2, HS3, and HS4 upstream of the short version of the human elongation factor 1-alpha 1 gene (EEF1A1) promoter (EF1α) gave rise to a greater than 20-fold higher level of expression of both an eGFP reporter and ADA therapeutic gene within the erythroid compared with other hematopoietic lineages in mice following ex vivo transduction and transplantation of HSCs. This outcome suggests that a combination of the βLCR with an appropriate housekeeping gene promoter can provide two levels of therapy in SCID-ADA; correction of the immediate T-cell deficiency by expression of the therapeutic gene within the lymphoid system via the housekeeping gene promoter as exemplified here by EF1α, and systemic detoxification of adenosine and deoxyadenosine metabolites by high-level expression of ADA within red blood cells. Evidently this system is generic in nature and could be adapted for other diseases in which the red blood cell can be converted into a “protein factory” to secrete a missing factor into the circulation, as has been demonstrated for blood clotting factor IX (Sadelain et al., 2009).
Insertional mutagenesis potential of the βLCR
The potent transcriptional activating potential of the βLCR, including at long distances, raises the possibility that this element within LVs can result in host gene activation following proviral integration, with the potential to give rise to insertional mutagenesis events. This possibility is amplified by the fact that the βLCR has a demonstrated activation capability for heterologous promoters (Blom van Assendelft et al., 1989; Collis et al., 1990; Noordermeer et al., 2008). Indeed, βLCR-based LVs have been shown to activate host genes in erythroid cells in vitro (Arumugam et al., 2009a; Montiel-Equihua et al., 2012) and in vivo in mice (Hargrove et al., 2008) up to 100 kb from the site of vector integration. Given that the mechanism of action of the βLCR is to engage and enhance transcription from the nearest activatable promoter via ACH formation (de Laat and Grosveld, 2003; Palstra et al., 2008), this finding may initially appear to be surprising. Since the nearest promoter will always be that linked with the βLCR within the LV, it would be expected to have a very low propensity to activate host genes. There are two factors contributing to βLCR-promoter ACH formation that can lead to the observed host gene activation. First, the ACH is a dynamic structure with the βLCR engaging and disengaging a given promoter or promoters in an on-off or “flip-flop” mechanism (Wijgerde et al., 1995). Second, the close proximity of the βLCR to the linked promoter within the LV may be a hindrance to stable ACH formation. These two factors may therefore result in the βLCR separating from its linked promoter at a high frequency and becoming associated with a more distant host gene, which allows formation of a more stable ACH structure. However, the host gene activation potential of the βLCR can be reduced by the incorporation of cHS4 insulator elements (Arumugam et al., 2009a). In addition, the use of βLCR elements HS2 and HS3 alone as in the GLOBE LV (Miccio et al., 2008) also has a very low host gene activation potential (Roselli et al., 2010). This latter observation suggests that βLCR element HS4 is required in addition to HS2 and HS3 for most βLCR–host gene promoter interactions to take place. It is important to note that to date no insertional mutagenesis events arising from βLCR-mediated host gene activation have been reported.
Since LVs preferentially integrate within actively transcribed genes, there is also the potential of insertional mutagenesis arising from aberrant splicing (Cavazza et al., 2013). It has now been shown that fusion transcripts can occur between host gene and LV sequences via aberrant splicing onto primarily cryptic splice acceptor sites particularly within βLCR element HS3 (Cavazza et al., 2013; Moiani et al., 2012). Although readily detectable, the frequency of these aberrant fusion transcripts was far less than the correctly spliced native mRNA molecules. In addition, it is not known if any stable protein product that could potentially cause harm is actually produced from these aberrant spliced mRNA molecules.
Finally, encouragingly from a safety perspective an LV with βLCR elements HS2, HS3, and HS4 scored negative in an in vitro murine HSC transformation assay (Montiel-Equihua et al., 2012).
Ubiquitous chromatin opening elements
An investigation of the HNRPA2B1-CBX3 and TBP-PSMB1 housekeeping gene loci revealed that the regions consisting of the closely spaced, dual divergently transcribed promoters of these genes, which are encompassed by a methylation-free CpG island are able to confer reproducible and stable transgene expression including when integrated within centromeric heterochromatin (Antoniou et al., 2003). This gave rise to these genomic structures being designated as UCOEs (Fig. 2A; Williams et al., 2005). The prototypical UCOEs are those present at the HNRPA2B1-CBX3 and TBP-PSMB1 loci (Antoniou et al., 2003; Harland et al., 2002; Williams et al., 2005).

The UCOE from the human HNRPA2B1-CBX3 locus.
The current working model of UCOE mechanism of action is multicomponent in nature based upon the following observations. First, the native HNRPA2B1-CBX3 locus has a distinctive epigenetic signature consisting of (i) a region of unmethylated DNA extending to twice the length (5 kb) of the methylation-free CpG island, (ii) promoter regions relatively depleted of nucleosomes, and (iii) an array of active histone modification marks at the HNRPA2B1 and CBX3 promoters and transcribed portion of these genes (Lindahl Allen and Antoniou, 2007). Second, it is known that transcription elongation possesses an inherent chromatin opening capability due to the association of RNA Pol II with nucleosome-remodeling and histone-modifying enzymes (Murawska and Brehm, 2011; Smolle et al., 2013; Winkler and Luger, 2011). It is therefore proposed that the UCOE generates a region of transcriptionally permissive chromatin at ectopic transgene integration sites by recapitulating an extended region of methylation-free DNA with its associated active histone modification marks coupled with divergent transcription from the closely spaced promoters (Antoniou et al., 2003; Lindahl Allen and Antoniou, 2007). This model predicts that for UCOE function to be preserved, a minimum length of the CpG island needs to be present as well as the two divergently transcribing promoters. Initial evidence in support of this model is the observation that transcription from the HNRPA2B1 promoter alone, that is, with the CBX3 promoter deleted, is subject to progressive silencing in stably transfected cells (Antoniou et al., 2003).
Expression vectors based on the UCOE from the HNRPA2B1-CBX3 locus (A2UCOE) have shown great success in expediting the generation of mammalian tissue culture cell lines for the production of proteins (Benton et al., 2002; Boscolo et al., 2012; Nair et al., 2011; Williams et al., 2005) including at commercial scales (John Wynne, Merck-Millipore Corporation, personal communication). In addition, transgenes under control of the A2UCOE have been shown to efficiently function within transgenic mice (Katsantoni et al., 2007).
In more recent years A2UCOE has been adapted for use within LVs. Based on previous work in stably transfected cell lines (Antoniou et al., 2003; Benton et al., 2002; Boscolo et al., 2012; Harland et al., 2002; Nair et al., 2011; Williams et al., 2005) and transgenic mice (Katsantoni et al., 2007), it is evident that A2UCOE can be used to provide stable gene expression from within two distinct configurations. First, a gene of interest can be expressed directly from the innate HNRPA2B1 promoter (Antoniou et al., 2003), and second the A2UCOE can be used to stabilize expression from a linked heterologous promoter (Williams et al., 2005). Both of these vector designs have been tested within LVs (Fig. 2B).
The first study of A2UCOE within LVs (Zhang et al., 2007) utilized the HNRPA2B1 promoter to directly drive expression of a gene of interest (Fig. 2Bi). The 2.2 kb (and slightly larger 2.5 kb) A2UCOE fragment employed, extends from immediately upstream from the translational start codon within exon I of HNRPA2B1 to just upstream of the second exon of CBX3 (Fig. 2A). This 2.2UCOE therefore also contains both alternative first exons of CBX3 and preserves the dual divergent transcriptional nature of this element (Fig. 2A). The 2.2UCOE, which had previously been shown to provide complete stability of expression in stably transfected cell lines (Antoniou et al., 2003), was compared with LVs with similar vector design in which the gene of interest was under control of either the human cytomegalovirus (CMV) or SFFV promoter/enhancer elements. This study clearly showed that the 2.2UCOE was able to provide not only a higher level of enhanced green fluorescent protein (eGFP) reporter gene expression but also a far greater degree of reproducibility, both within all peripheral blood cell lineages and bone marrow cells following ex vivo transduction and transplantation in mice of lin− HSCs. In addition, detailed cellular and molecular analysis revealed that the 2.2UCOE conferred expression proportional to transgene copy number and that there was a one-to-one correlation between average LV copy number per cell and the percentage of eGFP positive cells, suggesting that essentially all 2.2UCOE vectors delivered were active. This was in marked contrast to the SFFV- and CMV-regulated LVs, which showed a ratio of 1:9 and 1:90 between eGFP-positive cells and average vector copy number per cell, a clear indication that the vast majority of the vectors delivered had become silenced. Furthermore, a functional comparison was undertaken with either the 2.2UCOE or SFFV elements driving expression of the interleukin receptor common γ-chain gene (IL2RG), which is mutated in X-linked SCID (SCID-X1). This showed that the 2.2UCOE-IL2RG construct was able to completely rescue the SCID-X1 phenotype at a significantly lower LV copy number per cell than the SFFV-IL2RG vector not only in cell lines but more important in a mouse model of this condition following ex vivo HSC transduction and transplantation. Finally, analysis in cell lines showed that the 2.2UCOE lacks classical enhancer activity, implying that it has a far lower insertional mutagenesis activation potential than enhancer containing regulatory elements (Zhang et al., 2007).
In a follow-up investigation, the mechanisms by which the A2UCOE can resist silencing were investigated (Zhang et al., 2010). Initial expression studies in murine embryonal carcinoma P19 cells, showed that expression from the 2.2UCOE-eGFP LV cassette was unaltered over a prolonged (>40 days) period of continuous culture at an average vector copy number of one per cell. In marked contrast expression from LVs containing SFFV-eGFP was almost completely silenced within 14–21 days even at a vector copy number of eight per cell. A partial silencing was also observed with LVs harboring eGFP under control of the short version of the EF1α promoter at multiple (two to three) vector copies per cell but with almost total silencing at one vector copy. Molecular genetic analysis revealed that stability of expression from the 2.2UCOE was associated with resistance to DNA methylation. In addition, either the 1.5UCOE or 1.2UCOE subfragment was able to confer resistance to DNA methylation and thus maintain expression from a linked SFFV promoter, as had previously been shown for the 1.5UCOE-CMV promoter combination in stably transfected tissue culture cells (Williams et al., 2005). These in vitro studies were confirmed and extended in vivo by employing ex vivo transduction and transplantation of HSC in mice, where the 2.2UCOE vector maintained its nonmethylated DNA and expression status in both primary and secondary transplant recipients. On the other hand the SFFV-eGFP LV was found to undergo progressive DNA methylation of its promoter (but not enhancer) region as had been observed in the clinical trials of X-linked CGD (Stein et al., 2010).
A foamy virus class of RV-containing transgenes under control of a 0.6-kb subfragment of the 2.2UCOE has also recently been described (Uchiyama et al., 2012). This 0.6UCOE was truncated at the CBX3 end of the element and thus had most of the CpG island deleted and lacked divergent transcription (Fig. 2Biv). Nevertheless, the authors of this study show that the 0.6UCOE can provide stable expression of both an eGFP reporter and Wiskott–Aldrich syndrome protein gene (WAS) in mice. This finding is surprising since other studies showed that expression from the HNRPA2B1 promoter from either an A2UCOE deletion mutant lacking the CBX3 promoter (Antoniou et al., 2003) or deleted to within the first alternative first exon of CBX3 (Knight et al., 2012) was markedly unstable in stably transfected and LV-transduced cells, respectively. The reason for this discrepancy is at present unknown.
UCOE-based LVs have also been employed to develop a system for the rapid generation of mammalian cell lines for protein production (Bandaranayake et al., 2011). The data presented suggests that a 0.7-kb subfragment of the A2UCOE from the region within the first intron of CBX3 encompassing the last 500 bp of the methylation-free CpG island, but devoid of both the CBX3 and HNRPA2B1 promoters (Fig. 2A; 0.7UCOE), is sufficient to stabilize expression from a linked CMV promoter (Fig. 2Biii). Taken at face value, these data would suggest that the dual, divergent transcriptional component of the current model of UCOE mechanism of action would be unnecessary. However, it is important to note that the average 0.7UCOE LV copy number per cell was 8–10 and that despite this the mean fluorescence intensity had decreased by more than 50% at 4 weeks posttransduction, suggesting extensive transgene silencing. This implies that the 0.7UCOE can at best provide only partial protection against silencing when compared to the fully active 1.5UCOE in the same CHO cell environment (Williams et al., 2005). Further analysis of the 0.7UCOE at low (one to two) LV copy number in other relevant cell types is clearly required before any conclusion can be drawn about its potential utility within a gene therapy context.
In addition, to conferring elevated levels and stability of expression from linked viral gene promoters (Williams et al., 2005; Zhang et al., 2010), the A2UCOE can also be used to augment the function of tissue-specific promoters without compromising specificity (Brendel et al., 2012; Talbot et al., 2010). In particular, the 1.5-kb core A2UCOE region (1.5UCOE, Fig. 2Bii) stabilizes expression from the myeloid-specific myeloid related protein 8 gene (MRP8) promoter both in vitro and in vivo in mice and confers expression proportional to LV copy number (Brendel et al., 2012). In addition, using a 1.5UCOE-MRP8-gp91phox LV construct in an ex vivo transduction and transplantation of HSC procedure, efficiently rescued the X-CGD phenotype in a mouse model of this disease at low (0.5–2) vector copy number per cell (Brendel et al., 2012).
Most recently, the 1.5UCOE has been shown to confer stability of expression upon the short EF1α promoter within an LV context upon differentiation of induced pluripotent stem and embryonic stem cells into cell types representative of all three developmental germ layers such as those of the hematopoietic system and neurons. Again, as observed with other A2UCOE-heterologous promoter combinations, negation of silencing was associated with resistance to promoter DNA methylation (Pfaff et al., 2013).
Interestingly, although the A2UCOE conferred stability of expression upon a CMV promoter in stably transfected CHO cells when linked in either orientation (Williams et al., 2005), this has not been found to be the case within other systems. Stability of expression was achieved on the SFFV (Zhang et al., 2010) and MRP8 (Brendel et al., 2012) promoters only with the CBX3 end of the A2UCOE (Fig. 2) juxtaposed to these heterologous elements. The reasons for this orientation-dependent A2UCOE function in certain situations are not at present known but may reflect differences in levels of divergent transcription from the HNRPA2B1 and CBX3 promoters in different cell types and hence a differing ability to remodel chromatin via transcriptional elongation.
UCOE Insertional Mutagenesis Potential
The enhancer-less nature of the A2UCOE (Zhang et al., 2007) suggests it possesses a low risk of insertional mutagenesis by transgene enhancer-mediated host gene activation. However, the A2UCOE possesses other inherent features, which may result in a disturbance or disruption of host gene function at sites of LV integration. First, the divergent transcription from the A2UCOE may run through neighboring promoters at integration sites disturbing their activity. Second, the splice donor sites within the variants of the A2UCOE used in different LV configurations (Fig. 2B) can result in aberrant splicing, giving rise to mutant mRNA and protein species, which may have physiological consequences as has been observed in the LV gene therapy trial for β-thalassemia (Cavazzana-Calvo et al., 2010). Indeed, aberrant splicing between host sequences and LVs has now been shown to be a frequent occurrence due to vector propensity to integrate within actively transcribed genes (Cavazza et al., 2013).
The potential of A2UCOE-based LVs in either the 2.2UCOE or 1.5UCOE-heterologous promoter type configurations to produce aberrantly spliced transcripts has recently been reported (Knight et al., 2012). As expected, fusion transcripts were produced from either native or cryptic splice donor sites within the A2UCOE and host gene sequences. In particular, LV integration within the first intron of the growth hormone receptor gene (Ghr) in IL-3–dependent murine Bcl-15 cells resulted in its increased expression. This in turn rendered these cells independent of IL-3 in the presence of growth hormone. In addition, aberrant fusion mRNAs were formed between the A2UCOE and a number of other cellular genes in transduced human PLB-985 myelomonocytic cells. However, point mutations that abolish both native and potential cryptic splice donor sites at the CBX3 end of the A2UCOE was able to completely negate aberrant splicing onto host gene sequences while retaining full function and thus avoid insertional mutagenesis by this mechanism (Knight et al., 2012).
Summary and Conclusions
Despite some successes in clinical trials (see Aiuti et al., 2012; Biffi et al. 2011; Cavazzana-Calvo et al., 2010, 2012), reproducibility and stability of therapeutic gene expression from within γ-RV and LV remain major problems that need to be addressed. To date problems of variability and silencing of therapeutic gene expression from γ-RV and LV in a clinical trial setting may be masked by the selective survival and growth advantage of genetically corrected cells, as has been observed in cases of SCID-X1 and SCID-ADA (Aiuti et al., 2012; Cavazzana-Calvo et al., 2012). In addition, silencing of the therapeutic gene leading to failure of the initially successful gene therapy has been observed in a clinical trial for X-CGD (Stein et al., 2010).
This review has focused on two types of genetic regulatory elements that can be employed to at least reduce epigenetic-mediated position effects at sites of γ-RV and LV integration and which would provide a more reproducible and stable level of transgene expression within a therapeutic range. However, any design aimed at overcoming these obstacles needs to avoid increasing the risk of insertional mutagenesis.
S/MAR and insulator elements when used to flank the transgene at both 5′ and 3′ ends, can provide a chromatin domain boundary or border function and thus in principle confer more reproducible and stable expression as well as a reduction in insertional mutagenesis activation potential of host genes. However, important points to bear in mind when selecting and using the currently available insulators is that their incorporation within LV may compromise vector production and that they do not work uniformly in all cell types. Unfortunately, the efficacy of S/MAR elements within γ-RV and LV remains unknown because they have only been linked as a single entity upstream of a heterologous promoter rather than in a dual 5′ and 3′ flanking configuration when a true boundary function can be expected (see Ramezani and Hawley, 2010).
LCRs and UCOEs possess an inherent dominant chromatin remodeling and transcriptional activating function and are thus capable of conferring upon a linked gene site-of-integration independent expression that is proportional to transgene copy number. Although numerous LCRs have been identified (Li et al., 2002), only those from the CD2 (Kowolik et al., 2001; Indraccolo et al., 2001) and HBB (see Arumugam and Malik, 2010; Dong et al., 2013) loci have thus far been incorporated into γ-RVs and LVs.
The use of insulator and LCR elements is, of course, not mutually exclusive, and by combining the two, more reproducible expression can be obtained (Arumugam and Malik, 2010; Breda et al., 2012). Improvements in gene therapy efficacy with LVs may also be possible by exploiting the RNA interference (RNAi) system to knock-down a mutant transcript at the same time as delivering a therapeutic gene. For example, a short hairpin RNA (shRNA) that specifically targets HBB mRNA harboring the sickle cell anemia mutation, was incorporated within the second intron of a mini-HBG under control of the of the βLCR in an LV (Samakoglu et al., 2006). Upon transduction and erythroid differentiation of CD34+ cells from patients with sickle cell anemia, this resulted in a greater than 70% decrease in sickle mutant HBB mRNA as well as expression of HBG chains providing a combinatorial therapeutic effect.
It is perhaps true to say that LVs containing genes under control of the UCOE from the HNRPA2B1-CBX3 housekeeping genes has to date provided the greatest degree of reproducibility and stability of therapeutic gene expression both in vitro and more importantly in hematopoietic cells in vivo (Zhang et al., 2007, 2010). Its relatively small size and ability to augment the function of linked ubiquitous (Pfaff et al., 2013) and tissue-specific (Brendel et al., 2012) promoters further widens its utility, including within induced pluripotent and embryonic stem cells (Pfaff et al., 2013).
In conclusion, there are a number of elements with chromatin boundary, chromatin remodeling, and dominant transcriptional activating functions available to researchers in gene therapy that can be used to at least reduce if not completely overcome γ-RV and LV variable, variegated expression, with its associated compromise of therapeutic efficacy. LCR elements are currently clearly underused and S/MAR elements remain to be fully tested. Research into optimizing combinations of both boundary and chromatin remodeling elements for a given target cell population in all likelihood may be the best way forward.
Author Disclosure Statement
MA holds inventor status on patents covering the biotechnological application of UCOEs. The other authors declare no conflicts of interest.
