Abstract
Abstract
The dynamics of gene transcription is tightly regulated in eukaryotes. Recent experiments have revealed various kinds of transcriptional dynamics, such as RNA polymerase II pausing, that involves regulation at the transcription initiation stage, and the choice of different regulation pattern is closely related to the physiological functions of the target gene. Here we consider a simplified model of transcription initiation, a process including the assembly of transcription complex and the pausing and releasing of the RNA polymerase II. Focusing on the collective behaviors of a population level, we explore the potential regulatory functions this model can offer. These functions include fast and synchronized response to environmental change, or long-term memory about the transcriptional status. As a proof of concept we also show that, by selecting different control mechanisms cells can adapt to different environments. These findings may help us better understand the design principles of transcriptional regulation.
1. Introduction
T
Traditional models often treat transcription as a two-state system, i.e., the activation and inactivation of target genes (Raj et al., 2006, Shahrezaei and Swain, 2008). This is mostly true for prokaryotes. However, recent experiments using modern genetic techniques have provided evidence of extensive transcriptional regulation acting on different stages during transcription process in eukaryotes (Sandelin et al., 2007; Sun et al., 2011). For example, it is found that Pol II pausing is a general feature in mammalian cells, and the releasing of Pol II can be triggered by certain transcription factors such as c-Myc (Rahl et al., 2010; Young, 2011). It has been argued that Pol II pausing is energetically costly, but doing so allows fast transcriptional response (Boettiger et al., 2011). The willingness of cells to expend the extra energy in Pol II pausing suggests that fast transcriptional response of the target gene is important. Not surprisingly, genes that exhibit the Pol II pausing feature usually respond rapidly to signaling. The most studied example is the heatshock gene HSP90, whose Pol II pausing allows cells to act quickly (in minutes) in order to survive an abrupt temperature increase.
Another important feature some genes have is that they retain information about their previous transcriptional status despite environment changes. For example, it is found that knocking out of Brg in embryonic stem (ES) cells will eventually shut down the expression of the ES core transcription factors Oct4, Sox2, and Nanog, causing ES deficiency, but this process can take as long as several rounds of cell division after the knockout (Ho et al., 2009). Persistent to noise and environmental fluctuations, this kind of self-sustained transcriptional activity is very useful during development, because it allows stem cells to maintain their identity.
The question of how the different transcriptional dynamics mentioned above are realized and encoded in the regulatory machinery has not been fully answered. Mathematical modeling can help elucidate these processes. Several mathematical models have been developed to understand the potential functions that transcriptional control can offer. In Rajala et al. (2010) the authors studied the effect of Pol II pausing on the transcriptional dynamics using a delayed stochastic model. Later, in Boettiger et al. (2011), a detailed transcription regulatory model is presented, showing that regulation in Pol II pausing can lead to fast and synchronized transcription response.
Continuing this line of work, we simplify and extend a transcription initiation model studied in Boettiger et al. (2011) and study its transcriptional dynamics in a varying environment. Using a stochastic population model, we show that the collective transcriptional dynamics can be drastically different by letting the controlling signals act on different steps of the transcription processes. In particular, we observe a fast and synchronized transcriptional response if the Pol II releasing step is under tight regulation, a result consistent with Boettiger et al. (2011). We also discover a noise-resistant transcriptional dynamic that exhibits long-lasting memory effect, and to our best knowledge, this kind of behavior in transcriptional initiation has not been studied in detail before. Overall, this work highlights the importance of regulation during the transcription initiation process as a rich source of diversity in gene expression dynamics.
2. Results
2.1. Transcription initiation model
We divide the transcriptional initiation process into four stages as depicted in Figure 1. The initial state (I) represents the start of gene transcription in which transcription factors have not yet bound to DNA binding sites (Fig. 1B). After a promoter finds the specific DNA sequence (Fig. 1C) it will form a committed complex (C), which serves as a platform for other transcription factors and RNA Pol II to bind. A rapid start complex (R) is assembled when the Pol II is ready to be elongated for transcription (Fig. 1D). If certain conditions are met, the rapid start complex releases the elongated complex (E) and initializes the gene transcription (Fig. 1E). After ejecting the elongated complex, the remaining part of the rapid start complex on the DNA template, also referred to as the transcriptional scaffold, can function as a new committed complex. The recycling of the transcriptional scaffold makes gene transcription more efficient. With probability p, however, the scaffold will disassemble and for further transcription a new committed complex must be formed again from the initial state.

A gene transcription model with different controlling sites.
Advancement from one stage to the next is regulated by controlling signals, which act on three possible controlling sites, with signals s1, s2, and s3 (Fig. 1A). In the following, they are referred to as the distal, middle, and proximal controlling site according to their temporal distance to the final event of Pol II elongation. Biologically these signals may correspond to the downstream effects of rate-limiting proteins that participate in the transcription complex. Mathematically we assume s1, s2, and s3 are variables ranging from 0 to 1, with 0 corresponding to the complete shutdown of the signal and 1 the fully opened state. We model the changes in the transcriptional state as stochastic chemical reactions. The forward reactions I → C, C → R, and R → E have reaction rates
All together there are six parameters in the model and their values are important for the transcription dynamics. In the following we assume that (i) a1 ≪ a2 ≪ a3, (ii) b1 ≪ a1, b2 ≪ a2, and (iii) p is small. The heuristics behind assumption (i) is that as the regulation becomes more precise and specific at the later stages of the transcription initiation, the transcription machinery may want to speed up the process by using more energy. Note that assumption (iii) is essential for the long-term memory effect in transcription (see below). In our simulation, we use a1 = 0.1, a2 = 1, a3 = 10, b1 = 0.02, b2 = 0.2 (hour–1), and p = 0.1. According to this setting, when s1 = s2 = s3 = 1, on average it takes about 10 hours from the initial state to form a committed complex; 1 hour from the committed complex to the rapid star complex; and 6 minutes for the rapid start complex to release an elongated complex. These values are consistent, at least in order of magnitude, with in vitro observations of the transcription dynamics (Hawley and Roeder, 1985) (here in vitro data may be better than in vivo data as the former reflects the intrinsic rates a1, a2, a3 rather than the regulated rates
2.2. Static properties of the model
First we change the signal strengths s1, s2, and s3 at the distal, middle, and proximal controlling sites, respectively, to see how the transcription rate at equilibrium responds. For each controlling site, say the distal one, we keep the other two open (s2 = s3 = 1) and let s1 vary from 0 to 1. For a fixed si, i = 1, 2, 3, the transcription rate is measured by the rate of production of the elongated complex,
We can see that the dependence of the transcription rate on the signal strength s takes the form of a Hill's function for all three control mechanisms. However, the exact shapes of the functions are quite different (see Fig. 2). The transcription rate is almost linearly dependent on the signal strength at the middle controlling site, but for the other two the rates quickly saturate as the signal strength increases.

The dependence of the production rate of E on the controlling signal. For a fixed controlling signal s whose value ranges from 0 to 1, we determine
For the proximal controlling site s3, because the intrinsic rate a3 is very large, a very small s3 makes this step of reaction (P → E) rate-limiting. As a result, the transcription rate, which is proportional to a3s3 in this limit, is very sensitive to changes in s3. However, as s3 increases, the upstream reaction cannot supply enough rapid start complex so the transcription rate saturates. For the distal controlling site s1, because of the recycling of the transcription scaffold, a small production of the committed complex is enough to compensate for the loss of the committed complex caused by the degradation of the scaffold (p needs to be small). As s1 keeps increasing, the downstream reactions reach their limit and the transcription rate saturates.
2.3. Dynamical properties of the model
For all living systems the ability to adjust their gene expression in response to developmental signals and environmental changes is crucial for their survival. In the following, we focus on the dynamical properties of the model when it responds to time-varying signals.
2.3.1. Pol II pausing and synchronized expression
First we introduce a regulatory signal at the proximal controlling site s3, which switches between 1 and 0 periodically every 12 hours, while the other two controlling sites are in the ON-state throughout the simulation. We simulate a population of 8,000 cells and measure the instantaneous transcription rate, which is the total amount of the production of the elongated complex within a short time interval. Initially all cells are at transcription state I. During the first cycle in which the signal at s3 is turned on, the upstream reactions (I → C and C → R) are the rate-limiting steps, and the expression level gradually increases from zero to a steady state set by the maximum rate determined by the upstream reactions and the value of p (Fig. 3, top panel). When the signal at s3 is set to OFF, the upstream reactions will keep working to produce a rapid start complex, which can be recycled or degraded back to state C. Because b2 ≪ a2, most cells will stay at state R with their Pol II at a poised state waiting to be launched. Now if the signal at s3 is turned on, the paused Pol II is released in a very short period of time, causing a transient burst in the transcription level. However, this fast transcription rate can not last because the upstream reactions have a limited supply. As a result, the transcription quickly drops back to the equilibrium level.

Collective behavior of gene transcription under periodic signal (long period). A population of 8,000 cells is simulated over time, during which we apply an oscillating signal (T = 24 hours, bottom panel) to one of the three controlling sites of the transcription module, which is embedded in all the cells. At time t = 0, all cells are at the initial state. The transcription level is obtained by summing the total production of E in the population during a short period of time (0.1 hour).
If the same signal is acting on the middle controlling site s2 instead of s3, the overall expression level will drop to zero as the signal is turned off and increase to a moderate level as the signal is turned on. Because the accumulation of committed complex formed during the signal-OFF period, the expression level reaches a small peak after the signal is turned on. However, compared with the case of regulating the proximal site, the response time is relatively longer, and the peak expression level is much smaller. In other words, cells independently release their Pol II after the rapid start complex is formed, and there is little synchronized expression in the population.
2.3.2. Memory effect and noise suppression
Interestingly, when the regulating signal acts on the distal controlling site s1, the transcription of the target gene will last for a long period of time even after the signal is turned off (Fig. 3, bottom panel). This is because the transcription scaffold can be reused as a committed complex for many times without the need to assemble a new one from the beginning. The time that the self-sustained transcription persists depends on the value of p. For very small p (very stable transcription scaffold), the expression can last for days when the activation signal is gone. This kind of transcriptional behavior may explain the memory effect found in certain genes discussed earlier.
Next we apply a high frequency signal (0.5 hour ON followed by 0.5 hour OFF) at the three controlling sites respectively. As Figure 4 shows, for the distal controlling site s1, the transcription level maintains a steady state even though the signal is changing. For the other two controlling sites s2 and s3, however, the signal change causes significant oscillations during transcription. This result shows that it is possible for the transcription machinery to suppress high frequency noise in the signal.

Collective behavior of gene transcription under periodic signal (short period). A population of 8,000 cells is simulated over time, during which we apply an oscillating signal (T = 2 hours, bottom panel) to one of the three controlling sites of the transcription module embedded in the cells. At time t = 0, all cells are at the initial state. The transcription level is obtained by summing the total production of E in the population during a short period of time (0.1 hour).
2.3.3. Evolvability of the transcription module
We have shown that different transcriptional regulations can give rise to different expression dynamics. Next we demonstrate that environmental conditions can direct adaptation of transcriptional regulation in a population of cells. We consider three types of cells whose gene of interest is under different transcription control: the target gene of the type-1 cells is under distal control (signal acting on s1 while s2 and s3 are constantly ON); type-2 cells under middle control (signal acting on s2 while s1 and s3 are constantly ON); and type-3 cells under proximal control (signal acting on s3 while s1 and s2 are constantly ON). Here the signal represents the level of growth-promoting factors (GPFs) in the environment, which alternates between ON (high level of GPFs) and OFF (low level of GPFs) with a given period T. This setup mimics the experimental strategy used in (Poelwijk et al., 2011) in which a genetic module is engineered whose expression is beneficial in one environment condition and detrimental in another. In Poelwijk et al. (2011), a variable environment that switches every 6 hours between beneficial and detrimental conditions is used to examine the evolution of this genetic module.
Here we use the Moran population model (Moran, 1958) in which cells can either replicate or die and investigate the dynamics of the system under competition and selective pressure. The fitness of each cell is modeled by an auxiliary variable called F, which is determined by the following rules: when GPFs are abundant (ON signal), successfully producing an elongated complex increases the F by a certain amount; when GPFs are low (OFF signal), producing an elongated complex does not change F. Meanwhile, every time the transcriptional state advances to the next stage, F decreases (here we assume only the forward reactions cost energy while the backward ones do not). When the F level of a cell drops below a critical value, transcription activity is paused. When F reaches a certain threshold the cell divides into two daughter cells. F in the mother cells is equally split into the two daughter cells, which inherit the same regulation mechanism with their mother cell. When a cell divides, one cell in the population will be chosen to be replaced according to their F level: the chance of being chosen is proportional to the inverse of F. This means cells with smaller F are more likely to die. See the Appendix for more details about the model, the parameters, and the implementation details.
The numerical results show that the performance of the three types of transcription mechanisms depends on the period T at which the environment changes. The one that best fits the environment will become dominant eventually. As Figure 5 shows, under a rapidly changing environment (small T), type-3 cells are the most fit type because they can respond quickly to rapid changes in GPFs (Fig. 5A, black: type-1 cell; red: type-2 cell; green: type-3 cell). The energy that type-3 cells spend at Pol II pausing pays off in this situation. However, as T increases, type-3 cells keep consuming energy during the long period of low GPFs, which decreases their fitness. In contrast, the other two cell-types do a better job in preserving energy during the OFF state. As a result, for T = 2 days, type-2 cells are most fit (Fig. 5B), and for even longer periods (T = 4 months), the type-1 cells dominate the population (Fig. 5C). Through this game of life we show that transcription regulation in a population of cells can adapt to meet different regulatory demands.

Evolution of a heterogeneous population over time. The population (2,000 cells in total) is composed of type-1, type-2, and type-3 cells, which are distinguished by the way the transcription initiation is regulated (see the main text). Initially the proportions of the three cell types are equal. Over time the proportion of each type is indicated by the thickness of the colored layer (black: distal control, type-1; red: middle control, type-2; green: proximal control, type-3). A growth-promoting signal is switched between ON and OFF periodically in time with period T. Each panel corresponds to a different T (top: T = 1 hour, middle: T = 2 days, bottom: T = 4 months). The results show that different regulatory mechanisms have different fitness depending on periodicity of the signal: for fast-switching signals type-1 cells are dominant, while for slow-switching signals type-3 cells become the most dominant. In between, type-2 cells are the most fit.
3. Discussion
Recent experimental results on Pol II pausing demonstrate that transcriptional regulation is widespread and important in eukaryotes (Darzacq et al., 2007). Nevertheless, the design principles behind the different kinds of transcriptional dynamics have not been fully understood. To aid in closing the gap, we developed a stochastic transcription model focusing on the regulatory steps during the assembly of the transcription complex and Pol II pausing. Our simulation results showed that different control mechanisms can lead to distinct collective transcription behaviors, which offer great flexibility in regulating the transcription of the target genes.
On the one hand, regulations right before transcription such as Pol II pausing (corresponds to controlling the proximal site s1 in our model) can make gene expression fast and synchronized (Fig. 3, top panel). This kind of expression dynamics allows cells to rapidly respond to environmental changes or regulatory signals. Indeed, recent experiments have shown that Pol II pausing at promoter-proximal sites of many genes tends to respond rapidly to developmental and cell signaling in mammals (Core and Lis, 2008).
On the other hand, regulations early on at the transcription initiation processes (corresponds to controlling the distal site s3 in our model) may give rise to noise-resistant and self-sustained expression pattern (Figs. 3 and 4). We suspect that the recycling of the transcription scaffold is a way of retaining gene transcription information. The ability for transcription to persist in the presence of fluctuating signals are very important for developmental purposes. For example, it may help pluripotent cells to maintain their stem-cell identity. Because differentiation is mostly an irreversible process during development, fate decisions need to be made with caution. In the case of Brg in the ES cells we mentioned earlier (Ho et al., 2009), it is possible that Brg is a signal regulating the expression of ES core transcription factors whose role is similar to the distal controlling signal s1. As a result, those ES core transcription factors have a long-term, self-sustained expression pattern mimicked by Fig. 3 (controlling site s1).
We showed that rich expression patterns can be achieved under our simple transcription initiation model. Thus it is possible that the transcriptional machinery could be adapted by different genes to fullfil different biological functions. Using a game of life, we showed that such evolvability is important for cells to survive in changing environments. We hope this functional analysis for the simplified transcription initiation model may help us understand the logic of transcriptional control.
4. Materials and Methods
2.1. Model
The model in Figure 1 can be described by a continuous-time Markov process with states I, C, and R connected by the following reaction rules:
This system can be simulated using Gillespie's direct method (Gillespie, 1977). The last two reactions produce an elongated complex. The total number of elongated complexes of all cells in the population within every 6 minutes is recorded and used to measure the instantaneous transcription rate, which is the value on the y-axis in Figures 3 and 4.
2.2. Equilibrium transcription rate
The transcription rate at equilibrium as a function of the strength of the controlling signal can be obtained analytically as follows. At any given time, the system must be at one of the three possible states, I, C, or R. Let PI, PC, and PR be the probabilities that the system is in state I, C, and R, respectively. Consequently,
At equilibrium, the flux into each state must be equal to the flux out of each state, which gives
Here
Solving the above equations, we obtain the equilibrium transcription rates in Equations (1)–(3).
5. Appendix: Implementation Details of the Evolutionary Model
After replication, the two daughter cells will inherit the same regulation mechanism as the mother cell. In order to mimic the epigenetic effect, one of the daughter cells will share the same transcriptional state (I, C, or R) of its mother cell while the other is set to its initial state. We also tested other implementations, such as both daughter cells start with the initial state, or both inherit the transcriptional state of their mother cell. The results are similar.
According to the above setting, when the GPFs are abundant (signal constantly ON), the average cell division cycle is around 25 hours, which is reasonable for mammalian cells.
Footnotes
Acknowledgments
The authors would like to thank National Institutes of Health for partial funding through grants 1RC2CA148493-01 and P50GM76516 for a Center of Excellence in Systems Biology at the University of California, Irvine. The authors also thank the National Science Foundation, Division of Mathematical Sciences, for partial funding. Y.H. is supported by the NSFC (National Science Foundation of China) under grant #11301294 and the Tsinghua University Initiative Scientific Research Program (grant #20151080424).
Author Disclosure Statement
No competing financial interests exist.
