Abstract
Abstract
Twenty-five years ago, Lewis Wolpert, the eminent developmental biologist, asked the question, “Do We Understand Development?” He concluded that such rapid progress had been made in the preceding two decades that “It is not unreasonable to think that enough will eventually be known to program a computer and simulate some aspects of development.” This prediction has been fulfilled, at least partially, with data-driven simulations of several different developmental processes being developed in the intervening years. Nevertheless, the question remains of whether we “understand” development and if simulations are sufficient to provide an explanation of development. While in silico replications and models are undoubtedly an important tool in the investigation and dissection of developmental processes, which complement traditional experimental methods, these need to be supplemented by theory that identifies principles and provides coherent explanations. Here, I use the example of pattern formation in the vertebrate neural tube to illustrate this idea.
In Lewis Wolpert's provocative commentary “Do We Understand Development?” (Wolpert, 1994), he implied that “understanding” could be equated with programming a computational simulation of a developmental process. While previous work, including notable contributions by investigators such as Turing (1952) as well as Gierer and Meinhardt (1972), had pioneered the use of mathematical models in developmental biology, these studies had principally been conceptual and were not closely related to experimental data. What Wolpert appeared to have in mind were simulations that replicated tangible developmental processes. Over the past two decades, this goal has been actively pursued. Now many in silico models and simulations have been developed from experimental observations and used to investigate specific embryonic events (Jaeger et al., 2004; Farhadifar et al., 2007; Istrail et al., 2007; Raspopovic et al., 2014). But being able to recapitulate a developmental process does not necessarily mean that we comprehend it. For this, theory is needed that defines principles and offers insight that can lead to understanding. This is true for many, if not all, biological processes, but here, I will focus on the process of pattern formation in embryonic development.
Pattern formation in developing tissues relies on allocating naive cells into specific functional cell types in defined spatial arrangements and temporal order. This is achieved by initially uncommitted progenitors acquiring their fate in response to molecular signals that regulate the transcriptional programs that control cell functions (Sagner and Briscoe, 2017). How these programs are activated at the right time and place to produce correctly patterned tissues is a central question of developmental biology. These programs are encoded in the genome and function by controlling the activity of the genome. The theory of gene regulatory networks (GRNs), developed by Davidson and colleagues, offers a logical and formal framework in which to describe these processes (Davidson, 2010). Within this framework, a network comprises gene-regulating transcription factors (TFs), linked together by the cis-regulatory elements to which they bind. The functional output of the network is the organized expression of genes. Thus, the analysis of the architecture and dynamics of these networks offers an understanding and a rationale for the precise spatial and temporal pattern of expression of the thousands of genes necessary for tissue patterning.
Perhaps, the best demonstration of the power of this approach is the rigorous and comprehensive dissection of sea urchin endomesoderm development, undertaken by the Davidson laboratory over several decades (Peter and Davidson, 2015). This culminated with in silico models that all but completely simulate and explain the vast array of experimental observations (Peter et al., 2012). In this example, the power of experimental data with computation simulation is evident. The work illustrates the potential of the GRN approach to provide a mechanistic and causal explanation to a complex set of gene regulatory events controlling development cell fate specification.
Equally important, in terms of deriving conceptual understanding, are the GRNs that have been reconstructed from other species and tissues. From comparisons among these, general principles and common themes have begun to emerge. Two ideas appear to be central (Levine and Davidson, 2005; Davidson, 2010). First, GRNs follow a modular design structure, with subcircuits within the GRN at least partially segregated so that their operation is relatively independent of the rest of the network. Within these subcircuits, network motifs, such as feedback and feedforward loops, tend to be overrepresented (Alon, 2007; Davidson, 2010). Second, it is apparent that the regulation of individual genes within a GRN is combinatorial with the output gene expression depending on integrating the various TF inputs it receives. Thus, the cis-regulatory elements that bind the TF inputs provide the apparatus that transforms the information in the genome into the dynamic patterns of gene expression that drive development. This feat is achieved through the activity of small subnetworks of TFs, the output of which depends on the integration of inputs determined by the network structure (Levine and Davidson, 2005; Davidson, 2010). A well-studied tissue that illustrates these points is the vertebrate neural tube. In addition, it has provided insight into the type of computation GRNs perform.
The formation and cellular organization of the vertebrate spinal cord is well characterized (Dessaud et al., 2008; Sagner and Briscoe, 2017) (Fig. 1). During embryonic development, the forming neural tissue is partitioned into 14 molecular distinct domains of progenitors, each of which occupies a characteristic position along the dorsal–ventral (DV) axis (Jessell, 2000; Dessaud et al., 2008; Lu et al., 2015). Each domain of progenitors generates molecular and functionally distinct neuronal subtypes, hence different neuronal classes reside at specific positions along the DV axis of the neural tube. Importantly, the identity of a neuron generated by a progenitor is determined by the set of TFs expressed in each domain, and each domain expresses a distinct combination of TFs. Thus, a combinatorial transcriptional code determines progenitor identity and defines the fate of its progeny. In the terminology of the GRN framework, this transcriptional code is referred to as the “regulatory state” of a cell (Davidson, 2010). This strategy, of dividing a tissue into blocks of cells distinguished by their regulatory state, is observed in many developing tissues and is a hallmark of developmental pattern formation. Thus, the problem of pattern formation becomes one of understanding how spatial patterns of gene expression arise in a tissue.

Spatially restricted gene expression along the dorsal–ventral axis of the neural tube divides neural progenitors into molecular distinct domains. Each domain expresses a unique combination of transcription factors in response to gradients of extrinsic signals (Shh, BMP) emanating from the ventral and dorsal poles of the neuroepithelium. Moreover, each progenitor domain gives rise to a distinct set of molecular and functional distinguishable neuronal subtypes. BMP, bone morphogenetic protein; Shh, Sonic Hedgehog.
The gene expression pattern in the developing neural tube is established progressively over a period of 24–48 hours in mouse and chick embryos (Junker et al., 2014; Kicheva et al., 2014). During this time, the proliferation of cells results in tissue growth such that the DV length of the tissue more than doubles in size as the domains of progenitors are established. In the ventral half of the neural tube, the region that will generate motor neurons and the interneuron circuitry that controls motor output, a gradient of the secreted molecule Sonic Hedgehog (Shh) acts as the positional cue that guides the gene expression pattern formation (Jessell, 2000; Dessaud et al., 2008). Shh emanating from cells at the ventral pole of the neural tube forms a ventral to dorsal gradient; this directs the induction and repression of the TFs that make up the neural tube GRN. The majority of these TFs act as repressors and pairs of TFs cross repress each other's expression (Muhr et al., 2001), forming bistable switches, with the result that pairs are expressed in adjacent, abutting but non-overlapping, domains of progenitors.
The genetic toggle switches formed by the mutually repressing pairs of TFs have several important properties (Balaskas et al., 2012). First, the toggle switch acts as an analog-to-digital converter—it is the mechanism that translates the continuous input, provided by the gradient of Shh, into discrete all or none changes in gene expression. This is responsible for producing the unique regulatory states associated with each progenitor domain and is therefore central to pattern formation. Second, the toggle switches play a crucial role in positioning the boundaries of gene expression. In mutant embryos harboring mutations in one of the TFs, the domain of expression of its pair is altered, expanding to fill some or all the territory that would be expected to express the deleted TF (Briscoe et al., 2000; Sander et al., 2000; Vallstedt et al., 2001; Balaskas et al., 2012). Third, the bistability that results from the mutual repression contributes to the maintenance of gene expression (Balaskas et al., 2012). Once a toggle switch has been triggered, its reversal is more difficult. This feature, known as hysteresis, provides some robustness to fluctuating levels of the inducing signal and effectively acts as a memory of prior signaling. It is notable that genetic toggle switches composed of cross-repressing TFs are a recurring motif in tissue patterning GRNs (Davidson, 2010), presumably because they provide a means to produce and position distinct regulatory states in response to long-range graded signals.
Although DV patterning happens as the neural tube is growing, the Shh signaling gradient does not scale to match the changes in tissue size (Cohen et al., 2015). As a result, the levels of signaling associated with a particular progenitor domain changes with time and generally decreases after a peak of signal is attained at early developmental time points (Dessaud et al., 2007; Balaskas et al., 2012). The hysteresis generated by the genetic toggle switches begins to provide an explanation for the maintenance of gene expression as the tissue grows and signaling declines. This also suggested that the process of pattern formation could be divided into two phases (Kicheva et al., 2014). An initial stage in which gene regulation controlled by signaling changes the pattern of gene expression in individual cells within the tissue. In this phase, signaling interpreted by the GRN dynamically establishes neural tube patterning. A second phase then supervenes in which cell identity is stabilized and maintained, but differences in growth rate of individual progenitor domains explain changes in tissue pattern during this period.
A consequence of these two phases in neural tube patterning is that any errors in the pattern of gene expression introduced during the first phase would remain and potentially be amplified during the second phase. This highlighted the necessity of understanding how pattern is established accurately at the earliest stages of development. In the neural tube, in addition to a gradient of Shh in the ventral neural tube, there is a reciprocal gradient of bone morphogenetic proteins (BMPs) signaling in the dorsal neural tube (Liem et al., 2000; Mizutani et al., 2006). This arrangement, in which two sources of signals are positioned at opposite poles of a patterning axis to generate antiparallel gradients, is found in several tissues (Briscoe and Small, 2015; Sagner and Briscoe, 2017). Theoretical studies have suggested that if cells within the tissue use a combination of the two signals, this can maximize the precision of patterning (Morishita and Iwasa, 2009). This appears to be the case in the neural tube where the combination of Shh and BMP signaling could explain the precision of gene expression.
How do cells interpret the combination of Shh and BMP signaling to minimize patterning errors? Strikingly, a computational screen indicated that extending the cross-repressive interaction motifs could explain the behavior (Zagorski et al., 2017). A transcriptional network, composed of three TFs connected by genetic toggle switches and regulated by Shh and BMP signaling, produced dynamics of gene expression that replicated the observed temporal patterns of gene expression in the neural tube. A prediction from these simulations, which was confirmed by experiments, indicated that exposure to high levels of both signals resulted in the loss of cell identities characteristic of the intermediate neural tube, the location furthest from both sources of signals, and instead cells adopted a gene expression profile characteristic of either a dorsal or ventral identity. Moreover, as expected from the hysteresis observed for individual toggle switches, the three TF network motifs displayed multistability, consistent with the maintenance of gene expression of signaling at later developmental times.
Surprisingly, this GRN produced the same responses that would be expected if cells responded to the levels of Shh and BMP signaling by performing the equivalent of the statistical operation of maximum-likelihood estimation (Zagorski et al., 2017). Thus, a phenomenological model based on cells defining their positional identity using a maximum-likelihood calculation of their position, based on the combined levels of Shh and BMP signaling, predicted the same gene expression patterns as a mechanistic GRN model. This extends the earlier analyses demonstrating the importance of the GRN for the interpretation of signaling gradients and indicates how biological mechanisms can perform apparently sophisticated calculations. Taken together, the experimental data and computational modeling indicate that a GRN controlled by Shh and BMP signaling is able to establish accurately and then maintain gene expression within the developing neural tube.
To understand how these gene regulatory mechanisms are implemented at the genomic level, information on the identity and function of the cis-regulatory control regions of target genes is necessary. This is beginning to emerge from biochemical and bioinformatic studies (Vokes et al., 2007, 2008; Oosterveen et al., 2012, 2013; Peterson et al., 2012; Nishi et al., 2015; Kutejova et al., 2016). Most progress has been made in the case of gene regulation in the ventral neural tube. Binding sites for Gli proteins, the transcriptional effectors of Shh signaling, have been identified associated with many genes in the transcriptional network (Vokes et al., 2007; Peterson et al., 2012). Many of these cis-regulatory elements are associated with binding sites with SoxB proteins (Oosterveen et al., 2012; Peterson et al., 2012). This is a family of transcriptional activators that are broadly expressed in the neural tube. Moreover, if an SoxB TF is ectopically expressed within cells of the limb, signaling by Shh was sufficient to induced genes normally restricted to the neural tube (Oosterveen et al., 2013). Thus, the co-binding of SoxB and Gli proteins appears to confer Shh responsiveness and neural specificity to target genes.
In addition to binding sites for SoxB and Gli proteins, binding sites for the TFs comprising the genetic toggle switches have also been identified in the cis-regulatory elements (Vokes et al., 2007, 2008; Oosterveen et al., 2012, 2013; Peterson et al., 2012; Nishi et al., 2015; Kutejova et al., 2016). This indicates that the cross-repression that characterizes the gene regulatory mechanism responsible for neural tube patterning is most likely mediated by direct transcriptional inputs. Taken together, the data suggest a genomic mechanism that explains the genetic interactions (Nishi et al., 2015; Kutejova et al., 2016). In this view, the cis-regulatory elements associated with target genes integrate three distinct types of input. Activation provided by SoxB proteins provides the neural specificity, Shh signaling, via transcriptional input from Gli proteins provides a positional varying input that initiates pattern formation, and finally, the TFs that comprise the GRN completes the system (Briscoe and Small, 2015). Together, these three components provide the molecular basis for the spatial and temporal dynamics of gene expression in the neural tube. This is consistent with the idea that cis-regulatory elements represent the means by which TF inputs are integrated to control gene expression.
With the depth and detail of knowledge we now have of the neural tube, as well for other developing tissues, how can we extract general principles without getting lost in the molecular detail? A natural formalism to describe and investigate GRNs is dynamical systems theory. This in turn provides a connection to the field of complexity studies, which examine how simple interactions between multiple components can lead to collective and dynamical behavior (Strogatz, 2014). A GRN can be viewed as just such a complex system, in which the interactions between the individual genes are responsible for the properties of the network (Jaeger et al., 2015; Crombach et al., 2016). The properties of the GRN cannot be ascribed to individual genes and cannot readily be determined by simply inspecting the network. Instead, they are a consequence of the interactions between the components, and dynamical systems theory offers a way to combine structure and process to explain mechanism.
A significant insight from dynamical systems and complexity theory is the importance of recursive links (Strogatz, 2014). Both positive and negative feedback results in an output of the system being “recycled” back into the system. This can lead to behavior that is not intuitive and difficult to understand and predict. In addition, nonlinearity of interactions within a system can lead to sudden changes in behavior. This is often termed “criticality” and a system is said to be in a critical state if its behavior changes dramatically in response to a small input. The genetic toggle switches, comprising the mutually repressing TFs, are an example of this. The study of the sudden changes in behavior in deterministic dynamical systems is known as bifurcation theory. An early branch of which was developed by the French mathematician René Thom in the 1960s and termed catastrophe theory (Zeeman and Sussmann, 1979). This was popularized and applied to biological problems in the 1970s by Christopher Zeeman, who worked with developmental biologists including Jonathan Cooke, to investigate some of the implications for understanding development.
The use of dynamic systems theory to describe specific developmental mechanisms provides a framework in which to define simplifying abstractions and to explain principles. For example, the interactions within the GRN provide an example of multilevel behavior that explains how tissue patterns of gene expression arise from the molecular interactions of TFs in individual cells. This type of emergent property is not present in, nor reducible to, the properties of the lower level components of the system. Moreover, nonlinearity suggests that a change in a regulatory state in a cell is the result of a bifurcation. This raises the possibility of describing tissue patterning using catastrophe theory-inspired approaches (Corson and Siggia, 2012). In addition, an effect of nonlinearity is that systems tend to be sensitive to initial conditions. This can make it difficult to predict how a system will behave over time: small differences in initial conditions can result in exponential divergence over time. One crucial function for a developing embryo must, therefore, be to constrain these initial conditions. Understanding which initial conditions are necessary to constrain and how this is achieved remains an open question in the field.
Viewing GRNs as dynamical systems has several consequences. It emphasizes that a GRN is a generative process not a static descriptive plan, hence a GRN only makes sense if viewed as unraveling over time. While a GRN is composed of individual interacting components—the genes and cis-regulatory elements that control these genes—the system functions as a whole. Within this system, a change in gene regulation is a consequence of previous changes in gene activity, and recursive regulatory interactions are a characteristic feature of GRN operation. Dynamical systems theory suggests how the interactions between components can result in an increase in complexity (Strogatz, 2014). This is essential as the amount of genetic information present in an animal genome (∼20,000 protein coding genes) (Ezkurdia et al., 2014) is orders of magnitude below that which would be necessary to specify the spatial location and function of the trillions of cells that comprise an individual. Instead, the information necessary for development is distributed among genes as a GRN and the dynamics that it encodes.
Nevertheless, there are limitations to the GRN framework and challenges to be met. A common criticism is that the GRN approach is often seen as emphasizing the structure and topology of network. This underplays dynamics and quantitative aspects of a system, which is crucial when feedback and nonlinearity are involved. However, this is changing as advances in experimental techniques allow the collection and analysis of dynamic and quantitative data. Moreover, the combination of GRN analysis and dynamical systems approaches is addressing this deficiency. A more substantial criticism is that GRNs are normally perceived as sparsely connected networks, involving relatively few but strong connections between TFs (Davidson, 2010). By contrast, experimental evidence suggests that many networks appear to be densely connected (Novershtern et al., 2011; Kutejova et al., 2016). This raises the question of whether the assumption of modularity is valid and if so how mechanistic motifs remain insulated from one another. Related to this, how stochastic fluctuations, inherent to gene regulation, affect performance and are propagated through a gene regulatory remains poorly understood (Perez-Carrasco et al., 2016). Finally, the GRN approach emphasizes gene regulation and understates the role for other processes, such as cell and tissue mechanics, in the development and morphogenesis of a tissue. Taking these processes into account is a challenge for the future.
To return to Lewis Wolpert's question, “Do We Understand Development?” Clearly, substantial progress has been made toward a molecular, genetic, and cellular understanding of the development of specific tissues, the neural tube being a good example of this. But also, we now have a framework and set of tools that will enable us to more candidly answer Wolpert's question and explain the basis of our understanding. We will see what progress the next 25 years brings.
Footnotes
Acknowledgments
Work in J.B.'s laboratory is supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001051), the UK Medical Research Council (FC001051), and the Wellcome Trust (FC001051), and by the European Research Council under European Union (EU) Horizon 2020 research and innovation program grant 742138.
Author Disclosure Statement
The author declares there are no competing financial interests.
