Abstract
The exploration of pathways and alternative pathways that have a specific function is of interest in numerous chemical contexts. A framework for specifying and searching for pathways has previously been developed, but a focus on which of the many pathway solutions are realizable, or can be made realizable, is missing. Realizable here means that there actually exists some sequencing of the reactions of the pathway that will execute the pathway. We present a method for analyzing the realizability of pathways based on the reachability question in Petri nets. For realizable pathways, our method also provides a certificate encoding an order of the reactions, which realizes the pathway. We present two extended notions of realizability of pathways, one of which is related to the concept of network catalysts. We exemplify our findings on the pentose phosphate pathway. Furthermore, we discuss the relevance of our concepts for elucidating the choices often implicitly made when depicting pathways. Lastly, we lay the foundation for the mathematical theory of realizability.
INTRODUCTION
Large Chemical Reaction Networks (CRNs) are fundamental to numerous scientific, industrial, and societal challenges. Applications include the analysis of metabolic networks and their regulation in health and biotechnology; optimization of chemical synthesis processes; modeling of molecular ion fragmentation in mass spectrometry; investigation of hypotheses concerning the origins of life; and environmental monitoring of pollutants. Subnetworks with specific properties, often referred to as pathways—such as synthetic routes to target molecules or metabolic subsystems—are of particular interest. Thus, the ability to define and identify pathways within a CRN is a central objective in chemical modeling, exploration, and design.
CRNs can be modeled as directed hypergraphs (Zeigarnik, 2000; Müller et al., 2022; Andersen et al., 2019, 2020), where vertices represent molecules and directed hyperedges represent reactions. By considering pathways in CRNs as sets of reactions with integer multiplicities, Andersen et al. (2019) formally defined pathways as integer hyperflows in hypergraphs. The integer hyperflow model for pathways is analogous to flux balance analysis (FBA), another method for pathway discovery. Both approaches enforce mass conservation and typically employ linear constraints to identify pathways. However, they differ in several respects; see Andersen et al. (2019) for a detailed discussion. Notably, FBA yields flux distributions, whereas integer hyperflows provide pathways as sets of reactions with integer stoichiometric coefficients, facilitating a mechanistic understanding of the pathway. Additionally, Andersen et al. (2019) introduced the concept of a chemical transformation motif in a CRN, offering a flexible framework for querying reaction networks for pathways. A chemical transformation motif specifies a pathway by prescribing the input and output compounds, allowing intermediate products that must be consumed entirely. Computationally, finding and enumerating pathways that fulfill a chemical transformation motif can be addressed via Integer Linear Programming (ILP) (Andersen et al., 2019). Although ILP is NP-hard in general and even in the restricted context of finding integer hyperflows in CRNs (Andersen et al., 2012), current ILP solvers perform well for many practically relevant networks and pathways (Andersen et al., 2019).
While integer hyperflows specify reactions and their multiplicities, they do not determine the sequence in which these reactions occur to achieve the overall chemical transformation. Indeed, a valid sequencing may not exist. Figure 6 illustrates such a scenario: No ordering of reactions e1 and e2 in the integer hyperflow renders it executable—essentially, molecules C or D must be present prior to their production. We introduce the term realizable for integer hyperflows where the corresponding chemical transformation can be executed by some sequence of constituent reactions. To address this, we develop a framework that converts integer hyperflows into corresponding Petri nets, enabling the application of Petri net methodologies to express and determine the realizability of integer hyperflows. Petri nets have been extensively employed to model various aspects of metabolic networks (Baldan et al., 2010).
For realizable integer hyperflows, we introduce the concept of a realizability certificate, which specifies an execution order for the reactions along the pathway. Determining an explicit sequence not only enhances mechanistic understanding but is also essential for studies where individual atom identities are important, such as computing atom traces (Andersen et al., 2014). We also explore methods to extend non-realizable integer hyperflows into realizable ones. One approach involves scaling the integer hyperflow, while another entails borrowing additional molecules that are subsequently returned. This latter method is closely related to the concept of a “network catalyst” (see, e.g., Braakman and Smith, 2013; Morowitz et al., 2008). An algorithmic approach to deciding realizability through borrowing thus serves as a crucial foundation for future computational treatments of higher-level chemical motifs like autocatalysis and hypercycles (Eigen, 1971; Eigen and Schuster, 1977; Szathmáry, 1988, 2013). Finally, we apply our methodology to the non-oxidative phase of the pentose phosphate pathway (PPP) to demonstrate its utility and to explore potential catalysts within the network. The PPP is a well-known example that underscores the importance of simplicity in solution finding (Noor et al., 2010; Meléndez-Hevia and Isidoro, 1985).
The primary focus of our article is the formal definition and exploration of the realizability of pathways. It is noteworthy that conventional representations of pathways in the life sciences literature often reside between the two extremes of integer hyperflows and realizability certificates. We believe that our formalization of these concepts can raise awareness of the implicit choices made when depicting pathways. This perspective is further elaborated in Section 5.
The remainder of this article is organized as follows. Section 2 presents the notation and definitions for directed hypergraphs, integer hyperflows, and Petri nets, with terminology following (Andersen et al., 2019). Section 3 defines the realizability problem, outlines our method for converting integer hyperflows into Petri nets, and introduces realizability certificates. In Section 4, we investigate methods for rendering non-realizable integer hyperflows realizable, either by scaling the hyperflow or by borrowing molecules. Section 5 discusses the implications of integer hyperflows and realizability certificates in pathway depiction. Section 6 examines the mathematical properties of pathway realizability.
PRELIMINARIES
Chemical reaction networks and pathways
In this article, we follow Andersen et al. (2019) and model CRNs as directed hypergraphs. A directed hypergraph

A directed hypergraph in
To model pathways (Andersen et al., 2019) defines the extended hypergraph. Given a hypergraph
The hypergraph

Example of an extended hypergraph. It has vertices {A, B, C, D}, edges {e1, e2, e3, e4}, and a half-edge to and from each vertex. An edge e is represented by a box with arrows to (from) each element in e− (e+).
An integer hyperflow is an integer-valued function f on the extended network,
Note in particular that

Example flow f on the extended hypergraph from Figure 2. Vertex D has been omitted as it has no in- or out-flow. Edges leaving or entering D have also been omitted as they have no flow. The flow on an edge is represented by an integer. For example, the half edge into B has flow f(
Petri nets are an alternative method to analyze CRNs. Each molecular species in the network forms a place in the Petri net and each reaction corresponds to a transition (Koch, 2010; Reddy et al., 1993, 1996). The stoichiometric matrix commonly used in chemistry has an equivalent in Petri net terminology, called the incidence matrix (Koch, 2010). In Section 3, we will describe a transformation of a flow to a Petri net. The following notation for Petri nets (with the exception of arc weights) follows Esparza (1998).
A net is a triple (P, T, W) with a set of places P, a set of transitions T, and an arc weight function W : (P × T ) ∪ (T × P) →ℕ0. A marking on a net is a function M : P → ℕ0 assigning a number of tokens to each place. With M∅ we denote the empty marking, that is, M∅( p) = 0, ∀p ∈ P. A Petri net is a pair (N, M0) of a net N and an initial marking M0. For all x ∈ P ∪ T, we define the pre-set as •x = {y ∈ P ∪ T ∣ W (y, x) > 0} and the post-set as x• = {y ∈ P ∪ T ∣ W (x, y) > 0}. We say that a transition t is enabled by the marking M if W (p, t) ≤ M ( p ), ∀p ∈ P. When a transition t is enabled it can fire, resulting in a marking

Example firing sequence. Here P = {p1, p2, p3, p4, p5}, T = {t1, t2, t3}, W = {(p1, t1) ↦1, (p2, t1) ↦ 1, (t1, p3) ↦1, (p3, t2) ↦ 1, (t2, p4) ↦ 1, (p4, t3) ↦ 1, (t3, p5) ↦ 1, (t3, p1) ↦ 1}, and the initial marking M0 = {p1↦1, p2↦1, p3↦0, p4↦0, p5↦0} which is depicted in
Andersen et al. (2019) described a method (summarized in Section 2.1) to specify pathways in CRNs and then proceeded to use ILP to enumerate pathway solutions fulfilling the specification. In this article, we focus on assessing the realizability of such a pathway solution and on determining a specific order of reactions that proves its realizability. To this end, we map flows into Petri nets and rephrase the question of realizability as a particular reachability question in the resulting Petri net.
Flows as Petri nets
We convert a hypergraph
Given a flow, we would like to constrain the Petri net such that it yields only firing sequences for that particular flow. We therefore further convert the extended hypergraph
Transitions in (N, M0) therefore can fire at most the number of times specified by the flow. Furthermore, any firing sequence

The flow from Figure 3 converted to a Petri net with its initial marking. Places are circles, transitions are rectangles, and tokens are black dots. Arrows indicate pairs of places and transitions for which the weight function W is non-zero (in this example, all non-zero weights are equal to one). The target marking is M T (AT) = 1, MT(BT) = 1, MT(CT) = 1 and MT(p) = 0 for all p ∈ P ∖{AT, BT, CT}. We have omitted the part of the net that corresponds to the omitted part of Figure 3.
We are interested in whether a given pathway, represented by a flow f on an extended hypergraph
Figure 6 shows that not all flows f on

Example of a flow which is not realizable. Observe that the flow is indeed viable as it fulfils the flow conservation constraint. Furthermore, notice that there is no input flow to neither C nor D, and therefore in the corresponding Petri net it will not be possible to fire either of e1 or e2 which is necessary for it to be realized. However, if C or D was borrowed the related flow with this borrowing would be realizable.

An example of a flow for the formose reaction which is realizable. The input compound Formald is marked with green and Glycoald which is both an input and output compound is marked with turquoise.
In order to introduce realizability certificates that describe the causal order of the reactions needed to make the pathway realizable, we need some established terminology.
1.
2.
“Occurrence net” is also defined in Genrich and Stankiewicz-Wiechno (1980) and Best and Merceron (1982), but is used with a different meaning in other sources [see, e.g., Nielsen et al. (1981)].
1. q(PK) ⊆ PN and q(TK) ⊆ TN;
2. If C := {x ∈ PK ∣ •x = ∅} then M(p) = ∣ q−1 (p) ∩ C ∣ for all p ∈ PN;
3. WN (p, q(t)) = ∣ q−1(p) ∩ •t ∣ and WN (q(t), p) = ∣ q−1(p) ∩ t• ∣ for all t ∈ TK and p ∈ PN.
A process is thus an occurrence net that maps back to a Petri net, such that it respects the transitions, places and weight function of the Petri net. Furthermore, the process starts at the marking M in the net.
A realizability certificate exists if and only if the target marking MT is reachable from the initial marking M0 (Goltz and Reisig, 1983, Theorem 3.6).
A realizability certificate can be constructed from the initial marking using an algorithm exemplified in (Goltz and Reisig, 1983). Furthermore, the Petri net tool A Low Level Analyzer (LoLA) (Schmidt, 2000) is, given a Petri net with its initial marking and a target marking, able to compute a so-called witness path, which is an object isomorphic to a realizability certificate (or tell if the target marking is unreachable and no realizability certificate exists). The computational complexity of reachability in Petri nets is a complex question in the general case (Mayr, 1981; Reutenauer, 1990). However, in practical cases, LoLA performs well—in particular, in our use cases, it normally finishes in less than 10 minutes. In this article, we used LoLA to produce the underlying certificate for our figures. For an example of a realizability certificate see Figure 8, which is a certificate for the flow in Figure 3. For a more chemical example of a realizability certificate, see Figure 9, which is for the flow in Figure 7. To draw realizability certificates more concisely, we have omitted

A realizability certificate for the flow in Figure 3.

A realizability certificate for the flow in Figure 7. The input compounds are marked with green and the output compounds are marked with blue.
A realizability certificate is a directed acyclic graph (DAG) by Def. 3.2 (1); hence it has a topological sorting (Cormen et al., 2009), that is, a linear ordering of the vertices such that for every edge (u, v), u comes before v in the ordering. Such a topological sorting of a realizability certificate produces one possible firing sequence of its transitions, which realizes the flow.
Finally, we note that a realizability certificate is formulated in the Petri net literature such that it gives an individual token interpretation, where individual tokens are distinguishable (van Glabbeek, 2005). Such a property is an advantage (actually, a necessity) if one is to do atom tracing (Andersen et al., 2014) of stable isotope atoms through the pathway.
We have demonstrated above that flows may not be realizable. In this section, we study various means by which non-realizable flows may be made realizable.
Asking if a flow f is scaled-realizable corresponds to asking if k copies of f can be realized concurrently. This is of interest as in the real world, a pathway is often not just happening once, but multiple times. Therefore, even if the flow is not realizable, it is meaningful to consider if the scaled flow is. Figure 10 is an example of such a flow which is not realizable itself but is scaled-realizable by a factor 2. The flow represents an alternative formose reaction. In order to see that this flow is indeed scaled-realizable, see the realizability certificate of the flow in Figure 11.

An example of a flow for the formose reaction which is not realizable but is scaled-realizable by a factor 2. The input compound Formald is marked with green and Glycoald which is both an input and output compound is marked with turquoise. The SMILES strings for all molecule identifiers are listed in Appendix and Appendix Table A1.

A realizability certificate for the flow in Figure 10 when scaled by a factor 2, making it scaled-realizable. The input compounds are marked with green and the output compounds are marked with blue. The SMILES strings for all molecule identifiers are listed in Appendix and Appendix Table A1.
However, not all flows are scaled-realizable. A counterexample is the flow presented in Figure 6: No integer scaling can alleviate the fact that firing e1 or e2 requires C or D to be present at the outset. We note that Thm. 3 from Sec. 6 provides an easily checkable condition which, if true, implies that a flow is not scaled-realizable.
We denote b as the borrowing function and we say that
The combinatorics underlying the non-oxidative phase of the PPP has been studied in a series of works focusing on simplifying principles that explain the structure of metabolic networks (see, e.g., Noor et al., 2010; Meléndez-Hevia and Isidoro, 1985). An example of a simple flow from the PPP that is not scaled-realizable is shown in Figure 12. Here, the production of glyceraldehyde (Glyald) is dependent of the presence of Hex-2-ulose (2Hex), which depends on fructose 1-phosphate (F1P), which in turn depends on Glyald. This cycle of dependencies by Thm. 3 implies that firing is impossible unless one of the molecules in this cycle is present at the outset, which cannot be achieved by scaling. As illustrated in Figure 13 and proven by the existence of the realizability certificate, the flow is borrow-realizable with just one borrowing, namely of the compound Glyald. Thus Glyald can be seen as a network catalyst for this pathway.

Example of a flow for the pentose phosphate pathway that is not scaled-realizable. The flow is borrow-realizable. The input compound is marked with green and the output compound is marked with blue. The SMILES strings for all molecule identifiers are listed in Appendix and Appendix Table A1.

A realizability certificate for the flow in Figure 12 where the molecule Glyald is borrowed in order to make it borrow-realizable. The input compounds are marked with green, the output compounds are marked with blue and the borrowed compound is marked with purple. The SMILES strings for all molecule identifiers are listed in Appendix and Appendix Table A1.
We have described two ways of modeling pathways: flows and realizability certificates. The realizability certificate defines a causal order in the pathway and explicitly expresses which individual molecule is used when and for which reaction. A realizability certificate uniquely determines a corresponding flow. Flows, on the other hand, do not specify the order of the reactions or which one of multiple copies of a molecule is used in which reaction. A flow therefore may correspond to multiple different realizability certificates, each representing a different mechanism.
We want to point out that commonly used representations of pathways in the life science literature fall in between these two extremes, see Figure 14 for an example. In this example, the order of reactions is not fully resolved—for instance, is F6P produced before E4P or after? Indeed, some unspecified choice of borrowing is needed to set the pathway in motion. Additionally, the semantics of a molecule identifier appearing in several places is unclear—for instance, are the three appearances of G3P interchangeable in the associated reactions or do they signify different individual instances of the same type of molecule? In the former case, the figure corresponds to a much larger number of different realizability certificates than in the latter case. The answers to these questions have important consequences for investigations where the identity of individual atoms matter, such as atom tracing.

Example of a pathway drawing for the cyclic non-oxidative glycolysis (NOG) pathway. Recreated from (Bogorad et al., 2013, Fig. 2a).
Furthermore, when there is a choice between different pathway suggestions, avoiding borrow-realizable pathways often gives simpler depictions. However, this introduces a bias among the possible pathways, which may be unwanted, as borrow-realizable solutions are usually equally simple in chemical terms. We note that the need for borrowing in pathways is usually not discussed in the literature. Additionally, there has been a lack of computational methods to systematically look for borrow-realizable pathways, even if they could equally likely form part of what happens in nature. For instance, the PPP is usually depicted in a form that gives rise to a realizable flow depicted in Figure 15, with a realizability certificate shown in Figure 16. It could just as well be described by the equally simple and chemically realistic borrow-realizable pathway depicted in Figure 12.

A flow for the pentose phosphate pathway, which is realizable. The input compound is marked with green and the output compound is marked with blue. The SMILES strings for all molecule identifiers are listed in Appendix and Appendix Table A1.

A realizability certificate for the flow in Figure 15. The input compounds are marked with green and the output compounds are marked with blue. The SMILES strings for all molecule identifiers are listed in Appendix and Appendix Table A1.
We believe that our focus on the realizability of pathways may help raise awareness of the choices one often subconsciously makes when creating pathway illustrations.
In this section, we take the first steps towards a mathematical theory of the realizability of flows. We begin with a result on realizable flows and prove that if the König representation of the flow-induced subhypergraph of the extended hypergraph and flow f does not have any cycles, then f is realizable.
In short, the König representation of a hypergraph arises simply by considering both the circles and boxes of its visualization (in the style of e.g., Fig. 2) as nodes and the arrows as edges.
When the sweepline reaches v, v has received all its tokens in the flow. Node v only needs to supply tokens after the sweepline has reached v. If v is the last node in the sources of a hyperedge, the hyperedge can fire (i.e. there are still enough tokens on every node in its source).
Here (i) and (iii) are proven together by induction on the sweepline movements and (ii) is true by construction of the firing sequence. □
There exist flows requiring arbitrary scaling factors:

A flow which is not scaled-realizable for an integer k < 4 but is for any integer k ≥ 4.
There also exist flows not scaled-realizable for any factor:

Simple flow that is not scaled-realizable.
We now give an easily checkable condition which if true implies that a flow is not scaled-realizable for any factor. In short, the condition is that at least one vertex of the flow cannot be reached during a graph traversal from its source set.
In more detail: Consider a directed hypergraph
If there exists a vertex
We remark that Thm. 3 only provides a sufficient condition for determining non-scaled-realizable flows and not a necessary condition. This follows from the flow in Figure 18: During graph traversal, this flow will have all its vertices visited, but by Thm. 2 it is not scaled-realizable for any factor.
The property of being scaled-realizable is closed under addition of the scaling factors:
The family of flows from the proof of Thm. 1 has the following interesting property.
A natural question now arises whether all scaled-realizable flows are also monotone scaled-realizable. We did a computer-based search for counter-examples, but found none.
In more detail, we generated several pseudo-random directed hypergraphs in which we found a large number of different flows using the software package MØD (Andersen et al., 2016, 2018), which has a functionality for executing flow queries for hypergraphs via ILP (Andersen et al., 2019). We tested these flows for realizability and among the flows not directly realizable, we looked at those which were scaled-realizable with a smallest scale factor k = 2 or k = 3. If the lowest factor was k = 2, we tested if the flow was also scaled-realizable for factor j = 3. If the lowest factor was k = 3, we tested if the flow was also scaled-realizable for factors j where 3 < j ≤ 5. If so, we by Thm. 4 knew that the flow was monotone scaled-realizable. If not, we would have found a counter-example. Among the 1688 scaled-realizable flows studied, we found them all to be monotone scaled-realizable.
We thus close this section with the following conjecture:
We introduced here a concept of realizability of a pathway given as a flow by converting the flow to a Petri net. The question of realizability can then be rephrased as a question of reachability in the Petri net, leading to notions of realizable, scaled-realizable, and borrow-realizable flows. The method is essential if one is interested in finding alternative realizable pathways to those already known by chemists. Reachability in Petri nets and equivalent formal systems is an active field of research [see, e.g., Alaniz et al. (2022)] and the references therein. Many of the relevant reachability problems
An interesting direction for future research is extending the framework to allow for atom tracing in CRNs. While current Petri net methods allow us to track individual tokens/molecules (van Glabbeek, 2005), full atom tracing requires enumerating all possible firing sequences, that is, all witness paths, which existing Petri net tools do not currently provide. On the other hand, atom-atom mapping, that is, how atoms rearrange during reactions, is already available through an existing graph transformation framework MØD (Andersen et al., 2016, 2018). Such a combination of witness path enumeration and atom-atom mapping is crucial for tracking isotopic labels and understanding reaction mechanisms and would significantly enhance the model’s applicability in systems chemistry, metabolic engineering, and synthetic biology.
Footnotes
AUTHORS’ CONTRIBUTIONS
J.L.A.: Conceptualization, methodology, software, writing—original draft, writing—review and editing, supervision. S.B.: Methodology, writing—original draft, writing—review and editing, visualization. R.F.: Conceptualization, methodology, writing—review and editing, supervision. C.F.: Conceptualization. D.M.: Conceptualization, methodology, writing—review and editing, supervision. P.F.S.: Conceptualization, writing—review and editing.
AUTHOR DISCLOSURE STATEMENT
The authors have no conflict of interest to declare.
FUNDING INFORMATION
This work is supported by the Novo Nordisk Foundation, grant
Appendix
1
When comparing a multiset M and a set S, we view M as a set. I.e., M⊆S holds if every element in M is an element of S.
