Abstract
Abstract
Inborn errors of metabolism (IEM) are genetic diseases caused by mutations in enzymes or transporters affecting specific metabolic reactions that cause a block in the physiological metabolic fluxes. Therapeutic treatment can be achieved either by decreasing the metabolic flux upstream of the block or by increasing the flux downstream of the block. The identification of upstream and downstream fluxes however is not trivial, since metabolic reactions are intertwined in a complex network. To overcome this problem, we propose an innovative computational workflow to model the alteration of metabolism caused by IEM and predict the metabolites and reactions that are affected by the mutation. Our workflow exploits a recent genome-scale metabolic network model of hepatocyte metabolism to identify metabolites accumulating in hepatocytes due to single gene mutations in IEM via an innovative “differential flux analysis.” We simulated 38 IEMs in the liver, and in about half of the cases, our workflow correctly identified the metabolites known to accumulate in the blood and urine of IEM patients.
1. Introduction
These disorders are generally caused by single-gene mutations (monogenic) causing a loss- or gain-of-function1 in the encoded protein (usually an enzyme or a transporter). The accepted explanation for the pathogenesis of IEM diseases is that a mutated enzyme will cause an altered metabolic flux in the same pathway of which it is a part. Therapeutic avenues include dietary restrictions or supplements, enzyme replacement therapy where possible, or substrate reduction therapy (Lanpher et al., 2006). The hypothesis underlying such therapeutic strategies is that treatment can be achieved if the block in the physiological metabolic fluxes caused by an IEM can be restored. This can be achieved either by decreasing the metabolic flux upstream of the block or by increasing the flux downstream of the block. The identification of upstream and downstream fluxes, however, is not as trivial as it may seem since metabolic reactions are intertwined in a complex network, which may give rise to unpredictable behaviors if perturbed even by a single gene mutation.1
Our work builds on the hypothesis that a genome-scale metabolic network model of hepatocyte metabolism (Gille et al., 2010) may be used to identify novel therapeutic targets whose modulation can restore the physiological metabolic fluxes in inborn errors of liver metabolism. Moreover, it may also explain the pathomechanism of IEM, which may have been missed by currently biased approaches and may allow the identification of novel disease biomarkers.
Toward this goal, we first extended the HepatoNet1 model (Gille et al., 2010), which comprises 2539 reactions for 777 metabolites to include enzymes whose function is specifically altered in liver IEM disorders such as Primary HyperOxaluria Type I and II (PH1 and PH2). We then developed an innovative computational workflow to predict the metabolites and reactions that are the most affected by a single gene gain- or loss-of-function mutation typical of inborn errors of liver metabolism (Lanpher et al., 2006).
The proposed workflow, in Figure 1, consists of four steps comprising the entire process of modeling, simulating, and in silico phenotyping of liver IEM, starting from the first genome-scale model of a comprehensive metabolic network of human hepatocytes (Gille et al., 2010). The first step concerns the genome-scale metabolic network reconstruction of liver metabolism. In the second one, flux balance analysis (Orth et al., 2010) is applied to the metabolic network to simulate physiological and pathological metabolic flux distributions. The third step involves a “differential flux analysis” (DFA), which we developed, to identify those metabolites and reactions predicted to be most affected by a gene loss- or gain-of-function. Finally, in order to evaluate the predicted in silico phenotypes, we simulated 38 IEMs affecting hepatocytes resulting from single mutation, and assessed if the metabolites identified by the DFA as the most affected include those metabolites known to be altered in the disease.

Steps of the workflow for studying the effect of gene perturbation on liver metabolism.
We achieved very promising results from the application of our workflow, thus demonstrating for the first time that these genetic disorders can be modeled computationally and that the model can be used to identify new therapeutic targets in an unbiased and inexpensive way.
2. Material and Methods
2.1. Extension of a hepatocyte-specific genome-scale metabolic network model
Current genome-scale metabolic models provide a computational platform to study in silico the metabolic fluxes in a given condition and cell type. Recently, a genome-scale reconstruction of the metabolic network of the human hepatocyte HepatoNet1 (Gille et al., 2010) has been generated. In this network, 777 metabolites and 2539 reactions are arranged in six intracellular and two extracellular compartments, thus providing a model able to simulate a large set of known metabolic liver functions.
We extended HepatoNet1 to include: i) all the reactions and metabolites known to be involved in glyoxylate metabolism, causative of two IEM disorders (Primary Hyperoxaluria Type I and II), starting from published models (Duarte et al., 2007; Ma et al., 2007), public databases (Cerami et al., 2011; Kanehisa and Goto, 2000; Matthews et al., 2009), and literature analysis (Danpure, 2006; Danpure and Jennings, 1986); and ii) transport reactions useful to balance the fluxes in the different compartments. At the end of this step, we obtained a single tissue-specific metabolic model of human hepatocytes, which can be used to study hepatocyte functions. (A list of the new reactions and metabolites added can be found in Table 1).
(c), cytosol; (m), mitochondrial matrix; (p), peroxisome.
In order to validate this extended model, we performed producibility analysis to test that the model is able to produce all the compounds in the glyoxylate metabolism, as well as flux-balance analyses to establish a flux distribution for each of the different metabolic objectives listed in Gille et al (2010).
A metabolite xi is producible by a metabolic network if the network can sustain its synthesis under the steady state and thermodynamic constraints. To test the producibility of xi, we added a reaction rj in the cytoplasmic compartment that consumes xi, and then a flux-balance problem is solved to check if the network can produce strictly positive flux through rj.
2.2. Flux balance analysis and thermodynamic constraint-base modeling
The extended Hepatonet1 network topology can be mathematically described by a stoichiometric matrix
Considering the stoichiometric matrix, the flux balance statement is given by
In the constraint-base modeling, a metabolic network is assumed to optimize a biological objective function. We consider the principle of flux-minimization (Holzhütter, 2004), which states that given the value of relevant target fluxes, the most likely distribution of stationary fluxes within the metabolic network is such that the weighted sum of all fluxes is a minimum. Employing the principle of flux minimization results in the solution of the following constrained linear optimization problem for the calculation of stationary metabolic fluxes:
where wj is the weight associated with vj (in our simulations, fluxes in the objective function are weighted equally by default unless modified in selected cases to reflect differential activity of enzymes with alternative substrates or cofactors), while the equilibrium constants
where
2.3. Simulating wild-type and loss- or gain-of-function metabolic flux distributions
The consequence of an enzyme mutation on the metabolic network can be simulated by solving a flux minimization problem (Subsection 2.2) where the flux through the affected reaction is constrained to zero (in case of a loss-of-function) or to a value greater than zero (in case of a gain-of-function).
Let rj be the reaction catalyzed by an enzyme. In order to simulate the effect of a loss-of-function mutation of this enzyme in l different metabolic conditions, we first need to solve l optimization problems of type (1) to compute the wild-type flux distributions for the l different functions. Secondly, the same flux-balance problems must be solved by constraining the fluxes through vj to zero (loss-of-function mutation), that is,
2.4. Differential flux analysis
We would like to identify those metabolites and reactions that are likely to be most affected by the loss-of-function of an enzyme or transporter. To perform “differential flux analysis,” (DFA) we first apply flux balance analysis in both wild-type and loss-of-function conditions for each of the l different metabolic functions to obtain the matrix of fluxes
with
where δi,j is the element of Δ having indexes i and j. We indicate with
Δ mean is then directly used to obtain a ranked list of fluxes (and hence reactions) arranged in ascending order. In this way, at the top and at the bottom of the list, we will find the reactions having the metabolic flux most affected by the loss-of-function (or gain-of-function) mutation. More in details, the top of the list contains the reactions for which the flux is decreased in the simulated disorder, while in the bottom of the list we find the reactions having an increased flux.
Metabolites are instead ranked by taking into account also the stoichiometry of the network. We thus estimate the impact of an enzymatic defect on a metabolite concentration xi by means of the following index:
where si,j is the element of matrix
The vector
The resulting ranked list Xord may contain the same metabolite associated with different compartments in different positions. In order to study the in silico phenotype, we would like to have only one instance for each metabolite. To this aim, we decided to keep, for each compound, the one having the maximal absolute value of ψxi across all the different compartments.
2.5. Metabolite enrichment analysis
The ranked lists resulting from the previous step allow us to analyze how a loss-of-function changes the metabolic flux distribution. In order to validate whether the in silico results for a given loss-of-function mutation resemble clinically observed phenotypes of the IEM, we applied a variant of the gene set enrichment analysis (GSEA) (Subramanian et al., 2005) to test if metabolites that are clinically known to accumulate due to a loss-of-function mutation occur toward the bottom of the list Xord and vice versa. In order to avoid confusion, we named this statistical analysis “metabolic enrichment analysis” (MEA).
We considered 760 disease-associated metabolite sets comprising 500 different diseases, from a public repository (Xia and Wishart, 2010). These sets are divided into two subcategories based on the biofluids in which they have been measured: i) 414 sets in blood and ii) 346 in urine. Let
After that, we start from the list Xord to obtain two new lists, namely, Xblood and Xurine. The first one is obtained by removing from Xord the metabolites that are not elements of the set Sord ∩ Sblood, while the second one by removing the ones that are not in the set Sord ∩ Surine, where Sord is a set containing all the metabolites in Xord.
Metabolite enrichment analysis is performed by checking whether the disease-associated metabolites for a given IEM tend to be ranked at top (or bottom) of the ranked list Xblood and Xurine obtained by simulating the IEM under investigation. More in details, let us consider the disease-associated metabolite sets in blood (the same approach is applied to disease-associated metabolite sets in urine). For each
We applied a permutation test to asses the statistical significance of
The p-value is assessed by performing a set of permutations and computing the fraction of permutation values that are at least as extreme as the test statistic from the unpermuted data. Then, the enrichment score is computed after a random shuffling of Xblood. Let
The disease sets are ranked in ascending order according to absolute values of the enrichment scores, and then the rank is pruned by removing the sets for which the p-value is above a fixed threshold, usually set as 0.05 or 0.01.
3. Results
3.1. In silico simulation of inborn errors of liver metabolism
We applied our computational workflow to simulate the metabolic phenotypes of 38 IEMs reported in Table 2. For each of the 38 IEMs, flux-balance analysis, as outlined in Subsections 2.2 and 2.3, was applied to the extended HepatoNet1 metabolic network model to compute both wild-type and loss-of-function metabolic flux distributions across 8843 different metabolic objectives, simulating different physiological functions of the hepatocyte.
The KEGG database has been used as reference for the patnwas associated with each disease.
Following FBA, we then applied differential flux analysis (Sec. 2.4 and 2.5) to identify the metabolites predicted to change the most in each of the 38 IEMs due the loss-of-function mutation.
Next, we applied Metabolite Enrichment Analysis (MEA) (Sec. 2.5) to check whether metabolites whose levels are known to be altered in blood or urine of patients are correctly identified as the most changed by the in silico simulations. MEA assigns an enrichment score and a p-value to each one of 500 distinct metabolic sets known to be altered in 500 distinct metabolic disorders (including the 38 IEMs). We deemed a simulated metobolic phenotype correct (true positive), if the p-value of the metabolic set associated with the simulated IEM was significant (i.e., below 0.01 or 0.05).
To assess the performance of our in silico workflow, we used the positive predictive value (PPV): for each of the 38 simulated disorders, we counted the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN), and computed the PPV as TP/(TP + FP) and sensitivity as TP/(TP + FP).
Figures 2 and 3 plot the PPV versus rank and PPV versus sensitivity for the ranked list obtained by using, respectively, p–value ≤ 0.05 and p–value ≤ 0.01 to select TPs. As shown in Figure 2, the PPV reaches a maximum of approximately 15% for disease-associated metabolite sets in blood, and of approximately 23% for disease-associated metabolite sets in urine. Moreover, the sensitivities are approximately 47% and 34%, respectively.

Workflow performances on 38 IEMs, considering p-value = 0.05 as threshold to select true positives.

Workflow performances on 38 IEMs, considering p–val = 0.01 as threshold to select true positives.
On the other hand, when the lists are filtered by using 0.01 as thresholds for the p-values, then the PPV significantly increases. In fact, we obtained a maximum of 31% for disease-associated metabolite set in blood and of approximately 38% for disease-associated metabolite sets in urine.
In Tables 3 and 4, we report the ranks of the simulated diseases that are present in the final ranked lists, considering the two p-value thresholds. Due to the fact that some disorders can have different clinical manifestations, we associated more than one rank to some of them.
4. Conclusions and Ongoing Work
In this article, we proposed a computational workflow, based on a genome-scale metabolic model of hepatocytes, to simulate in silico the changes in metabolites observed in patients affected by IEM. In this work, we considered Mendelian disorders, but we would like to point out that our method might have broader applications for the study of other aspects of the metabolism and common human diseases, such as obesity, diabetes, and cancer.
To validate our computational approach, we simulated the in silico phenotype of a representative set of inborn errors of metabolism capturing the wide spectrum of pathophysiology, clinical presentation, and clinical management of these Mendelian disorders. The results here presented prove that our workflow can be a valuable tool to simulate IEM in liver and to identify new therapeutic targets in an unbiased and inexpensive way.
As an ongoing work, we are testing the usefulness of such a system-level approach in automatically identifying the most promising therapeutic targets by focusing on Primary Hyperoxaluria Type I, which is caused by a loss-of-function mutation of the liver-specific AGT enzyme. In the peroxisomes of normal human hepatocytes, this enzyme catalyzes the transamination of the intermediary metabolite glyoxylate to glycine. This is a detoxification reaction because its dysfunction in an IEM known as Primary Hyperoxaluria Type 1 (PH1) allows glyoxylate to escape from the peroxisomes into the cytosol where it is oxidized to oxalate, catalyzed by lactate dehydrogenase, and reduced to glycolate, catalyzed by glyoxylate/hydroxypyruvate reductase (Danpure, 2006). In humans, at least, oxalate cannot be further metabolized, and its increased synthesis and urinary excretion leads to the progressive deposition of insoluble calcium oxalate (CaOx) in the kidney and urinary tract, resulting in various combinations of nephrocalcinosis (diffuse deposition throughout the renal parenchyma) and/or urolithiasis (calculi). This eventually leads to renal failure and a multi systemic disorder due to widespread tissue CaOx accumulation, following which the combined effects of increased oxalate synthesis and failure to remove it from the body results in the deposition of CaOx almost anywhere.
As shown in Table 4, the metabolites known to accumulate in the urine of PH1 patients match very well with the list of metabolites predicted to change the most by a loss-of-function mutation of AGT by our computational method (first rank in the disease sets associated to urine). To analyze in more details the simulated PH1 phenotype, we observed that in the associated ranked lists of metabolites, namely Xurine and Xord, the top two metabolites predicted to increase the most are glycolate and oxalate in the cytosolic compartment; this prediction is in agreement with the known increased conversion rate of glyoxylate to oxalate and glycolate in PH1 patients (Beck and Hoppe, 2006).
Moreover, we observed that metabolites and fluxes involded in the pathways that convert hydroxypyruvate and tryptophan to oxalate are predicted to significantly change. It is known that hydroxypyruvate is a precursor of oxalate (Gambardella and Richardson, 1978; Raghavan and Richardson, 1983), that is, it increases endogenus oxalate via glycolaldehyde → glycolate → glyoxylate → oxalate. Moreover, tryptophan has been shown to be converted to oxalate (Gambardella and Richardson, 1977).
These results show that our proposed methodology is able to perform nontrivial predictions and that it can be ultimately used to identify alternative therapeutic strategies proposing nontrivial substrate reduction therapies or enzymes whose modulation could restore physiological metabolic fluxes in IEM.
Footnotes
Acknowledgments
This work was funded by the European Community's Seventh Framework Programme [FP7/2007-2013] under grant agreement n 259743 (MODHEP) and by the Telethon Foundation Grant TGM11SB1 to DdB.
Author Disclosure Statement
The authors declare that no competing financial interests exist.
1
Gain-of-functions are extremely rare among IEMs.
2
In our model, we consider 2m reactions because a metabolic flux can be positive or negative. Therefore, to deal with non-negative variables, each reaction is decomposed into an irreversible forward one and an irreversible backward one.
