Abstract
Abstract
Markov Chain Monte Carlo (MCMC) theory and stochastic simulation techniques were incorporated to analyze the effect of different prior knowledge on quantifying parameter uncertainty and its impact on mass transport in heterogeneous aquifer. The MCMC algorithm employing the Metropolis-Hastings rule (MH-MCMC) was used to obtain the posterior distribution of log-hydraulic conductivity. Random simulation technology, Sequential Gaussian Simulation, was used to generate a spatial stochastic hydraulic conductivity field. We investigated two different assumptive prior knowledge scenarios, a uniform prior distribution and a Gaussian prior distribution. Results showed that the prior knowledge could affect the posterior distributions of parameters. When the Gaussian prior distribution was adopted, there was a better convergence of parametric posterior distribution and a decrease in the zone of uncertainty influence and the area of confidence interval on groundwater mass transport modeling. However, it was difficult to draw the conclusion that the Gaussian prior distribution was preferred because the relative influence of parameter prior distribution depended on the location, number of measurements, and methods to reflect the heterogeneity of hydraulic conductivity. Therefore, the prior distribution is a sensitive input parameter and should be defined based upon best available data.
Introduction
G
Among the mentioned uncertainties, parametric uncertainty, caused by heterogeneity of hydraulic properties, has attracted widespread attention. Many former researchers have used spatial statistical methods to characterize the heterogeneity and uncertainty of hydraulic properties in groundwater modeling applications (Feyen and Caers, 2006). Hydraulic conductivity is considered to be the most important input parameter of groundwater flow models. Its variability in space is considerably higher than other hydraulic properties relevant to groundwater flow and it can vary by orders of magnitude over a few meters (Feyen et al., 2003). Random space function models are often adopted to characterize the spatial variability of hydraulic conductivity (Franssen et al., 2003; Kerrou et al., 2008; Liang et al., 2009). Three parameters, mean (μ), sill (c), and range (h) can be used to generate spatial distribution of hydraulic conductivity based upon measured hydraulic conductivity values. Scarcity of data and lack of detailed information concerning spatial variability make assumptions in uncertainty techniques impacting variability associated with groundwater flow and transport model predictions.
Uncertainty associated with parameter estimation within this context needs to be accounted for, along with impact on the output variability. Many articles have been published on parameter estimation in the random space function. A commonly used methodology for assessing the parametric uncertainty is the Bayesian approach such as the Markov Chain Monte Carlo (MCMC) method (Neuman et al., 2012). Parameter uncertainty through MCMC method is accounted for by introducing a prior probability distribution, which represents historical or expert information before any new data is collected, and a likelihood function, which characterizes the proximity of simulated and observed data. The MCMC approach expresses parameter uncertainties in terms of a probability distribution known as the posterior distribution. The parametric prior distribution is used to generate a posterior distribution to summarize uncertainty and quantify its effects on model prediction (Freni and Mannina, 2010).
The prior distribution is generally random designated due to the deficiency of prior information and the uniform distribution is usually preferred. The prior distribution as an important component of the Bayesian inference can affect the posterior distribution and the subsequent model results. This article combines the MCMC approach with random simulation technology and simultaneously compares the performance of two different prior distributions: a uniform distribution and a Gaussian distribution to quantify the effect of the selection of the prior distribution on the groundwater flow and mass transport models.
Methodology
Groundwater flow and mass transport governing equations
The equation describing groundwater flow and mass transport in a heterogeneous confined aquifer can be written as follows (Zheng and Wang, 1999):
where K is the hydraulic conductivity; h is the hydraulic head of the aquifer; qs is the infiltration thickness per unit time; Ss is specific yield; t is time; xi is the distance along the respective Cartesian coordinate axis; i is the respective Cartesian coordinate axis.
where R is the retardation coefficient; C is the dissolved concentration; n is the porosity of the subsurface medium, dimensionless; t is time; xi is the distance along the respective Cartesian coordinate axis; Dij is the hydrodynamic dispersion coefficient tensor; ui is the seepage or linear pore water velocity; qs is the volumetric flow rate per unit volume of aquifer representing fluid; Cs is the concentration of the source or sink flux sources (positive) and sinks (negative).
Random space simulation method
Sudicky (1986) has found that hydraulic conductivity (K) is log-normally distributed in a heterogeneous aquifer. K(x) is used to stand for the stochastic hydraulic conductivity field and its log form is denoted as Y(x)=log K(x), where x indicates the spatial location. Gaussian stationary random simulation field (RSF) with constant μ and an isotropic exponential two-point covariance function are adopted to quantificationally describe the spatial continuity of Y(x) (Feyen et al., 2003):
where μ is the mean of the log-hydraulic conductivity, V(h) is the two-point covariance function of the process with lag separation vector h, the variance σ2(σ2=V(h=0)), and the correlation function ρ(h). The Equations (3)–(5) indicate that Y(x) is decided by the three parameters: θ(μ, σ2, ϕ).
MCMC approach to estimate parametric uncertainty
In this article, three parameters denoted by θ=(μ, σ2, ϕ) are inferred by the MCMC approach from limited measurements to quantify the uncertainty stemming from the imperfect prior knowledge of parameters in the simulation process.
We assume that the log form of hydraulic conductivity is
where p(Y(x)|y), the posterior distribution, is a function of the likelihood function p(Y(x)|y,θ) and the prior distribution p(θ|y).
A conventional MCMC algorithm employing the Metropolis-Hastings rule (MH-MCMC) was used in this study to solve Equation (6). The basic steps of MH-MCMC are described as follows:
(1) Initialize the first realization, i=0, (2) Generate a new realization according to the Metropolis-Hastings rule: • Draw a candidate realization • Draw the prior probability density function g, the likelihood probability density function L, the transition probability density function based on • Calculate the acceptance probability
• Generate a random number of uniform distribution, u∼U[(0,1)];
• If
(3) i=i+1, return to (2).
After enough iterations, the MCMC chains are converged. Various statistical characteristics, such as the mean and the variance, can be obtained from the parametric posterior distribution.
Uncertainty assessment
Three metrics are defined to measure parameter uncertainty from the difference between the reference and the simulated concentration (Fu and Gómez-Hernández, 2009).
where node is the number of cells, noder is the number of realizations, cm,r is the simulated value at cell m and realization r,
I1 measures the precision of the realizations since it evaluates the ensemble variance over all the cells, I2 measures the average and I3 measures a combination of bias and precision. The smaller the I1, I2, and I3 are, the smaller uncertainty of prediction.
Case study
In this article, a hypothetical steady two-dimensional flow and transport field in the groundwater constructed by Wilson and Miller (1978) were adopted to represent the reality and reflect the solute transport under the generated hydraulic conductivity field. The flow model was surrounded by constant-head boundaries on the west and east borders and no-flow boundaries on the north and south borders of the study area. The hydraulic head values at the constant-head boundaries were arbitrarily chosen so that the plume developed from the point source would not reach the boundary. There was regional flow from west to east across a 460×310 m with depth of a 10 m aquifer. The fluid moved at a Darcy velocity of 0.33 m/day, and the porosity of the porous medium was 0.3. The aquifer was assumed to have heterogeneous and isotropic properties. A point source released a small amount of fluid into the aquifer at 1 m3/day at location (x=160 m, y=160 m). The injected concentration of mass was 1,000 ppm. The aquifer was initially pristine with concentrations everywhere equal to zero. The only source of contaminant was the injection; thus, flow through the inlet had zero concentration. The period of interest was 1 year.
The main research steps were as follows:
First, the same conditioned log K(x) was used to update two kinds of assumptive prior information (a uniform distribution and a Gaussian distribution) to generate the parameter posterior distribution. Next, two hundred set θ were, respectively, chosen from their posterior distribution to yield conditional log K(x) field using sequential Gaussian simulation (sgsim) algorithm, one of the most commonly used forms of stochastic geostatistical simulation. Then, log K(x) field was transformed to K field and used as inputs of groundwater flow and transport model to generate the conditioned concentration field. Finally, the difference between the reference concentration field and the conditioned concentration field was quantified to evaluate the uncertainty of mass transport prediction with Equations (7)–(9) for each iteration of the groundwater model.
Results and Discussion
Results of the reference fields
MODFLOW-96 was used to simulate the steady groundwater flow, and MT3DMS, through the third-order total variation diminishing solution scheme, was used to simulate the mass transport.
Figure 1a shows the reference field of logK(x) generated through the unconditional SGS algorithm. The parameters used to generate the reference hydraulic conductivity were μ=2, σ=0.5, and ϕ=100. The regularly chosen logK(x) to be conditioned from the reference field is shown in Fig. 1b. The reference field of the hydraulic head and the concentration are shown in Fig. 1c–f.

Results of the MCMC approach
In this article, it was arbitrarily assumed that the uniform and Gaussian prior distributions could describe the hydraulic conductivity uncertainty and were thus used to update the parameter posterior distribution. The initial values of μ, σ2, ϕ were, respectively, set as 2, 0.5, 100. In the two scenarios, we ran 20,000 MCMC simulations with q=5 parallel sequences to compute parameter posterior distribution. The Scale Reduction score (

Scale Reduction (
Figure 3 shows the cumulative posterior probability functions (PDFs) of each parameter under different parametric prior distributions. When the uniform prior distribution was used, the posterior distributions of all parameters did not obey the uniform distribution. The posterior distributions of the parameter μ and the parameter φ were still the Gaussian distribution when the Gaussian prior distribution was used.

Cumulative posterior probability functions (PDFs) of each parameter under different parametric prior distributions.
Table 1 presents the summary statistics of the marginal parametric posterior distributions for different prior distributions. The range of the parameter was divided into 25 groups and the median of a group with the highest probability was taken as the optimal value of parameter. Percentiles, for example, 95% confidence interval was 2.5th percentile value, 97.5th percentile value, 90% confidence interval was 5th percentile value, 95th percentile value and the rest may be deduced by analogy, were used to determine the confidence interval of parameter and thus to reflect the parametric uncertainty.
The summary statistics showed that the standard deviation of θ was smaller and the optimal value was closer to the true value (μ=2, σ2=0.5, ϕ=100) when the Gaussian prior distribution was used.
Uncertainty quantification of the hydraulic head and mass transport
The variability in contaminant predictions due to parameter uncertainty of K was propagated by MCMC, and the MCMC was used to predict the concentration distributions of the contaminant. For uniform prior distributions, the simulated hydraulic head contour and the spatial variations of the mass concentration contour for t=100 day, t=200 day, t=300 day are illustrated in Fig. 4a, c, e, and g and for Gaussian prior distributions, the simulated hydraulic head contour and the spatial variations of the mass concentration contour for t=100 day, t=200 day, t=300 day are illustrated in Fig. 4b, d, f, and h. Figure 4a and b shows that the zone of 95% confidence interval of the hydraulic head field was smaller when the Gaussian prior distribution was considered. The zone of 95% confidence interval of the concentration field expanded with the increase of the simulation time along the flow direction with time increasing. Furthermore, the area impacted at the 95% confidence bound for the same simulation time was smaller when the Gaussian prior distribution was used. The values of the three metrics I1, I2, I3 calculated on the hydraulic head and the mass concentration fields are listed in Table 2. By analyzing Table 2, we also noticed that the uncertainty of the hydraulic head and the mass concentration fields was smaller when the Gaussian distribution was adopted. The uncertainty of the mass concentration fields was increased as simulation time increased. The contours and the statistic uncertainty showed that the selection of the prior distribution could impact the hydraulic head and the mass concentration, but only because of different input parameter sampling for each model iteration. There was a smaller effect on the hydraulic head. This indicated that the uncertainty of the hydraulic conductivity was transferred to mass transport and it was magnified. However, it was difficult to draw the conclusion that the Gaussian prior distribution was more suitable because the results obtained depended on many factors, such as the location and the number of the chosen measurements from the reference field, which may affect the simulation result.

Conclusions
In the article, the MCMC approach and stochastic simulation techniques were incorporated to analyze the effect of imperfect prior knowledge on parameter uncertainty (hydraulic conductivity only) and mass transport in a heterogeneous aquifer. Two prior knowledge scenarios, a uniform prior distribution and a Gaussian prior distribution, had been considered. The results showed that the prior knowledge dose affects the posterior distributions of parameters and the simulated hydraulic head and mass transport. Therefore, the prior distribution should be based on actual measurements rather than hypothesized. When the Gaussian prior distribution was adopted, there was a better convergence and a decrease in the zone of uncertainty influence on groundwater mass transport modeling. The above consideration may lead to the conclusion that the Gaussian prior distribution was preferred. However, it was difficult to draw conclusions about the relative influence of parameter prior distribution, as it depended on the location, number of the measurements, and the methods to reflect the heterogeneity of hydraulic conductivity.
Footnotes
Acknowledgments
The study was financially supported by the National Natural Science Foundation of China (51039001, 51009063), the National Foundation of the Three Gorges Project Committee under the State Council (SX2010-026), the Program for Changjiang Scholars and Innovative Research Team in University (IRT0719), and Fundamental research funds for the Central Universities.
Author Disclosure Statement
No competing financial interests exist.
