Evolutionary Algorithm for Water Storage Forecasting Response to Climate Change with Small Data Sets: The Wolonghu Wetland,China

Abstract

A novel genetic programming (GP) technique, a new method of evolutionary algorithms, was applied to a small data set to predict the water storage of Wolonghu wetland in response to the climate change in the northeastern part of China. Fourteen years (1993–2006) of annual water storage and climatic data of the wetland were used for model training and testing. Results of simulations and predictions illustrate a good fit between calculated water storage and observed values (mean absolute percent error=9.47, r=0.99). By comparison, a multilayer perceptron method (a popular artificial neural network model) and Grey theory model with the same data set were applied for performance estimation. It was found that GP technique had better performance than the other two methods, in both the simulation step and the predicting phase. The case study confirms that GP method is a promising way for wetland managers to make a quick estimation of fluctuations of water storage in some wetlands under the limitation of a small data set.

Introduction

Wetland ecosystems depend on water levels (Dawson et al., 2003), as low water storage may increase the cover of terrestrial species that may lead to progressive degradation of the wetland (Fojt and Harding, 1995). Therefore, the ability to predict the fluctuation of water storage in wetlands will be of great importance for wetland management, especially in the semiarid and arid regions in China.

Grings et al. (2009) have used radar remote sensing to explore the water storage of wetlands of the Paraná River Delta in Argentina. However, the remote sensing techniques and other relative techniques based on gamma-ray (Calder and Wright, 1986) and microwave attenuation (Bouten et al., 1996) are expensive to implement (Liu, 1998), although they are capable of monitoring the water level inside marshes on a regional scale (Bach and Mauser, 2003). Bradley (2002) used the MODFLOW to simulate the annual water table dynamics of a flood plain with a large amount of weekly data. So far, there are still not appropriate physical models used for wetland water storage forecasting for some difficulties. The primary reason rests with factors complexity; that means local natural and social conditions and so on around the wetland will surely influence the amount of water storage to a certain extent, which consequently increases the difficulty of modeling (Bradley, 2002; Kirk et al., 2004). And generally, mathematical models used for hydrological process modeling require a relatively long-term series of historical data (Trivedi and Singh, 2005). However, the availability of a large data set for model training and validation is not a problem in developed countries, as their hydrological data banks are relatively massive. However, in developing countries, such as China, a large hydrological data set is often unavailable, especially in remote rural areas. Therefore, choosing a suitable approach for water storage prediction under the limitation of a small data set will be of much practical utility.

The artificial neural network (ANN) method, due to its good potential of identifying correlations between input data and corresponding target values, has been widely used for hydrological modeling systems (Abrahart and Kneale, 1997; Dawson and Wilby, 1998; Zealand et al., 1999; Imrie et al., 2000; Baratti et al., 2003; Castellano-Méndez et al., 2004; Riad, et al., 2004; Valença, et al., 2005; Aqil et al., 2007). This method has also been shown to perform well with respect to conventional models. However, a large data set is needed for training when setting up an ANN model. A small data set usually led to poor performance. Grey theory model (GM) as a potential tool for modeling with a small data set (as few as four) has been successfully used in hydrological process modeling (Trivedi and Singh, 2003; Hao et al., 2006). But the data quality often influenced the performance of the method.

Genetic programming (GP) (Koza, 1992), a relatively new method of evolutionary algorithms, has been considered as one of the best methods in dealing with modeling for complex nonlinear conditions. It has been successfully applied in many applications (Whigham and Crapper, 2001; Kamal and Eassa, 2002; Duyvesteyn et al., 2005; Muttil and Lee, 2005; Lopes, 2007, Makkeasorn et al., 2008). GP can capture the relationship between the inputs and outputs automatically without the request of prior knowledge of the underlying physics, even when the data set is relatively small (Nath et al., 1997; Muttil and Lee, 2005).

In this study, we applied the GP method for estimating the fluctuations of water storage in the Wolonghu wetland, the biggest inland wetland in Liaoning Province, with a small data set. Two other popular forecasting methods, the multilayer perceptron (MLP) and the GM, a novel GP method, were applied for comparison. The article is organized as follows: first, a description of the study area and data is presented to allow readers to understand the background of the application. Then the key principles of GP are outlined, followed by its application to model annual water storage of the Wolonghu wetland. Subsequently, the results from the GP model are discussed and compared with those from the MLP and the GM methods. Finally, the conclusions are drawn.

Study Site and the Data Set

The research was carried out in the Wolonghu wetland, which is located at the remote north part of Liaoning Province in northeast China. Wolonghu wetland is the largest natural inland-wetland in Liaoning Province, covering an area of ∼112 km² (Fig. 1). The major functions of the wetland are supporting biodiversity and regulation of water regime. The climate of the region is semiarid. Average annual precipitation and annual pan evaporation are 514 and 1933.2 mm, respectively. The annual mean air temperature is 7.6°C. The total available water resource of the region is 1.71×10⁸ m³, of which 0.43×10⁸ m³ is the surface water resource and 1.28×10⁸ m³ the underground water resource. Due to climate change and limited water resources, the wetland has developed a serious loss of biodiversity in recent years, which has greatly impeded the sustainable development of the ecosystem. To protect this important ecosystem, the local management department must make a plan of adaptive water resource allocation to meet the least water demand of Wolonghu wetland. And therefore, estimation of the fluctuations of the water storage in the wetland is sure to be helpful for the plan-making.

FIG. 1.

Study area.

In the Wolonghu wetland, precipitation is believed to be the primary source of recharge. Evapotranspiration, underground water recharge, and surface water outflow from the wetland are the main losses of the water in the wetland. In this rural area, gauge stations are sparse and the records of the underground water recharge and surface water outflow data sets needed for model setting are typically small, owing to cost difficulties. Moreover, the earliest annual water storage data of the wetland was recorded in 1993.

Genetic Programming

GP (Koza, 1992), an extension of genetic algorithm, is a relatively new approach to automatic modeling. It conducts its research in the space of computer programs (solutions of a problem) whose structure are represented by binary trees of varying size and shape. For example, an expressing of f(x)=x₁ ln(x₂)+cos(x₃) is represented by the program (Fig. 2). The computer programs consist of internal node F={arithmetical functions, relation functions, etc.} and terminal node T={numerical constants, variables, etc.}. GP starts by randomly creating a population of computer programs consisting of the available nodes from F and T, and each population represents a function f(x) of the given data. Then, GP genetically evolves the population by using the Darwinian principle of natural selection in which the crossover and mutation are the main operations. Thus, GP provides a way to find a suitable solution to a problem.

FIG. 2.

(a) Genetic programming (GP) tree example and mutation and (b) crossover operation.

The primary operators in GP are as follows:

(1) Selection: Pairs of parent trees are selected based on their fitness values for reproduction.

(2) Crossover: This process is performed by randomly selecting a node and then exchanging the associated subtrees to produce a pair of offspring trees (Fig. 2b).

(3) Mutation: This process is performed by replacing a node selected at random with a newly created subtree (Fig. 2a). This process can prevent the GP model from falling into the local optimum. For more information about selection, crossover, and mutation, refer to Wang and Cao (2002).

Before running a GP algorithm, five steps should be performed:

(1) Determine the terminal node set, T;

(2) Determine the internal node set, F;

(3) Determine the fitness function;

(4) Determine parameters and variables for controlling the run; and

(5) Determine criterion for terminating the GP algorithm.

Application

GP algorithm for water storage estimation

In this section, a forecasting model for water storage of Wolonghu wetland is to be obtained automatically evolved from the GP algorithm with small amount of data sets. We used the real values as the elements of the terminal node set T. Ten functions were chosen in the internal node set F for the purpose of the complicated nonlinear relationship between the inputs and the outputs. Of the set F, four are arithmetic operators (+, −, ×, ÷) and the rest are functional ones [exp(_), ln(_), cos(_), sin(_), tan(_), cot(_)]. The selection of the appropiate parameters of GP is necessary. Usually good GP performance also requires the choice of a high crossover probability, a low mutation probability, and a moderate population size (Cheng et al., 2002). The parameters for GP running are given in Table 1.

Table 1.

Primary Parameters Used for Genetic Programming Running

Parameter	Value
Function set	+, −, ×, ÷, sin, cos, tan, cot, exp, ln
Terminal set	Real code
Population	200
Crossover rate	0.9
Mutation rate	0.05
Selection	Tournament with elitist strategy
The maximum depth of parse tree	6

In GP, the fitness function was used to estimate the solutions in terms of the problem. In this study, we used Equation (1) as the fitness function: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*}f = \big \| y_t^{ ( 0 ) } ( x ) - \hat{y}_t^{(0)} (x) \big \| _2^2 \tag{1}\end{align*} \end{document}

where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$${y}_t^{ ( 0 ) } ( x )$$\end{document} represents the measured water storage value and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$\hat{y}_t^{ ( 0 ) } ( x )$$\end{document} the simulated value. The available optimal model automatically derived from GP method is the one which has the lowest relative f value.

As in the case studies of Levy Prairie (Kirk et al., 2004) and Narborough Bog (Bradley, 2002), in this study, Equation (2) was used to describe the fluctuations of the water storage of Wolonghu wetland. \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*}{\Delta S} = ( {P} + {SW}_{in} ) - ( {ET} + {GW}_{out} + {SW}_{out} ) \tag{2}\end{align*} \end{document}

where ΔS is the fluctuation of water storage in the wetland, P is precipitation, SW_in is surface water flow into the wetland, ET is evapotranspiration, GW_out is groundwater flow out of the wetland, and SW_out is surface water flow out of the wetland.

Figure 3 is a plot of water storage, precipitation, and evapotranspiration data from 1993 to 2003. The figure indicates that water storage in Wolonghu wetland declined since 1993. The trend of water storage had almost the same trend with precipitation and the opposite trend with evapotranspiration. From these observations, we can possibly conclude that precipitation was the primary charge to the wetland and evapotranspiration mostly influenced the loss of water storage. As mentioned above, owing to cost difficulties of the remote rural region, the data sets of SW_in, GW_out, and SW_out are unavailable. Therefore, we use the GP algorithm to capture the underlying relationships between water storage, P, and ET. According to the correlation coefficient between water storage and precipitation and evapotranspiration \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$( r ( \hat{y}_t^{ ( 0 ) } ( x ) , P ) =0.531$$\end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$r ( \hat{y}_t^{ ( 0 ) } ( x ) , \mid P - ET \mid ) = 0.648$$\end{document} , by spss), the relationship can be mathematically expressed as Equation (3): \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*}\hat{y}_t^{ ( 0 ) } ( x ) = U ( \mid P - ET \mid ) \tag{3}\end{align*} \end{document}

FIG. 3.

Time series of the estimated water storage and recorded evapotranspiration and precipitation.

where t refers to time of year, and P and ET represent the annual precipitation and annual evapotranspiration of the t year, respectively.

The steps of GP algorithm in this study are as follows:

(1) Randomly generate an initial population of models of the water storage

(2) Run a tournament, which randomly picks two models out of the population, and apply the search operators (selection, crossover, or mutation) to produce an offspring (new model) in the following way:

(a) With Crossover Frequency, apply crossover operation to produce an offspring.

(b) With Mutation Frequency, mutate the models.

(3) Compare all the models and remove the loser based on the fitness measure function [Eq. (1)].

(4) Repeat steps 2 and 3 until the termination criterion has been satisfied.

(5) Display the model with the best fitness evolved from the GP.

The terminal criterion in this study is that the least fitness function value of Equation (1) is unchanged after 100 repeats of step 4. Otherwise, the program will continuously perform step 4.

Water storage and meteorological data covering the years 1993–2006 were divided into two parts, of which 11 years (1993–2003) of data was used during the training phase and the remaining 3 years of data (2004–2006) in the testing phase. The program of the GP algorithm was developed in language C and was run on a PC with a 2.4 GHz and 1 GB RAM memory, and the running time was <5 min.

The model automatically derived from the GP algorithm is Equation (4): \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \begin{split}\hat{y}_t^{ ( 0 ) } ( x ) = {\rm tan} ( - 443.834x - 62.24 ) \\\quad - x\times\, ( x + \ln \mid x \mid - 25.593)\end{split} \tag{4}\end{align*} \end{document}

where x is the input value of |P−ET|, and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$\hat{y}_t^{ ( 0 ) } ( x ) = 0.531$$\end{document} is the simulated value of the water storage in Wolonghu wetland. From Equation (4), the annual water storage of the wetland can be calculated when the precipitation and evapotranspiration data were available. Results of simulated water storage using the GP model are presented in Table 2 and Fig. 4.

FIG. 4.

Results of simulated water storage by GP.

Table 2.

Testing of the GP Model for Simulating Water Storage in Wolonghu Wetland

	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$y_t^{ ( 0 ) } ( x )$$\end{document}	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$\hat{y}_t^{ ( 0 ) } ( x )$$\end{document}	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$\varepsilon ( x ) = y_t^{ ( 0 ) } ( x ) - \hat{y}_t^{ ( 0 ) } ( x )$$\end{document}	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$ {\Delta} (x) = \frac {\mid \varepsilon (x) \mid} { y_t^ { ( 0 ) } ( x ) } $$\end{document}
Year	Raw water storage (10⁶ m³)	Simulated water storage (10⁶ m³)	Residual error	Relative error (%)
1993	62.50	67.80	−5.3	8.48
1994	65.00	62.48	2.52	3.88
1995	68.00	63.07	4.93	7.25
1996	57.00	57.94	-0.94	1.65
1997	55.00	50.37	4.63	8.42
1998	72.00	66.83	5.17	7.18
1999	42.00	47.13	−5.13	12.2
2000	32.93	37.40	−4.47	13.6
2001	19.93	18.04	1.89	9.5
2002	12.70	11.68	0.82	8.0
2003	21.80	24.10	−2.3	10.6
2004	35.19	32.20	2.99	8.5
2005	60.84	55.00	5.84	9.6
2006	46.82	42.00	4.82	10.3

Boldface values were those used for model testing.

In Table 2, the minimum average relative error is 1.65% in 1996 and the maximum average relative error is 12.2% in 1999 during the training phase. The simulation curve is in substantial agreement with the measured data (Fig. 4).

Using the data of |P−ET| from 2004 to 2006 as input to Equation (4), the predicted annual water storages of Wolonghu wetland were calculated to be 32.20×10⁶ m³, 55.00×10⁶ m³, and 42.00×10⁶ m³, respectively. The relative errors are 8.50%, 9.60%, and 10.30% for 2004, 2005, and 2006, respectively.

Comparison with other methods

In this section, another two popular methods were introduced to do the same estimation of water storage in Wolonghu wetland. One is the MLP method, one of the most popular ANN architectures (Castellano-Méndez et al., 2004) and the other is the GM method (Deng, 1989), which has been applied successfully in hydrological modeling (Chiao et al., 1997; Trivedi and Singh, 2003; Hao et al., 2006).

For the MLP, a three-layer structure (see Fig. 5) with back-propagation learning algorithm was applied. The appropriate number of nodes in the hidden layer was determined by the rule of from 2n^1/2+m to 2n+1 (Fletcher and Goss, 1993), which is helpful to prevent the MLP algorithm from being overfitting (Huang and Foo, 2002), where n is the number of input nodes and m is the number of the output nodes.

FIG. 5.

Three-layer feed-forward backpropagation multilayer perceptron (MLP) network.

For the GM method, the GM(1,2) model was set up for water storage forecasting. To get good performance of the GM(1,2) model, two models were set up according to the data division. One was GM(1,2) with data-1, of which data from 1993 to 2003 was used for model setting and data from 2004 to 2006 for model testing. The other was the GM(1,2) model with data-2, of which data from 1999 to 2003 (the data set is smoother than data-1 [see Fig. 3]) was used for model setting and data from 2004 to 2006 for model testing. See Hao et al. (2006) for detailed information of GM(1,2) model construction.

Evaluation

In this study, two criteria were used to evaluate the performances of the forecasting models.

The first measurement is mean absolute percent error (MAPE): \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {\rm MAPE} = \frac {1} {n} \mathop {\sum} _ {t = 1} ^n \bigg | \frac {P_t - A_t} {A_t} \bigg | \tag {5} \end{align*} \end{document}

where P_t is the predicted value at time t, A_t is the measured value at time t, and n is the number of predictions.

The second criterion is the correlation coefficient r. The performances of the models and the comparative results were presented in Table 3 and Fig. 6.

FIG. 6.

Simulation and estimated of water storage by GP, MLP, and Grey theory model [GM(1,2)].

Table 3.

Performance Comparison of Between GP, MLP, and GM(1,2) Model for Water Storage Simulation and Prediction

	GP	MLP	GM(1,2) with data-1	GM(1,2) with data-2
Training phase
MAPE	8.25	61.64	791.90	37.31
r	0.98	0	−0.91	0.69
Testing phase
MAPE	9.47	19.77	987.85	88.06
r	0.99	0	−0.83	−0.74

GM, Grey theory model; GP, genetic programming; MAPE, mean absolute percent error; MLP, multilayer perceptron.

In Table 3, the GP method showed the best performance during both the training phase and testing phase with MAPE=8.25, r=0.98 in the training phase and MAPE=9.47, r=0.99 in the testing phase; whereas, the MLP performed with MAPE=61.64, r=0 in the training phase and MAPE=19.77, r=0 in the testing phase. The GM(1,2) models also performanced poorly compared with the GP model. Between the GM(1,2) models, the GM(1,2) with data-2 was better than GM(1,2) model with data-1.

In Fig. 6, the simulation and estimation curve of the GP model is in substantial agreement with the measured water storage data, whereas the curve of MLP is a straight line which had no correlation with the measured data. For the GM(1,2) models, the GM(1,2) model with data-1 was much different from the measured data, with high error of MAPE (987.85) in the testing phase. The GM(1,2) model with data-2 behaved better than both the GM(1,2) model with data-1 and the MLP model.

Discussion

The GP algorithm was applied to study the responses of water storage to climate change in Wolonghu wetland in the semiarid region of China. By using the GP algorithm, it is possible to simulate the change of water storage in Wolonghu wetland. The simulation and prediction curve matched well with the measured data under the limitation of a small data set. Based on the values of MAPE and r of the GP method (Table 2), we can conclude that the water storage in Wolonghu wetland is positively correlated with the precipitation and evapotranspiration. It therefore seems realistic to reemphasize that climate change is likely to have a significant impact on water storage in Wolonghu wetland in the semiarid region of northeast of China.

In the process of comparison, the reasons for poor performance of the MLP method and GM model are discussed. For the MLP method, the reason is probably that the network of the MLP is not convergent, which usually occurs when the network is not well-trained due to inavailability of large data sets for training, and the straight line of MLP in Fig. 6 confirms this. Moreover, the BP algorithms used in MLP are a researching tool of optimum value in a local area. It always leads to the local optimum and result in failure of network training when encountering the request of global optimum. Moreover, the choice of an appropriate network for a fixed case is another problem which is not well solved so far.

For the GM(1,2) models, the reason for poor performance is probably that there were more peaks in the data sets. We know that GM is a potential tool for modeling with a small data set (as few as four). However, the data quality often influences the performance of the method. In other words, the GM usually performed well when the data series satisfied the Grey exponential law (Li, 1992), and peaks in the data set often led to poor performance. We divided the data set into two parts (data-1 and data-2); in this study, data-2 was smoother than data-1. In contrast, the GP algorithm has the potential to automatically capture the relationship between the inputs and outputs without prior knowledge of the underlying physics, because it takes the errors as its inner motivation and evolves through operation of crossover and mutation, which help it to get as close to the global optimal solution as possible even with a small data set.

Conclusions

Water storage is the key factor for a wetland ecosystem (Kirk, et al., 2004). In this study, the GP method provided a quick and flexible means of creating a model and was successfully applied to simulate water storage of Wolonghu wetland in China. Taking |P−ET| as the input, annual water storage as the output, a simple specific model was automatically obtained from the GP algorithm. By using the model [Eq. (4)], we got the results of predicted annual water storage in Wolonghu wetland to be 32.20×10⁶ m³, 55.00×10⁶ m³, and 42.00×10⁶ m³, in the years 2004, 2005, and 2006, respectively. The relative errors are 8.50%, 9.60%, and 10.30%, respectively. The model predictions are in substantial agreement with the measured data (MAPE=9.47, r=0.99), in contrast with MLP (MAPE=19.77, r=0), GM(1,2) with data-1 (MAPE=987.85, r=−0.83), and GM(1,2) with data-2 (MAPE=88.06, r=−0.74). Results from comparison indicate that the GP algorithm can be used as a cost-effective and easy-to-use alternative tool for managers to evaluate water storage variation, especially for managers of a region where the data set was small. The results also indicated that the climate changes of the area impacted fluctuations of water storage in the wetland.

Footnotes

Acknowledgments

The authors would like to express cordial thanks to the National Key Project of Major Science and Technology Program for Water Pollution Control and Treatment (2009ZX07528-006-2) in China and Chinese College Research Fund on Vital Projects (No. 705011) for the financial support.

Author Disclosure Statement

No competing financial interests exist.

References

Abrahart

R.J.

, Kneale

P.E.

1997. Exploring Neural Network Rainfall-Runoff Modelling. BHS 6th National Symposium: Salford, United Kingdom, 9.35.

Aqil

, Kita

, Yano

, Nishiyama

2007. Analysis and prediction of flow from local source in a river basin using a neuro-fuzzy modeling tool. J. Environ. Manage., 85:215.

Bach

, Mauser

2003. Methods and examples for remote sensing data assimilation in land surface process modeling. IEEE Trans. Geosci. Remote Sens., 41:1629.

Baratti

, Cannas

, Fanni

, Pintus

, Sechi

G.M.

, Toreno

2003. River flow forecast for reservoir management through neural networks. Neurocomputing, 55:421.

Bouten

, Schaap

M.G.

, Aerrs

, Vermetten

W.M.

1996. Monitoring and modeling canopy water storage amounts in support of atmospheric deposition studies. J. Hydrol., 185:363.

Bradley

2002. Simulation of annual water table dynamics of a floodplain wetland, Narborough Bog, UK. J. Hydrol., 261:150.

Calder

I.R.

, Wright

I.R.

1986. Gamma ray attenuation studies of interception from Sitka spruce: Some evidence for an additional transport mechanism. Water Resour. Res., 22:409.

Castellano-Méndez

, González-Manteiga

, Febrero-Bande

, Prada-Sánchez

J.M.

, Lozano-Calderón

2004. Modelling of the monthly and daily behavior of the runoff of the Xallas River using Box-Jenkins and neural networks methods. J. Hydrol., 296:38.

Chiao

J.H.

, Wang

W.Y.

, Lu

M.J.

1997. A study for applying Grey forecasting to improve the reliability of product. Second National Conference on Grey Theory and Applications, Taiwan, 202.

10.

Dawson

C.W.

, Wilby

1998. An artificial neural network approach to rainfall-runoff modelling. Hydrol. Sci. J., 43:14.

11.

Dawson

T.P.

, Berry

P.M.

, Kampa

2003. Climate change impacts on freshwater wetland habitats. J. Nat. Conserv., 11:25.

12.

Deng

J.L.

1989. Introduction to Grey system theory. J. Grey Syst., 1:1.

13.

Duyvesteyn

, Kaymak

2005. Genetic programming in economic modelling. Proceedings of the 2005 IEEE Congress on Evolutionary Computation, 1025.

14.

Fletcher

, Goss

1993. Forecasting with neural networks: An application using bankruptcy data. Inf. Manage., 24:159.

15.

Fojt

, Harding

1995. Thirty years of changes in the vegetation communities of three valley mires in Suffolk, England. J. Appl. Ecol., 32:561.

16.

Grings

, Salvia

, Karszenbaum

, Ferrazzoli

, Kandus

, Perna

2009. Exploring the capacity of radar remote sensing to estimate wetland marshes water storage. J. Environ. Manage., 90:2189.

17.

Hao

Y.H.

, Yeh

T.C.J.

, Gao

Z.Q.

, Wang

Y.R.

, Zhao

2006. A Grey system model for studying the response to climate change: The Liulin Karst springs, China. J. Hydrol., 328:668.

18.

Huang

, Foo

2002. Neural network modeling of salinity variation in Apalachicola River. Water Res., 36:356.

19.

Imrie

C.E.

, Durucan

, Korre

2000. River flow prediction using artificial neural networks: Generalisation beyond the calibration range. J. Hydrol., 233:138.

20.

Kamal

H.A.

, Eassa

M.H.

2002. Solving curve fitting problems using genetic programming. 11th IEEE Mediterranean Electrotechnical Conference 2002, Cairo, EgyptIEEE MELECONMay 7–9, 316.

21.

Kirk

J.A.

, Wis

W.R.

, Delfino

J.J.

2004. Water budget and cost-effectiveness analysis of wetland restoration alternatives: a case study of Levy Prairie, Alachua County, Florida. Ecol. Eng., 22:43.

22.

Koza

J.R.

1992. Genetic Programming on the Programming of Computers by Means of Natural Selection, 7th. MIT Press: Cambridge, MA.

23.

1992. Grey System Theory and Application. Scientific and Technological Document Press: Beijing(in Chinese).

24.

Liu

1998. Estimation of rainfall storage capacity in the canopies of cypress wetlands and slash pine uplands in north-central Florida. J. Hydrol., 207:32.

25.

Lopes

H.S.

2007. Genetic programming for epileptic pattern recognition in electroencephalographic signals. Appl. Soft Comput., 7:343.

26.

Makkeasorn

, Chang

N.B.

, Zhou

2008. Short-term water storage forecasting with global climate change implications—A comparative study between genetic programming and neural network models. J. Hydrol., 352:336.

27.

Muttil

, Lee

J.H.W.

2005. Genetic programming for analysis and real-time prediction of coastal algal blooms. Ecol. Model., 189:363.

28.

Nath

, Rajagopalan

, Ryker

1997. Determining the saliency of input variables in neural network classifiers. Comput. Oper. Res., 24:767.

29.

Riad

, Mania

, Bouchaou

, Najjar

2004. Rainfall-runoff model using an artificial neural network approach. Math. Comput. Model., 40:839.

30.

Trivedi

H.V.

, Singh

J.K.

2005. Application of Grey theory in the development of a runoff prediction model. Biosyst. Eng., 92:521.

31.

Valença

, Ludermir

, Valença

2005. River flow forecasting for reservoir management through neural networks. Proceedings of the Fifth International Conference on Hybrid Intelligent Systems, Los Alamitos, CA.

32.

Wang

X.P.

, Cao

L.M.

2002. Genetic Algorithm: Theory Application and Software Implement. Xian Jiaotong University Press: Xi’an(in Chinese).

33.

Whigham

P.A.

, Crapper

P.F.

2001. Modelling rainfall-runoff using genetic programming. Math. Comput. Model., 33:707.

34.

Zealand

C.M.

, Burn

D.H.

, Simonovic

S.P.

1999. Short term water storage forecasting using artificial neural networks. J. Hydrol., 214:32.