Abstract
In recent years, the consumption of natural gas in China has dramatically increased. The consumption of natural gas in 2013 is six times as much as that in 2003. Hence, reasonably forecasting the supply and demand of China’s natural gas has an important significance for Chinese government to formulate energy policy. In this paper, a novel grey model named TPGM(1,1) is proposed to simulate and forecast the supply and demand of natural gas in China. Firstly, the unbiased parameter estimation method of TPGM(1,1) was studied by Cramer’s rule; secondly the optimal method of the initial value of TPGM(1,1) was deduced; thirdly, the TPGM(1,1) for the output and consumption of natural gas in China was then built, and simulated and predicted results were compared with those of other models using known data. Finally, the supply and demand of natural gas in China during 2015-2020 are then forecasted by the novel model, and the results show 67.61 percent of the consumption of natural gas in 2020 will depend on foreign imports due to a surge in demand for natural gas in China.
Keywords
Introduction
At present, natural gas has become increasingly popular as it burns cleaner than other forms of fossil fuels and has high heating power. Developing the natural gas industry has become a perfect choice for improving human environment and promoting sustainable economic development. Hence, natural gas has already played a significant role in global energy consumption and is an important raw material in many industries. Because both the supply and demand sides of natural gas sales must abide the international rule of ‘take or pay’, there is a need to scientifically and effectively forecast the production scale and consumption of natural gas. This is especially important for China to ensure a balance of natural gas supply and demand, and for the authorities to reasonably enact energy consumption plans and policy measures to promote sustainable development of economy.
Common prediction methods include the regression model, neural network, Markov model and support vector machine (SVM, for short). In regression model, a dependent variable is forecasted by building the functional relation between the dependent variable and the independent variable [1]. Since regression model is based on the mathematical statistics theory, the method requires large samples of data. Meanwhile, the predictive value of the dependent variable is dependent on the independent variable, but the independent variable is also estimated by means of forecasting and the result is uncertain; thus inevitably the predictive value of the dependent variable has even more uncertainty. Therefore, classical regression models cannot handle uncertain systems with the characteristics of small sample sizes, non-obvious statistical regularity and complicated structural relationship between the dependent variable and the independent variable. Neural network accomplishes the mapping function from input to output [2], but the imperfection of ‘overfitting’ often results in the decline of its predictive ability. Markov model has a better performance when predicting process status, but it is not suitable for medium- and long-term prediction of a system [3]. As for SVM, it needs a large enough data set to accomplish a reliable crosscheck [4]; so SVM is more suitable for prediction problems that involve large amounts of data.
Grey prediction model represented by GM(1,1) is the core of grey system theory [5, 6], and is a common method which can be employed to deal with the uncertainty prediction problem of ‘small sample and poor information’. It mainly uses a small amount of effective data and grey uncertain information to reveal a system’s future developmental trend through the accumulation generation process of sequences. Grey prediction model has been widely used because it is not strict with the size of data and does not require analysis of the complicated functional relations between the dependent variable and the independent variable [7–11]. In the last thirty years, in order to enhance the simulation precision and improve the predictive performance of grey prediction model, there has been in-depth and effective study of the model from different perspectives. Examples include the preprocessing of modeling data [12], optimization of the model’s initial value [13, 14] and background value [15], reform and upgrade of modeling method and different models’ combination [16–22], expansion and extension of modeling target [23], mathematical proof of model properties and applicable scope [24, 25]. All of the above measures promote the improvement of the grey prediction theory.
However, be it GM(1,1) model or DGM(1,1) model, their latest restored forms are all shown as homogeneous exponential form, and this weakens their ability to simulate non-homogeneous exponential sequences. Hence, building a grey prediction model which can simultaneously simulate homogeneous and non-homogeneous exponential sequences has a general significance. In fact, two different types of grey prediction models for approximate non-homogeneous exponential sequences have been built by the authors [26, 27]. These are achieved via homogeneous transformation of approximate non-homogeneous exponential sequences and direct modeling method that omits the process of accumulating generation; both have been shown to effectively simulate non-homogeneous exponential sequences. However, the above two models assumed that the modeling sequences have approximate non-homogeneous characteristic, and the simulation precisions of these two models are unsatisfactory when a modeling sequence shows the characteristic of being approximate homogeneous. This shows the limitations of the above two models. In literature [28, 29], the mean form of GM(1,1) model was expanded, and two new prediction models that can accomplish the simulation of non-homogeneous exponential sequences were deduced, according to x(0) (k) + az(1) (k) = kb and x(0) (k) + az(1) (k) = kb - 0.5b + c, respectively. However, they both take parameters a, b estimated by the difference equation as parameters a, b of their time response functions, and this approximate replacement means they cannot achieve the complete simulation for non-homogeneous exponential sequences. In other words, the two types of models produce simulation errors in the case of rigorous non-homogeneous exponential sequences.
A classical GM(1,1) model contains two parameters, that is a and b, hence, GM(1,1) model can be considered a dual-parameter grey prediction model. In this paper, a novel unbiased grey prediction model including three parameters a, b, c is proposed, and we call it a three-parameter grey prediction model, or TPGM(1,1) model for short. In TPGM(1,1) model, parameters a, b, c are estimated by a difference equation, and its time response function of grey model is also deduced by the same difference equation. Hence the modeling process of TPGM(1,1) model avoids the simulation errors resulted from the parameter approximate substitution in literature [28, 29], and it can be used for unbiased simulation of homogeneous or non-homogeneous exponential sequences.
This paper is organized as follows. In Section 2, we define the common form of GM(1,1) model and deduce the modeling method of TPGM(1,1) model. Then we employ Cramer’s rule to estimate the parameters of TPGM(1,1) model. In Section 2, we deduce the final restored formula and show the initial value optimization method of TPGM(1,1) model. In Section 4, we employ TPGM(1,1) model to simulate and forecast the output and consumption of natural gas in China, and compare its simulation and prediction precisions with GM(1,1) model, ODGM(1,1) model, DDGM(1,1) model, NGM(1,1,k) model and Neural Network model. The results show that the TPGM(1,1) model has the best simulation and prediction precisions. In Section 5, some conclusions will be provided. The structure chart of this paper is shown in Fig. 1.
Model derivation
The common form of GM (1, 1) model
Where , k = 1, 2, …, n; and Z(1) is the Mean sequence generated by consecutive neighbors of X(1), that is,
Where z(1) (k) =0.5 × [x(1) (k) + x(1) (k - 1)], k = 2, 3, …, n. Then
Parameter estimation of TPGM (1, 1) model
From Definition 1,
Then
Then
Then
That is
Let
Then Formula (1) will be transformed as follows,
Let be the simulation value of x(1) (k), to minimize the simulation error, it needs to satisfy the following condition:
According to Ordinary Least Square (OLS) method, we minimize S with respect to parameters φ1, φ2, φ3 to obtain
From the above formulas, we can obtain Equation (3) as follows:
Parameters φ1, φ2, φ3 are all unknown in Equation (3), and we can obtain the following results according to Cramer’s rule:
If D ≠ 0, the solution of non-homogeneous Equation (3) is as follows:
We can deduce the estimated values of parameter a, b, c according to the relationship between φ1, φ2, φ3 and a, b, c in Formula (1), as follows:
Then, the calculation formula of simulation value of can be transformed to the following,
The restored formula of is , as shown below, where k = 2, 3, …, n.
From Formula (4), it is easy to note that there are three parameters , , in Formula (5). Hence we call Formula (5) a three-parameter grey prediction model, or TPGM(1,1) for short.
In grey prediction model, is often used as the initial value to deduce the time response function of . In Formula (4), x(0) (1) will be employed to express , and the derivation is shown as follows.
When k = 2,
When k = 3
⋯
When k = t
From Formula (6), we can obtain Formula (7),
Formula (7) is simplified as the following,
According to Formula (8), we can deduce that when k = 2, 3, …, that is,
Rearranging Formula (9), we can obtain Formula (10) as follows.
Formula (10) is called the final restored form of TPGM(1,1) model. It can be seen from the deduction process that the solution of Formula (10) is obtained under the condition that its initial value is x(0) (1). As a result, the fitted curve of TPGM(1,1) model inevitably passes through the point (1, x(0) (1)) on a coordinate plane. However, according to Ordinary Least Square (OLS) method, the fitted curve maybe not passes through point (1, x(0) (1)). In other words, the theoretical foundation of taking as the initial value is not sufficient. According to the principle of OLS in econometric model, an initial value should satisfy the condition of producing minimum simulation errors in a model.
Assume that the optimal initial value of TPGM(1,1) model is φ, then
To solve the least simulation errors of is an optimization problem, that is,
According to OLS, to make Q the minimum, φ should satisfy
Rearrange it to get the following formula,
Then the optimal initial value of TPGM(1,1) model φ is the following:
According to the above researches, the modeling flowsheet of TPGM(1,1) model can be seen in Fig. 2. A MATLAB program for building a TPGM(1,1) model was developed to decrease the computational complexity, and it can be seen in the appendix of this paper.
The present situation of output and consumption of natural gas in China
As the second largest energy consumer in the world, China has a long history of using natural gas. Natural gas has the advantages of being environmentally friendly, having high thermal efficiency and higher performance-price ratio compared to petrol, diesel and liquid petroleum gas (LPG). As a result the output and consumption of natural gas in China have maintained a rapidly growing trend. The output and consumption of natural gas between 2002 and 2013 are shown in Table 1.
According to Table 1, the changed tendency’s time line chart of the output and consumption of natural gas in China is shown in Fig. 3.
It can be seen from Fig. 1 that the output and consumption of natural gas grow rapidly since 2002, and the output and consumption about balanced each other before 2009. After 2009, with the rapid development of Chinese natural gas industries and the increase of civilian fuel consumption, both the output and consumption continued to rise, but the growth rate of output has been significantly slower than that of the consumption. As there is a lack of effective domestic supply of natural gas, China begun to import a lot of natural gas from other countries.
Testing method of model errors
Assume that X(0) = (x(0) (1) , x(0) (2) , …, x(0) (n)) is an original sequence, we employ model α to simulate sequence X(0), and its corresponding simulative sequence is , then the residual sequence is defined as ɛ(0), that is,
The relative error sequence is defined as Δ, that is
Here Δ k (k = 1, 2, …, n) is the relative simulation percentage error of data at point k, and is the mean relative simulation percentage error (MRSPE) of sequence X(0) in model α; Similarly, 1 - Δ is the relative simulation percentage precision at point k, and the mean relative simulation percentage precision of sequence X(0) in model α.
The TPGM(1,1) model of China’s natural gas output and consumption, or Model 1 for short, will be built according to the statistical data between 2002 and 2012 in this subsection, and we will compute its simulation precision and forecast the output and consumption of natural gas in 2013; after that, we will compare the above predicted result with the real data.
In addition, it is known from the proof of TPGM(1,1) model’s property and deduction that TPGM(1,1) model can achieve the unbiased simulation of homogeneous and non-homogeneous exponential sequences at the same time. In this subsection, in order to test the simulation and prediction performance of TPGM(1,1) model, two homogeneous exponential models, including GM(1,1) model (Model 2 for short) and ODGM model (Model 3 for short), two non-homogeneous exponential model, including DDGM(1,1) model (Model 4 for short) and NGM(1,1,k) model (Model 5 for short), and neural network model (Model 6 for short) will be used to simulate and forecast the output and consumption of China’s natural gas. The simulated and predicted results will be compared and analyzed.
Simulating and forecasting China’s natural gas output
According to the statistical data of natural gas output between 2002 and 2012 in Table 1, the modeling Sequence X(0) is as follows,
TPGM(1,1) model (Model 1) TPGM(1,1) model of China’s natural gas output and its restored form are as follows, respectively,
GM(1,1) model (Model 2: homogeneous exponential model)
ODGM model (Model 3: homogeneous exponential model)
DDGM(1,1) model (Model 4: non-homogeneous exponential model)
NGM(1,1,k) model (Model 5: non-homogeneous exponential model)
Neural network model (Model 6) We use the neural network package ‘nnstart’ of MATLAB R2013a, deploy the data in Sequence with Training: Validation: Testing=70% :15% :15% in proportion, and set Number of Hidden Neurons=10, Number of delays=2; Simulation results are shown in Table 2.
The simulation values and errors of China’s natural gas output between 2003 and 2012 with six different models are shown in Table 2.
According to Table 2, the relative simulation percentage errors of China’s natural gas output with six different mathematical models are shown in Fig. 4.
The predictive values and errors of China’s natural gas output in 2013 with six different mathematical models are shown in Table 3.
Simulating and forecasting China’s natural gas consumption
According to the statistical data of China’s natural gas consumption between 2002 and 2012 in Table 1, modeling Sequence X(0) is the following,
TPGM(1,1) model (Model 1) TPGM(1,1) model of China’s natural gas consumption and its restored form are as follows, respectively,
GM(1,1) model (Model 2: homogeneous exponential model)
ODGM model (Model 3: homogeneous exponential model)
DDGM(1,1) model (Model 4: non-homogeneous exponential model)
NGM(1,1,k) model (Model 5: non-homogeneous exponential model)
Neural network model (Model 6) We use the neural network package ‘nnstart’ of MATLAB R2013a, deploy the data in Sequence with Training: Validation: Testing = 70% : 15% :15% in proportion, and set Number of Hidden Neurons=10, Number of delays=2. Simulation results are shown in Table 4.
The simulation values and errors of China’s natural gas consumption between 2003 and 2012 with six different models are shown in Table 4.
According to Table 4, the relative simulation percentage errors of China’s natural gas consumption with six mathematical models are illustrated in Fig. 5.
The predictive values and errors of China’s natural gas consumption in 2013 with six mathematical models can be seen in Table 5.
According to Tables 3 and 5, the comparison histogram of the predicted output and consumption in 2013 with six models is shown in Fig. 6.
According to Tables 3 and 5, the comparison histogram of the predictive percentage errors of China’s natural gas output and consumption in 2013 using six different models is shown in Fig. 7.
Forecasting China’s natural gas output and consumption from 2014 to 2020
It can be seen from the earlier simulated and predicted results of China’s natural gas output and consumption in year 2003 to 2012 with six models that the TPGM(1,1) model proposed in this paper has the relatively optimal simulation and prediction performance. Hence, in this subsection, we will employ the model to forecast and analyze China’s natural gas output and consumption from 2014 to 2020.
When k = 13, 14, …, 19, the predictive models of China’s natural gas output and consumption from 2014 to 2020 are as follows:
Output:
Consumption:
According to the above models, the predictive results of China’s natural gas output and consumption are shown in Table 6.
According to Table 6, the evolutionary process and changing trend of China’s natural gas output and consumption from 2002 to 2020 can be seen in Fig. 8.
As shown in Fig. 8, China’s natural gas output and consumption were able to maintain a balance of supply and demand during 2002-2009. However, after 2009, there was limited growth of domestic natural gas output, while consumption increased dramatically; the supply and demand balance was broken. Consequently the Chinese government started to import a large amount of natural gas to meet the growing domestic demand for natural gas. According to our research findings, the supply and demand disparity of domestic natural gas will become more prominent: China’s natural gas consumption in 2020 is expected to reach 502.80 billion cubic meters, but the predicted output will be only 162.85 billion cubic meters for the same time period. By then more than half of China’s natural gas demand will need to be imported from overseas. This may be the main reason that China and Russia signed a huge trade agreement of natural gas worth 400 billion US dollars in May 2014.
The main reasons for the sharp increase of China’s natural gas demand are twofold: Firstly the Chinese government needs to use natural gas which is a form of clean energy to help to mitigate increasingly serious environmental pollution problems; secondly, the rapid development of China’s natural gas chemical industry also significantly contributes to this growing demand.
Analysis of model performance
In this paper, we use six different mathematical models to simulate and forecast the output and consumption of China’s natural gas, and the findings show that the TPGM(1,1) model proposed in this paper has the most excellent simulation and prediction performance. The main reasons for this superior performance are explained as follows.
Parameters of GM(1,1) and NGM(1,1,k) model are all estimated based on the modeling foundation of difference equations, but their time response functions are drawn through differential equations, hence their modeling process gives rise to ‘hopping errors’ due to distinction between difference equations and differential equations, and this is the main reason that the two models cannot achieve unbiased simulation of homogeneous exponential sequences.
The parameter estimation method of OGM(1,1) model, unlike the traditional GM(1,1) model, does not have the intrinsic weakness of ‘hopping error’ from difference equations to differential equations. However, its final restored form is of homogeneous exponential type, and this reduces the simulation and prediction effectiveness of OGM(1,1) for non-homogeneous or approximate non-homogeneous exponential sequences. The simulation and prediction performance of OGM(1,1) model will become more inferior when the modeling sequence has fluctuating features.
When the modeling sequence is already approximate exponential, the modeling method of DDGM(1,1) model omits the accumulating generation process and directly builds a grey prediction model. Accumulating generation is a method employed to whitenize a grey process. It plays an extremely important role in any grey system. Through accumulating generation, one can potentially uncover a development tendency existing in the process of accumulating grey quantities so that the characteristic and laws of integration hidden in the chaotic original data can be sufficiently revealed. So the precision of DDGM (1,1) model that omits the accumulating generation process is not ideal. Furthermore, neural network accomplishes the mapping function from input to output, but the imperfection of ‘overfitting’ often resulted in decreased predictive ability.
TPGM(1,1) model is deduced from the fundamental form of GM(1,1) model (not differential equations), and its modeling parameters are estimated by OLS. This ensures the consistency between the expressed form and parameters of the model, and solves the hopping error problem of NGM(1,1,k) due to changes from difference equations to differential equations. On the other hand, the parameter used in TPGM(1,1) model is estimated according to a real situation, and not the constant parameter. Employing a changed parameter that adapts to the changing exterior situation is another reason that TPGM (1, 1) has better simulation and prediction performance.
Conclusions
A novel unbiased grey prediction model, TPGM(1,1) for short, is proposed in this paper. We have deduced the calculational method of finding the optimal initial values. We have also solved the inconsistency problem between the parameter estimation method that is based on difference equations and the time response functions of grey models that are based on differential equations.
The TPGM(1,1) model of China’s natural gas output and consumption was built, and its simulation and prediction results were compared with those of other models. The comparison shows that the proposed TPGM(1,1) model has relatively optimal simulation and prediction precisions. The output and consumption of China’s natural gas during 2015-2020 have been forecasted, and the predictive findings show that China’s natural gas demand will go up to 502.80 billion cubic meters; however the corresponding output will be only 162.85 billion cubic meters. Hence, by 2020 67.61% of China’s natural gas will need to be imported from overseas. China will become one of the fastest growing countries of natural gas consumption.
Footnotes
Acknowledgments
Our work was supported by National Natural Science Foundation of China (Grant No. 71271226), Chongqing Frontier and Applied Basic Research Project (cstc2014jcyjA00024), Postdoctoral Science Foundation of China (Grant No. 2015T80975) and Marie Curie International Incoming Fellowship within the 7th European Community Framework Programme (Grant No. FP7-PIIF-GA-2013-629051). We would like to thank the anonymous referees for their constructive comments that helped to improve the clarity and completeness of this paper.
