Forecasting the relation of supply and demand of natural gas in China during 2015-2020 using a novel grey model

Abstract

In recent years, the consumption of natural gas in China has dramatically increased. The consumption of natural gas in 2013 is six times as much as that in 2003. Hence, reasonably forecasting the supply and demand of China’s natural gas has an important significance for Chinese government to formulate energy policy. In this paper, a novel grey model named TPGM(1,1) is proposed to simulate and forecast the supply and demand of natural gas in China. Firstly, the unbiased parameter estimation method of TPGM(1,1) was studied by Cramer’s rule; secondly the optimal method of the initial value of TPGM(1,1) was deduced; thirdly, the TPGM(1,1) for the output and consumption of natural gas in China was then built, and simulated and predicted results were compared with those of other models using known data. Finally, the supply and demand of natural gas in China during 2015-2020 are then forecasted by the novel model, and the results show 67.61 percent of the consumption of natural gas in 2020 will depend on foreign imports due to a surge in demand for natural gas in China.

Keywords

Grey prediction model parameter estimation TPGM(1,1) model supply and demand conditions of natural gas in China

1 Introduction

At present, natural gas has become increasingly popular as it burns cleaner than other forms of fossil fuels and has high heating power. Developing the natural gas industry has become a perfect choice for improving human environment and promoting sustainable economic development. Hence, natural gas has already played a significant role in global energy consumption and is an important raw material in many industries. Because both the supply and demand sides of natural gas sales must abide the international rule of ‘take or pay’, there is a need to scientifically and effectively forecast the production scale and consumption of natural gas. This is especially important for China to ensure a balance of natural gas supply and demand, and for the authorities to reasonably enact energy consumption plans and policy measures to promote sustainable development of economy.

Common prediction methods include the regression model, neural network, Markov model and support vector machine (SVM, for short). In regression model, a dependent variable is forecasted by building the functional relation between the dependent variable and the independent variable [1]. Since regression model is based on the mathematical statistics theory, the method requires large samples of data. Meanwhile, the predictive value of the dependent variable is dependent on the independent variable, but the independent variable is also estimated by means of forecasting and the result is uncertain; thus inevitably the predictive value of the dependent variable has even more uncertainty. Therefore, classical regression models cannot handle uncertain systems with the characteristics of small sample sizes, non-obvious statistical regularity and complicated structural relationship between the dependent variable and the independent variable. Neural network accomplishes the mapping function from input to output [2], but the imperfection of ‘overfitting’ often results in the decline of its predictive ability. Markov model has a better performance when predicting process status, but it is not suitable for medium- and long-term prediction of a system [3]. As for SVM, it needs a large enough data set to accomplish a reliable crosscheck [4]; so SVM is more suitable for prediction problems that involve large amounts of data.

Grey prediction model represented by GM(1,1) is the core of grey system theory [5, 6], and is a common method which can be employed to deal with the uncertainty prediction problem of ‘small sample and poor information’. It mainly uses a small amount of effective data and grey uncertain information to reveal a system’s future developmental trend through the accumulation generation process of sequences. Grey prediction model has been widely used because it is not strict with the size of data and does not require analysis of the complicated functional relations between the dependent variable and the independent variable [7 –11]. In the last thirty years, in order to enhance the simulation precision and improve the predictive performance of grey prediction model, there has been in-depth and effective study of the model from different perspectives. Examples include the preprocessing of modeling data [12], optimization of the model’s initial value [13, 14] and background value [15], reform and upgrade of modeling method and different models’ combination [16 –22], expansion and extension of modeling target [23], mathematical proof of model properties and applicable scope [24, 25]. All of the above measures promote the improvement of the grey prediction theory.

However, be it GM(1,1) model or DGM(1,1) model, their latest restored forms are all shown as homogeneous exponential form, and this weakens their ability to simulate non-homogeneous exponential sequences. Hence, building a grey prediction model which can simultaneously simulate homogeneous and non-homogeneous exponential sequences has a general significance. In fact, two different types of grey prediction models for approximate non-homogeneous exponential sequences have been built by the authors [26, 27]. These are achieved via homogeneous transformation of approximate non-homogeneous exponential sequences and direct modeling method that omits the process of accumulating generation; both have been shown to effectively simulate non-homogeneous exponential sequences. However, the above two models assumed that the modeling sequences have approximate non-homogeneous characteristic, and the simulation precisions of these two models are unsatisfactory when a modeling sequence shows the characteristic of being approximate homogeneous. This shows the limitations of the above two models. In literature [28, 29], the mean form of GM(1,1) model was expanded, and two new prediction models that can accomplish the simulation of non-homogeneous exponential sequences were deduced, according to x⁽⁰⁾ (k) + az⁽¹⁾ (k) = kb and x⁽⁰⁾ (k) + az⁽¹⁾ (k) = kb - 0.5b + c, respectively. However, they both take parameters a, b estimated by the difference equation as parameters a, b of their time response functions, and this approximate replacement means they cannot achieve the complete simulation for non-homogeneous exponential sequences. In other words, the two types of models produce simulation errors in the case of rigorous non-homogeneous exponential sequences.

A classical GM(1,1) model contains two parameters, that is a and b, hence, GM(1,1) model can be considered a dual-parameter grey prediction model. In this paper, a novel unbiased grey prediction model including three parameters a, b, c is proposed, and we call it a three-parameter grey prediction model, or TPGM(1,1) model for short. In TPGM(1,1) model, parameters a, b, c are estimated by a difference equation, and its time response function of grey model is also deduced by the same difference equation. Hence the modeling process of TPGM(1,1) model avoids the simulation errors resulted from the parameter approximate substitution in literature [28, 29], and it can be used for unbiased simulation of homogeneous or non-homogeneous exponential sequences.

This paper is organized as follows. In Section 2, we define the common form of GM(1,1) model and deduce the modeling method of TPGM(1,1) model. Then we employ Cramer’s rule to estimate the parameters of TPGM(1,1) model. In Section 2, we deduce the final restored formula and show the initial value optimization method of TPGM(1,1) model. In Section 4, we employ TPGM(1,1) model to simulate and forecast the output and consumption of natural gas in China, and compare its simulation and prediction precisions with GM(1,1) model, ODGM(1,1) model, DDGM(1,1) model, NGM(1,1,k) model and Neural Network model. The results show that the TPGM(1,1) model has the best simulation and prediction precisions. In Section 5, some conclusions will be provided. The structure chart of this paper is shown in Fig. 1.

2 Model derivation

2.1 The common form of GM (1, 1) model

Definition 1. Assume that an original non-negative sequence is X⁽⁰⁾ = (x⁽⁰⁾ (1) , x⁽⁰⁾ (2) , …, x⁽⁰⁾ (n)) and X⁽¹⁾ is the 1-AGO sequence of X⁽⁰⁾, that is $X^{(1)} = (x^{(1)} (1), x^{(1)} (2), \dots, x^{(1)} (n))$

Where $x^{(1)} (k) = \sum_{i = 1}^{k} x^{(0)} (i)$ , k = 1, 2, …, n; and Z⁽¹⁾ is the Mean sequence generated by consecutive neighbors of X⁽¹⁾, that is, $Z^{(1)} = (z^{(1)} (2), z^{(1)} (3), \dots, z^{(1)} (n)),$

Where z⁽¹⁾ (k) =0.5 × [x⁽¹⁾ (k) + x⁽¹⁾ (k - 1)], k = 2, 3, …, n. Then $x^{(0)} (k) + {az}^{(1)} (k) = 0.5 (2 k - 1) b + c$ is the common form of GM(1,1) model. [20]

2.2 Parameter estimation of TPGM (1, 1) model

From Definition 1, ${\begin{matrix} x^{(0)} (k) + {az}^{(1)} (k) = 0.5 (2 k - 1) b + c \\ x^{(0)} (k) = x^{(1)} (k) - x^{(1)} (k - 1) \\ z^{(1)} (k) = 0.5 (x^{(1)} (k) + x^{(1)} (k - 1)) \end{matrix}$

Then $\begin{matrix} x^{(1)} (k) - x^{(1)} (k - 1) + 0.5 {ax}^{(1)} (k) \\ + 0.5 {ax}^{(1)} (k - 1) = 0.5 (2 k - 1) b + c \end{matrix}$

Then $\begin{matrix} (1 + 0.5 a) x^{(1)} (k) - (1 - 0.5 a) x^{(1)} (k - 1) \\ = 0.5 (2 k - 1) b + c \end{matrix}$

Then $\begin{matrix} (1 + 0.5 a) x^{(1)} (k) = 0.5 (2 k - 1) b \\ + c + (1 - 0.5 a) x^{(1)} (k - 1) \end{matrix}$

That is

$\begin{matrix} x^{(1)} (k) & = & \frac{1 - 0.5 a}{1 + 0.5 a} x^{(1)} (k - 1) + \frac{b}{1 + 0.5 a} k \\ + \frac{0.5 b - c}{1 + 0.5 a} . k = 2, 3, \dots, n \end{matrix}$ (1)

Let $φ_{1} = \frac{1 - 0.5 a}{1 + 0.5 a}, φ_{2} = \frac{b}{1 + 0.5 a}, φ_{3} = \frac{0.5 b - c}{1 + 0.5 a}$

Then Formula (1) will be transformed as follows,

$\begin{matrix} x^{(1)} (k) & = & φ_{1} x^{(1)} (k - 1) + φ_{2} k + φ_{3} \\ k = 2, 3, \dots, n \end{matrix}$ (2)

Let ${\hat{x}}^{(1)} (k)$ be the simulation value of x⁽¹⁾ (k), to minimize the simulation error, it needs to satisfy the following condition: $\begin{matrix} S & = & min \sum_{k = 2}^{n} {[x^{(1)} (k) - {\hat{x}}^{(1)} (k)]}^{2} \\ = & min \sum_{k = 2}^{n} {[x^{(1)} (k) - {\hat{φ}}_{1} x^{(1)} (k - 1) - {\hat{φ}}_{2} k - {\hat{φ}}_{3}]}^{2} . \end{matrix}$

According to Ordinary Least Square (OLS) method, we minimize S with respect to parameters φ₁, φ₂, φ₃ to obtain ${\begin{matrix} \frac{\partial S}{\partial {\hat{φ}}_{1}} & = & - 2 \sum_{k = 2}^{n} [x^{(1)} (k) - {\hat{φ}}_{1} x^{(1)} (k - 1) \\ - {\hat{φ}}_{2} k - {\hat{φ}}_{3}] x^{(1)} (k - 1) = 0 \\ \frac{\partial S}{\partial {\hat{φ}}_{2}} & = & - 2 \sum_{k = 2}^{n} [x^{(1)} (k) - {\hat{φ}}_{1} x^{(1)} (k - 1) \\ - {\hat{φ}}_{2} k - {\hat{φ}}_{3}] k = 0 \\ \frac{\partial S}{\partial {\hat{φ}}_{3}} & = & - 2 \sum_{k = 2}^{n} [x^{(1)} (k) - {\hat{φ}}_{1} x^{(1)} (k - 1) \\ - {\hat{φ}}_{2} k - {\hat{φ}}_{3}] = 0 \end{matrix}$

From the above formulas, we can obtain Equation (3) as follows: ${\begin{matrix} {\hat{φ}}_{1} \sum_{k = 2}^{n} & [x^{(1)} (k - 1)]^{2} + {\hat{φ}}_{2} \sum_{k = 2}^{n} [{kx}^{(1)} (k - 1)] \\ + {\hat{φ}}_{3} \sum_{k = 2}^{n} x^{(1)} (k - 1) \\ = \sum_{k = 2}^{n} [x^{(1)} (k) x^{(1)} (k - 1)] \\ {\hat{φ}}_{1} \sum_{k = 2}^{n} & {kx}^{(1)} (k - 1) + {\hat{φ}}_{2} \sum_{k = 2}^{n} k^{2} \\ + {\hat{φ}}_{3} \sum_{k = 2}^{n} k = \sum_{k = 2}^{n} {kx}^{(1)} (k) \\ {\hat{φ}}_{1} \sum_{k = 2}^{n} & (k - 1) + {\hat{φ}}_{2} \sum_{k = 2}^{n} k + {\hat{φ}}_{3} (n - 1) \\ = \sum_{k = 2}^{n} x^{(1)} (k) \end{matrix}$ (3)

Parameters φ₁, φ₂, φ₃ are all unknown in Equation (3), and we can obtain the following results according to Cramer’s rule: $\begin{matrix} D = | \begin{matrix} \sum_{k = 2}^{n} [x^{(1)} (k - 1)]^{2} & \sum_{k = 2}^{n} {kx}^{(1)} (k - 1) & \sum_{k = 2}^{n} (k - 1) \\ \sum_{k = 2}^{n} {kx}^{(1)} (k - 1) & \sum_{k = 2}^{n} k^{2} & \sum_{k = 2}^{n} k \\ \sum_{k = 2}^{n} x^{(1)} (k - 1) & \sum_{k = 2}^{n} k & n - 1 \end{matrix} | \\ D_{1} = | \begin{matrix} \sum_{k = 2}^{n} [x^{(1)} (k) x^{(1)} (k - 1)] & \sum_{k = 2}^{n} {kx}^{(1)} (k - 1) & \sum_{k = 2}^{n} x^{(1)} (k - 1) \\ \sum_{k = 2}^{n} {kx}^{(1)} (k) & \sum_{k = 2}^{n} k^{2} & \sum_{k = 2}^{n} k \\ \sum_{k = 2}^{n} x^{(1)} (k) & \sum_{k = 2}^{n} k & n - 1 \end{matrix} | \\ D_{2} = | \begin{matrix} \sum_{k = 2}^{n} [x^{(1)} (k - 1)]^{2} & \sum_{k = 2}^{n} [x^{(1)} (k) x^{(1)} (k - 1)] & \sum_{k = 2}^{n} x^{(1)} (k - 1) \\ \sum_{k = 2}^{n} {kx}^{(1)} (k - 1) & \sum_{k = 2}^{n} {kx}^{(1)} (k) & \sum_{k = 2}^{n} k \\ \sum_{k = 2}^{n} (k - 1) & \sum_{k = 2}^{n} x^{(1)} (k) & n - 1 \end{matrix} | \\ D_{3} = | \begin{matrix} \sum_{k = 2}^{n} [x^{(1)} (k - 1)]^{2} & \sum_{k = 2}^{n} {kx}^{(1)} (k - 1) & \sum_{k = 2}^{n} [x^{(1)} (k) x^{(1)} (k - 1)] \\ \sum_{k = 2}^{n} {kx}^{(1)} (k - 1) & \sum_{k = 2}^{n} k^{2} & \sum_{k = 2}^{n} {kx}^{(1)} (k) \\ \sum_{k = 2}^{n} x^{(1)} (k - 1) & \sum_{k = 2}^{n} k & \sum_{k = 2}^{n} x^{(1)} (k) \end{matrix} | \end{matrix}$

If D ≠ 0, the solution of non-homogeneous Equation (3) is as follows: ${\hat{φ}}_{1} = \frac{D_{1}}{D}, {\hat{φ}}_{2} = \frac{D_{2}}{D}, {\hat{φ}}_{3} = \frac{D_{3}}{D}$

We can deduce the estimated values of parameter a, b, c according to the relationship between φ₁, φ₂, φ₃ and a, b, c in Formula (1), as follows: $\hat{a} = \frac{2 (1 - {\hat{φ}}_{1})}{1 + {\hat{φ}}_{1}}, \hat{b} = \frac{2 {\hat{φ}}_{2}}{1 + {\hat{φ}}_{2}}, \hat{c} = \frac{2 {\hat{φ}}_{2} - 4 {\hat{φ}}_{3}}{1 + {\hat{φ}}_{1}} .$

Then, the calculation formula of simulation value of ${\hat{x}}^{(1)} (k)$ can be transformed to the following,

$\begin{matrix} {\hat{x}}^{(1)} (k) & = & \frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}} {\hat{x}}^{(1)} (k - 1) + \frac{\hat{b}}{1 + 0.5 \hat{a}} k \\ - \frac{0.5 \hat{b} - \hat{c}}{1 + 0.5 \hat{a}} . k = 2, 3, \dots, n \end{matrix}$ (4)

The restored formula of ${\hat{x}}^{(1)} (k)$ is ${\hat{x}}^{(0)} (k)$ , as shown below, where k = 2, 3, …, n. ${\hat{x}}^{(0)} (k) = {\hat{x}}^{(1)} (k) - {\hat{x}}^{(0)} (k - 1)$ (5)

From Formula (4), it is easy to note that there are three parameters $\hat{a}$ , $\hat{b}$ , $\hat{c}$ in Formula (5). Hence we call Formula (5) a three-parameter grey prediction model, or TPGM(1,1) for short.

3 Optimization of initial values of TPGM (1, 1) model

In grey prediction model, $x^{(0)} (1) = x^{(1)} (1) = {\hat{x}}^{(1)} (1)$ is often used as the initial value to deduce the time response function of ${\hat{x}}^{(1)} (k)$ . In Formula (4), x⁽⁰⁾ (1) will be employed to express ${\hat{x}}^{(1)} (k)$ , and the derivation is shown as follows.

When k = 2, $\begin{matrix} {\hat{x}}^{(1)} (2) = & (\frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}}) x^{(0)} (1) \\ + (\frac{2 \hat{b}}{1 + 0.5 \hat{a}} - \frac{0.5 \hat{b} - \hat{c}}{1 + 0.5 \hat{a}}) \end{matrix}$

When k = 3 $\begin{matrix} {\hat{x}}^{(1)} (3) = & {(\frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}})}^{2} x^{(0)} (1) \\ + \frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}} (\frac{2 \hat{b}}{1 + 0.5 \hat{a}} + \frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}}) \\ + (\frac{3 \hat{b}}{1 + 0.5 \hat{a}} + \frac{0.5 \hat{b} - \hat{c}}{1 + 0.5 \hat{a}}) \end{matrix}$

⋯

When k = t

$\begin{matrix} {\hat{x}}^{(1)} (t) \\ = x^{(0)} (1) {(\frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}})}^{t - 1} \\ + (\frac{2 \cdot 2 \hat{b}}{1 + 0.5 \hat{a}} + \frac{0.5 \hat{b} - \hat{c}}{1 + 0.5 \hat{a}}) {(\frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}})}^{t - 2} \\ + (\frac{3 \cdot 2 \hat{b}}{1 + 0.5 \hat{a}} + \frac{0.5 \hat{b} - \hat{c}}{1 + 0.5 \hat{a}}) {(\frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}})}^{t - 3} \\ + \dots \\ + (\frac{t \cdot 2 \hat{b}}{1 + 0.5 \hat{a}} + \frac{0.5 \hat{b} - \hat{c}}{1 + 0.5 \hat{a}}) (\frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}}) \end{matrix}$ (6)

From Formula (6), we can obtain Formula (7),

$\begin{matrix} x^{(1)} (k) & = & x^{(0)} (1) {(\frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}})}^{k - 1} \\ + \sum_{i = 0}^{k - 2} [(k - i) \frac{\hat{b}}{1 + 0.5 \hat{a}} + \frac{0.5 \hat{b} - \hat{c}}{1 + 0.5 \hat{a}}] \\ \times {(\frac{1 - 0.5 \hat{a}}{1 + 0.5 \hat{a}})}^{i} \end{matrix}$ (7)

Formula (7) is simplified as the following, ${\hat{x}}^{(1)} (k) = {\hat{φ}}_{1}^{k - 1} x^{(0)} (1) + \sum_{i = 0}^{k - 2} [(k - i) {\hat{φ}}_{2} + {\hat{φ}}_{3}] {\hat{φ}}_{1}^{i}$ (8)

According to Formula (8), we can deduce that ${\hat{x}}^{(0)} (k) = {\hat{x}}^{(1)} (k) - {\hat{x}}^{(1)} (k - 1)$ when k = 2, 3, …, that is,

$\begin{matrix} {\hat{x}}^{(0)} (k) & = & {\hat{φ}}_{1}^{k - 1} x^{(0)} (1) \\ + \sum_{i = 0}^{k - 2} [(k - i) {\hat{φ}}_{2} + {\hat{φ}}_{3}] {\hat{φ}}_{1}^{i} - {\hat{φ}}_{1}^{k - 2} x^{(0)} (1) \\ - \sum_{i = 0}^{k - 3} [(k - i) {\hat{φ}}_{2} + {\hat{φ}}_{3}] {\hat{φ}}_{1}^{i} \end{matrix}$ (9)

Rearranging Formula (9), we can obtain Formula (10) as follows.

$\begin{matrix} {\hat{x}}^{(0)} (k) & = & [x^{(0)} (1) (φ_{1} - 1) + (2 φ_{2} + φ_{3})] {\hat{φ}}_{i}^{k - 2} \\ + \sum_{i = 0}^{k - 3} {\hat{φ}}_{2} {\hat{φ}}_{1} . \end{matrix}$ (10)

Formula (10) is called the final restored form of TPGM(1,1) model. It can be seen from the deduction process that the solution of Formula (10) is obtained under the condition that its initial value is x⁽⁰⁾ (1). As a result, the fitted curve of TPGM(1,1) model inevitably passes through the point (1, x⁽⁰⁾ (1)) on a coordinate plane. However, according to Ordinary Least Square (OLS) method, the fitted curve maybe not passes through point (1, x⁽⁰⁾ (1)). In other words, the theoretical foundation of taking $x^{(0)} (1) = x^{(1)} (1) = {\hat{x}}^{(1)} (1)$ as the initial value is not sufficient. According to the principle of OLS in econometric model, an initial value should satisfy the condition of producing minimum simulation errors in a model.

Assume that the optimal initial value of TPGM(1,1) model is φ, then $\begin{matrix} {\hat{x}}^{(0)} (k) & = & [x^{(0)} (1) (φ_{1} - 1) + (2 φ_{2} + φ_{3})] {\hat{φ}}_{1}^{k - 2} \\ + \sum_{i = 0}^{k - 3} {\hat{φ}}_{2} {\hat{φ}}_{1}^{i} . \end{matrix}$

To solve the least simulation errors of ${\hat{x}}^{(0)} (k)$ is an optimization problem, that is, $\begin{matrix} Q & = & min \sum_{k = 2}^{n} {[x^{(0)} (k) - {\hat{x}}^{(0)} (k)]}^{2} \\ Q & = & min \sum_{k = 2}^{n} [x^{(0)} (k) - φ (φ_{1} - 1) {\hat{φ}}_{1}^{(k - 2)} \\ - (2 φ_{2} + φ_{3}) {\hat{φ}}_{1}^{(k - 2)} - \sum_{i = 0}^{k - 3} {\hat{φ}}_{2} {\hat{φ}}_{1}^{i}]^{2} \end{matrix}$

According to OLS, to make Q the minimum, φ should satisfy $\begin{matrix} \frac{dQ}{d φ} & = & - 2 \sum_{k = 2}^{n} {[x^{(0)} (k) - φ (φ_{1} - 1) {\hat{φ}}_{1}^{(k - 2)} \\ - (2 φ_{2} + φ_{3}) {\hat{φ}}_{1}^{(k - 2)} \\ - \sum_{i = 0}^{k - 3} {\hat{φ}}_{2} {\hat{φ}}_{1}^{i}] (φ_{1} - 1) {\hat{φ}}_{1}^{(k - 2)}} = 0 \end{matrix}$

Rearrange it to get the following formula, $\begin{matrix} φ (φ_{1} - 1)^{2} \sum_{k = 2}^{n} {\hat{φ}}_{1}^{2 (k - 2)} \\ = (φ_{1} - 1) \sum_{k = 2}^{n} x^{(0)} (k) {\hat{φ}}_{i}^{(k - 2)} \\ - (2 φ_{2} + φ_{3}) (φ_{1} - 1) \sum_{k = 2}^{n} {\hat{φ}}_{1}^{2 (k - 2)} \\ - (φ_{1} - 1) \sum_{k = 2}^{n} [{\hat{φ}}_{1}^{(k - 2)} \sum_{i = 0}^{k - 3} {\hat{φ}}_{2} {\hat{φ}}_{1}^{i}] \end{matrix}$

Then the optimal initial value of TPGM(1,1) model φ is the following: $\begin{matrix} φ & = & \frac{(φ_{1} - 1) \sum_{k = 2}^{n} x^{(0)} (k) {\hat{φ}}_{1}^{(k - 2)}}{(φ_{1} - 1)^{2} \sum_{k = 2}^{n} {\hat{φ}}_{1}^{2 (k - 2)}} \\ - \frac{(2 φ_{2} + φ_{3}) (φ_{1} - 1) \sum_{k = 2}^{n} {\hat{φ}}_{1}^{2 (k - 2)}}{(φ_{1} - 1)^{2} \sum_{k = 2}^{n} {\hat{φ}}_{1}^{2 (k - 2)}} \\ - \frac{(φ_{1} - 1) \sum_{k = 2}^{n} [{\hat{φ}}_{1}^{(k - 2)} \sum_{i = 0}^{k - 3} {\hat{φ}}_{2} {\hat{φ}}_{1}^{i}]}{(φ_{1} - 1)^{2} \sum_{k = 2}^{n} {\hat{φ}}_{1}^{2 (k - 2)}} \end{matrix}$

According to the above researches, the modeling flowsheet of TPGM(1,1) model can be seen in Fig. 2. A MATLAB program for building a TPGM(1,1) model was developed to decrease the computational complexity, and it can be seen in the appendix of this paper.

4 Forecasting the output and consumption of natural gas in China

4.1 The present situation of output and consumption of natural gas in China

As the second largest energy consumer in the world, China has a long history of using natural gas. Natural gas has the advantages of being environmentally friendly, having high thermal efficiency and higher performance-price ratio compared to petrol, diesel and liquid petroleum gas (LPG). As a result the output and consumption of natural gas in China have maintained a rapidly growing trend. The output and consumption of natural gas between 2002 and 2013 are shown in Table 1.

According to Table 1, the changed tendency’s time line chart of the output and consumption of natural gas in China is shown in Fig. 3.

It can be seen from Fig. 1 that the output and consumption of natural gas grow rapidly since 2002, and the output and consumption about balanced each other before 2009. After 2009, with the rapid development of Chinese natural gas industries and the increase of civilian fuel consumption, both the output and consumption continued to rise, but the growth rate of output has been significantly slower than that of the consumption. As there is a lack of effective domestic supply of natural gas, China begun to import a lot of natural gas from other countries.

4.2 Testing method of model errors

Assume that X⁽⁰⁾ = (x⁽⁰⁾ (1) , x⁽⁰⁾ (2) , …, x⁽⁰⁾ (n)) is an original sequence, we employ model α to simulate sequence X⁽⁰⁾, and its corresponding simulative sequence is ${\hat{X}}^{(0)} = ({\hat{x}}^{(0)} (1), {\hat{x}}^{(0)} (2), \dots, {\hat{x}}^{(0)} (n))$ , then the residual sequence is defined as ɛ⁽⁰⁾, that is, $\begin{matrix} ɛ^{(0)} & = & (ɛ (1), ɛ (2), \dots, ɛ (n)) \\ = & (x^{(0)} (1) - {\hat{x}}^{(0)} (1), x^{(0)} (2) - {\hat{x}}^{(0)} (2), \\ \dots, x^{(0)} (n) - {\hat{x}}^{(0)} (n)) . \end{matrix}$

The relative error sequence is defined as Δ, that is $\begin{matrix} Δ & = & (Δ_{1}, Δ_{2}, \dots, Δ_{n}) \\ = & (| \frac{ɛ (1)}{x^{(0)} (1)} | \times 100 %, | \frac{ɛ (2)}{x^{(0)} (2)} | \times 100 %, \\ \dots, | \frac{ɛ (n)}{x^{(0)} (n)} | \times 100 %) . \end{matrix}$

Here Δ_k (k = 1, 2, …, n) is the relative simulation percentage error of data ${\hat{x}}^{(0)} (k)$ at point k, and $\bar{Δ} = \frac{1}{n} \sum_{k = 1}^{n} Δ_{k}$ is the mean relative simulation percentage error (MRSPE) of sequence X⁽⁰⁾ in model α; Similarly, 1 - Δ is the relative simulation percentage precision at point k, and $1 - \bar{Δ}$ the mean relative simulation percentage precision of sequence X⁽⁰⁾ in model α.

The TPGM(1,1) model of China’s natural gas output and consumption, or Model 1 for short, will be built according to the statistical data between 2002 and 2012 in this subsection, and we will compute its simulation precision and forecast the output and consumption of natural gas in 2013; after that, we will compare the above predicted result with the real data.

In addition, it is known from the proof of TPGM(1,1) model’s property and deduction that TPGM(1,1) model can achieve the unbiased simulation of homogeneous and non-homogeneous exponential sequences at the same time. In this subsection, in order to test the simulation and prediction performance of TPGM(1,1) model, two homogeneous exponential models, including GM(1,1) model (Model 2 for short) and ODGM model (Model 3 for short), two non-homogeneous exponential model, including DDGM(1,1) model (Model 4 for short) and NGM(1,1,k) model (Model 5 for short), and neural network model (Model 6 for short) will be used to simulate and forecast the output and consumption of China’s natural gas. The simulated and predicted results will be compared and analyzed.

4.3 Simulating and forecasting China’s natural gas output

According to the statistical data of natural gas output between 2002 and 2012 in Table 1, the modeling Sequence X⁽⁰⁾ is as follows, $\begin{matrix} X^{(0)} & = & (32.7, 35.0, 41.5, 49.3, 58.6, 69.2, \\ 80.3, 85.2, 94.8, 103.1, 107.2) . \end{matrix}$

TPGM(1,1) model (Model 1)

TPGM(1,1) model of China’s natural gas output and its restored form are as follows, respectively, $\begin{matrix} {\hat{x}}^{(0)} (k) & = & 32.82 \times 0 . 9727^{k - 2} \\ + \sum_{i = 0}^{k - 3} 10.3535 \times 0 . 9727^{i}, \\ k = 2, 3, \dots, n \end{matrix}$

GM(1,1) model (Model 2: homogeneous exponential model) $\begin{matrix} {\hat{x}}^{(0)} (k) & = & 37.1707 \times e^{0.1133 (k - 1)}, \\ k = 2, 3, \dots, n \end{matrix}$

ODGM model (Model 3: homogeneous exponential model) $\begin{matrix} {\hat{x}}^{(0)} (k) & = & 36.8899 \times 1 . 1198^{(k - 1)}, \\ k = 2, 3, \dots, n \end{matrix}$

DDGM(1,1) model (Model 4: non-homogeneous exponential model) $\begin{matrix} {\hat{x}}^{(0)} (k) & = & 752.0368 \times 1 . 0095^{k - 1} \\ - 719.3368, k = 2, 3, \dots, n \end{matrix}$

NGM(1,1,k) model (Model 5: non-homogeneous exponential model) $\begin{matrix} {\hat{x}}^{(0)} (k) & = & - 159.3816 e^{- 0.0966 (k - 1)} \\ + 164.5435, k = 2, 3, \dots, n \end{matrix}$

Neural network model (Model 6)

We use the neural network package ‘nnstart’ of MATLAB R2013a, deploy the data in Sequence with Training: Validation: Testing=70% :15% :15% in proportion, and set Number of Hidden Neurons=10, Number of delays=2; Simulation results are shown in Table 2.

The simulation values and errors of China’s natural gas output between 2003 and 2012 with six different models are shown in Table 2.

According to Table 2, the relative simulation percentage errors of China’s natural gas output with six different mathematical models are shown in Fig. 4.

The predictive values and errors of China’s natural gas output in 2013 with six different mathematical models are shown in Table 3.

4.4 Simulating and forecasting China’s natural gas consumption

According to the statistical data of China’s natural gas consumption between 2002 and 2012 in Table 1, modeling Sequence X⁽⁰⁾ is the following, $\begin{matrix} X^{(0)} & = & (29.2, 33.9, 39.7, 46.8, 56.1, 69.5, \\ 80.7, 87.5, 107.5, 131.3, 147.1) \end{matrix}$

TPGM(1,1) model (Model 1)

TPGM(1,1) model of China’s natural gas consumption and its restored form are as follows, respectively, $\begin{matrix} {\hat{x}}^{(0)} (k) & = & 33.4749 \times 1 . 1593^{k - 2} \\ + \sum_{i = 0}^{k - 3} 1.2601 \times 1 . 1593^{i}, \\ k = 2, 3, \dots, n \end{matrix}$

GM(1,1) model (Model 2: homogeneous exponential model) $\begin{matrix} {\hat{x}}^{(0)} (k) & = & 29.4746 \times e^{0.1621 (k - 1)}, \\ k = 2, 3, \dots, n \end{matrix}$

ODGM model (Model 3: homogeneous exponential model) $\begin{matrix} {\hat{x}}^{(0)} (k) = & 29.4281 \times 1 . 1763^{(k - 1)}, \\ k = 2, 3, \dots, n \end{matrix}$

DDGM(1,1) model (Model 4: non-homogeneous exponential model) $\begin{matrix} {\hat{x}}^{(0)} (k) = & 42.6037 \times 1 . 1444^{(k - 1)} - 13.4037, \\ k = 2, 3, \dots, n \end{matrix}$

NGM(1,1,k) model (Model 5: non-homogeneous exponential model) $\begin{matrix} {\hat{x}}^{(0)} (k) = & 203.0529 e^{0.0491 (k - 1)} - 196.9735 \\ k = 2, 3, \dots, n \end{matrix}$

Neural network model (Model 6)

We use the neural network package ‘nnstart’ of MATLAB R2013a, deploy the data in Sequence with Training: Validation: Testing = 70% : 15% :15% in proportion, and set Number of Hidden Neurons=10, Number of delays=2. Simulation results are shown in Table 4.

The simulation values and errors of China’s natural gas consumption between 2003 and 2012 with six different models are shown in Table 4.

According to Table 4, the relative simulation percentage errors of China’s natural gas consumption with six mathematical models are illustrated in Fig. 5.

The predictive values and errors of China’s natural gas consumption in 2013 with six mathematical models can be seen in Table 5.

According to Tables 3 and 5, the comparison histogram of the predicted output and consumption in 2013 with six models is shown in Fig. 6.

According to Tables 3 and 5, the comparison histogram of the predictive percentage errors of China’s natural gas output and consumption in 2013 using six different models is shown in Fig. 7.

4.5 Forecasting China’s natural gas output and consumption from 2014 to 2020

It can be seen from the earlier simulated and predicted results of China’s natural gas output and consumption in year 2003 to 2012 with six models that the TPGM(1,1) model proposed in this paper has the relatively optimal simulation and prediction performance. Hence, in this subsection, we will employ the model to forecast and analyze China’s natural gas output and consumption from 2014 to 2020.

When k = 13, 14, …, 19, the predictive models of China’s natural gas output and consumption from 2014 to 2020 are as follows:

Output: $\begin{matrix} {\hat{x}}^{(0)} (k) & = & 32.82 \times 0 . 9727^{k - 2} \\ + \sum_{i = 0}^{k - 3} 10.3535 \times 0 . 9727^{i}, \\ k = 13, 14, \dots, 19 \end{matrix}$

Consumption: $\begin{matrix} {\hat{y}}^{(0)} (k) & = & 33.4749 \times 1 . 1593^{k - 2} \\ + \sum_{i = 0}^{k - 3} 1.2601 \times 1 . 1593^{i}, \\ k = 13, 14, \dots, 19 \end{matrix}$

According to the above models, the predictive results of China’s natural gas output and consumption are shown in Table 6.

According to Table 6, the evolutionary process and changing trend of China’s natural gas output and consumption from 2002 to 2020 can be seen in Fig. 8.

As shown in Fig. 8, China’s natural gas output and consumption were able to maintain a balance of supply and demand during 2002-2009. However, after 2009, there was limited growth of domestic natural gas output, while consumption increased dramatically; the supply and demand balance was broken. Consequently the Chinese government started to import a large amount of natural gas to meet the growing domestic demand for natural gas. According to our research findings, the supply and demand disparity of domestic natural gas will become more prominent: China’s natural gas consumption in 2020 is expected to reach 502.80 billion cubic meters, but the predicted output will be only 162.85 billion cubic meters for the same time period. By then more than half of China’s natural gas demand will need to be imported from overseas. This may be the main reason that China and Russia signed a huge trade agreement of natural gas worth 400 billion US dollars in May 2014.

The main reasons for the sharp increase of China’s natural gas demand are twofold: Firstly the Chinese government needs to use natural gas which is a form of clean energy to help to mitigate increasingly serious environmental pollution problems; secondly, the rapid development of China’s natural gas chemical industry also significantly contributes to this growing demand.

4.6 Analysis of model performance

In this paper, we use six different mathematical models to simulate and forecast the output and consumption of China’s natural gas, and the findings show that the TPGM(1,1) model proposed in this paper has the most excellent simulation and prediction performance. The main reasons for this superior performance are explained as follows.

Parameters of GM(1,1) and NGM(1,1,k) model are all estimated based on the modeling foundation of difference equations, but their time response functions are drawn through differential equations, hence their modeling process gives rise to ‘hopping errors’ due to distinction between difference equations and differential equations, and this is the main reason that the two models cannot achieve unbiased simulation of homogeneous exponential sequences.

The parameter estimation method of OGM(1,1) model, unlike the traditional GM(1,1) model, does not have the intrinsic weakness of ‘hopping error’ from difference equations to differential equations. However, its final restored form is of homogeneous exponential type, and this reduces the simulation and prediction effectiveness of OGM(1,1) for non-homogeneous or approximate non-homogeneous exponential sequences. The simulation and prediction performance of OGM(1,1) model will become more inferior when the modeling sequence has fluctuating features.

When the modeling sequence is already approximate exponential, the modeling method of DDGM(1,1) model omits the accumulating generation process and directly builds a grey prediction model. Accumulating generation is a method employed to whitenize a grey process. It plays an extremely important role in any grey system. Through accumulating generation, one can potentially uncover a development tendency existing in the process of accumulating grey quantities so that the characteristic and laws of integration hidden in the chaotic original data can be sufficiently revealed. So the precision of DDGM (1,1) model that omits the accumulating generation process is not ideal. Furthermore, neural network accomplishes the mapping function from input to output, but the imperfection of ‘overfitting’ often resulted in decreased predictive ability.

TPGM(1,1) model is deduced from the fundamental form of GM(1,1) model (not differential equations), and its modeling parameters are estimated by OLS. This ensures the consistency between the expressed form and parameters of the model, and solves the hopping error problem of NGM(1,1,k) due to changes from difference equations to differential equations. On the other hand, the parameter used in TPGM(1,1) model is estimated according to a real situation, and not the constant parameter. Employing a changed parameter that adapts to the changing exterior situation is another reason that TPGM (1, 1) has better simulation and prediction performance.

5 Conclusions

A novel unbiased grey prediction model, TPGM(1,1) for short, is proposed in this paper. We have deduced the calculational method of finding the optimal initial values. We have also solved the inconsistency problem between the parameter estimation method that is based on difference equations and the time response functions of grey models that are based on differential equations.

The TPGM(1,1) model of China’s natural gas output and consumption was built, and its simulation and prediction results were compared with those of other models. The comparison shows that the proposed TPGM(1,1) model has relatively optimal simulation and prediction precisions. The output and consumption of China’s natural gas during 2015-2020 have been forecasted, and the predictive findings show that China’s natural gas demand will go up to 502.80 billion cubic meters; however the corresponding output will be only 162.85 billion cubic meters. Hence, by 2020 67.61% of China’s natural gas will need to be imported from overseas. China will become one of the fastest growing countries of natural gas consumption.

Footnotes

Acknowledgments

Our work was supported by National Natural Science Foundation of China (Grant No. 71271226), Chongqing Frontier and Applied Basic Research Project (cstc2014jcyjA00024), Postdoctoral Science Foundation of China (Grant No. 2015T80975) and Marie Curie International Incoming Fellowship within the 7th European Community Framework Programme (Grant No. FP7-PIIF-GA-2013-629051). We would like to thank the anonymous referees for their constructive comments that helped to improve the clarity and completeness of this paper.

References

Merigó

J.M.

, Yang

J.B.

and Xu

D.L.

, Demand analysis with aggregation operators, International Journal of Intelligent Systems 31(5) (2016), 425–443.

Leigh

, Paz

and Purvis

, An analysis of a hybrid neural network and pattern recognition technique for predicting short-term increases in the NYSE composite index, Omega 30(2) (2002), 69–76.

Samitas

and Armenatzoglou

, Regression tree model versus Markov regime switching: A comparison for electricity spot price modeling and forecasting, Operational Research 14(3) (2014), 319–340.

Ahmad

A.S.

, Hassan

M.Y.

and Abdullah

M.P.

, A review on applications of ANN and SVM for building electrical energy consumption forecasting, Renewable & Sustainable Energy Reviews 33(5) (2014), 102–109.

Liu

S.F.

, Forrest

and Yang

Y.J.

, A brief introduction to grey systems theory, Grey Systems: Theory and Application 2(2) (2012), 89–104.

Zeng

, Li

and Long

X.J.

, Equivalency and unbiasedness of grey prediction models, Journal of Systems Engineering and Electronics 26(1) (2015), 110–118.

Wang

X.W.

, Cai

Y.P.

and Chen

J.J.

, A grey-forecasting intervalparameter mixed-integer programming approach for integrated electric-environmental management-a case study of Beijing, Energy 63 (2013), 334–344.

Xie

N.M.

and Pearman

A.D.

, Forecasting energy consumption in China following instigation of an energy-saving policy, Natural Hazards 74(2) (2014), 639–659.

Meng

, Niu

D.X.

and Shang

, A small-sample hybrid model for forecasting energy-related CO2 emissions, Energy 64 (2014), 673–677.

10.

, Liang

and Wang

T.Y.

, Criterion fusion for spectral segmentation and its application to optimal demodulation of bearing vibration signals, Mechanical Systems and Signal Processing 64-65 (2015), 132–148.

11.

Boran

F.E.

, Forecasting natural gas consumption in turkey using grey prediction, Energy Sources Part B-Economics Planning and Policy 10(2) (2015), 208–213.

12.

Liu

J.F.

, Liu

S.F.

, Fang

Z.G.

, et al., New strengthening buffer operators based on adjustable intensity and their applications, Journal of Grey System 26(3) (2014), 117–125.

13.

Dang

Y.G.

, Liu

S.F.

and Chen

K.J.

, The GM Models that x(n) be taken as initial value, Kybernetes 33(2) (2004), 247–254.

14.

Wang

Y.H.

, Dang

Y.G.

and Li

Y.Q.

, An approach to increase prediction precision of GM(1,1) model based on optimization of the initial condition, Expert Systems with Applications 37(8) (2010), 5640–5644.

15.

Wang

Y.H.

, Liu

, Tang

J.R.

, et al., Optimization approach of background value and initial item for improving prediction precision of GM(1,1) model, Journal of Systems Engineering and Electronics 25(1) (2014), 77–82.

16.

L.F.

, Liu

S.F.

and Cui

, Non-homogenous discrete grey model with fractional-order accumulation, Neural Computing & Applications 25(5) (2014), 1215–1221.

17.

, Dang

Y.G.

and Cui

, Comprehensive optimized GM(1,1) model and application for short term forecasting of Chinese energy consumption and production, Journal of Systems Engineering and Electronics 26(4) (2015), 794–801.

18.

L.F.

, Liu

S.F.

and Wang

Y.N.

, Grey Lotka-Volterra model and its application, Technological Forecasting and Social Change 79(9) (2012), 1720–1730.

19.

Zhao

, Wang

J.Z.

, Zhao

, et al., Using a Grey model optimized by Differential Evolution algorithm to forecast the per capita annual net income of rural households in China, Omega 40(5) (2012), 525–532.

20.

C.H.

, Applying the Grey prediction model to the global integrated circuit industry, Technological Forecasting & Social Change 70(6) (2003), 563–574.

21.

Geng

N.N.

, Zhang

and Sun

Y.X.

, Forecasting china’s annual biofuel production using an improved grey model, Energies 8(10) (2015), 12080–12099.

22.

Zeng

, Chen

and Liu

S.F.

, A novel interval grey prediction model considering uncertain information, Journal of the Franklin Institute 350(10) (2013), 3400–3416.

23.

Yao

T.X.

, Liu

S.F.

and Xie

N.M.

, On the properties of small sample of GM(1,1) model, Applied Mathematical Modeling 33(4) (2009), 1894–1903.

24.

Liu

S.F.

, Yang

Y.J.

and Wu

L.F.

, Grey Systems Theory and Applications (the 7th edition), Beijing Science Press: Chapter 10, 2014.

25.

Zeng

and Liu

S.F.

, Analysis of indirect DGM(1,1) model of non-homogeneous exponential incremental sequence, Statistics & Information Forum 25(8) (2010), 30–33.

26.

Zeng

and Liu

S.F.

, Direct modeling approach of DGM(1,1) with approximate non-homogeneous exponential sequence, Systems Engineering Theory and Practices 21(2) (2011), 297–301.

27.

Cui

, Dang

Y.G.

and Liu

S.F.

, A Novel grey forecasting model and its modeling mechanism, Control and Decision 24(11) (2009), 1702–1706.

28.

Zhan

L.Q.

and Shi

H.J.

, Methods and model of grey modeling for approximation non-homogenous exponential data, Systems Engineering-Theory & Practice 33(3) (2013), 659–694.

29.

Yao

T.X.

and Liu

S.F.

, Characteristic and optimization of discrete GM(1,1) model, Systems Engineering Theory and Practices 29(3) (2009), 142–148.