Abstract
Gray fuzzy prediction model is suitable for small-sample-size prediction. The real per capita disposable income of urban residents in Hebei Province used as an example, and samples 3–35 in length selected, the influence of sample length on prediction performance of the GM (1,1) model were investigated. Sample length presents a nonlinear relationship with the predicted relative error of the model. Compared with large samples with lengths more than 15, small samples with lengths below 15 are suitable to establish the gray fuzzy prediction model. Small samples with length of 8–13 are applicable to three-step prediction. Sample lengths suitable for modeling were proposed, and the above conclusions provide a certain theoretical foundation and guidance for the research and application of gray fuzzy prediction in the future.
Introduction
Gray system theory, which was proposed by Chinese Professor Julong Deng in 1982 defines a gray system as a system containing known and unknown information [1], thereby laying a theoretical foundation for the subsequent construction of gray model (GM).A gray fuzzy prediction model has better prediction performance for small-sized data with unknown distribution compared with other prediction models. The traditional GM (1,1), that is, first-order univariate gray fuzzy prediction model, has been extensively applied in various fields, such as economics, sociology, ecology, and medicine. However, this model has some limitations for constantly changing data. Thus, scholars have developed new models, including mixed gray fuzzy, gray fuzzy dynamic, and gray fuzzy neural network models, by combining the traditional gray fuzzy prediction model with other theories or models and considering the data feature in different fields to improve the prediction accuracy.
Prediction techniques based on gray fuzzy system theory have been widely used in many fields, such as economics and natural science. Most studies have indicated that, the gray fuzzy prediction model has small sample size and unknown data distribution compared with other models. Many studies have focused on method improvement and ignored data size. Few studies have verified the suitability of the gray fuzzy prediction model for small-sample-size prediction. The sample lengths used in different studies are different. In this study, real per capita disposable income of urban residents in Hebei Province was used as the study object to overcome the limitations in existing studies. GM (1,1) models were constructed for samples with different lengths to compare their prediction effects and determine the effects of sample length on model prediction performance. A proper sample length was selected to establish a model and predict the variation trend of real per capita disposable income of urban residents in Hebei Province in the future. The proposed model can provide theoretical basis for studies on gray fuzzy prediction model and serve as guide in future applications.
Literature review
Gray fuzzy prediction is a prediction method used for systems containing certain and uncertain information. This method, which was developed by Professor Julong Deng, has been broadly applied to various fields, such as medicine, economics, and human capital [2–5]. Many scholars have indicated the limitations of the traditional model. Thus, gray fuzzy prediction theory has been combined with other methods or theories to derive a new model and improve the prediction accuracy.
Wang et al. presented a dynamic bootstrap gray fuzzy prediction model by combining gray fuzzy theory and bootstrap method to effectively solve the problem of small information quantity [6]. Wu et al. proposed a time-varying weighted gray fuzzy model to mitigate the effects of data fluctuation [7]. Yang et al. developed a three-parameter background value construction method to improve the smoothing effect of background value, thereby reducing the effects of extreme data on prediction performance [8]. Gray fuzzy modeling method DGGM (1,1) [9] predicting seasonal fluctuation series and new GM (1,1) power model [2] fusing the self-memorization principle have been used to predict unimodal and fluctuating series.
Therefore, the gray fuzzy prediction method has evolved from the traditional single model into a complex methodology system containing multiple models. Scholars have improved the model in terms of prediction accuracy, and the derived gray fuzzy prediction model can effectively solve the small sample size problem. Scholars have used different sample lengths in their empirical studies. Wu et al. used samples with lengths of 4, 10, and 14, to construct models in their case study [7]. Hu used samples with lengths of 4 for rolling prediction [4]. Yin and Tang used sample lengths of 22 [5], and Wang et al. used 11-year quarterly data for model fitting, where the sample size reached 43 (first quarter data in the first year were missing) [9]. Thus, most of the existing studies have focused on fitting degree and prediction accuracy of one specific sample and ignored the suitability of the selected sample length. The selection of sample length in empirical studies is mostly based on the subjective influence of scholars or limited by data accessibility because of the lack of scientific basis and theoretical support.
Sample length has an influence on model prediction ability, and excessively large and small sample sizes will impact model prediction accuracy. Wu et al. used matrix perturbation theory to explain the feasibility of gray fuzzy prediction on small sample size, used samples with lengths of 4–19 and 14 to construct models in different cases, and indicated that samples with length of four has a favorable prediction effect by comparing the prediction results, indicating that the small-sample-size GM (1,1) model has high prediction accuracy [10]. In this study, the selected sample length was small, and the samples with other lengths were not discussed. Therefore, real per capita disposable income of urban residents in Hebei Province was used the study object to overcome the limitations of existing studies, and the sample length was enlarged to investigate their effects on the prediction performance of gray fuzzy prediction model GM (1,1). Samples with suitable lengths were used to construct models and predict the variation trend of real per capita disposable income of urban residents in Hebei Province in the future.
The main contributions of this study are described as follows. First, R software is used to compile the self-compiling function of each GM (1,1) model and construct multiple GM (1,1) models within a short time period, improving the modeling efficiency and overcoming the limitations of modeling software suitable for gray fuzzy prediction. Second, GM (1,1) models are constructed for samples with lengths of 3–35 in length, and the relationship between model prediction performance and sample length is determined by analyzing the prediction results of 33 models. The relative model prediction error exhibits periodic fluctuation rather than rising or declining tendency with the increase in sample length. Third, the GM (1,1) is suitable for small-sample-size prediction, whereas it has poor prediction effect on large samples on the basis of fluctuation amplitude and sample prediction results. The proposed model can provide powerful evidence and serve a theoretical foundation for the judgment made in previous studies. In particular, the gray fuzzy prediction model is suitable for small samples. Fourth, the sample length suitable for constructing GM (1,1) models is selected through the comparison of multistep prediction results of different samples, and suggestions for future applications of the gray fuzzy prediction model are provided.
Methodology
GM (1,1) and evaluation method
GM (1,1) is a first-order univariate gray fuzzy time series prediction method that seeks for rules from original data and constructs a gray fuzzy differential equation for generated data. The first parameter in the bracket represents the order number, and the second parameter indicates the number of variables. This model does not need to consider the effects of related factors on system variation trend, and the modeling process is simple [11]. Gray fuzzy system theory is applied to this model, and the series generated by accumulating nonnegative original data can relieve the effects of random factors. Therefore, this method can be adopted to construct a model for generated data. Assume that the original data is time series m in length:
The elements in Eq. (1) are accumulated to obtain a generated series, which can be expressed as:
Every two neighboring elements in Eq. (2) are added to generate a neighbor mean series, which can be expressed as:
The gray differential equation of GM (1,1) model using Eqs. (1) and (3) is expressed as follows:
Equation (4) is a typical one-variable linear regression model when X(0)(k) and Z(1)(k) are known. Unknown parameters a and b can be solved using the least squares method, and the calculation process is shown in Eqs. (5 and 6):
The winterization equation of gray differential Equation (4) when k is defined as continuous variable t is expressed as:
Equation (7) is solved, and continuous variable t is transformed into the original discrete variable k to obtain:
Given that X(1) is a generated series through one-time accumulation of series X(0), the inverse operation of prediction series
The evaluation of the GM (1,1) model is divided into internal and external evaluations, where the former refers to the evaluation indicators of a single fitted model, and the latter indicates the comparison of evaluation indicators of different models. Many indicators, including relevancy, average relative error, ratio mean square error, and small error probability, are found in internal evaluation.
Average relative error is calculated using the fitted value of the model and the actual value. The absolute error series of the model is solved, which can be expressed as:
Equation (10) is divided using the original series to obtain the relative error series, which can be expressed as:
The mean value of relative error series is solved, namely, average relative error series:
The ratio of mean square error refers to the ratio of standard deviations of original series to that of absolute error series and reflects the distribution characteristics of residual errors. The standard deviations of the original and absolute error series are calculated:
The ratio of two standard deviations in Eq. (13) is calculated to obtain the ratio of mean square error, which can be expressed as:
The small error probability is calculated using Eqs. (10 and 13) corresponding to the standard deviations of absolute error and original series, respectively:
The small error probability is a common test index used for the GM model, and the greater the probability, the better the results will be.
Relevancy is usually used to analyze the similarity degree between different series for determining the tightness of their relationship [12]. Therefore, relevancy can be used to determine the fitting degree of the model prediction series. The relevancy coefficient between the original series and corresponding elements of the fitted series is first calculated, and the formula is expressed as follows:
The mean value of the above mentioned relevancy coefficient series is calculated, and the relevancy is obtained as follows:
Small average relative error and ratio of mean square error and large relevancy and small error probability yield good results. Under normal circumstances, the accuracy grade of the GM (1,1) model can be evaluated in terms of the ratio of mean square error and small error probability, as shown in Table 1.
Accuracy grade of the GM (1,1) model
External evaluation is used to compare the prediction performance of different GM (1,1) models. The mean absolute percentage error (MAPE) of each model can be calculated based on the prediction result of each model. MAPE is the average relative error expressed in the form of percentage. For the convenience of differentiation, the average relative error used in this study is the error within samples, and is calculated in terms of the fitted value of the model and the actual value. MAPE refers to the error outside the samples, and is calculated in terms of the predicted value of the model and the actual value. The errors are multiplied by 100% using Eq. (11), and their average value, that is, MAPE, is calculated.
The data used in this study are derived from the 2018 Hebei Economic Yearbook and 2018 Statistical Bulletin of National Economy and Social Development of Hebei Province. The real per capita disposable incomes of urban residents in1978 as the base period were obtained (Table 2), and the influence of price fluctuation was excluded through the calculation of real per capita disposable income of urban residents in Hebei Province and its indicators (1978 = 100).
Real per capita disposable incomes of urban residents (1978 as the base period)
Real per capita disposable incomes of urban residents (1978 as the base period)
The gray fuzzy prediction model is commonly applied to short-term prediction [3, 8]. Thus, the prediction performance of steps 1–5 of the GM (1,1) model was investigated. For the convenience of comparison, the data were divided into two parts, namely, the data during 1978–2012 constituted the training set, and those during 2013–2017 formed the test set. GM (1,1) models were established by extracting subsets with different lengths from the training set to predict the data during 2013–2017. The evaluation indicators of each model were calculated, and the fitting degree and prediction performance of the models were compared. The data during 1978–2017 were used as the training set, and samples with proper lengths were selected to predict the future variation trend of real per capita disposable income.
Influenced by data accessibility, the sample size was set to 35 that is, all data during 1978–2012 were used for prediction. Excessively small sample length makes it difficult to establish the prediction equation. Thus, the sample length was set to three, that is, the data during 2009–2011 were used for prediction. The sample length in this study ranged from 3 to 35, and GM (1,1) models were constructed for the data during 2010–2012, 2009–2012, 2008–2012 ... 1978–2012 to predict the real per capita disposable incomes of urban residents during 2013–2017. Columns 3 and 6 in Table 2 list the stepwise ratio series of real income series, that is, the results obtained by dividing real income in each year with the real income in the previous year. The element factors in the stepwise ratio series were similar, which ranged from 1.00–1.37, manifesting that the real per capita disposable income of urban residents in Hebei Province presented exponential growth during 1978–2017.
The sample length in this study ranged from 3- to 35, and 33 GM (1,1) models were established. R software can effectively reduce repeated operations by a large margin and rapidly obtain the results. Thus, R software was used to create self-compiling functions of GM (1,1) models using Formulas (1–16) and to calculate the fitting results of different samples (Table 3). The absolute value of development gray number a was within 0.106–0.123 and smaller than 0.3, indicating that the GM (1,1) model was significant and could be used for prediction. The endogenous control gray number declined to a great extent from 14,125.6 to 537.0 by 96.20% with the increased in sample length. The comparison of various ranges of development and endogenous control gray numbers showed that sample length had a great impact on the endogenous control gray number of the model, whereas it had a minor effect on the development gray number.
Fitting results of different samples
Fitting results of different samples
The prediction results of the models were compared. The data were represented in a visualized manner because of large data size to better reflect the data distribution characteristics better. The relative error series of each model (expressed in the form of percentage) was calculated using Formula (11) combined with the predicted and actual values during 2013–2017. The relationship between the predicted relative error of each element in this series and sample length was determined, as shown in Fig. 1.

Relationship between the predicted relative error and sample length.
The horizontal axis in Fig. 1 denotes the sample length, and longitudinal axis expresses the predicted relative error in each year (unit:%, similarly hereinafter). Different symbols represent relative errors within a certain range. Circle indicates that the relative error is within (0%, 2%), triangle denotes that the relative error is within (2%, 5%), and the plus sign indicates that the relative error is 5% or above. Under different average relative errors, the predicted relative error in each year and sample length presented nonlinear relation. The comparison of large samples with length more than 15 indicated that the predicted relative errors of GM (1,1) models established for small samples with length less than 15 were stable. For the convenience of explanation, small samples refer to samples with length less than 15, and large samples are those with length of more than 15). Concretely, as the sample size increased, The predicted relative error presented a fluctuating trend with the increase in sample size. The predicted relative error sharply increased when the sample length exceeded 30.
The fitting degree of each model was evaluated using its internal evaluation indicators, and the relevancy was commonly low, which was within 0.557–0.699. Relevancy would be greater than 0.6 when the sample size was 5–8, 10–14 and 26–35. Other indicators, except relevancy, were satisfactory. With the increase in sample length, the ratio of mean square error presented a rising tendency from 0.007 to 0.236 and smaller than 0.35, and small error probability was 1. As shown in Table 1, the accuracy grade of the GM (1,1) model was superior. The above analysis manifested that GM (1,1) models established for different samples are reasonable.
After the comparison of average relative errors of the models, the average relative error immensely increased with the increase in sample length. The average relative error was within 1% when the sample length was less than 7. The average relative errors of samples with 8–15 length did not exceed 5%.The average relative error was greater than 10% when the sample length was greater than 24 and reached 39.97% when the sample length was 35. The fitting results of the models showed that small sample length results in low internal error of the samples in the model and high fitting degree.
The bottom point (trough) of each fluctuation was observed. The trough appeared at different locations in different years. Based on predicted relative errors in the years, the predicted relative errors of samples with length less than 15 and those with length of approximately 29 were small during 2013-2015. The predicted relative errors of samples with length of 17–26 were small. For samples with lengths less than 15, relative errors during 2013-2015 were gentle, and those in 2014 and 2015 were low. The crest appearing during 2016-2017 was high and exceeded the 10% limit. For samples with lengths of 15–25 in length, the crest remarkably reduced, and the change amplitude reduced with years. In particular, in 2016 and 2017, most relative errors were less than 5%, especially during 2016-2017. As the years progressed, the location of trough became increasingly closed to samples with small length (above 25). The trough appeared at lengths of 30 in 2013, and the trough appeared at the length of 25 in 2016.
MAPEs under different predicted step sizes were calculated using the predicted relative error series of the models under different sample lengths. The relationship between MAPE and sample length of each model was obtained for each step size, as shown in Fig. 2. Symbol meanings in Fig. 2 are similar to those in Fig. 1. As the sample length increased, MAPE change predicted at each step size was identical with the trend embodied in Fig. 1. MAPE abruptly increased when the sample length exceeded 30. For samples with length of less than 30, the MAPE curve became gentle with the increase in step size. After the samples with MAPE less than below 5% in multistep prediction were investigated, the prediction effects of steps 3 and 4 were favorable, followed by that of step 2, and steps 1 and had poor prediction effects. Compared with the samples of other lengths, MAPE change in samples with length less than 15 was gentle in each step, satisfying the prediction effect. In samples with lengths more than 15, MAPE considerably changed, and the prediction effect was good in samples with length of approximately 29. Thus, the samples with lengths less than 15 were suitable to establish a gray fuzzy prediction model.

Relationship between MAPEs under different predicted step sizes and sample lengthrelationship between the predicted relative error and sample length.
Figures 1 and 2 shows that the relationship between the prediction performance of the GM (1,1) model and the sample length was linear. The model prediction performance presented a fluctuating trend, and the fluctuation amplitude immensely increased indicating that the relative to large samples, the GM (1,1) model has better prediction ability for small samples.
The prediction effect of the model established using samples with lengths of approximately 29 was favorable. MAPEs predicted in steps 1 and 2 for samples with lengths less than 30 and reached below 2%, and the prediction performance exceeded the prediction performance for some small samples (Fig. 2). Table 3 shows that the internal errors of samples in each model, that is, average relative errors were large and more than 10%, indicating that the model had a poor fitting effect on the original series, and the extrapolation ability of the model did not present a positive correlation with the fitting degree. The model with good extrapolation ability might not have a good fitting effect on the original series. As shown in Fig. 1, the predicted relative errors of the models in the years considerably changed within this interval. For instance, the relative errors of samples with lengths of 29 were low during 2013-2015, whereas those in 2016 and 2017 were high.
Although the GM (1,1) model could obtain a preferable effect in some large samples, its effect was unstable. Therefore, this model was unsuitable for the prediction of large samples and had relatively better prediction effect for small samples with lengths less than 15. The most suitable length used for GM (1,1) model construction were found by comparing Figs. 1 and 3. First, the GM (1,1) model was suitable for three-step prediction. As shown in Fig. 1, the prediction effect for small samples during 2013-2015 was good, whereas for those during 2016-2017 was poor. As shown in Fig. 2, the prediction effects of steps 2 and 3 were good. Although the MAPEs predicted through four steps were low, the relative error was large in 2016. Then, further observation indicates that samples with lengths within 8–13, MAPEs predicted using steps 1, 2, and step 3 were low in Fig. 2. The predicted relative errors in 2013 and 2014 were small, as shown in Fig. 1. The predicted relative error in 2015 was higher than the relative error of samples with lengths less than 8. However, the sample errors within this length interval were less than 5%. Thus, the GM (1,1) model is suitable for prediction using three steps for samples with lengths of 8–13.
As discussed in Section 4, sample lengths of 8–13 are suitable for three-step prediction. In particular, the three-step prediction effect for samples with lengths of nine is good, as shown in Figs. 1 and 2. Therefore, samples with lengths of nine were selected to predict the real incomes of urban residents in Hebei Province during 2018–2020. The solution of the gray fuzzy differential equation based on the results is expressed as:
The absolute value of development gray number in the model was 0.097, which was smaller than 0.3, and the GM (1,1) model had high significance. Internal evaluation indicators of the model showed that relevancy was 0.535, ratio of mean square error was 0.067, small error probability was 1, and the internal average relative error of samples was 3.22%. As shown in Table 1, the model had a superior accuracy grade. Following the three-step prediction for Formula (18), the value of k = 9–11 was calculated, and the obtained prediction result was the real per capita disposable income using 1978 as the base period. This value was converted into the value using year 2017 as the base period. The results are listed in Table 4. Year 2017 was used as the base period, real per capita disposable incomes of urban residents will continuously increase in the next three, where annual growth rate exceeds 10%, and per capita income exceeded will exceed 2020. The growth rate was high in 2018, reaching 13.50%, and was stabilized at approximately 10%.
Prediction result of the real per capita disposable income of urban residents in Hebei Province during 2018–2020 (yuan)
Figure 3 shows the model prediction results, and the curve is constituted using the fitted values of the model and the predicted values. that is, the exponential curve expressed using Formula (18). The deviation degree of each point from the curve is small, indicating that the GM (1,1) model fiat the changing trend of per capita disposable income of urban residents.

Real per capita disposable income of urban residents in Hebei Province during 1978-2020.
The gray fuzzy prediction model is a method that finds generated data with strong regularity through data processing in a system containing uncertain components and establishes a differential equation to predict future trends. This method is applicable to data with small sample size and unknown distribution. Most scholars have focused on improving the methods to increase the prediction accuracy, and few scholars have concentrated on the selection of sample length. Therefore, the real per capita disposable income of urban residents was used as an example, and samples 3–35 in length were selected. The effects of sample length on prediction performance of the GM (1,1) model were investigated by comparing the prediction results of GM (1,1) models established using the samples. Samples with proper lengths were selected to predict the variation trend of real per capita disposable incomes of urban residents in China after 2017. The following conclusions are summarized as follows:
Sample length has a minor influence on development gray number in the gray fuzzy differential equation, whereas it has a major effect on endogenous control gray number.
Sample length presents a nonlinear relationship with the predicted relative error of the model. The model-predicted relative error fluctuated with sample length, and the fluctuation amplitude immensely increased.
Small samples with lengths less than 15 are suitable to establish the gray fuzzy prediction model by analyzing the relationship between MAPE in each step size and sample length compared with large samples with length of more than 15. Small samples with lengths below 15 are suitable to establish the gray fuzzy prediction model compared with large samples with lengths more than 15.
Some of the large samples have good prediction performance similar to the prediction performance of small samples. The internal errors of large samples in the model are large, exceeding 10%, and the overall prediction error of the model is unstable in the past years. Therefore, this result fully indicates that the GM (1,1) model is unsuitable for the prediction of large samples.
Small samples with length of 8–13 are applicable to the GM (1,1) model through three-step prediction.
Year 2017 is used as the base period, real per capita disposable incomes of urban residents will continuously increase in the next three years, annual growth rate will exceed 10%, and per capita income will exceed RMB 40,000 by 2020.
The above conclusions overcome the limitations of existing studies in consideration of the effects of sample length on prediction performance. The self-compiling function of the GM (1,1) model was established using the advantages of R software to calculate the results of 33 models. The nonlinear relation between sample length and gray fuzzy prediction performance was determined on the basis of visualized model results to provide sufficient evidence for the conclusion drawn in existing studies. In particular, the gray fuzzy prediction model was suitable for small sample data, and sample lengths suitable for modeling were proposed. This study has some limitations. The gray fuzzy prediction model is a traditional GM (1,1) model without analysis of other derivative models in the gray fuzzy model system. The above conclusions provide a certain theoretical foundation and guidance for the research and application of gray fuzzy prediction in the future.
