Abstract
Grey system models have proven to be effective techniques in diverse fields and are crucial to global decision science. Amongst the various approaches of grey theory, the fractional-order grey model is fundamental and extends the cumulative generation method used in grey theory. Fractional-order cumulative generating operator offers numerous significant benefits, especially in educational funding that is often influenced by economic policies. However, their computational complexity complicates the generalization of fractional-order operators in real-world scenarios. In this paper, an enhanced fractional-order grey model is proposed based on a new fractional-order accumulated generating operator. The newly introduced model estimates parameters by utilizing the method of least squares and determines the order of the model through the implementation of metaheuristic algorithms. Our results show that, after conducting both Monte Carlo simulations and practical case analyses, the newly proposed model outperforms both existing grey prediction models and machine learning models in small sample environments, thus demonstrating superior forecast accuracy. Moreover, our experiments reveal that the proposed model has a simpler structure than previously developed grey models and achieves greater prediction accuracy.
Introduction
Forecasting results should enable decision-makers to make rational and scientific decisions. Research into forecasting methods has led to the development of numerous fields. For the purpose of ensuring resource allocation and risk avoidance, scientists have researched forecasting methods to assist decision-makers. Numerous forecasting methods have been developed, each with its own advantages and limitations. As a result of the development of mathematical models, these methods are constantly being improved. Statistical methods [1], fuzzy mathematics [2], and grey system theory [3] are the primary methods of forecasting. At this stage, deep learning has become widely used in time series forecasting as a result of the advent of big data [4]. However, in practical application scenarios, it is also necessary to make predictions using small data, i.e., when there is a limited amount of data to analyze. This has been an ongoing research topic for many researchers over the past few years.
Grey system theory is primarily concerned with modeling problems under marginal sample conditions. After decades of development, it has been successfully applied to various industries, solving several challenging issues that are impossible to answer using machine learning or physics models in small sample situations [5]. Scholars have focused their attention on predictive models in grey system models [6]. Because of the work of numerous scholars, grey prediction models have evolved into a discipline that answers many practical prediction problems in fields such as energy [7], management [8] and economics [9]. For example, Zhang et al. [10] established a framework to differentiate valid and invalid grey information based on their impact on model performance. Additionally, they put forward a probabilistic accumulation grey forecasting model that utilizes the Bernoulli distribution. Meanwhile, Sapnken [11] introduced a hybrid multivariate grey model optimized by genetic algorithm (GA) to forecast oil product consumption in Cameroon. The model employed a sequential selection forecasting mechanism for enhanced accuracy. Tulkinov [12] employed two intelligent grey forecasting models, namely the optimized discrete grey forecasting model DGM (1,1,α) and the optimized even grey forecasting model EGM (1,1,α,θ), to forecast electricity production. In a study by Duman et al. [13], an optimized multivariate grey model was proposed to forecast electronic equipment waste, taking into account numerous input factors. In another work, Islam et al. [14] developed a particle swarm optimization-based GM(1,1) model to estimate the warehouse’s primary performance characteristics, given limited historical data. Hamzacebi et al. [15] predicted the annual electricity consumption of Turkey using an optimal grey model, implemented through both direct and iterative approaches. To forecast conventional energy usage, Kumar et al. [16] utilized three time-series models, namely Grey-Markov model, Grey-Model with rolling mechanism, and singular spectrum analysis (SSA). The theory of grey systems and fractional-order calculus are integrated into models of fractional-order grey prediction. Researchers presented a fractional-order cumulative generating operator, which successfully enhanced the degree of freedom and precision of the grey prediction model by expanding the order from integer to fractional order [17]. For instance, Huang et al. [18] utilized a hybrid optimization approach comprising particle swarm optimization (PSO) and genetic algorithm to determine the optimal order for their fractional-order cumulative grey model (FAGM). In the fractional-order grey prediction model, the fractional-order accumulation generating operator is a discrete variant of the fractional-order integral. Therefore, fractional-order grey prediction models and fractional-order calculus are intimately related.
With the development of fractional order calculus [19], there are two types of fractional order grey prediction models: a difference equation with fractional order accumulation and a differential equation with integer order derivative [20], and one with both fractional order derivatives and fractional order accumulation [21]. Integrating discrete fractional operators and continuous fractional operators in a grey prediction model can potentially enhance the accuracy of predictions. However, doing so also leads to an increase in the computational complexity of the model. Depending on the type of derivatives used, fractional order grey prediction models may include both non-local fractional order derivatives [22] as well as local fractional order derivatives [23]. The two types of grey prediction models each have their own merits, with the local fractional order grey prediction model being computationally simpler. Several grey models with local fractional order derivatives have been crucial in real-world applications. The conformable fractional order derivative [24] and Hausdorff fractional order derivative [25] are two types of simple-form local fractional order derivatives. Accordingly, they are commonly employed in the study of grey systems. Wang proposes a novel Hausdorff fractional NGMC(p,n) grey prediction model based on the NGMC(1,n) model. The hyperparameters of the model are also optimized using grey wolf optimizer (GWO), and the experimental results demonstrate a high degree of accuracy for the proposed model [26]. Chen presents a novel fractional Hausdorff discrete grey model that is applied to renewable energy consumption for the years 2021 to 2023 in three different areas; experimental results indicate that the model is capable of accurate prediction [27]. Li developed a novel grey prediction model using support vector regression and Hausdorff derivatives [28]. Moreover, the fractional Hausdorff grey prediction model and its enhanced variants are also used to assist decision-makers and address various industry challenges in the environment [29], economy [30], and human resources [31]. As part of this study, we present an enhanced fractional-order Hausdorff model, with the primary objective of reconstructing the reconstructed model by merging the revised fractional-order cumulant and difference operators. An introduction to the grey prediction model can be found in Fig. 1 below, and a shortened version of the relevant concepts can be found in Table 1.

Introduction to the relevant concepts of grey system model.
Meaning of main symbols
In comparison with the previous literature, the main contributions of this study can be summarized as follows:
(1) We develop a new fractional-order grey prediction model based on Hausdorff difference and accumulation operators and validate the model’s validity using numerical examples and Monte Carlo simulation.
(2) The new model is used to forecast education funding in four Chinese provinces. Compared to the comparison model, we determined that the new model is more accurate at predicting education funding and can be applied to real-life cases.
(3) Based on the validated model, four provinces in China were forecasted for education funding over the next several years, and recommendations were made regarding future educational resource allocations in those provinces.
This research is divided into the following sections. The second section consists of basic material, including the theory of fractional order calculus and the fractional order grey prediction model. In section 3, we discuss the computation of prediction values, estimation of parameters, assessment of model performance, and optimization of hyperparameters. Section 4 presents the datasets, empirical findings, and applicable analyses. In the last subsection, the entire text is summarized.
Preliminaries
This subsection provides some preparatory information. Throughout this paper,
Aside from the previously defined discrete Hausdorff derivative and integral, another definition can be introduced. The effectiveness of these new operators will be verified through experiments in the following section.
The basic definition of the optimized grey model
Following the above definitions, we will provide an optimized fractional order grey prediction model in conjunction with equations (9). Set
Due to the fact that the difference approximates the differential computation, it is possible to approximate the derivative of the differentiable function x(α) (t) as a difference [36],
The formula (15) will be used in order to estimate the model’s parameters [β1, β2] T .
The relevant literature describes this proof process [9, 37]. Following the calculation of the parameters and the discrete form of the response function, the fractional Hausdorff accumulation series can be predicted as follows:
Once the fitted and predicted values have been obtained, it is necessary to evaluate the model’s effectiveness. To evaluate the prediction performance of the OFHGM(1,1) model, four evaluation criteria were selected based on this research: the mean absolute percentage error (MAPE), the mean absolute error (MAE), the root mean square error (RMSE), and the mean square error (MSE), which are characterized as follows:
PSO for optimizing order of fractional order grey prediction model
1: Initialize a swarm of particles with random positions and velocities
2: Evaluate the fitness of each particle (i.e., the performance of the fractional order grey prediction model with the order represented by the particle)
3:
4: For each particle, update the velocity according to Eq. (28)
5: For each particle, update the position according to Eq. (29)
6: Evaluate the fitness of each particle
7: Update the personal best position of each particle if its current fitness is better
8: Update the global best position if the current particle’s fitness is better
9:
10: Return the order represented by the global best position
It may be possible to improve the predictive accuracy of a model by selecting the optimal hyperparameters. Developing a proper planning model to retrieve the model’s hyperparameters is necessary. In this study, the hyperparameters of the model are merely orders. We only need to identify the order with the least fitting error during the fitting phase. The fitting error in this study is MAPE, and the corresponding planning model is as follows:
Due to its nonlinear nature, numerous studies have demonstrated that equation (27) cannot be solved using conventional mathematical methods. In this article, we investigate the optimum values of the PSO technique. PSO is an evolutionary algorithm for solving practical optimization problems. It has been used in various fields due to its simple rule, rapid convergence rate, and relatively few internal factors. Below are the specific steps involved in the computation: To begin, formulate the fitness function as
The second scenario assumes that m particles are in D-dimensional airspace, where
In this subsection, Monte-Carlo Simulation is utilized to explain and validate the new model’s predicted stability. Considered are four random sequences with distinct distributions (normal distribution, chi-square distribution, exponential distribution, and F distribution). In this study, a random sequence satisfying the four distributions with a length of 12 was generated using Matlab when the parameters of the four distributions (Degrees of Freedom, Standard Deviation, and Average Number) were taken to be 5, 10, 15, and 20, respectively. The sequences were then modeled using the modified models OFHGM(1,1), CFGM(1,1) and FHGM(1,1) models were then used to model the generated sequences separately to observe the errors generated by the models. In this study, the predictive performance of the models was measured using MAPE and the sequence generated under each parameter is 100. As can be seen from the Fig. 2, except for the normal distribution, the new proposed model has better prediction performance than FHGM(1,1) and CFGM(1,1) for the sequences satisfying the other three distributions, i.e., with more minor prediction errors and fewer outliers. This suggests that the new proposed grey prediction model has some advantages when it comes to fitting data sets that satisfy certain distributions. The OFHGM(1,1) model can fit some special distributions. Also, we can observe that the fitting errors of all three types of models are relatively close to the series of normal distribution, which indicates that the three types of models are most suitable to fit the data satisfying the normal-terrestrial distribution.

Boxplots for the fitting error MAPE using OFHGM(1,1), FHGM(1,1) and CFGM(1,1) with the data satisfying four distributions with the parameter (5,10,15 and 20) of distribution function.
Water resources are an indispensable resource for human survival. Accurate water resource prediction helps to allocate water resources rationally and can provide early warnings of water scarcity. However, monitoring water resource data often requires a lot of time and effort, making it difficult to obtain a large amount of data for modeling water resources. In such cases, using grey prediction models for water resource prediction can provide better results as many existing machine learning models require vast amounts of data for modeling. To validate the effectiveness of our model, we conducted further analysis using water supply data 1 . (measured in 100 million cubic meters) from five regions in China. Specifically, we used data from 2004 to 2017 as the training set to establish the model, and data from 2018 to 2021 as the testing set to verify the model’s accuracy. Our experimental results revealed that our model outperformed other fractional-order prediction models in terms of predictive performance. Table 2 displays the prediction errors of our model on the testing set, which indicates that the OFHGM(1,1) model achieved the lowest prediction error compared to other models for the five regions’ datasets. If we use the data from Fujian as an example, OFHGM(1,1) model’s MAPE is 1.9889%, while the MAPE values of CFGM(1,1) and FHGM(1,1) are 4.5678% and 4.9448%, respectively. This clearly demonstrates that the new model has the ability to achieve superior results in practical applications.
Comparative analysis of MAE, RMSE, MSE, and MAPE across different forecasting models
Comparative analysis of MAE, RMSE, MSE, and MAPE across different forecasting models
Grey model is suitable to solve the problem of “small sample, poor information”. The sample size of education data is usually small, and at the same time, it is affected by economic [38], social and other uncertain factors, so it is suitable to use the grey system for prediction. The purpose of this section is to use the OFHGM(1,1) model to forecast the funding for education (unit: 10,000 yuan) in four cities in China (Inner Mongolia, Heilongjiang, Gansu and Ningxia), the data of four cases are from the official website of the National Bureau of statistics of China. The efficacy of OFHGM(1,1) is evaluated by comparing our OFHGM(1,1) model to the FHGM(1,1), CFGM(1,1), DGM(1,1) [39], LSTM (Long Short-Term Memory) [40], GRU (Gate Recurrent Unit) [40], MLP (Multilayer Perceptron) [41] and GM(1,1) [5]. In these models, LSTM, GRU, and MLP are classical machine learning techniques. LSTM and GRU are time cycle neural networks, which can solve the long-term dependence problem of common RNN (recurrent neural network). MLP, known as artificial neural network, can be used to model nonlinear dependence and is robust to noise. Based on data from 2013 to 2017, we will construct a model and then make predictions based on data from 2018 to 2019. The MAPE of several types of prediction models is compared with the performance of predictions. According to Table 3, OFHGM(1,1) achieved minimal prediction errors, as well as a high degree of generalizability. According to the Table 3, the OFHGM(1,1) model shows the best prediction performance across all four data sets for education expenditures in the four Chinese locations. The new model achieves the minimum value for all four error indicators compared to other models. For example, when predicting education expenditure in Gansu, the MAPE value of the OFHGM(1,1) model is 1.5663%, but the MAPE value of the FHGM(1,1) model is 3.9222%, demonstrating that the improved model has improved prediction performance and has contributed to its improvement. There is an intriguing phenomenon regarding machine learning models’ relatively poor prediction performance, particularly deep learning models with complex structures. Because these models have complex structures and many parameters, and a large amount of data is required to build a suitable model, it is often challenging to apply to small data sets. This example demonstrates that the grey prediction model has certain advantages when applied to small sample sets of data. However, it is possible to make more accurate predictions using our enhanced model.
Results of MAE, RMSE, MSE and MAPE for different forecasting models
Results of MAE, RMSE, MSE and MAPE for different forecasting models
aIn Inner Mongolia, Heilongjiang, Gansu, and Ningxia, the orders are 0.72852, 0.73893, 0.61627 and 0.67223, respectively. b The activation function is ReLU, the optimizer is Adam, and the number of units in the first hidden layer is 50. c The activation function is ReLU, the optimizer is Adam, and the number of units in the first hidden layer is 50. d The activation function is ReLU, the optimizer is Adam, and there is a single hidden layer consisting of 50 neurons. e In Inner Mongolia, Heilongjiang, Gansu, and Ningxia, the orders are 0.54342, 0.50025, 0.37666 and 0.79434, respectively. f In Inner Mongolia, Heilongjiang, Gansu, and Ningxia, the orders are 0.69298, 0.66797, 0.58712 and 0.7499, respectively.
Finally, the new model offers certain advantages for projecting education funding in the four regions to provide policymakers with more plausible recommendations. Using the standard OFHGM(1,1) model, we can predict the education funding for these four regions over the next eight years. According to Fig. 3, by 2027, education funding in Inner Mongolia and Heilongjiang will remain stable and will not differ significantly from 2020, whereas education funding in Gansu and Ningxia will continue to increase, with Gansu Province achieving the highest growth rate among the four regions. Policymakers should allocate educational resources based on the expected outcomes in order to support the high-quality and sustainable development of education.

The prediction results of Inner Mongolia(top left), Heilongjiang (top right), Gansu (lower left) and Ningxia (lower right) from 2020 to 2027.
A precise forecast of educational financing aids in determining a logical allocation of resources. However, due to economic laws, only a limited percentage of the data can be used for modelling purposes. Many trials have shown that the model provided in this paper is useful for predicting education expenditures. To further reduce the complexity of the fractional-order grey model, we propose a novel Hausdorff fractional-order grey prediction model, also known as the local fractional-order grey model. In this study, we develop an enhanced fractional-order Hausdorff grey prediction model and a technique for estimating the model’s parameters. The model is evaluated using data on education funding from four Chinese regions. Compared to other grey prediction models and machine learning model, OFHGM(1,1) provides a more accurate prediction. Analysis of the new model’s characteristics reveals several exceptional characteristics. As shown by many numerical results, the proposed model outperforms current grey prediction algorithms and machine learning models in forecasting education funding, suggesting that our proposed approach can effectively reveal data patterns buried within the data. There are, however, still some tasks to be completed: (1) Even though the OFHGM(1,1) model yields excellent fitting accuracy, overfitting may occur; we will discuss various methods for preventing overfitting. (2) The optimization technique will be enhanced to increase the predictions’ accuracy.
Declaration of competing interest
No conflicts of interest regarding the publication of this paper.
Footnotes
Acknowledgements
The relevant researches are supported by the National Natural Science Foundation of China (No. 62007020) and the Project funded by China Postdoctoral Science Foundation (No. 2022M711883).
The samples are collected from China Statistical Yearbook, which can be downloaded at https://www.stats.gov.cn/
