Abstract
Short-term load prediction has always played an increasingly important part in power system administration, load dispatch, and energy transfer scheduling. However, how to build a novel model to improve the accuracy of load forecasts is not only an extremely challenging problem but also a concerning problem for the power market. Specifically, the individual model pays no attention to the significance of data selection, data preprocessing, and model optimization. So these models cannot always satisfy the time series forecasting’s requirements. With these above-mentioned ignored factors considered, to enhance prediction accuracy and reduce computation complexity, in this study, a novel and robust method were proposed for multi-step forecasting, which combines the power of data selection, data preprocessing, artificial neural network, rolling mechanism, and artificial intelligence optimization algorithm. Case studies of electricity power data from New South Wales, Australia, are regarded as exemplifications to estimate the performance of the developed novel model. The experimental results demonstrate that the proposed model has significantly increased the accuracy of load prediction in all quarters. As a result, the proposed method not only is simple, but also capable of achieving significant improvement as compared with the other forecasting models, and can be an effective tool for power load forecasting.
Introduction
Precise short-term load prediction plays an increasingly significant role in the electricity market operation. A highly precise prediction approach is one of the most crucial methods utilized in advancing electrical power system development [1]. Effective prediction is beneficial for formulating economic policies for relevant departments and reducing management risk [2]. And it is an essential role in electricity risk management [3]. On the contrary, inaccurate electrical load forecasts lead to a considerable loss for electric power systems [4]. A reduction in load prediction error in terms of the mean absolute percentage error (MAPE) of 1% reduces the variable generation cost by between $0.6 and $1.6 million annually for a 10,000 MW utility with a MAPE of approximately 4% [5]. As we know, the world has happened many times major blackouts, such as the 2003 U.S.-Canada Blackout [6], and the Indian power grid collapse [7], severely affecting the production of enterprises and life of millions of people. If these events can be given an early warning ahead of us, on the basis of an excellent forecasting algorithm, these accidents can be avoided.
As a result, various statistical models are universally implemented in load forecasting, including the autoregressive model [8], autoregressive moving average model [9, 10], autoregressive integrated moving average model [11, 12], regression model [13], multiple linear regression [14], general exponential smoothing [15], and the Kalman filtering method [16], and so on. Nevertheless, the related literature displays that the conventional statistical method has the inevitable drawbacks that it can not properly explain the complicated relationship between the electric power load and the random variables, as a result, the drawbacks can result in highly uncertain changes in electricity demand [17]. Furthermore, traditional statistical prediction models are poorly adapted to data series and have excellent prediction performance for time series with fixed variation patterns, which are not capable enough to handle the nonlinear characteristics of power load data series. Meanwhile, due to the simple structure of traditional prediction models, the accuracy of the prediction of power load data series is not high, especially when the volatility of data series is high. Hence, artificial intelligence algorithms, such as the Elman neural network [18, 19], and support vector machines [20, 21], are considered as one forceful prediction appliance with strong robustness and fault tolerance to settle the short term load prediction issue and related studies. Zhang et al. [22] implement a two-layer decomposition approach, extreme learning machine, and differential evolution algorithm to achieve short-term electricity load forecasting. Demonstrated by Australian and Spanish power data that the developed model has a good performance. Afrasiabi et al. [23] employ a deep learning strategy in power market forecasting, which is an available method. An advanced hybrid method, utilizing modified variational mode decomposition to decompose the power load data and employing a machine learning technique optimized by a sparrow search algorithm to predict power demand, is proposed in [24]. Yang et al. [25] combine adaptive parameter-based variational mode decomposition, optimal kernel-based extreme learning machine model, and chaotic sine cosine algorithm to predict electricity price. Iwabuchi et al. [26] conduct electricity price prediction based on switching mother wavelets, wavelet transform, and long short-term memory, and achieve good forecasting results. Meng et al. [27] employ empirical wavelet transform, long short-term memory network, as well as the crisscross optimization algorithm to predict power price. Khan et al. [28] introduce a convolutional neural network to solve the problem of power load prediction and achieve satisfactory results. Özen and Yıldırım [29] propose a Bootstrap aggregation approach to power price prediction. Du et al. [30] combine a grey model with fractional order accumulation, seasonal factors, sine cosine algorithm as well as an error correction strategy to predict power consumption, which has strong adaptability and usefulness. In general, machine learning technology has achieved very effective research results in power load prediction and related fields, and has become one of the main hot research directions in power load prediction.
As we know, the back propagation neural network (BPNN), as one of the typical machine learning technology, is capable of performing the nonlinear system. However, the performance of BPNN can be unstable on account of the random allocation of input weights and hidden biases. To cope with this question, in this paper, a novel prediction model was put forward, combined with data selection, wavelet de-noising (WD) algorithm, BPNN, rolling mechanism (RM), and modified artificial bee colony (MABC) algorithm, denoted by WDRMABCBP. Specifically, data selection is designed to select suitable training datasets to build the model, while WD data preprocessing algorithm is employed to reduce the interferences from the original data. Besides, the RM is developed to achieve multi-step forecasting and improve forecast accuracy, meanwhile, the global search technique MABC is established to detect the optimal initial parameters of weights and biases for the BPNN model. To measure the availability of the proposed model, electricity load data of New South Wales, Australia, is considered as the case study, and two experiments are conducted in this paper. The experimental results demonstrate that the proposed model has better performance than other algorithms.
The main contributions of this paper can be summarized as follows: An innovative multi-step forecasting model is developed for electrical load forecasting, which combines the merits of data selection, data preprocessing, artificial neural network, modified artificial intelligence optimization algorithm, and rolling mechanism. Its forecasting ability is validated well in the Australia electricity market. Different from some previous studies, data selection is considered in the modeling process, which can further improve the forecasting performance by providing training datasets with the same properties as testing datasets. A modified artificial intelligence optimizer is employed to search the main parameters and develop the optimal forecasting model, while data preprocessing is designed to remove noise from the original electrical power load data. Experimental results prove that data preprocessing and intelligent optimization can enhance the forecasting model’s performance. Rolling mechanism is designed for achieving multi-step forecasting, which can improve the overall forecasting accuracy of the developed model for multi-step forecasting. Experimental results reveal that the rolling mechanism can be considered an alternative for multi-step forecasting.
The rest of this paper is arranged as follows. A general depiction of the essential methods of the novel approach is displayed in
Methodology
In this part, a global depiction of the essential methods of the novel approach is shown detailedly.
Data preprocessing
The wavelet de-noising algorithm, as a data preprocessing tool, is employed to remove the interferences from the original data. The relevant literature suggests that several decomposed wavelet details are related to the mean value of noise, whereas others are connected with the ‘clean’ series [31]. Hence, if we remove the ‘unimportant’ noise information, the original signal can be reconstructed without losing any significant details of the original signal [32–34]. The wavelet function ‘wden’ is the general form for denoising in practice [35]. Hence, the function is performed to denoise the original signal in this paper.
The determination of parameter settings plays a significant part in achieving satisfying results of de-noising. The relevant reference suggested that the best results were acquired by the Coif, Daubechies, and Symlet wavelets [33]. Therefore, only these wavelet basis functions are considered in this study. Besides, the choice of the level and thresholding (soft or hard) is also important for obtaining an ideal de-noising effect. Four threshold selection algorithms, as shown in Table 1, are available in this paper.
Four threshold selection algorithms [33]
Four threshold selection algorithms [33]
The noise removing algorithm is given as follows:
The algorithm is designed to achieve the suppression of the interference e permitting to gain of the raw signal. To achieve the best result, three threshold rescaling options (one, sln, and mln) can be employed in this process. The option ‘one’ is based on the white-noise model as shown in Equation (1), while the scheme ‘sln’ is related to the fundamental model and employs an unscaled white-noise method using an individual estimation of the noise level. The ‘mln’ corresponds to the noise algorithm with the non-white noise method. The choice of threshold rescaling algorithm depends upon the noise estimated level.
The BPNN model, which is trained by the method called back-propagation learning algorithm, is one of the typical artificial neural network models. It is composed of three types of layers: the input layer, hidden layer, and output layer. The architecture of BPNN is determined by the number of nodes in each layer. And each node is a neuron that computed the inner product of the input vector and weight vector using a nonlinear transfer function to get a scalar result.
The BPNN algorithm can be depicted as Algorithm 1:
Artificial intelligence optimization algorithm
In this part, the principle of the original artificial bee colony algorithm is mentioned, and then the modified optimization algorithm is described in detail.
Original artificial bee colony algorithm
Karaboga established an artificial bee colony (ABC) algorithm to settle practical issues, which is an artificial intelligence method inspired by the performance of honeybees foraging for nectar [36]. The ABC algorithm is composed of three groups of honeybees: employed bees, onlookers, and scouts. As shown in Fig. 1, the honeybee foraging behavior can help us understand the ABC algorithm better [37].

Behavior of honeybee foraging for nectar.
The population of the ABC algorithm is composed of SND-dimensional vectors of decision variables, and each solution
In the beginning, each solution
Each employed bee is responsible for modifying the position of the food source by randomly selecting an around food source. A new food source
Each watching bee selects a better source from all the hired bees. The watching bee chooses a source by applying the roulette wheel selection method, which is represented as
If an individual cannot be further improved after a given number of trials (denoted by limit), then the individual is discarded. If
The relevant literature suggests that the search equation determined by Equation (4) does excellently at exploration but weakly at exploitation [38, 39]. To overcome the weakness of the original ABC, a modified search equation is developed as:
The MABC algorithm can be generalized as Algorithm 2:
As we know, recent data can be applied for improving prediction accuracy [40]. The rolling mechanism, as a metabolism method, updates the input by abandoning old data in each loop. Normally, it is employed to improve the forecast performance [41]. In each loop, the next step of forecasting always uses the most recent data. Of course, the ‘recent data’ not only refers to the true value but can also be understood as a predictive value. In this context, the rolling mechanism also can be combined with one forecasting model to achieve multi-step forecasting.
The BPNN model with rolling mechanism, denoted as RBP algorithm, can be depicted as Algorithm 3:
From Fig. 2, we can understand the forecasting process of the RBP algorithm well. The model is established to make multi-step predictions.

An example of the forecasting procedure by RBP.
The proposed model will be elaborated in this section, and the flowchart is displayed in Fig. 3. It is composed of four parts.

The flowchart of the proposed model.
The neural network is very sensitive to historical data, bad data will make the training and prediction of neural network disturbance [42]. On the one hand, training data is crucial for one model’s forecasting performance. As a result, in this study, data selection (see Fig. 4 Part b). is designed to select suitable training datasets to build the model, which can further improve the forecasting performance by providing training datasets with the same properties as testing datasets. On the other hand, the power load signals are generally noisy, hence, in this study, the wavelet de-noising algorithm is employed to reduce the interferences from the original data to elevate the influence of noise information. Compared with the low-frequency component, the high-frequency component has more randomness and less influence on future electrical load data. As a result, the sequence with the highest frequency is eliminated, other sequences are reconstructed, and major fluctuations of load sequence are maintained, which are input into the load prediction model as independent variables.

The overview of our experiments.
Selecting suitable network inputs and outputs plays a vital role in establishing the network structure because the dynamic behavior of the network is highly dependent on the selected inputs and outputs. Many researchers choose the load data in the first few periods as inputs, and some recent load data as outputs. There will be more than one input and output, especially the number of outputs, it is possible to weaken the performance of forecast and the possibility of improving the accuracy due to the complex structure of the BPNN that is caused by large amounts of weight and biases. In this paper, we found that the first three periods of load, are the prevalent candidate inputs having an important influence on the characters of the load profile. Therefore, in this paper, taking one-step forecasting as an example, the input number is 3, meaning that the electrical power load forecasting value
Part three: Determine the optimization techniques
There are many studies of optimization techniques to optimize neural networks. Nevertheless, it is noted that directly optimizing the basic BPNN model is time-consuming due to the structure of the model becoming complicated with the increasing number of nodes. As we mentioned above, our BPNN model has a proper size, so we can get an excellent model by artificial intelligence optimization methods. In this study, the MABC algorithm is adopted as an optimization technique. This method improves the performance of the model and solves the problem that the model is easy to fall into the local minimum because it can achieve the goal without considering the error function that is differentiable or not. The algorithm shows great strength in assessing connection weights and improving performance. Therefore, the MABC is suitable for optimizing the BPNN model.
Part four: Building the model and evaluation
In this section, the optimal parameters adjusted by MABC are performed to train the BPNN model. And then, combined with the rolling mechanism, we build the proposed model. Eventually, the novel model is adopted to predict the 1-step and 6-step load data forecasting. After the predicted values of the models are acquired, the comparison between different models was conducted in this paper to evaluate and prove the superiority of the proposed model for multi-step power load forecasting.
Experiments and analysis
For the sake of assessing the performance of the proposed model in conducting load forecasting, two experiments are conducted in this part with the load data of the State of New South Wales, Australia. The schematic diagram of our experiments is displayed in Fig. 4 Part a.
Data selection
The selection of datasets is one of the vital steps in prediction. For the sake of improving forecasting performance and verifying the adaptability of the novel model applied in short term load prediction, in this paper, we also have to take the influence of the seasonal factors on the power load data into account. As can be seen from Fig. 4 Part b. in this paper, we first classify the data by the quarter, and then we identify the index of weekdays or weekends [43] in each quarter. Therefore, the original dataset is divided into totally eight types of electrical power data, and we will separately predict the electrical power of different types.
The performance metric
To better evaluate the predictive performance and learn the model features more comprehensively, three performance measurement rules are first adopted, as shown in Table 2. The smaller indices values represent better forecast performance.
Three metric rules
Three metric rules
Besides, to evaluate the improvement percentages of accuracy between the two algorithms, the decreased relative error (RE) is proposed in the study, as presented in Table 3.
The decreased relative error (RE)
To assess the performance of the proposed novel model, two experiments are conducted in this paper. As previously mentioned, the original dataset is divided into totally eight types of electrical power data, and respectively predict the different power series with the power load of New South Wales, and separately compare the performance of the different models. And the testing sample is composed of one and five day’s data for the case study of weekdays, and one day’s data for the case study of weekends. Although we separately choose one day and five days as the testing sample and showed all the metrics values for the testing sample that we choose in weekday’s study, instead of listing all the figures of testing results, we only list the figure of the testing result by five days sample for weekdays. The one-day testing sample is only adopted to verify the proposed model can effectively apply for load forecasting without retraining the network using the recent data. It means that the proposed model has a high generalization performance. Furthermore, it is believed that a retraining network can be applied for improving forecasting accuracy as long as the conditions permit. Retraining the network, however, is not the major concentration of this study. Therefore, we adopt a simple method that achieves the ideal accuracy without retraining the network using the new data.
Moreover, to determine the best experimental parameters of Wavelet de-noising, BPNN, and MABC, such as the number of decomposition levels, the number of nodes, the number of follower bees, and so on, the trial and error approach are employed to settle this question. Table 4 shows the selection result of the experimental parameters by the method.
The experiment parameters in this study
The experiment parameters in this study
For the sake of estimating the effectiveness of the novel model, three models are established to make one-step predictions and six-step predictions. Table 5-6 displays the comparison of the four quarterly prediction results of each model. And MAE, MAPE, RMSE, TIME are adopted to measure the forecasting performance. Besides, the average value of the indexes of the four quarters is calculated to show the model’s average performance level in one year. Meanwhile, Fig. 5 shows the prediction results of one-step, six-step ahead achieved by the different models based on five day’s samples. Whether on weekdays or weekends, it is also clear that the proposed model performs better than any other.

The forecasting results of the developed model and other models (Experiment I).
The results of the proposed model and other models (Experiment I)
(1) Table 5. Part (1) shows the comparison result of the different models using half-hourly data from weekdays, and the testing sample composed of one day’s data. By comparing the three prediction models, it can be seen that the proposed model is robust and can obtain better performance, as shown by the smaller MAPE value, MAE value, and RMSE value. For instance, in the first quarter, the one-step and six-step forecasting MAPE of the proposed model is 0.60% and 2.24% respectively, and the RBP model is 0.99% and 3.03% the RMABCBP model is 0.91% and 2.63%. For the weekday one-step forecasting in the first quarter, we can come to the conclusion that, compared with RBP and RMABCBP, the developed model leads to reductions of 39.3939% and 34.0659% in MAPE. For six-step forecasting, compared with RBP and RMABCBP, the developed model leads to reductions of 26.0726% and 14.8289% in MAPE, respectively. For the other three quarters, the RE values are represented in Table 6. Part (1).
The decreased relative error (RE, %) between different models (Experiment I)
(2) From Table 5. Part (2), it can also be concluded that the proposed model forecast the half-hourly load data effectively, and among all models, the proposed model has obtained the best prediction results. When comparing Table 5. Parts (1) and (2), we can conclude that: (a) With the increase in the number of test samples, if the test results change in a good direction, it can be said that the model has a strong ability to forecast the power load. For example, in the first quarter and fourth quarter, the latter evaluation index values are smaller than the former. These results state clearly that our proposed model can effectively apply to electricity load forecasting, at least in the first and fourth quarters; (b) In the second and third quarters, the latter result is worse than the former. But, we can’t assert that our proposed model is poor at forecasting. Because the former result shows that our developed model can realize high-precision forecasting. The accuracy shown by the latter result is also acceptable to some extent; (c) As for us, the reason why the latter becoming worse is that the load data has a weak regularity in these quarters. We assume that if we add new data into the training sample after the end of each day, and then retrain the model, the performance should be better.
(3) From Table 5. Part (3) and Table 6. Part (3), can be analyzed that: (a) The proposed model has the best performance among all models, indicating that the model can capture the characteristics of the load data series, and achieve good load forecasting performance. In addition to the second quarter, the MAPE value of our proposed model is less than 3% in the six-step forecasting experiment, and the MAPE value of one-step and six-step prediction are all within a reasonable range. The one-step and six-step average MAPE values of this year are 0.61% and 2.71% respectively; (b) When comparing the proposed model with the original model, the former has improved the performance of the latter. The average MAPE promoting percentages of the individual RBP model by the proposed model in one-step and six-step predictions are 41.9048% and 30.6905% respectively; (c) When comparing the proposed model with the RMABCBP model, the former has also improved the performance of the latter considerably. The MAPE promoting percentages of the RMABCBP model by the proposed model in one-step and six-step predictions are 42.4528% and 17.3780% respectively; (d) The cause of the phenomenon displayed in (a) - (c) is that: the combination of the WD algorithm and the MABC algorithms has promoted the forecasting capacity of the individual RBP model effectively. Because the WD algorithm reduces the jumping character of the original load data and the MABC algorithms select the best initial weights and biases for the built RBP model, the optimized BPNN model can obtain high accuracy prediction performance.
In summary, we can reach our conclusion that the proposed model can forecast the one-step and six-step half-hourly load data effectively on weekdays and weekends.
In this section, we will compare the performance of the different models in the six-step forecasting based on hourly load data that is divided into weekdays data and weekend data. The weekday prediction precision measurements of the four quarters are calculated and compared in this experiment, and the results are displayed in Tables 7–8. Meanwhile, Fig. 6 shows the prediction results at five days’ samples of six-step ahead forecasting achieved by the different models. Whether on weekdays or weekends, it is also clear that the proposed method performs better than any other model.

The forecasting results of the developed model and other models (Experiment II).
The results of the proposed model and other models (Experiment II)
(1) Table 7. Part (1) shows the comparison result of the different models using hourly data of weekdays, and the testing sample composed of one day’s data. By comparing the three prediction models, it can be seen that the proposed model has good robustness and can obtain better performance, as shown by the smaller MAPE value, MAE value, and RMSE value. For instance, in the fourth quarter, the six-step forecasting MAPE of the proposed model is 1.98% and the RBP model is 3.31% the RMABCBP model is 3.04%. And the proposed model’s average MAPE value of six-step prediction is 2.75%.
(2) When comparing the proposed model with the RBP model and the RMABCBP model, the performance of the proposed model is better than the others. For example, in the first quarter, the developed model leads to reductions of 38.8235% and 25.0000% in MAPE in six-step prediction. This demonstrates that the de-noising method is powerful in the field of data preprocessing.
(3) As shown in Table 7. Part (1), in addition to the first quarter, the MAPE value of our proposed model is less than 3% in the six-step forecasting experiment, and the MAPE value of the six-step prediction is all within a reasonable range. The six-step prediction MAPE values of four quarters are 3.12% 2.91% 2.98% and 1.98% respectively. The six-step average MAPE values of this year are 2.75% respectively.
(4) To further validate the model for weekday load forecasting capability, more data was selected as the testing sample. Table 7. Part (2) and Table 8. Part (2), shows the result of the different models using hourly data of weekdays (5 days as test sample) and the corresponding RE values. According to the comparisons of the results, the proposed algorithm is superior to other algorithms and has a satisfactory forecasting performance in six-step forecasting. Therefore, we can conclude that our data preprocessing methods and optimization algorithm is effective. Furthermore, the developed model can predict the load data effectively on weekdays.
The decreased relative error (RE, %) between different models (Experiment II)
(5) The weekends’ prediction precision measurements of the four quarters are also calculated and compared in this experiment, and the results are displayed in Table 7. Part (3) and Table 8. Part (3). As can be seen from Table 7. Part (3) and Table 8. Part (3), the proposed model achieves the highest precision compared with the RBP model and RMABCBP model. More precisely, in the six-step forecasting, in the first quarter, the developed model achieves reductions of 33.9806% and 24.0223% in the MAPE in comparison with the RBP and RMABCBP models. In the meantime, the decreases in MAE are 37.9420% and 25.0044% and the decreases in RMSE are 35.4361% and 20.9629% severally. For the other three quarters, the decreases in MAPE and MAE are distinctly presented in Table 8 Part (3),. the result of the different models using hourly data of weekends is also clearly listed in Table 7 Part (3).
In summary, combining the result shown in Tables 7-8 and Fig. 6, a conclusion can be drawn that the developed model can forecast the six-step hourly load data effectively on weekdays and weekends.
In this study, an innovative model based on data selection, data preprocessing, artificial neural network, modified artificial intelligence optimization algorithm, and rolling mechanism is developed for electrical load multi-step forecasting. Based on the case study in the Australia electricity market, the following findings and contributions can be summarized: (1) the developed model improves the electrical power load forecasting performance when compared with the benchmark models; (2) the data selection ensures that the developed model can be trained using datasets with the same properties as testing datasets; (3) the modified optimization algorithm can improve the model from perspectives of parameters optimization, while data preprocessing can reduce the difficulty of modeling from the perspective of data characteristics; (4) the rolling mechanism can be combined with forecasting technique to achieve multi-step forecasting, which can be considered as a promising method in the future study.
In addition, the main focus of this study is developing an innovative multi-step forecasting model based on historical electrical power load data, while leaving some research directions for future study. On the one hand, some variables like weather conditions can be considered to take full advantage of influencing factors from the perspectives of experimental data; on the other hand, a more powerful time series forecasting model can be established based on deep learning methods and other techniques from the perspectives of modeling technique. Furthermore, in the future study, the modeling framework and idea can be extended to other forecasting fields, such as wind speed forecasting [44], grey forecasting model [45], tourism demand forecasting [46], oil price forecasting [47], bending force forecasting in the hot strip rolling process [48], air quality prediction [49], and building energy consumption prediction [50].
Footnotes
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Grant No. 72101138), Humanities and Social Science Fund of Ministry of Education of the People’s Republic of China (Grant No. 21YJCZH198); Shandong Provincial Natural Science Foundation, China (Grant No. ZR2021QG034, ZR2022QG036); Social Science Planning Project of Shandong Province (Grant No. 22DJJJ24); and Special Support for Post-doc Creative Funding in Shandong, China (Grant No. 202103018).
