Abstract
The electric load forecasting (ELF) is a key area of the modern power system (MPS) applications and also for the virtual power plant (VPP) analysis. The ELF is most prominent for the distinct applications of MPS and VPP such as real-time analysis of energy storage system, distributed energy resources, demand side management and electric vehicles etc. To manage the real-time challenges and map the stable power demand, in different time steps, the ELF is evaluated in yearly, monthly, weekly, daily, and hourly, etc. basis. In this study, an intelligent load predictor which is able to forecast the electric load for next month or day or hour is proposed. The proposed approach is a hybrid model combining empirical mode decomposition (EMD) and neural network (NN) for multi-step ahead load forecasting. The model performance is demonstrated by suing historical dataset collected form GEFCom2012 and GEFCom2014. For the demonstration of the performance, three case studies are analyzed into two categories. The demonstrated results represents the higher acceptability of the proposed approach with respect to the standard value of MAPE (mean absolute percent error).
Keywords
Introduction
Generally, in the MPS, there are mainly five type of forecasting mechanisms in different time-horizon (TH), which are used in the power system’s distinct applications such as [1–9]: 1) time horizon 1-year to 10-years, 2) time horizon 1-week to 1-year, 3) time horizon 1-minute to 1-week, 4) time horizon millisecond to seconds and 5) time horizon Nano-seconds to micro-seconds. There are different application area with respect to (w.r.t.) the time horizon such as: forecasting between 1-year to 10-years is utilized for the power system planning; forecasting between 1-week to 1year is used maintenance scheduling; forecasting between 1-minute to 1-week is used for UCA (unit commitment analysis), ELD (economic load dispatch), OPF (optimal power flow) and automatic generation control; forecasting between milliseconds to seconds is utilized for MPS dynamic analysis; and finally forecasting between nanoseconds to microseconds is use for MPS transient analysis [3].
The ELF is an emergent requirement for the MPS planning, which is required for the following applications such as: 1) capacity planning, 2) network planning, 3) generation & transmission capital investment planning, 4) financial forecast, 5) efficient power procurement, 6) selling of excess power, 7) fuel ordering planning, 8) optimal supply and scheduling, 9) renewable planning, and 10) fuel mix selection planning etc. in the MPS. Moreover, there are numerous benefits for the ELF such as: ELF ensures the availability of electricity supply; ELF provides the means of avoiding over & underutilization of generating capacity; ELF supports to make use of best possible use of capacity; ELF save unnecessary capital expenditure; and ELF prevent economic growth of OPF etc. Therefore, apart of these benefits, there are some uncertainties for ELF such as: forecasting the future needs of electricity; electricity production & distribution are highly capital intensive; projects lead times are very long due to large size of project; forecasting is not isolated activity; electrical energy role should be highlighted in the society; national policy & strategy are very key-points; it may effect due to policies, public perception; additional needs for ELF in DSM & conservation policies; to make highly preciseness; single ELF may not be reliable; higher uncertainties. Therefore, ELF is highly rely on numerous key-factors such as: land, city/ industrial/ community development plans, alternative energy resources, load density, population growth, historical data and geographical factors etc.
In the present scenario, different type of forecasting models have been developed by the researchers, such as: 1) statistical models (SMs), 2) artificial intelligent (AI)/ machine learning models (AIM-MLMs), and 3) hybrid models (HMs) [1–9].
The SMs are known as mathematical model which are based on a set of statistical assumption and hypothesis. The SMs represent the mathematical relationship between variables. The few SMs are AR (autoregressive), MA (moving average), ARMA (Autoregressive Moving Average), ARIMA (Autoregressive Integrated Moving Average), ARFIMA (Autoregressive Fractionally Integrated Moving Average), SARIMA (Seasonal Autoregressive Integrated Moving Average), ARMAX, ARIMAX, NAR (Non-linear Autoregressive), NMA (Nonlinear Moving Average), TAR (Threshold Autoregressive), ARCH (Autoregressive Conditional Heteroskedasticity), GARCH (Generalized ARCH), EARCH (Exponential Generalized ARCH), Kalman filter model, ES (exponential smoothing), and GM (Grey model). These SMs are used to forecast the electric load according to linear time series and non-linear time series analysis. Apart of SMs, the AIM-MLMs are used for the ELF due to its superiority over the conventional method. These methods are NN, SVM (support vector machine), ELM (extreme learning machine), Fuzzy-logic, wavelet, GA (genetic algorithm), expert system, GEP (gene expression programming) and its associated hybrid models etc.
The organization of this paper comprises into five sections, including introduction in section-1, study area and data collection in secion-2, proposed hybrid approach in section-3, results and discussion in section-4 and finally conclusion in section-5.
Study area and data collection
Hourly basis electrical load pattern dataset is collected from GEFCom2012 [10], which includes historical dataset of twenty zones from 2004 to 2008. Based on these dataset, three case studies have been performed in this paper: 1) Month-ahead forecast, 2) Day-ahead forecast and 3) Hour-ahead forecast. According to these proposed case studies, historical data set is rearranged and preprocessed. Moreover, few statistical characteristics are analyze to show the internal property of the recorded dataset. These statistical characteristics are represented in Figs. 1 to 4 for min (minimum), max (maximum), mean and STD (standard deviation) respectively. Moreover, per hour recorded data set is represented in Figs. 5 to 8 for the time interval of “1 to 6 hour”, “7 to 12 hour”, “13 to 18 hour” and “19 to 24 hour” respectively

Monthly minimum value of the historical recorded dataset.

Monthly maximum value of the historical recorded dataset.

Monthly mean value of the historical recorded dataset.

Monthly STD of the historical recorded dataset.

Per day hourly-value of the historical recorded dataset.

Per day hourly-value of the historical recorded dataset.

Per day hourly-value of the historical recorded dataset.

Per day hourly-value of the historical recorded dataset.
The proposed hybrid model for multi-step ahead electric load forecasting is presented in Fig. 9. The proposed hybrid model is the combination of EMD and NNs. The proposed approach is comprises into five sub-parts such as: 1) Historical dataset collection, 2) Data pre-processing and decomposition, 3) different NNs model development and training performance, 4) unification of the trained models output and 5) testing and model adoptability checking.

Flowchart of the proposed hybrid model.
In part-1, historical records of the EL dataset are collected then it is pre-processed to extract the features by using EMD method in sub-part-2. Then obtained IMFs are used as predicting variable to the NNs model. In sub-part-3, different NNs models are developed according the step of forecasting. Here in this paper, forecasting is performed upto level of 3 which is performed for all three different types of case studies (i.e., moth, day and hour – ahead forecasting). After the proper training of the NN models, testing phase is performed, where its adoptability is checked whether trained model is acceptable or retrained to enhance its performance and accuracy. Thereafter, unification of the output of all developed model is performed, which gives the final output of the ELF. The main key-feature of the proposed approach are: Real-time analysis Data pre-processing without high memory capacity Multi-step testing and forecasting in a continuous manner Self-adoptability check with respect to the step-ahead forecasting Forecasting with and/or without feature extraction to maintain the processing time limit Each step of forecasting are correlated to each other, so no global trapping problem with NNs
There are several data pre-processing methods in the research domain, which are utilized to decompose the linear and/or non-linear data samples. EMD is the one of them. The EMD method has several key features such as: 1) it process the multi-frequency signals into a series of IMFs, 2) any complicated dataset can be decomposed into a finite components, 3) it is adaptive, 4) highly efficient, 5) the number of IMFs are controllable in nature via user etc.
The implementation procedure of EMD to generate the IMF is represented in Fig. 10. A generated IMF is defined as a function if it satisfies following two main conditions: 1) In collected all historical records, the number of extrema and the number of zero-crossings must either be equal or differ at most by one, and 2) At any point, the mean value of the envelope is zero.

EMD method implementation procedure.
After the implementation of the EMD on the historical dataset, generated IMFs are represented in Fig. 11 for the electric load pattern of year 2004. Although, the IMFs for all recorded data set have been generated, but due to constraint of the page limit, the IMFs for 2004 have been presented for the better visualization and understanding point of view in this section. Moreover, the generated IMFs for other dataset have been used for the forecasting purpose as mentioned in the proposed approach. In this study, the following different type of IMFs are generated for all recorded historical dataset: 1) IMFs generation for monthly forecast, 2) IMFs generation for daily forecast, and 3) IMFs generation for hourly forecast.

IMFs representation for hour#1 of 2004 dataset.
The Fig. 11 shows the generated IMFs for hourly forecast only for the time duration of first hour out of 24 hours of a day. For the visualization of the generated IMFs of remaining hours, the evaluated energy magnitude is represented in Fig. 12 for all hourly datasets of the year 2004.

Energy magnitude representation for genrted IMFs of hour#1 to hour#24 of 2004 dataset.
In this study, used NN architecture of a single hidden layer MLP is represented in Fig. 13. Let’s assume i1, i2 are input1 and input2 respectively of the training dataset I (i.e., I = [x1, x2, . . . . . x n ]).

Flowchart of the NNs model’s implementation.
The mathematical modeling for NN architecture is as follow:
Step1) evaluation at input layer:
Step2) evaluation at hidden layer:
Step3) evaluation at output layer:
Where,
The number of hidden layer neuron are evaluated as:
In the research domain of the forecasting and prediction, several standard indices are implemented to analyze the performance of the proposed forecasting approach/models. Some of them are: MAE, MAPE and RMSE. In this study, MAPE is used to evaluate the performance of the proposed forecasting models.
Mean absolute error (MAE):
The results demonstration are represented into two categories: 1) One-Step Ahead forecasting and 2) Multi-Step Ahead forecasting. Both categories of forecasting are further analyzed into three sub-micro levels. These demonstration of micro-levels are as follows: a) Monthly basis, b) daily basis and c) hourly basis. The information of the used dataset for forecasting models is represented in Table 1 and obtained best optimal results from each case study have been demonstrated in Table 3. Table 2 represents the total number of developed NN models in this study.
Dataset used for NNs model formation for ELF
Dataset used for NNs model formation for ELF
Developed NNs models for multi-step forecasting
NNs performance for ELF: MAPE value
These three case studies (i.e., monthly, daily and hourly forecasting) are formulated with consideration of it distinct and valuable application in the power system such as system planning, maintenance scheduling, unit commandment, economic load dispatch management, OPF (optimal power flow) analysis, automatic generation control etc.
Month-ahead forecasting
The month-ahead forecasting is very useful for maintenance scheduling of the system in the power system network. In this section, one-step ahead forecasting results for monthly forecasting has been presented, which are represented in Table 4 for the training and testing phase. According to the [21], the value of MAPE of a forecasting model is less than 10% then model is highly acceptable condition. In this case study, the maximum and minimum MAPE for training phase is 3.66 and 5.93 respective and the average MAPE for all month in a year is 4.42. In the testing phase, the evaluated minimum and maximum MAPE is 4.7 and 7.6 respectively. Therefore, after analyzing Table 4, it is clear that developed NN model for one-step ahead forecasting is more acceptable and it can be used for further analysis in multi-step ahead forecasting. Fig. 14 represents the obtained results of one-step ahead forecasting for monthly basis.
ANN model performance for case study#1: MAPE for Month-ahead LF
ANN model performance for case study#1: MAPE for Month-ahead LF

Performance measures for single step ahead monthly forecasting.
In this section, the day-ahead forecasting results for testing phase have been represented in Figs. 15 and 16 for the tested dataset of year 2006 and 2007 respectively. From the test result analysis, it is shown that the average value of MAPE for Monday to Sunday are 5.63, 6.26, 5.98, 7.42, 9.23, 5.95, and 7.32 respectively for the year of 2006. And for the year 2007, the testing results are 6.41, 6.84, 9.5, 8.5, 2.99, 10.9, and 9.78 for the day of Monday to Sunday respectively. In the both cases (2006 and 2007), the weekly average value of MAPE are 6.83 and 7.85 respectively, which are highly acceptable for the forecasting problems.

Performance measures for single-step ahead daily forecasting (Test#1 : 2006 dataset).

Performance measures for single-step ahead daily forecasting (Test#2 : 2007 dataset).
In this section, the hour-ahead forecasting results for testing phase have been represented in Fig. 17 (in four continuous Fig) for the tested dataset of year 2007 respectively for visualization point of view. From the test result analysis, it is shown that the average value of MAPE for hour#1 to hour#24 (for each day) for Monday to Sunday are 9.72, 9.78, 9.64, 10.38, 9.82, 10.42, and 9.68 respectively for the year of 2007. And the average value for all 24 hours in a week is 9.92, which is lesser that 10. Hence, developed NN models for hour-ahead forecasting is highly acceptable.

Performance measures for single-step ahead hourly forecasting (Test#2 : 2007 dataset).
Based on the performance analysis of the developed NN models for one-step ahead forecasting, multi-step ahead forecasting is performed in the step of two-step and three-step for the all three cases (i.e., monthly, daily and hourly forecasting of the laod) as explained in sub-sequence sections.
Month-ahead forecasting
In this section, two-step ahead and three-step ahead forecasting results for monthly load forecast have been presented, which are tabulated in Tables 5 and 6 respectively for the training and testing phase of the models.
Two-step ahead MAPE for monthly forecasting
Two-step ahead MAPE for monthly forecasting
Three-step ahead MAPE for monthly forecasting
In the two-step ahead load forecasting case study, the maximum and minimum MAPE for training phase is 3.03 and 5.3 respective and the average MAPE for all month in a year is 3.79. Whereas, in the testing phase, the evaluated minimum and maximum MAPE is varied from 4.11 and 6.95 and its average values of MAPE for the year 2006 and 2007 are 5.71 and 5.55 respectively.
Similarly, in the three-step ahead forecasting case study, the maximum and minimum MAPE for training phase is 3.66 and 5.93 respective and the average MAPE for all month in a year is 4.42. Whereas, in the testing phase, the evaluated minimum and maximum MAPE is varied from 4.7 and 7.6 and its average values of MAPE for the year 2006 and 2007 are 6.36 and 6.19 respectively.
Therefore, after analyzing Tables 5 and 6, it is clear that developed NN models for two-step ahead and three-step ahead forecasting are more acceptable and it can be utilized for further implementation on real-side.
This section presents the results of daily-ahead load forecasting for two-step ahead and three-step ahead forecasting, which are listed in the Tables 7 to 10 for the testing phase. Tables 7 and 9 represents the testing phase results for the historical dataset of the year 2006 for two-step and three-step ahead forecasting respectively. Similarly, Tables 8 and 10 represents the testing phase results for the historical dataset of the year 2007 for two-step and three-step ahead forecasting respectively.
Two-step ahead MAPE for daily EL forecasting (Test#1 : 2006 Dataset)
Two-step ahead MAPE for daily EL forecasting (Test#1 : 2006 Dataset)
Two-step ahead MAPE for daily EL forecasting (Test#2 : 2007 Dataset)
Three-step ahead MAPE for daily EL forecasting (Test#1 : 2006 Dataset)
Three-step ahead MAPE for daily EL forecasting (Test#2 : 2007 Dataset)
In the two-step ahead daily-load forecasting case study, the average maximum and minimum MAPE for testing phase is 5.85–7.65 (for 2006 test dataset) and 6.63–9.72 (for 2007 test dataset), and the overall-average MAPE for all days of the months is 6.92 (for 2006 test dataset) and 7.87 (for 2007 test dataset).
Similarly, in the three-step ahead daily-load forecasting case study, the average maximum and minimum MAPE for testing phase is 6.03–7.84 (for 2006 test dataset) and 6.81–9.9 (for 2007 test dataset), and the overall-average MAPE for all days of the months is 7.1 (for 2006 test dataset) and 8.05 (for 2007 test dataset).
Therefore, after analyzing Tables 7 to 10, it is concluded that developed NN models for two-step ahead and three-step ahead forecasting for daily load forecast are more acceptable and it can be implemented for further uses on real-side.
In this section, the obtained results of an hour-ahead load forecasting for two-step ahead and three-step ahead forecasting are demonstrated, which are listed in the Tables 11 to 12 for the testing phase. Table 11 represents the testing phase results for the historical dataset of the year 2007 for two-step forecasting. Similarly, Table 12 represents the testing phase results for the historical dataset of the year 2007 for three-step ahead forecasting.
Two-step ahead MAPE for hourly EL forecasting (Test#2 : 2007 Dataset)
Two-step ahead MAPE for hourly EL forecasting (Test#2 : 2007 Dataset)
Three-step ahead MAPE for hourly EL forecasting (Test#2 : 2007 Dataset)
In the two-step ahead hourly-load forecasting case study, the 24-hour’s average maximum and minimum MAPE for testing phase is 9.63–10.4 (for 2007 test dataset), and the overall-average MAPE for all hours of the all 7-days is 9.93.
Similarly, in the three-step ahead hourly-load forecasting case study, the 24-hour’s average maximum and minimum MAPE for testing phase is 9.97–10.83 (for 2007 test dataset), and the overall-average MAPE for all 24-hours of the all 7-days is 10.31.
Hence, after analyzing Tables 11 and 12, it is finalized that the developed NN models for 2-step-ahead and 3-step-ahead forecasting for hourly load forecast are acceptable and its performance is highly reliable and the proposed approach can be implemented for future prospective on real-side applications.
In this study, an intelligent load predictor which is able to forecast the electric load for next-month or next-day or next-hour has been proposed. The developed proposed approach is a hybrid model which has been combined with EMD and NN for multi-step ahead ELF. The proposed model performance is experimentally demonstrated by suing historical dataset collected form GEFCom2012 and GEFCom2014. For the demonstration of the performance, three case studies (monthly, daily and hourly) have been analyzed into two categories (One-Step Ahead and Multi-Step Ahead). The total twenty four NN models have been developed and demonstrated in different sections, which have been represented in the section-4. The overall average performance of the proposed models is in term of MAPE during training phase of 3.14 (for MAF), 4.57 (for DAF) and 6.5 (for HAF). Whereas, the overall average performance of the proposed models is in term of MAPE during testing phase for test dataset#1 of year 2006 is 5.05 (for MAF), 6.83 (for DAF), and 9.15 (for HAF); and for test dataset#2 of year 2007 is 4.88 (for MAF), 7.85 (for DAF) and 9.92 (for HAF). The demonstrated results represents the higher acceptability of the proposed approach with respect to the standard value of MAPE, which is always lesser than 10 for all cases.
Hence, the proposed approach may be utilized for online ELF for the different MPS applications.
Footnotes
Acknowledgment
“The authors extend their appreciation to the Researchers Supporting Project at King Saud University, Riyadh, Saudi Arabia, for funding this research work through the project number RSP-2020/278”.
