Midterm Power Load Forecasting Model Based on Kernel Principal Component Analysis and Back Propagation Neural Network with Particle Swarm Optimization

Abstract

To improve the accuracy of midterm power load forecasting, a forecasting model is proposed by combing kernel principal component analysis (KPCA) with back propagation neural network. First, the dimension of the input space is reduced by KPCA, then input the data set to the neural network model, optimized by particle swarm optimization. The monthly average of daily peak loads is forecasted to modify the daily forecast values and output the daily peak load in the end. Using the data provided by European Network on Intelligent Technologies to test the model, the mean absolute percent error of load forecasting model is only 1.39%. The feasibility and validity of the model have been proven.

Introduction

Load forecasting has become the basic work of many power system departments such as planning, power consumption, and dispatching. Improving the level of load forecasting has always been a hot topic both at home and abroad.¹ According to the timescale, load forecast can be roughly divided into three categories: short-term load forecast, midterm load forecast (MTLF), and long-term load forecast (LTLF).²

Midterm load forecasting refers to the prediction of load value in the next few weeks to several months. The improvement of midterm load forecasting accuracy is convenient for planned power management, which is conducive to rationally arranging the operation mode of power system, helping to reduce power generation cost and formulating reasonable power supply construction plan.

The short-term load forecasting refers to daily load forecasting and weekly load forecasting, and power dispatchers use the recent load trend changes and the load situation in the same period of the previous year to predict the load value in the next day or week.

Long-term load forecasting is used to predict the power consumption after a long period of time in the future, generally up to 3–5 years or even longer. It is mainly used for medium- and large-scale power grid planning such as power grid capacity expansion. MTLF is usually used for planning purposes, such as adjusting midterm plans and resource allocation.³

Over the past 50 years, dozens of different load forecasting methods have been used and recorded. However, it is evident that MTLF is usually ignored because of the error accumulation. Xia et al.⁴ reported the difficulty in accurate load forecasting due to the unstable randomness of various factors such as the development of the national economy. There are many methods used for MTLF such as time sequence method,^5,6 linear regression method,⁷ gray model,^8,9 support vector machine (SVM),^10,11 and neural network.^12–15

In literature,⁵ Warrington et al. incorporated the time sequence reserve into the existing control scheme to adapt to the main technical and economic practices of the time series reserve in the power market. Participants were required to be able to respond to prediction errors in a predetermined manner. Their power output is corrected by scaling with the prediction error achieved during successive trading intervals while still stabilizing the grid frequency.

In literature,⁶ the aim was to develop and use an evolving linear model to predict the synchronized time series, by assuming there are multiple synchronized time sequences that are related to the time series being predicted. The idea is to use the information contained in the relevant series to make predictions. While the time sequence model can describe a random process. However, the prediction effect is often not very good only by time series modeling and forecasting, as there may be autoregressive phenomena in forecasting residual without considering multiple factors.

In literature,⁷ a new technology for predicting power load is proposed. This technique is based on fuzzy linear regression. The linear regression fuzzy model was developed using load impact factors such as load, population, and annual growth factors for the previous year. The results show that the average absolute error of the forecast weekly average daily load does not exceed 3.68% of the annual actual load. The linear regression method is a relatively mature algorithm. Using the model to statistically analyze the load data, the load change law can be accurately identified and a more accurate prediction can be made. However, there are many factors affecting the load, and these factors are not clearly reflected in the linear regression model.

The principle of additional prediction of the same dimension is discussed based on the gray theory in literature.⁸ It only requires a smaller amount of information for easy calculation. In addition, it has higher modeling accuracy. The model is used for midterm load forecasting in a region with an error rate of around 3%. In literature,⁹ a medium-term forecast was given based on the power consumption of Ningxia Autonomous Region in the next 5 years. The gray model has the characteristics of less data and less information, and it reveals the evolution of the entity. The gray model has many advantages, such as a small sample of data, without considering the distribution and trends, simple principles, convenient operation, high accuracy, and strong testability, and is widely used. However, the gray model is an exponential function and is suitable for a situation when the load is growing rapidly.

A multiple least-squares SVM-based midterm electricity market clearing price forecasting model is proposed in literature.¹⁰ The data classification and price forecasting module is designed to preprocess input data into corresponding price regions first and then predict electricity prices. Compared with the prediction model using a single least-square SVM, the proposed model shows improved prediction accuracy on the peak price and the overall system.

In literature,¹¹ the cointegration technique was used to analyze the influencing factors, and the cointegration correlation formula was used to predict the medium- and long-term power load. To improve the prediction accuracy, the SVM was used to correct the prediction error of the cointegration method. Compared with the traditional prediction method, the SVM method has a great improvement, but the prediction result of the SVM method has hysteresis relative to the actual value, and the error at the inflection point is large, which affects the prediction accuracy.

In the literature,¹² two techniques for midterm load forecasting of Al-Dakhiliya distribution system based on linear regression and neural network have been developed. The models developed include historical monthly load data, temperature, humidity, and wind speed. The simulation results show that the neural network nonlinear model is superior to the traditional multiple linear regression model, and the discovery is more reasonable and satisfactory. In the literature,¹³ neural network techniques and fuzzy logic were used to develop a two-stage medium-term load forecaster. The first stage consists of a neural network that trains historical data that are readily available in the supervised era. The load forecast generated in the first stage is then converted into a temperature-sensitive module in the second stage with a fuzzy logic-based module. Fuzzy logic is well suited to characterize the uncertainty that occurs in load behavior due to weather changes. Because the neural network has a strong self-learning ability and a complex nonlinear function fitting capability, it is very suitable for power load forecasting.

The back propagation neural network (BPNN) is one of the commonly used neural networks. It has the characteristics of simple structure and high fitting accuracy. The literature¹⁴ uses wavelet analysis and BPNN to realize the midterm forecast of monthly load and obtains good prediction results. However, the initial value of the neural network weight threshold is random and easy to fall into local minimum. Since the power load is affected by various complicated factors, the input space has a large dimension and there is a strong correlation between various factors such as temperature and season. Therefore, it is necessary to decouple the input space for dimensionality reduction.

In literature,¹⁵ the principal component analysis (PCA) algorithm is used to extract linearly independent input variables from the original input space, which effectively simplifies the structure of the model. However, the commonly used linear dimensionality reduction method has shortcomings, which is based on the premise that the subspace embedded in high-dimensional data space is linear or approximately linear.

The literature¹⁶ proposes a method to modify the daily load forecasting value using weekly load forecasting. The mean absolute percentage error (MAPE) has been increased from 1.78% to 1.59%. The prediction accuracy has been effectively improved, but the weekly load forecasting still has an error accumulation in MTLF and LTLF, which affects the accuracy of load forecasting. The essence of BPNN is in the forward flow of the signal and the back propagation of the error, constantly adjusting the weights and thresholds of each neuron until the error satisfies the accuracy condition, and the learning result is memorized by storing the weight. So, BPNN is suitable for the application of power load forecasting. However, the initial selection of weights and thresholds in BPNN is random, so it is easy to fall into the local minimum and affect the prediction error.

It is analyzed that there are three problems about midterm power load forecasting in the existing literature, including the initial selection of weights and thresholds in BPNN is random, the input space of the neural network is too complex, and there are accumulation errors in rolling prediction. To solve these problems, a forecasting model is proposed by combing kernel principal component analysis (KPCA) with BPNN, which can improve the accuracy of MTLF. First, in midterm power load forecasting, the amount of data is very large, and the computational consumption and space storage may be unbearable, so the dimension reduction is very important.

To analyze and process large amounts of data efficiently and overcome the problem of excessive dimensionality, the complexity of the forecast model can be reduced by the dimensionality reduction of multidimensional historical data input through KPCA. The dimension-reduced historical data are then input to BPNN, which is optimized by particle swarm optimization (PSO). The monthly average of daily peak loads is forecasted to modify the daily forecast values. The daily peak loads of a month are output in the end. Based on KPCA-PSOBP, this article studies the MTLF. Thanks to the competition organized by EUNITE, the historical data provided by EUNITE are used to demonstrate the feasibility and validity of the model.

The rest of this article is organized as follows. In the BPNN Model section, the model of BPNN is presented. The proposed improvement of forecasting model is formulated in the Improvement of Forecasting Model section. In the Experiment section, a case study on the proposed model is presented. Finally, conclusions are given in the Results and Discussion section.

BPNN Model

The traditional BPNN is a three-layered feedforward network model with input layer, hidden layer, and output layer. It contains the forward propagation of the work signal and the reverse propagation of the error signal.

If the input layer contains n neurons, the hidden layer contains h neurons and the output layer contains m neurons. The input is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\textbf{\textit{x}} \in {\textbf{\textit{R}}^n}$$ \end{document} , \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\textbf{\textit{x}} = { \left( {{x_0} , {x_1} , \ldots , {x_{n - 1}}} \right) ^{ \rm{T}}}$$ \end{document} , then the output of the hidden layer is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\textbf{\textit{y}} \in {\textbf{\textit{R}}^h}$$ \end{document} , \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\textbf{\textit{y}} = { \left( {{y_0} , {y_1} , \ldots , {y_{h - 1}}} \right) ^{ \rm{T}}}$$ \end{document} , \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {{y_j} = {f_1} \left( { \mathop \sum \limits_{i = 0}^{n - 1} { \omega _{ij}}{x_i} - { \theta _j}} \right)} \tag{1} \end{align*} \end{document}

\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \omega _{{ \rm{ij}}}}$$ \end{document} is the weight between the input layer and the hidden layer, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \theta _j}$$ \end{document} is the threshold of the hidden layer and f₁ is the transfer function for the hidden layer, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j = 0 , 1 \ldots , h - 1$$ \end{document} . The output of the output layer is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\textbf{\textit{o}} \in {\textbf{\textit{R}}^m}{ \rm{ \; \; \;}}\textbf{\textit{o}} = { \left( {{o_0} , {o_1} , \ldots , {o_{m - 1}}} \right) ^{ \rm{T}}}$$ \end{document} , \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {{o_l} = {f_2} \left( { \mathop \sum \limits_{j = 0}^{h - 1} { \omega _{jl}}{y_j} - { \theta _l}} \right) } \tag{2} \end{align*} \end{document}

\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \omega _{jl}}$$ \end{document} is the weight between the hidden layer and the output layer, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \theta _l}$$ \end{document} is the threshold of the output layer and f₂ is the transfer function for the output layer, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$l = 0 , 1 \ldots , m - 1$$ \end{document} . The above is the forward propagation of the work signal, then calculating the error function: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} e = \frac { 1 } { 2 } \mathop \sum \limits_ { l = 0 } ^ { m - 1 } { \left( { s - o } \right) ^2 } \tag { 3 } \end{align*} \end{document}

The weights and thresholds of the model are adjusted according to the back propagation of the error signal until the expected error or number of iterations is reached.

Improvement of Forecasting Model

Based on the forecasting model, the accuracy of model prediction has been improved in three aspects: reduction and reconstruction of input data space, optimization of BPNN algorithm, and error correction of output data.

Reduce the dimension of input space based on KPCA

The input of the neural network forecasting model should be those that have a greater impact on the results and have less influence on each other. Therefore, under the premise of ensuring effective information, it is necessary to rationally reconstruct the input space and improve the model's learning ability and generalization ability. Recently, PCA is one of the common dimensionality reduction methods. The input space is reduced according to the contribution rate of each principal component.

If \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\textbf{\textit{X}}_1} , {\textbf{\textit{X}}_2} , \ldots {\textbf{\textit{X}}_N} , {\textbf{\textit{X}}_i} \in {\textbf{\textit{R}}^d}$$ \end{document} is data set, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\textbf{\textit{Y}}_i} \in {\textbf{\textit{R}}^k}$$ \end{document} is the principal component. The steps to solve the principal component of the data set are as follows: (1)

Calculate the covariance matrix of the sample set: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} Cov \left( \textbf{\textit{X}} \right) = { \left( {{c_{ij}}} \right) _{d \times d}} \tag{4} \end{align*} \end{document}

(2)

Obtain the objective function: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \max { \rm{ \;}}{\textbf{\textit{a}}^{ \rm{T}}}Cov \left( \textbf{\textit{X}} \right) \textbf{\textit{a}} \end{align*} \end{document} \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} { \rm{s}}.{ \rm{t}}.{ \rm{ \;}} \left\vert { \bf{a}} \right\vert = 1 \tag{5} \end{align*} \end{document}

Calculate the matrix eigenvalue of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$Cov \left( \textbf{\textit{X}} \right)$$ \end{document} , and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\textbf{\textit{a}}_{d \times k}}$$ \end{document} is made up of the eigenvector corresponding to the largest k eigenvalue.

(3)

Calculate the set after dimension reduction: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {{ \bf{Y}}_{d \times k}} = {{ \bf{a}}^{ \rm{T}}}{{ \bf{X}}_{d \times k}} \tag{6} \end{align*} \end{document}

The contribution rate of the principal component is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \lambda _i} / \mathop \sum \nolimits_{i = 1}^d { \lambda _i}$$ \end{document} , the cumulative contribution of the principal component is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mathop \sum \nolimits_{i = 1}^k { \lambda _i} / \mathop \sum \nolimits_{i = 1}^d { \lambda _i}$$ \end{document} , and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \lambda _i}$$ \end{document} is the eigenvalue of the covariance matrix \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$Cov \left( \textbf{\textit{X}} \right)$$ \end{document} .

Because the input space of the load forecasting model does not strictly satisfy the prerequisite of linear feature extraction, the nonlinear PCA method based on kernel function is introduced into the forecasting model.

The method of KPCA based on kernel function is to map the samples in the original input space nonlinearly to the high-dimensional linear space D, and then use the PCA in the feature space.

According to the Mercer theorem, there exists a nonlinear mapping \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\phi$$ \end{document} : \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\textbf{\textit{R}}^d} \to {\textbf{\textit{R}}^D}$$ \end{document} , it makes \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$K \left( {{\textbf{\textit{X}}_i} , {\textbf{\textit{X}}_j}} \right) = \phi { \left( {{\textbf{\textit{X}}_i}} \right) ^{ \rm{T}}} \phi \left( {{\textbf{\textit{X}}_j}} \right)$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$D \ll d$$ \end{document} . The original data set is mapped to the linear space of the D dimension by nonlinear mapping. Then, using PCA in this linear space, obtain the covariance matrix as C_○

It is difficult to solve the eigenvalues and eigenvectors of the covariance matrix directly. can be set as the eigenvector of C, and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\lambda$$ \end{document} is the corresponding eigenvalue:

where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\psi \left( { { \textbf { \textit { X } } _i } } \right) = \phi \left( { { \textbf { \textit { X } } _i } } \right) - \frac { 1 } { N } \mathop \sum \limits_ { i = 1 } ^N \phi \left( { { \textbf { \textit { X } } _j } } \right)$$ \end{document} , since \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\textbf{\textit{C}} { \bm{\nu}} = \lambda {\bm{\nu}}$$ \end{document} , it can be simplified as follows: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} { \rm{ \lambda N}}{ \bf{a}} = \overline { \bf{K}} { \bf{a}} \tag{8} \end{align*} \end{document}

where , then:

Finally, determine the input space after KPCA based on the contribution rate of the principal component and the cumulative contribution rate.

Optimization of BPNN based on PSO

In general, the weights and thresholds of the BPNN are randomly selected. PSO is based on the adaptability of the algorithm to update the position and velocity of the particles to achieve the global optimal solutions. It can optimize the selection of initial value, reduce the possibility of traditional neural network into local optimum, and improve convergence speed and precision.

Suppose that in an N-dimensional target search space, there are M particles forming a community, where the m-th particle position is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} { \bar x_m} = ( {x_{m1}} , {x_{m2}} , \cdots , {x_{mN}} ) , m = 1 , 2 , \cdots , M \tag{10} \end{align*} \end{document}

It is possible to assess whether \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bar x_m}$$ \end{document} is the best solution based on the particle size, if not, then search the next optimal particle position.

The steps of optimizing the neural network are as follows: (1)

Set the weight and threshold of the neural network as the initial position of PSO;

(2)

set the training error of neural network as the fitness of PSO;

(3)

update the particle velocity and position until the end of the iteration, looking for the individual particle optimal value and global optimal value; and

(4)

finally, the global optimal value obtained by PSO is taken as the initial weight and threshold of the neural network algorithm.

Modify the daily peak load

According to the principle of near-large and small, the known load value of the day has a greater influence on the prediction accuracy of the subsequent period in the day, so usually, the load forecasting is used in the way of rolling prediction, which caused the cumulative error and affected the accuracy of the forecasting model greatly. Also, the power load has a certain periodicity. So, the method of monthly load to revise the daily load is proposed. Monthly load forecasting does not need to roll the forecast, neither cumulative error nor conclude all the forecast date, to make up for the lack of daily load forecast.

The steps of revising the daily load forecast value based on the forecasted average daily peak load of a month are as follows: (1)

Forecast W_y based on the history value, where W_y is the average daily peak load of a month that is forecasted and

(2)

obtain \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$W_y^{ \rm{ \prime }}$$ \end{document} based on the daily peak loads of a month that will be corrected, where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$W_y^{ \rm{ \prime }}$$ \end{document} is the average daily peak load of a month that is calculated.

(3)

The correction value can be obtained: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \Delta {W_y} = W_y^ \prime - {W_y} \tag{11} \end{align*} \end{document}

(4)

Finally, the final daily peak load: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} Y = Y \prime + \omega \times \Delta {W_y} \tag{12} \end{align*} \end{document}

Experiment

Experimental configuration

For the realization of the model algorithm, MATLAB is used as a calculation tool, because it can directly provide analysis functions and also has unique advantages in matrix calculation and data drawing. The configuration of PC is Intel Core i5-3230, 8G RAM, and 500G hard disk.

In 2001, EUNITE organized a competition on the load forecasting. The given information includes load data for the past 2 years, temperature over the past 4 years, and holiday events. The task of competitors is to supply the prediction of maximum daily values of electrical loads for January 1999. Evaluation of submissions would mainly depend on MAPE, maximum error (ME).

To analyze and compare the prediction results of KPCA-PSOBP, models such as BP, PSOBP, and PCA-PSOBP, as well as the model revised by the forecasted average daily peak load, are proposed.

The comparison is divided into two steps:

Step 1: The original sequence is predicted using four methods, respectively, and the prediction result is compared with the actual load data.

Step 2: The prediction error is evaluated by MAPE and ME.

Description of data

Power load is mainly affected by historical load, temperature, and type of date. If the temperature is considered, it is necessary to predict the daily temperature of the next stage (i.e., the daily temperature of the next month). The accuracy of temperature prediction is not guaranteed, and the complexity of the prediction model is improved. Considering the temperature may not improve the accuracy or even affect the stability and prediction accuracy of the model, the temperature is not taken into account in this article. The input and output of the model are shown in Table 1.

Table 1.

The input and output of forecasting model

	Variable	Variable name
Input variables	X₁ ∼ X₂₁	Daily peak loads, type of week, and type of date of the previous week
	X₂₂ ∼ X₂₃	Daily peak loads, type of week of the same data last year
	X₂₄ ∼ X₂₅	Type of week and type of date of the forecasted day
Output variables	Y ₁	Daily peak load

The week and date types are quantified, the week is set from 1 to 7, the date in addition to Christmas and New Year's Day is set to 2, the remaining holidays set to 1, and the general day set to 0. The data set of January and February of 1997 and 1998 will be used as the training set of the model.

First, KPCA is used to reconstruct the input space. The kernel function chosen by this prediction example is the Gaussian kernel function. The contribution rate of the principal component and the cumulative contribution rate obtained by KPCA are shown in Table 2.

Table 2.

Contribution rates of principal components and accumulated contribution rate

Number	Eigenvalue	Contribution rate, %	Cumulative contribution rate, %
1	0.2015	26.0186	26.0186
2	0.1972	25.4710	51.4896
3	0.1490	19.2397	70.7293
4	0.0472	6.0927	76.8220
5	0.0335	4.3287	81.1507
6	0.0261	3.3720	84.5227
7	0.0202	2.6066	87.1294
8	0.0187	2.4159	89.5452
9	0.0166	2.1425	91.6877
10	0.0150	1.9339	93.6216
11	0.0110	1.4254	95.0470
…	…	…	…
25	0	0	100

The greater the contribution rate, the stronger the ability to characterize the new variable's comprehensive information. Usually it is appropriate to make the cumulative contribution rate above 85%, which can not only reduce the loss of information but also reduce the variables and simplify the problem. It can be found that the cumulative contribution rate has achieved 95% when the dimension of the input variable is 11 to replace the original input space. The network model structure is simplified under the premise of guaranteeing effective information.

The reconstructed input space is used as the input of the PSO-BP model. The initial parameters of the model are shown in Table 3.

Table 3.

Parameters of particle swarm optimization-back propagation model

	Parameter name	Set-point
PSO	Initial weight, ω	0.9
	Learning rate c₁, c₂	2
	Number of particles, M	50
	Number of iterations, T	1000
BP	Input layer neurons	11
	Hidden layer neurons	5
	Output layer neurons	1
	Expected error, ɛ	0.01
	Number of iterations, t	1000

BP, back propagation; PSO, particle swarm optimization.

The model is trained until the iteration ends or the expected error is reached. Then, the prediction sample is input into the trained prediction model, and the load forecast value is output.

Finally, the daily peak loads of the forecasted month are corrected based on the forecasted average daily peak load of a month. According to Figure 1, it can be found that the monthly load has a strong seasonal and annual periodic rule, so the input of the forecast model adopts the monthly load value of the past 3 months, and output the average daily peak load of a month. The predicted value is 756.47 MW (while the actual value is 749.26 MW). Then, Equations (11) and (12) can output the final result.

FIG. 1.

The average daily peak load in each month from the year 1997 to 1998.

Results and Discussion

The results of the daily peak load of KPCA-PSOBP model are shown in Table 4. The results of the four different models are shown in Figure 2, and the evaluation indicators are shown in Table 5.

FIG. 2.

Results of daily peak load forecasting in different models.

Table 4.

Results of the forecasting model

Date	Actual, MW	Prediction, MW	Relative error, %
January 1	751	728.5827	2.985
January 2	703	709.9944	0.9949
January 3	677	680.076	0.4544
January 4	718	712.5554	0.7583
January 5	738	733.49	0.6111
January 6	709	712.9401	0.5557
January 7	745	737.2402	1.0416
January 8	749	724.5447	3.2651
January 9	734	718.9402	2.0517
January 10	679	681.4596	0.3622
January 11	748	752.8791	0.6523
January 12	739	754.7849	2.136
January 13	756	757.0542	0.1394
January 14	763	752.3399	1.3971
…	…	…	…
January 25	789	783.5077	0.6961
January 26	798	791.121	0.862
January 27	791	798.4818	0.9459
January 28	776	802.7446	3.4465
January 29	792	787.2691	0.5973
January 30	763	756.0328	0.9131
January 31	743	718.8328	3.2527
MAPE = 1.39%
ME = 33.96 MW

MAPE, mean absolute percentage error; ME, maximum error.

Table 5.

The comparison of four different forecasting models

	MAPE (%), ME (MW) of four different forecasting models
	BP	PSOBP	PCA-PSOBP	Proposed
Results	MAPE = 2.13	MAPE = 1.94	MAPE = 1.75	MAPE = 1.47
Results	ME = 48.17	ME = 45.03	ME = 38.12	ME = 36.98
Output modified	MAPE = 1.94	MAPE = 1.90	MAPE = 1.51	MAPE = 1.39
Output modified	ME = 41.17	ME = 40.78	ME = 37.43	ME = 33.96

From the results of Table 4 and Figure 2, it can be seen that the prediction results of the MTLF can reflect the actual load and prove the feasibility of the model. Compared with the BP, PSO-BP, and PCA-PSOBP, the MAPE and ME of the KPCA-PSOBP are smaller and the running time is shorter, reflecting the effectiveness of the model. PSO improves the performance of the BP. Compared with the noninput dimension reduction processing, the space dimension reduction reconstruction removes the redundancy and simplifies the prediction model. Compared with the linear dimensionality reduction, KPCA-PSOBP overcomes the shortcomings of linear dimensionality reduction and the prediction result is more accurate. In addition, it can be seen from Table 5 that the method of calculating the daily load based on the monthly load can effectively compensate the midterm prediction of the cumulative error of the daily load and the prediction accuracy has been further improved.

Since the launch of the load forecasting competition, many scholars have established different models based on the load data provided by the organizers to predict the load, such as SVM,¹⁷ self-organizing fuzzy neural network,¹⁸ empirical mode decomposition-PSO-SVM,¹⁹ and fuzzy support vector machines.²⁰ The comparison of prediction accuracy is shown in Table 6. It can be seen from the comparison that the prediction accuracy of the proposed method is the highest and the volatility is small.

Table 6.

Accuracy comparison of different models

	Models	MAPE, %	ME, MW
1.	Proposed	1.39	33.96
2.	SVM	1.95	50–60
3.	SOFNN	1.59	41.95
4.	EMD and PSO-SVM	1.78	45.71
5.	FSVM	2.11	42.48

EMD, empirical mode decomposition; FSVM, fuzzy support vector machines; SOFNN, self-organizing fuzzy neural network; SVM, support vector machine.

Conclusion

The MTLF model is proposed based on the combination of KPCA and improved neural network, and it can improve the accuracy of model prediction in three aspects: reconstruction of input data space, optimization of prediction algorithm, and error correction of output data.

(1)

The KPCA method based on kernel function is introduced into MTLF. It can realize the dimensionality reconstruction of the input space, make up for the insufficiency of the linear dimension reduction method, remove the redundancy, and simplify the model structure.

(2)

The particle swarm algorithm is used to optimize the neural network algorithm, and the performance of the algorithm has been improved.

(3)

A method that the monthly average of daily peak loads is forecasted to modify the daily forecast values is proposed, which effectively reduces the cumulative error of MTLF and improves the accuracy of MTLF.

(4)

The prediction results show that the KPCA-PSOBP model can accurately predict the power load and is an effective method for MTLF.

Footnotes

Acknowledgments

The research presented in this article was supported by the National Natural Science Foundation of China, and, in part, by the Jiangsu Province Science and Technology Support Plan project. The authors acknowledge the National Natural Science Foundation of China (Grant: 51507086), the Jiangsu Province Natural Science Fund (Grant: BK20150839), and the Jiangsu Province Natural Science Fund (Grant: BK20170841).

Authors' Contributions

Z.L. is the main writer of this article. He proposed the main idea, completed the model, and analyzed the results. X.S. introduced the kernel principal component analysis method in MTLF. S.W. used the particle swarm algorithm to optimize the neural network algorithm, M.P. forecasted the monthly average of daily peak loads to modify the daily forecast values, Y.Z. used the data provided by European Network on Intelligent Technologies to test the model, and Z.J. gave some important suggestions for KPCA-PSOBP. All authors read and approved the final article.

Author Disclosure Statement

No competing financial interests exist.

Abbreviations Used

References

Khuntia

, Rueda

, van der Meijden

MAMM

. Forecasting the load of electrical power systems in mid- and long-term horizons: A review. IET Gener Transm Dis. 2016; 10:3971–3977.

Willis

, Northcote-Green

JED

. Spatial electric load forecasting: A tutorial review. Proc IEEE. 1983; 71:232–253.

Christoforidis

, Aganagic

, Awobamise

, et al. Long-term/mid-term resource optimization of a hydrodominant power system using interior point method. IEEE Trans Power Syst. 1996; 11:287–294.

Xia

, Wang

, McMenemy

. Short, medium and long term load forecasting model and virtual load forecaster based on radial basis function neural networks. Int J Elec Power Energy Syst. 2010; 32:743–750.

Warrington

, Mariéthoz

, Morari

Time-sequence reserve products for electricity markets. In: 10th International Conference on the European Energy Market. Stockholm, Sweden: IEEE Computer Society, 2013. pp. 1–7.

Jahandari

, Kalhor

, Nadjar Araabi

. Online forecasting of synchronous time series based on evolving linear models. IEEE Trans Syst Man Cybernet Syst. 2018;Early Access:1–12.

Al-Hamadi

HM.

Long-term electric power load forecasting using fuzzy linear regression technique. IEEE Power Eng Autom Conf, 2011; 3:96–99.

, Bai

, Fan

Study of the medium-long load forecasting based on the identical dimension addition grey model. In: Second International Conference on Mechanic Automation and Control Engineering. Inner Mongolia, China: IEEE Computer Society, 2011. pp. 700–703.

Niu

, Sun

, Wei

Medium-term power load forecasting based on grey model in Ningxia Autonomous Region. In: International Conference on E-Business and E-Government. Shangai, China: IEEE Computer Society, 2011, pp. 1–3.

10.

Yan

, Chowdhury

. Mid-term electricity market clearing price forecasting using multiple least squares support vector machines. IET Gener Transm Dis. 2014; 8:1572–1582.

11.

Niu

, Li

. Middle-long Electric Power Load Forecasting Based on Cointegration and Support Vector Machine. In: Third International Conference on Natural Computation. Hainan, China: Inst. of Elec. and Elec. Eng. Computer Society, 2007, pp. 596–600.

12.

Feilat

, Bouzguenda

Medium-term load forecasting using neural network approach. In: IEEE PES Conference on Innovative Smart Grid Technologies—Middle East. Jeddah, Saudi arabia: IEEE Computer Society, 2011, pp. 1–5.

13.

Davlea

, Teodorescu

A neuro-fuzzy algorithm for middle-term load forecasting. In: International Conference and Exposition on Electrical and Power Engineering. Iasi, Romania: Institute of Electrical and Electronics Engineers Inc., 2016, pp. 5–9.

14.

, Lu

. Forecasting monthly runoff using wavelet neural network model. In: International Conference on Mechatronic Science, Electric Engineering and Computer. Jilin, China: IEEE Computer Society, 2011, pp. 2177–2180.

15.

Liu

, Shang

Power system load forecasting by improved principal component analysis and neural network. IEEE International Conference on High Voltage Engineering and Application. Chengdu, China: Institute of Electrical and Electronics Engineers Inc., 2016, pp. 1–4.

16.

Chen

, Chang

, Lin

. Load forecasting using support vector Machines: A study on EUNITE competition 2001. IEEE Trans Power Syst. 2004; 19:1821–1830.

17.

Niu

, Li

, Cheng

, Gu

. Mid-term load forecasting based on dynamic least squares SVMS. In: 2008 International Conference on Machine Learning and Cybernetics. Kunming, China: IEEE Computer Society, 2008, pp. 800–804.

18.

Mao

, Zeng

, Leng

, et al. Short-term and midterm load forecasting using a bilevel optimization model. IEEE Trans Power Syst. 2009; 24:1080–1090.

19.

Duo

, Qi

, Lina

, Xu

. A short-term traffic flow prediction model based on EMD and GP SO-SVM. In: 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference. Chongqing, China: Institute of Electrical and Electronics Engineers Inc., 2017, pp. 2554–2558.

20.

Jiang

, Liu

, Gao

. Hybrid load forecasting method based on fuzzy support vector machine and linear extrapolation. In: Proceedings of the 10th World Congress on Intelligent Control and Automation. Beijing, China: Institute of Electrical and Electronics Engineers Inc., 2012, pp. 2431–2435.