Abstract
Smart metering is a quite new topic that has grown in importance all over the world and it appears to be a remedy for rising prices of electricity. Forecasting electricity usage is an important task to provide intelligence to the smart gird. Accurate forecasting will enable a utility provider to plan the resources and also to take control actions to balance the electricity supply and demand. The customers will benefit from metering solutions through greater understanding of their own energy consumption and future projections, allowing them to better manage costs of their usage. In this proof of concept paper, our contribution is twofold: (1) we deal with short term electricity load forecasting for 24 hours ahead, not on the aggregate but on the individual household level what fits into the stream of Residential Power Load Forecasting (RPLF) methods; (2) we utilized a set of household behavioral data which significantly improved the forecasts accuracy.
Introduction
Smart metering systems are expected to play important role in reducing overall energy consumption and increasing energy awareness of the users. One of the most important aims of smart metering is to encourage users to use less electricity through being better informed about their consumption patterns. Leveraging smart metering to support energy efficiency on the individual user level poses novel research challenges in monitoring usageand providing accurate load forecasting.
This paper is a continuation of a study published in [10]. In comparison to our previous work we extended it by including behavioral usage data for forecasting and introducing additional modelling techniques. We believe, our research fits into attempt to generate value added for individual customers within the stream of Residential Power Load Forecasting (RPLF) methods.
Forecasting the usage provides the customers the mean to link current usage behavior with future costs. Therefore, customers may benefit from forecasting solutions through greater understanding of their own energy consumption and future projections, allowing them to better manage costs of their usage. With smart meter technology it would be possible to benefit from demand flexibility and better choices on tariff plans. By making energy consumption and future projections more visible to us it would be easy to understand how much we’re actually using and how it would affect our budget in the future. Of course, we should note that technology alone will not be enough to change the way people consume energy but it gives the mean to use energy in a deliberate and conscious way.
In this paper, we will study an approach to forecast the hourly electricity loads of a particular individual consumer for 24 hours ahead taking into account electricity consumption and household behavioral data. However, it should be noted that forecasting loads of individual smart meter is not common practice since the volatility of the system is high thus resulting in high error rates.
The proposed research begins with the state of the art. Smart meter data characteristics as provided by self-designed metering infrastructure from one experimental household are described in the third section. The empirical analysis and comparison of the algorithms outcomes are presented in the fourth section. Conclusions are given in the last section.
State of the art
Recently, with advances in communication infrastructure for remote and automated data reading, there is increasing interest in RPLF. Load forecasting on the individual household level is challenging task due to the extreme system volatility as the result of a dynamic processes composed of many individual components. Typical home loads are between 1 to 3 kWh and can be influenced by a number of factors, such as devices’ operational characteristics, users’ behaviors, economic factors, time of the day, day of the week, holidays, weather conditions, geographic patterns and random effects. With the appearance of novel technologies, demand response programs, changes in the lifestyle and energy consumption pattern etc., it becomes necessary to use alternative modelling techniques, to capture the factors responsible for accurate short term forecasting in smart metering applications [9, 28].
Different methods have been developed for forecasting the electric load demand in the last decades. Some of the most popular include time series analyses with autoregressive integrated moving average (ARIMA) method [7], fuzzy logic [22], neuro-fuzzy method [23], artificial neural network (ANN) [4, 24] and support vector machines (SVM) [16].
The basic quantity of interest in load forecasting is typically the hourly total electric load. However, load forecasting is also concerned with the prediction of hourly, daily, weekly and monthly values of the system load and the peak loads. Therefore, when classifying load forecasting in terms of the time horizon’s duration we can distinguish: up to 1 day short-term load forecasting (STLF), 1 day to 1 year for medium-term load forecasting (MTLF), and between 1 and 10 years for long-term load forecasting (LTLF). In case of the larger loads such as region or the country grid, forecasting is achieved with relatively high accuracy [9, 26]. For smaller populations such as individual household or a building the load dynamics change so drastically that standard short term load forecasting (STLF) tools require certain re-adjustments [2, 25]. Therefore, to forecast such micro system we need to look at the STLF modeling tools and data preprocessing transformations to deliver a scalable solution of high forecasting accuracy.
For instance, Sevlian and Rajagopal [19] proposed a simple empirical scaling law that described load forecasting accuracy at different levels of aggregation. The model was justified based on a simple decomposition of individual consumption patterns. The results showed that for different forecasting methods and time horizons, aggregating more customers improves the relative forecasting performance. Consumption pattern of a single customer generally has little structure to be exploited. Aggregating more customers smoothness the signal so that it becomes more predictable.
Aung et al. [3] proposed a SVM system to forecast the daily peak loads of individual smart meters. The method can provide of about 98% of average accuracy and can be applied when the longer forecasting horizon is considered.
Ghofrani et al. [11] showed that residential load can be represented as the sum of a deterministic component and a random Gaussian perturbation. They used Kalman filtering to predict the residential load for different time periods and forecasting horizons. The main conclusion was that higher sampling rate of measurement data substantially improves the accuracy of the forecast, but it is increasing the computational cost. Thus, achieving the desired prediction accuracy while limiting the volume of data used requires careful selection of the sampling rate.
Several modelling techniques are typically used for energy load forecasting. These techniques can be classified into nine categories [1]: (1) multiple regression, (2) exponential smoothing, (3) iterative reweighted least-squares, (4) adaptive load forecasting, (5) stochastic time series, (6) ARMAX models based on genetic algorithms, (7) fuzzy logic, (8) artificial neural networks and (9) expert systems.
Based on literature findings we can conclude that time series analysis techniques are neither scalable to higher dimension nor are effective in highly volatile data [5]. For this reason time series methods such as regression models, ARIMA models, GARCH and hybrid models such as combination of ARIMA and GARCH using wavelet transform are not considered for short term forecasting [14, 27] (see Table 1).
Therefore, for forecasting experiments we used following techniques: artificial neural networks, support vector machines and regression trees.
Techniques such as artificial neural networks (ANN) through their hidden layers and ability to learn seem to be more capable of solving forecasting problem. This technique is able to identify hidden trends thereby finding the trends in time series and use them to produce the accurate forecast. Several features of artificial neural networks make them very popular and attractive for practical applications Firstly, they possess ability to generalize even if the data are incomplete or noisy. Secondly, neural nets are non-parametric method what mean that they do not require any a-priori assumptions about the distribution of the data. Thirdly, they are good approximators capable to model any continuous function to any desired accuracy. Multi-layer perceptrons (MLP) and radial basis functions (RBF) networks are the two most commonly used types of feed-forward neural networks. A main difference between these two types is the way in which hidden units aggregates values at their inputs. MLP networks use mainly sigmoid functions and RBFs use the radial basis functions taking on the role of the activation functions.
The main problem in neural networks application is to find the correct values for the weights between the input and output layer using a learning paradigm called supervised learning (training). To train the network we use the data for which the correct output is known. Starting with random weights, an input pattern is presented to the network to make initial forecast. During the training process, the difference between the forecast made by the network and the correct value for the output is calculated, and the weights are changed in order to minimize the error. As a result, we want the algorithm to find these properties of the input data, which are most relevant for modelling the target function.
The other method used in our experiments was support vector machines (SVM). It is a technique characterized by usage of kernels, absence of local minima, sparseness of the solution and capacity control obtained by acting on the margin, or on number of support vectors. The capacity of the system is controlled by parameters that do not depend on the dimensionality of the feature space. The non-linear function is leaned by linear learning machine which maps inputs into high dimensional kernel induced feature space. SVM is motivated to find and optimize the generalization bounds given for regression [20]. They relied on defining so called epsilon intensive loss function that ignores errors, which are situated within the certain distance of the true value.
For the regression tree we used CART (Classification and Regression Trees) algorithm [6, 18] to model the continuous response variable. The sample mean or sample median of the response values of the members of the learning sample corresponding to the node were used for this purpose. In the tree growing process, the split selected at each stage is the one that minimize the sum of the squared differences between the response values for the learning sample cases corresponding to a particular node and their sample mean, or the greatest reduction in the sum of the absolute differences between the response values for the learning sample corresponding to a particular node and their sample median.
Smart metering data
Electricity measurements data were prepared using Mieo HA104 meter installed in one of the households in Warsaw, Poland for the purpose of SMEPI project (SMEPI –Smart Metering Poland, a Hi-Tech project to develop smart metering solutions partially financed by National Centre for Research and Development (NCBiR) and led by Vedia S.A in cooperation with GridPocket and Faculty of Applied Mathematics and Informatics at Warsaw University of Life Sciences). The household consisted of two adult people and a child. The household was living in a flat and was equipped in various home appliances including washing machine, refrigerator, dishwasher, iron, electric oven, two TV sets, audio set, pot, coffee maker, desk lamps, computer, and a couple of light bulbs.
The data were gathered during 60 days, starting from 29 August until 27 October 2012. However, for the analysis we extracted 44 days for which we gathered a set of user behavioral information such as devices’ operational characteristics at the household. These data were produced by the reference system which was constructed to collect binary data about the on-off states of the devices. The reference data were individually collected for: washing machine (WM), dish washer (DW), tumble dryer (TD), kettle (KE) and microwave oven (MO), please refer to and Fig. 1 to see the details. Original dataset contained the electricity usage readings of the smart meter at every second, every minute and every hour. From these readings, we extracted the hour loads (in kWh) for the purpose of short-term load forecasting. Data characteristics for the analyzed period are illustrated in Fig. 2.
To analyze the volatility in our data we prepared the box and whisker plot, see Fig. 3, for each of 24 hours using load data over all 44 days. The whiskers show the minimum and maximum value in a given hour and box encloses 50% of the total data (top edge represents 75th quartile and bottom edge 25th quartile and line in the middle is the median). The results show that the volatility is rather high (especially during day hours) what can have impact on forecasting.
In our research, we focused on forecasting the electricity usage of a particular household for 24 hours ahead. In order to forecast the load we constructed a feature vector with attributes as presented in Table 2. These 43 attributes were empirically derived. The individual, the average, the minimum, the maximum and the range loads information were obtained from the hourly load time series. Each day was divided into five periods namely, morning, noon, afternoon, evening and night. Moreover holiday indicator was prepared to mark the holidays in Poland.
In addition to the attributes which describe previous electricity usage, a set of behavioral features describing the household habits was prepared. For this reason some data transformation was needed and it was aimed at creating analytical table (please refer to Table 3) with decoded input variables per each hour with appliances’ operational characteristics at the household.
The table presents the electricity readings per each day, per each hour with input variables such as day of the week, holiday indicator, part of the day (morning, noon, afternoon, evening, night), the number of ON-OFF states for each of the appliance. Based on the analytical table, additional 30 attributes with appliance usage patterns were prepared such as presented in Table 4.
Forecasting experiments
Limitations of the study
In this study we are aware of some limitations due to the nature of the problem and its complexity. First of all, we didn’t apply time series analysis techniques for our data since we observed high data volatility. Instead, we used neural networks, support vector machines and regression trees, which seem to be more capable of solving this kind of forecasting problem.
Secondly, although we possess potentially useful behavioral variables including devices’ operational characteristics at the household, in practical applications such data may be accessible only if the end user will undertake the effort to help the system gather the reference data about the operating devices. This is manual process aimed at identification what appliances are being switched on and off by the household members.
At this moment, we possess the data from only one smart meter and therefore we treat this experiment as proof of concept and the main research question is whether proposed short term load forecasting models can work efficiently for forecasting the electricity usage at individual households.
Accuracy measures
To assess the model performance for forecasting, we used three measures: precision, resistant MAPE error and accuracy [14].
Precision is the measure of how close the model is able to forecast to the actual load. To measure precision we used mean squared error (MSE) given by:
where W hi is the observed load in hour i and P hi is the forecasted load in hour i.
According to the National Research Council (1980), any summary measure of error must meet five basic criteria: measurement validity, reliability, ease of interpretation, clarity of presentation, and support of statistical evaluation. In attempt to meet these criteria, as the summary measure of forecast a MAPE (mean absolute percentage error) error is most often used.
However, it does not meet the validity criterion due to the fact that the distribution of the absolute percentage errors is usually skewed to the right, with the presence of outlier values. In these cases, MAPE can be highly over-influenced by some very bad instances and can overshadow quite good forecasts. In this article, we propose an alternative index, called resistant MAPE or r-MAPE based on the calculation of the Huber M-estimator, which helps to overcome the aforementioned limitation [17].
An M-estimator for the location parameter μ using maximum likelihood (ML)-estimator is defined as a solution θ to
or
where φ = ρ′, W
hi
is the observed load in hour i, P
hi
is the forecasted load in hour i and φ is the scale parameter. For a given positive constant k, the Huber [13] estimator is defined by the following function in φ (4)
where k is a tuning constant determining the degree of robustness set at 1.5. Above function is known as metric Winsorizing and brings in extreme observations to μ ∓ k. In reality σ is not known, thus a MAD robust estimator was used:
Accuracy is the measure of how many correct forecasts the model makes, where the term correctness is defined by user. This can be done by defining correct forecast as the value within a percentage range of the actual load. However, for low loads, a percentage range may become insignificant. For a load of 0.1 kWh, a 15% range would be 0.085–0.115 and a forecast of 0.2 kWh will be considered as wrong, but in practice such forecast would be acceptable. To overcome this false loss of accuracy we set two scales to measure accuracy. In this study, we set a 15% range of error for accuracy, but if the load was smaller than 1 then we considered range of±0.15 kWh as range of acceptable forecast. Therefore, accuracy for hour i was given as:
The analysis setup was constructed in the following way. In the first experiment the models using only historical consumption data were trained. In the second experiment the historical data enriched with household behavioral data were used for modeling.
The problem considered in this section is the correct forecasting of the electricity load for the 24 hours ahead. Firstly, we computed Spearman’s correlation coefficients between observed load and explanatory variables as presented in Table 5.
The observations in Table 5 suggest that an efficient forecasting might be possible taking into account, for instance, the usage covering last three hours, and the variables derived based on that, that is the average, minimum, maximum and range of the load observed in last three hours. This finding might be important for the data storage and data volume reduction to be transmitted by smart meter.
Before estimating the ANN, SVM, CART, we have randomly selected two samples. The training set was used to estimate the model, while the testing set was used to validate the model for better generalization. The calibration sample included 90% of the observations and the test sample included 10% of the observations.
The calculations were prepared in R software and Statistica ver. 10. For all the algorithms we build 24 models, each for single hour.
Firstly, a three layer back propagation neural network was trained. As loss function we chose the least squares estimator. In the most general terms, least squares estimation is minimizing the sum of squared deviations of the observed values for the dependent variables from those forecasted by the model. Technically, the least squares estimator is obtained by minimizing SOS (sum of squares) function:
where W hi is the observed load in hour i and P hi is the forecasted load in hour i.
For training neural networks we used the BFGS (Broyden –Fletcher –Goldfarb –Shanno) algorithm, which belongs to the broad family of quasi-Newton optimization methods. This method performs significantly better than for instance traditional algorithms such as gradient descent, but it is more memory and computationally demanding.
To select the best model, we used the multiple correlation coefficients which measure the correlation (linear dependence) between linear combinations of independent variables and dependent variables (in our case, hourly electricity load demand).
In the experiment we tried several neural network structures for each hour to get the best result. As a result we used a neural network which consists of one hidden layer. Input layer consisting of 43 perceptrons (model without behavioral features) and 73 perceptrons (model taking into account household habits). Hidden layer consisted of 38 perceptrons and finally, the output layer consisted one perceptron. All the perceptrons were activated by logistic function:
The number of neurons in hidden layer was proposed as a result of numerical procedure. We started neural network learning with small number of hidden units and then successively we increased number of neurons until no significant improvement in terms of models performance was observed.
It is well known that support vector machines generalization performance depends on a good setting of global parameters: C, ɛ and the kernel function. The problem of optimal parameter selection is further complicated by the fact that SVM model complexity depends on all these parameters. Due to these, we have arbitrary chosen values of these parameters and tried several different configurations. The final setting was following. Parameter ɛ which controls the width of the insensitive zone, was set at 0.01. The capacity coefficient C was set to 10, which determines the trade-off between the model complexity and the degree to which deviations larger than are tolerated in optimization formulation. As a kernel we used the radial basis functions with parameter γ equal 0.2. This functions is by far the most popular choice of kernel types, because of their localized and finite responses across the entire range of the real x-axis.
The CART algorithm is currently one of the most popular decision tree learning algorithm. It uses a splitting criterion based on approach mentioned in the third section. In this research we used binary splits and cost-complexity pruning using 10-fold cross-validation. The final results obtained by ANN, SVM, CART and aggregated over all hours are presented in Table 6.
For the training sample without behavioral information, the accuracy which measures of how many correct forecasts the model makes is 63.9% for ANN, 68.3% for SVM and 50% for CART. The precision of how close the model is able to forecast to the actual load (MSE) is 0.069, 0.108 and 0.127 for ANN, SVM and CART, respectively. Moreover, r-MAPE was 52.9% for ANN, 46.6% for SVM and 75.5% for CART.
The models results associated with the test set are following: ANN obtained 52.4% of accuracy, 68.2% for r-MAPE and 0.083 for MSE. SVM obtained accuracy of 63.4% , r-MAPE of 52.4% and MSE of 0.138, while for CART we had accuracy of 49.3% , r-MAPE of 78.1% and 0.117 for MSE.
For the training sample enriched with behavioral variables, the accuracy increased by 10 percentage points for ANN, nearly 5 pp. for SVM and nearly 2 pp. for CART in comparison to the results obtained on the dataset without behavioral variables. The MSE measure decreased by 0.019 for ANN, by 0.019 for SVM and by 0.005 for CART. Also r-MAPE decreased by 15.3 pp. for ANN, by 9.4 pp. for SVM and by 2.1 pp. for CART.
The results associated with the test set are also better in comparison to the modelling without behavioral patterns. For this sample ANN gained 11 percentage points of accuracy, decreased by 16.8 pp. for r-MAPE and decrease of 0.022 for MSE. SVM reported 2.9 pp. accuracy gain, 0.018 decrease in MSE, and decrease by 4.5 pp. for r-MAPE, while CART gained 0.3 pp. of accuracy, decrease of 0.004 in MSE and slight increase of 1 pp. in terms of r-MAPE.
Model results for randomly selected test day are shown in Figs. 4 and 5.
We can observe that some hours can be forecasted with relatively high accuracy while others are affected by rather high errors. For instance, night hours: between 22 and 04 can be modeled with high accuracy but on the other hand afternoon and evening hours are less predictable. As depicted on Fig. 4 a better fit is provided by the models which used behavioral variables. We can observe that the load forecast curve for ANN follows the real load curve quite well.
Additionally, to give also a graphical view on the performance of the proposed models the results obtained for the datasets with behavioral features and without them are shown in Fig. 5. From this figure we can observe, for each technique separately, how the forecasts follow the real load curve. It can be noticed that the best projection can be assigned to SVM and ANN, where SVM produces forecasts which are smooth and ANN tends to capture the peak values. In general, the trend is followed well enough but as it was expected, due to household behavior and other immeasurable influences, there are some deviations when comparing the forecasts and the real load.
Although the results presented above are promising we should bear in mind that forecasting on individual household level is difficult task since the daily household behavior may change drastically due to different circumstances, e.g. using home appliances depending on weather conditions (lights and TV on rainy days), going on trips or holidays, inviting guests. In larger populations, smaller loads tend to neutralize to produce a stable time series but for an individual home load, the time series volatility is quite extreme, thus accurate forecasting becomes challenging task.
In this paper, we presented an approach to forecast electricity load on individual household level, what can potentially provide greater intelligence to the smart meters and value added for individual customers. The results of CART, SVM and MLP neural network model used for 24 hours ahead short term load forecast show that they have a good performance and reasonable prediction accuracy was achieved with these models. The forecasting capabilities were evaluated by computing the accuracy measures between the observed and predicted values.
We showed through experiments that a combination of historical usage data and household behavioral data can greatly enhance forecasting of individual consumer’s load when used in RPLF systems. This richer data set can reduce r-MAPE error by up to 25% and increase average accuracy up to 21% in some cases, as observed on the test set for ANN model. As future work we see the following direction. The electricity consumption of a household changes over time based on the operation of individual appliances used by the family. Therefore we aim to propose the optimal structure of the dataset that captures variability over appliances and supports accurate forecasting. We are going to explain how appliance level data affords numerous benefits, and proof that using the algorithms in conjunction with smart meters is cost-effective and scalable solution for both, appliance recognition and forecasting in smart metering systems applied on the household level.
Footnotes
Acknowledgments
This research was financed by VEDIA S.A. leading a project partially supported by National Centre for Research and Development in Poland (NCBiR).
The study is cofounded by the European Union from resources of the European Social Fund. Project PO KL “Information technologies: Research and their interdisciplinary applications”, Agreement UDA-POKL.04.01.01-00-051/10-00.
