k -NN and ANN based deterministic and probabilistic wind speed forecasting intelligent approach

Abstract

Development of power through wind with the enhancement of renewable energy resources, frolics/romps a principal role in a developing country like India due to its censorious locations. Wind speed prediction in long term scenario has become a key research area in distinct applications (i.e., management of energy, optimal designing of wind farm, restructuring of electricity marketing, load-shedding and load forecasting). However, forecasting of accurate wind speed data for installation of wind turbine is very difficult due to its deterministic and probabilistic characteristics. The presented technique in this study may bridge the research gap related with the long term wind speed forecasting as resolve the previously indicated problems. Thence, two basically distinct techniques, k-nearest neighbors (kNN) algorithm and artificial neural network (ANN), have been implemented to forecasting of monthly wind speed of Indian cities. The uniqueness of the presented paper is to predict the wind speed in common form of incoming month by implementing the kNN algorithm. A dataset of current wind speed recorded specimen from 168 cities of India is utilized to train and test the proposed approach. Obtained results through the proposed approach have been validated by using ANN technique, which shows very small MSE.

Keywords

Decision tree multilayer perceptron wind speed prediction artificial neural network

1 Introduction

The forecasting of wind power plays an essential role in management of power by government of a country as well as for an utility. An unswerving and highly meticulous forecasting develops good planning as well as control of power like access power storage, load scheduling and transmission [1]. When forecasting of the wind power is not possible directly, then forecasting of wind speed (WS) becomes more significant, because WS is related with wind power directly. Types of Wind speed forecasting are short–term (ST), medium-term (MT) and long-term (LT) which may be utilized by power industry and/or power market for trading and management purposes. The study of ST wind speed prediction is necessary for avoiding the collapse to wind turbine due storm, whereas LT wind speed forecasting is useful for produced power planning [2].

Wind power and/or WS prediction models can be categorized into two types (i.e., statistical and physical models). In physical models, WS forecasting is performed by using meteorological datasets. Numerical approach for weather forecasting is an example of physical prediction model [3]. In the available literature, statistical models for WS and wind power forecasting are linear-time-series (LTS) based model and persistence model. ANNs model and ANFIS type models are also a part of statistical models. Persistence approach is simple whereas LTS based models are widely used model.

Whenever, forecasting is more than six hours, then LTS and physical models are not suitable due to probabilistic characteristic of wind. To overcome such problem, numerous non linear methods (i.e., ANNs and ANFIS) [4 –8] are utilized for forecasting the WS as given in the literature as some are reviewed in next section [9 –21].

In this paper, a comparison of kNN with state of the art ANNs model have been presented whose inputs are measured metrological datasets of 168 cities of India. The performance comparison is shown of 25 different cities and represented graphically.

Rest of paper is assembled as follows: The state-of-the-art for WS prediction using ANN models is represented in Section-2, used datasets are explained in Section-3. Methodology developed in this study has been represented in Section 4 and obtained results from the methodology have been described in detail in Section 5. Finally, summery of the work has been described in Section 6 as a conclusion.

2 Related work

2.1 Review of literature

Based on available online database, several researches have been done in the area of wind speed (WS) forecasting. Some of them have been explained as given below:

S. Buhan et al. have implemented a model based on statistical approach. Authors have applied both ANN and SVM model to predict WS of 25 power plant of Turkey. The mean absolute error (MAE) of the presented approach is in range of 1.5% to 5.2%. MAEs are 12.63% and 16.85% for Multistage and single stage respectively which is comparatively very high [9].

A. Kusaik and Z. Zhang represent a case study of short term WS forecasting using different AI models (i.e., ANN, SVM, Random forest, Boosting Tree) to identify the wind behaviour and a comparison is presented. The prediction accuracy 90% at 60 s time period. Out of different AI models, SVM identify WS very closer to measure value for 1 hour prediction whereas for 4 hour prediction, ANN has given good results that mean ANN is not suitable for short time WS prediction for this problem [10].

T.G. Barbounis et al. Presented a long term WS prediction model RNNN (Recurrent network neural network), and MLP. The performance indexes are measured in term of MAE and RMS [11].

S. Li et al. have designed ANN with 4 inputs for WS prediction and then wind power. Proposed method’s performance is 1% -2% but taken large time for multiple inputs [12].

Ronay Ak et al. have implemented two different hybrid models (ANN with GA and ELM with kNN) for forecast time-series WS in Canada. The prediction accuracy of NN with CP is 90% and ELM with CP is 90.6% [13].

C.S. Ioakimidis et al. have Implemented probabilistic wind power potential through WS prediction using ANN hourly though South of Portugal dataset. The prediction accuracy is approximate 75% [14] which is very poor for installation of wind turbine.

T.G Barbounis and J.B. Theocharis have designed a fuzzy based recurrent ANN. Proposed approach is utilized to identify multi step wind predictions from remote locations. Proposed model enhance the reasonable accuracy as compare with ANN and Fuzzy [15].

J. Wang et al. have developed a hybrid approach using RBF for real time WS identification of Hexi Corridor, China. The performance of the proposed approach is very low in the range of 12%–16% which is comparatively higher than existing MLP, ESM, RBFN, SAM-ESM, SAM-RBFN and ESM-RBFN. The main disadvantage of the proposed approach is to predict only hourly WS [16].

P. Ramasamy et al. have developed ANN approach for WS identification using measured dataset from western Himachal Pradesh, India. Obtained wind power through proposed approach is vary from 773.6 to 5329.8 watt, which is comparably higher [17].

H. Shao [18] conducted study on wind speed forecasting by proposing Wavelet Transformation (WT) and AdaBoosting neural networks. The data provided by state grid includes air pressure, WS, and air temperature. The AdaBoosting NN is utilized to enhance the prediction accuracy. The forecasting performance is measured by RMSE, MAE, and RMAE. The average forecasting accuracy evaluated by TRD is higher than TRA.

H. Borhan at al. [19] proposed method for WS estimation based on pattern recognition (PR) using ANNs. The proposed model uses NARX (Non Linear Autoregressive Network) 10 year wind data. The proposed method is compared with two different real dataset and has been compared with other methods. MAE is utilized as performance analyzer of the model, which shows the results that MAE is minimum with proposed scheme as comparedto others.

2.2 Findings

After critical review of literature, we found some findings related with wind speed forecasting:

Datasets for training, testing and validation are chosen for ANN model based on deterministic WS.

Architecture configuration (input layer neuron-hidden layer neuron-output layer neurons) is chosen for ANN model based on WS conditions.

A lot of research scope is available for WS prediction (i.e, ST, MT and LT) for wind farm installation.

The prediction accuracy of WS is varied with variation of input variable so exact value should be recorded for accurate prediction by ANN model.

Prediction accuracy needs to be enhanced as conventional artificial intelligence techniques are not suitable for this problem.

The accuracy is changed with variation of input variables, so selection of input variables are another big finding for this problem.

Input variable selection to enhance the forecasting accuracy of WS.

To bridge these research gaps, kNN and ANN have been implemented in this paper for proper understanding.

3 Data set used

Credential datasets are collected from two different resources in this study. Dataset#1 has been collected form recorded data by CWET of 168 cities (Table 1) of distinct reasons of India [20] whereas dataset#2 has been collected from online freely available data provided by NASA [21]. Four different data files have been prepared from this dataset. Data samples in training, testing, validation and prediction files are 93, 25, 25 and 25 cities’s data sets, selected randomly. Each file is distinct to each other. Thirteen input variables are used for WS prediction in this study. Thirteen input variables (i.e., Lat.-Latitude, Long-Longitude, RH-Relative humidity, EL-Elevation, AP-Atmospheric pressure, SR-Daily solar radiation, AT-Air temperature, ET-Earth temperature, CD-Cooling degree-days, HD-Heating degree-days, CDT-Cooling design temperature, HDT-Heating design temperature and ET-Earth temperature amplitude) are utilized in this study. Dataset matrix is shown in Equations 1 to 4.

Table 1
Recorded Datasets of 168 Indian cities utilized for training, testing, validation and one month ahead prediction

City Lat. Lon. City Lat. Lon. City Lat. Lon. City Lat. Lon.

93 Cities for training datasets

Vapi 20.385 72.912 Manmad 20.252 74.438 Thanesar 29.962 76.818 Sunam 30.13 75.8

Mahuva 21.092 71.77 Ner 20.492 77.868 Zirakpur 30.643 76.817 Malout 30.192 74.498

Surat 21.17 72.831 Malegaon 20.561 74.525 Panchkula 30.695 76.854 Patiala 30.34 76.38

Porbandar 21.641 69.606 Khamgaon 20.712 76.566 Sonepat 28.929 77.091 Nabha 30.374 76.145

Jhagadia 21.72 73.151 Shegaon 20.793 76.694 Kurukshetra 29.97 76.878 Sri Muktsar Sahib 30.48 74.518

Upleta 21.741 70.281 Umred 20.861 79.318 Sangli 16.868 74.57 Rajpura 30.484 76.594

Khambhat 22.318 72.619 Paratwada 21.273 77.523 Kavathe Mahankal 17.006 74.865 Khanna 30.703 76.22

Kalol 22.606 73.463 Morshi 21.324 78.014 Solapur 17.66 75.906 Moga 30.816 75.172

Nadiad 22.7 72.87 Shahada 21.546 74.47 Pandharpur 17.675 75.324 Rahon 31.052 76.118

Morbi 22.812 70.824 Malvan 16.067 73.467 Madha 18.034 75.522 Phagwara 31.224 75.771

Viramgam 23.126 72.057 Kolhapur 16.691 74.245 Osmanabad 18.186 76.042 Pathankot 32.266 75.647

Kadi 23.298 72.331 Kosumb 17.115 73.592 Udgir 18.393 77.113 Madhopur 32.365 75.597

Mansa 23.427 72.657 Satara 17.691 74.001 Lonavala 18.748 73.407 Sirhind 30.644 76.394

Junagadh 21.515 70.456 Pune 18.517 73.856 Nanded 19.17 77.32 Mandi Gobindgarh 30.683 76.294

Rajkot 22.306 70.82 Khalapur 18.834 73.289 Thane 19.218 72.978 Kharar 30.751 76.637

Vadodara 22.311 73.193 Parli Vaijnath 18.871 76.536 Mira Bhayandar 19.295 72.854 Rupnagar 30.975 76.527

Vadodara 22.311 73.193 Parbhani 19.258 76.774 Newasa 19.551 74.928 Nangal Dam 31.385 76.375

Jamnagar 22.471 70.058 Manwath 19.3 76.5 Sangamner 19.581 74.205 Khodal 2 26.372 71.188

Palanpur 24.179 72.427 Nalasopara 19.432 72.774 Palghar 19.694 72.766 Barkheri Bazar 24.571 77.731

Narnaul 28.066 76.101 Virar 19.466 72.806 Pusad 19.913 77.567 BIDDA 33.12 74.82

Rohtak 28.896 76.607 Partur 19.596 76.211 Wani 20.067 78.958 Robertsganj 24.685 92.564

Panipat 29.399 76.977 Kopargaon 19.883 74.483 Washim 20.1 77.15 Warora 20.23 79

Sirsa 29.537 75.026 Nashik 19.997 73.79 Mehkar 20.15 76.574

Narwana 29.593 76.119 Nagpur 21.147 79.089

25 Cities for testing datasets

Amalsad 20.818 72.955 Ambala City 30.378 76.777 Arvi 20.634 79.14 Bhadgaon 20.667 75.233

Bardoli 21.125 73.113 Bhiwani 28.799 76.134 Akola 20.707 77.003 Bathinda 30.211 74.945

Amreli 21.603 71.222 Ahmednagar 19.101 74.741 Amravati 20.937 77.779 Dhuri 30.37 75.87

Anand 22.554 72.949 Andheri 19.117 72.862 Amalner 21.042 75.064 Batala 31.823 75.205

Ahmedabad 23.034 72.585 Aurangabad 19.901 75.353 Akot 21.1 77.06 Amritsar 31.634 74.872

Bahadurgarh 28.68 76.92 Buldana 20.537 76.181 Bhandara 21.17 79.65 Balesar 26.397 72.479

Baramati 18.151 74.577

Cities for validation datasets

Bharuch 21.706 72.998 Gurgaon 28.458 77.026 Dahanu 19.991 72.744 Buti Bori 20.928 79.004

Botad 22.17 71.668 Hansi 29.102 75.966 Chalisgaon 20.464 74.997 Fazilka 30.404 74.028

Dahod 22.836 74.256 Gangakhed 18.966 76.748 Daryapur 20.922 77.327 Faridkot 30.678 74.74

Bhuj 23.242 69.667 Georai 19.263 75.752 Gondia 21.46 80.195 Garhshankar 31.214 76.144

Deesa 24.259 72.191 Devlali 19.906 73.824 Chimur 20.501 79.38 Ferozepur 30.923 74.61

Fatehabad 29.512 75.455 Chandrapur 19.97 79.303 Dhule 20.903 74.775 Kanod 26.083 71.783

Chiplun 17.532 73.518 Chiplun

City	Lat.	Lon.	City	Lat.	Lon.	City	Lat.	Lon.	City	Lat.	Lon.
93 Cities for training datasets
Vapi	20.385	72.912	Manmad	20.252	74.438	Thanesar	29.962	76.818	Sunam	30.13	75.8
Mahuva	21.092	71.77	Ner	20.492	77.868	Zirakpur	30.643	76.817	Malout	30.192	74.498
Surat	21.17	72.831	Malegaon	20.561	74.525	Panchkula	30.695	76.854	Patiala	30.34	76.38
Porbandar	21.641	69.606	Khamgaon	20.712	76.566	Sonepat	28.929	77.091	Nabha	30.374	76.145
Jhagadia	21.72	73.151	Shegaon	20.793	76.694	Kurukshetra	29.97	76.878	Sri Muktsar Sahib	30.48	74.518
Upleta	21.741	70.281	Umred	20.861	79.318	Sangli	16.868	74.57	Rajpura	30.484	76.594
Khambhat	22.318	72.619	Paratwada	21.273	77.523	Kavathe Mahankal	17.006	74.865	Khanna	30.703	76.22
Kalol	22.606	73.463	Morshi	21.324	78.014	Solapur	17.66	75.906	Moga	30.816	75.172
Nadiad	22.7	72.87	Shahada	21.546	74.47	Pandharpur	17.675	75.324	Rahon	31.052	76.118
Morbi	22.812	70.824	Malvan	16.067	73.467	Madha	18.034	75.522	Phagwara	31.224	75.771
Viramgam	23.126	72.057	Kolhapur	16.691	74.245	Osmanabad	18.186	76.042	Pathankot	32.266	75.647
Kadi	23.298	72.331	Kosumb	17.115	73.592	Udgir	18.393	77.113	Madhopur	32.365	75.597
Mansa	23.427	72.657	Satara	17.691	74.001	Lonavala	18.748	73.407	Sirhind	30.644	76.394
Junagadh	21.515	70.456	Pune	18.517	73.856	Nanded	19.17	77.32	Mandi Gobindgarh	30.683	76.294
Rajkot	22.306	70.82	Khalapur	18.834	73.289	Thane	19.218	72.978	Kharar	30.751	76.637
Vadodara	22.311	73.193	Parli Vaijnath	18.871	76.536	Mira Bhayandar	19.295	72.854	Rupnagar	30.975	76.527
Vadodara	22.311	73.193	Parbhani	19.258	76.774	Newasa	19.551	74.928	Nangal Dam	31.385	76.375
Jamnagar	22.471	70.058	Manwath	19.3	76.5	Sangamner	19.581	74.205	Khodal 2	26.372	71.188
Palanpur	24.179	72.427	Nalasopara	19.432	72.774	Palghar	19.694	72.766	Barkheri Bazar	24.571	77.731
Narnaul	28.066	76.101	Virar	19.466	72.806	Pusad	19.913	77.567	BIDDA	33.12	74.82
Rohtak	28.896	76.607	Partur	19.596	76.211	Wani	20.067	78.958	Robertsganj	24.685	92.564
Panipat	29.399	76.977	Kopargaon	19.883	74.483	Washim	20.1	77.15	Warora	20.23	79
Sirsa	29.537	75.026	Nashik	19.997	73.79	Mehkar	20.15	76.574
Narwana	29.593	76.119	Nagpur	21.147	79.089
25 Cities for testing datasets
Amalsad	20.818	72.955	Ambala City	30.378	76.777	Arvi	20.634	79.14	Bhadgaon	20.667	75.233
Bardoli	21.125	73.113	Bhiwani	28.799	76.134	Akola	20.707	77.003	Bathinda	30.211	74.945
Amreli	21.603	71.222	Ahmednagar	19.101	74.741	Amravati	20.937	77.779	Dhuri	30.37	75.87
Anand	22.554	72.949	Andheri	19.117	72.862	Amalner	21.042	75.064	Batala	31.823	75.205
Ahmedabad	23.034	72.585	Aurangabad	19.901	75.353	Akot	21.1	77.06	Amritsar	31.634	74.872
Bahadurgarh	28.68	76.92	Buldana	20.537	76.181	Bhandara	21.17	79.65	Balesar	26.397	72.479
Baramati	18.151	74.577
Cities for validation datasets
Bharuch	21.706	72.998	Gurgaon	28.458	77.026	Dahanu	19.991	72.744	Buti Bori	20.928	79.004
Botad	22.17	71.668	Hansi	29.102	75.966	Chalisgaon	20.464	74.997	Fazilka	30.404	74.028
Dahod	22.836	74.256	Gangakhed	18.966	76.748	Daryapur	20.922	77.327	Faridkot	30.678	74.74
Bhuj	23.242	69.667	Georai	19.263	75.752	Gondia	21.46	80.195	Garhshankar	31.214	76.144
Deesa	24.259	72.191	Devlali	19.906	73.824	Chimur	20.501	79.38	Ferozepur	30.923	74.61
Fatehabad	29.512	75.455	Chandrapur	19.97	79.303	Dhule	20.903	74.775	Kanod	26.083	71.783
Chiplun	17.532	73.518	Chiplun

$Train = {[\begin{matrix} Lat, Long, RH, EL, AP, SR, AT, \\ ET, CD, HD, CDT, HDT, ET] \end{matrix}]}_{1116 \times 9}$ (1) $Test = {[\begin{matrix} Lat, Long, RH, EL, AP, SR, AT, \\ ET, CD, HD, CDT, HDT, ET] \end{matrix}]}_{300 \times 9}$ (2) $Valid . = {[\begin{matrix} Lat, Long, RH, EL, AP, SR, AT, \\ ET, CD, HD, CDT, HDT, ET] \end{matrix}]}_{300 \times 9}$ (3) $Pred . = {[\begin{matrix} Lat, Long, RH, EL, AP, SR, AT, \\ ET, CD, HD, CDT, HDT, ET] \end{matrix}]}_{300 \times 9}$ (4)

4 Methodology

4.1 k-nearest nieghbors algorithm (k-NN) [22]

In data mining, k-NN is a non-parametric approach utilized for both regressions as well as classification purpose. In k-NN, the input includes k training samples whereas the output depends on the application of k-NN (whether use for regression or classification).

For classification purpose: the output is classified by a majority vote of neighbors (i.e. assigned class corresponding to each sample)

For regression purpose: the output is defined by the average of the value of its k-NNs

k-NN algorithm is a lazy type learning or instance-based learning in which function is only approximated locally and is a simplest algorithm amongst the all data mining algorithms. k-NN is also assign the weight to neighbors contribution in which nearer neighbors contribute more than other.

Lets us assume, a data set (x, y) , (x₁, y₁) , …, (x_n, y_n), where y is the target value of the data sample x, so that x|y = r □ p_r for r = 1,2 and p_r= probability distribution.

In the training phase, k-NN algorithm stores the feature vectors and corresponding target level. To evaluate the distance metric for continuous variables is Euclidean distance and for discrete variables, overlap metric (or Hamming distance) is evaluated as given by Equations 5 to 8. $Euclidean : \sqrt{\sum_{i = 1}^{k} {(x_{i} - y_{i})}^{2}}$ (5) $Manhat tan : \sum_{i = 1}^{k} | x_{i} - y_{i} |$ (6) $Minkowski : {(\sum_{i = 1}^{k} {(| x_{i} - y_{i} |)}^{q})}^{1 / q}$ (7) $\begin{matrix} Hum min g : D_{H} : \sum_{i = 1}^{k} | x_{i} - y_{i} | \end{matrix}$

where, $x = y \Rightarrow D = 0; x \neq y \Rightarrow D = 1$ (8)

The k-NN accuracy can be enhanced if distance metric is learned by LMNN (large margin neighbor) or by NCA (neighbourhood component analysis). A major drawback of k-NN is skewed. To overcome this problem in k-NN, the class/target value of each k nearest points (sample) is multiplied by a weight proportional to the inverse of the distance from that point to the test point.

In parameter selection: the choice of k depends upon the data. To reduce the effect of noise, larger value of k is used. When k = 1, the predicted value is closest to training sample, and can be optimized by some other techniques.

The main logic of k-NN algorithm is to evaluate a set of k objects in training data set that are closer with test data set. The basic idea of k-NN algorithm is shown as below:

4.2 Multilayer perceptron (MLP) [23 –28]

There are two type of perceptron NN (i.e., Single layer perceptron (SLP) and Multilayer Perceptron-MLP). SLP is utilized for linear classification whereas for nonlinear problem MLP is used. Multilayer Perceptron were introduced by Werbos in 1974 and revised by Rumelhart, McClelland, and Hinton in 1986 which are also called feed forward networks (FFN). Usually, perceptron evaluate a discontinuous function which is represented as: $\vec{m} \to g_{step} (w_{0} + 〈 \vec{w}, \vec{m} 〉)$ (9)

Where, m is the smoothed function which is represented as: $\vec{m} \to g_{log} (w_{0} + 〈 \vec{w}, \vec{m} 〉)$ (10)

With $g_{log} (k) = \frac{1}{1 + e^{- k}}$ (11)

These Eq. are called logistic equation. Logistic functions have the property that they are monotonically increasing, continuous and differentiable. Multilayer Perceptron is an acyclic graph in which nodes are neurons with logistic activation. The process of MLP Training is same as training of a SLP. The weight value of two neurons in MLP is also varying to minimize the error value as given: $E = \frac{1}{2} \sum_{p} \sum_{j} {({tg}_{j}^{p} - {out}_{j}^{(N) p})}^{2}$ (12)

$\begin{matrix} E = - \sum_{p} \sum_{j} [{tg}_{j}^{p} . log ({out}_{j}^{(N) p}) \\ + (1 - {tg}_{j}^{p}) . log (1 - {out}_{j}^{(N) p})] \end{matrix}$ (13)

Where, tg= target value, out= model output (from j to p NN layer).

Updation of Weight between two neurons may be performed through a series of rules (w updating rules) which are represented as: $Δ w_{uv}^{(m)} = - η \frac{\partial E (w_{ij}^{(n)})}{\partial w_{uv}^{(m)}}$ (14)

$\begin{matrix} {{in}_{i}^{p}, {tg}_{j}^{p} : i = 1 . . . . . ninputs, j = 1 . . . . . . . noutputs, \\ p = 1 . . . . . . npatterns} \end{matrix}$ (15)

It is the only output ${out}_{j}^{(N) p}$ that appears in the output error function. Finally, output of the output layer is depends on the output of the weight value and hidden layer output. The complete training procedure of MLP is represented in Fig. 1 in step-by-step manner.

Fig.1

Training Multilayer Perceptron.

4.3 Implementation of k-NN and MLP

For implementing k-NN and MLP models data is collected from two resources: 1) NASA and 2) measured value from CWET of 168 cities of India. These datasets of 168 cities have been divided into four groups randomly. Group1 contain the dataset of 93 cities and utilized for designing and training the model. In group2, 25 cities’s dataset have been included and utilized for testing purpose of the designed and trained model. Again 25 cities dataset (group3) have been used for validation purpose of the trained and tested model. Finally, group4 contains remaining 25 cities data sets which are utilized for future forecasting purposes. These four groups dataset are totally different to each other. Training, testing, validation and prediction data includes 1116, 300, 300, and 300 samples respectively of 13 variables.

5 Results and discussion

Both k-NN and MLP approaches have been designed and run for training, testing and validation datasets by using MATLAB R2014a (8.3.0.532, win32/64) on a Lenovo Y700 PC. The configuration of the PC has 2.6-GHz Intel Core i7-6700HQ processor with 16GB of RAM; a 1TB, 5,400-rpm HDD with a 128GB SSD. The graphical representations of obtained result during all operating phases have been represented in Figs. 2–4 respectively. On analyzing the plots (Figs. 2–4) it has been identified that k-NN provide WS values which are very closer to recorded WS from real sites of Indian cities. For the comparison point of view, MLP method has been implemented for the same dataset, which shows that obtain results from kNN is much better than MLP. Moreover, the wind power of a location is depends upon the condition of WS of that location as represented by Equation (16). So, before installation of any wind turbine/farm, correct identification of WS is become more realistic without installation ofmetrological substation. In that scenario, kNN model will play an important roll to predict the WS in actual manner without a metrological substation. The WS forecasting accuracy through MLP method is 94.0% whereas through kNN method is 99.1%. $Energy \propto v^{2}$ (16)

Fig.2

Training phase results.

Fig.3

Testing Phase results.

Fig.4

Validation Phase Results.

Figure 2 represents the training phase output as well as error variation with respect to number of samples for kNN and MLP methos. Here, red line represents the kNN performance whereas, green line for MLP performance.

The represented data samples in the training, testing and validation phase are very large. Therefore, there is not an easy task to make distinguish between measured and predicted value by the model. To overcome this problem, we have plotted again these training, testing and validation phase results by using “rose” method of plotting as shown in Fig. 5. By using this method, we can easily distinguish the predicted value and measured value of wind speed.

Fig.5

Angle histogram plot for a) Training phase, b) testing phase and c) validation phase.

5.1 Comparison of Result obtained from k-NN and MLP Approaches

Presented study in this paper represents the comparative study of two approaches named kNN and MLP and results obtained from these approaches are further validated by forecasting the WS for a new unknown dataset of 25 cities of Indian states. Predicted results have been compared with recorded value through the metrological station provided by CWET centre of India and graphically represented in Fig. 6.

Fig.6

Comparison of Predicted Wind speeds by k-NN and MLP methods of 25 cities mentioned in Table 4.

6 Conclusion

In this paper k-NN and MLP models are developed for prediction of long term wind speed (one month ahad wind speed). Data of 168 cities of India is collected from NASA as well as measured data from CWET which is used in this study. The input variables which are used in this paper are 13 (i.e., Lat., Long., RH, EL, AP, SR, AT, ET, CD, HD, CDT, HDT and ET). On comparing these two models it is found that k-NN gives better results than MLP model.

The future scope is to implement the proposed kNN approach at real site for the forecasting of WS in deterministic and probabilistic environmental conditions. Moreover, to select the most relevant input variables for WS forecasting is another future scope of this study which has been performed in coming research.

References

Barbounis

T.G.

,et al., Long-term wind speed and power forecasting using local recurrent neural network models, IEEE Transactions on Energy Conversion21(1) (2006), 273–284.

Kariniotakis

G.N.

, Stavrakakis

G.S.

and Nogaret

E.F.

, Wind power forecasting using advanced neural networks models, IEEE Trans. Energy Convers11(4) (1996), 762–767.

and Deo

M.C.

, Forecasting wind with neural networks, Marine Structures16 (2003), 35–49.

Parul , Malik

and Sharma

, Wind speed forecasting model for northern-western region of india using decision tree and multi layer perceptron neural network approach, Interdisciplinary Environmental Review19(1) (2018), 13–30. doi: 10.1504/IER.2018.089766.

Savita, M.A. Ansari, Pal

N.S.

, and Malik

, Wind Speed and Power Prediction of Prominent Wind Power Potential States in India using GRNN, in, Proc. IEEE ICPEICES-2016 (2016), pp. 1–6. doi: 10.1109/ICPEICES.2016.7853220.

Malik

, and Savita , Application of Artificial Neural Network for Long Term Wind Speed Prediction, in, Proc. IEEE CASP-2016 (2016), pp. 217–2229–11. doi: 10.1109/CASP.2016.7746168.

Azeem

, Kumar

and Malik

, Artificial Neural Network Based Intelligent Model for Wind Power Assessment in India, in, Proc. IEEE PIICON-2016 (2016), pp. 1–6, 25–27. doi: 10.1109/POWERI.2016.8077305.

Azeem

, Kumar

and Malik

, Application of Waikato Environment for Knowledge Analysis Based Artificial Neural Network Models for Wind Speed Forecasting, in Proc. IEEE PIICON-2016 (2016), pp. 1–6, 25–27. doi: 10.1109/POWERI.2016.8077352.

Buhan

and Çadirci

, Multistage wind-electric power forecast by usinga combination of advanced statistical methods, , IEEE Transactions on Industrial Informatics11(5) (2015), 1231–1242.

10.

Kusiak

and Zhang

, Short-horizon prediction of wind power: A data-driven approach, IEEE Transactions on Energy Conversion25(4) (2010), 1112–1122.

11.

Barbounis

T.G.

, Theocharis

J.B.

, Alexiadis

M.C.

and Dokopoulos

P.S.

, Long-term wind speed and power forecasting using local recurrent neural network models, IEEE Transactions on Energy Conversion21(1) (2006), 273–284.

12.

, Wunsch

D.C.

, O’Hair

E.A.

and Giesselmann

M.G.

, Using neural networks to estimate wind turbine power generation, IEEE Transactions on Energy Conversion16(3) (2001), 276–282.

13.

Ronay

A.K.

, Fink

and Zio

, Two machine learning approaches for short-termwind speed time-series prediction, IEEE Transactions on Neural Networks and Learning Systems27(8) (2016), 1734–1747.

14.

Ioakimidis

C.S.

, Oliveira

L.J.

and Genikomsakis

K.N.

, Wind power forecasting in a residential location as part of the energy box management decision tool, IEEE Transactions on Industrial Informatics10(4) (2014), 2103–2111.

15.

Barbounis

T.G.

and Theocharis

J.B.

, Locally recurrent neural networks for long-term wind speed and power prediction, Neurocomputing69(4-6) (2006), 466–496.

16.

Wang

, Zhang

, Wang

, Han

and Kong

, A novel hybrid approach for wind speed prediction, Information Sciences273 (2014), 304–318.

17.

Ramasamy, S.S. Chandel, and Yadav

A.K.

, Wind speed prediction in the mountainous region of India using an Artificial neural network model, Renewable Energy80 (2015), 338–347.

18.

Shao

, Wei

, Deng

and Xing

, Short-term wind speed forecasting using wavelet transformation and AdaBoosting neural networks in Yunnan wind farm, IET Renew Power Gener11(4) (2017), 374–381. doi: 10.1049/iet-rpg.2016.0118.

19.

Azad

H.B.

, Mekhilef

and Ganapathy

V.G.

, Long-term wind speed forecasting and general pattern recognition using neural networks, IEEE Transactions On Sustainable Energy5(2) (2014), 546–553. doi: 10.1109/TSTE.2014.2300150.

20.

Timeseries data and Publications for Sale. http://niwe.res.in/NIWE_OLD/html/departments_ps.html, [accessd 5.04. 2016].

21.

https://eosweb.larc.nasa.gov/sse/RETScreeii/ [accessd 5.04.2016]

22.

Malik

, Wavelet and hilbert huang transform based wind turbine imbalance fault classification model using k-nearest neighbor algorithm, Int J Renewable Energy Technology9(1/2) (2018), 66–83. doi: 10.1504/IJRET.2018.090105

23.

Malik

and Mishra

, Artificial neural network and empirical mode decomposition based imbalance fault diagnosis of wind turbine using turb sim, FAST and simulink, IET Renewable Power Generation11(6) (2017), 889–902. 10.1049/iet-rpg.2015.0382.

24.

Yadav

A.K.

, Malik

and Chandel

S.S.

, Application of rapid miner In ANN based prediction of solar radiation for assessment of solar energy resource potential of 76 sites in northwestern India, Renewable and Sustainable Energy Reviews52 (2015), 1093–1106. doi: 10.1016/j.rser.2015.07.156

25.

Yadav

A.K.

, Malik

and Chandel

S.S.

, Selection of most relevant input parameters using WEKA for artificial neural network based solar radiation prediction models, Renewable and Sustainable Energy Rev31 (2014), 509–519. doi: 10.1016/j.rser.2013.12.008

26.

Malik

and Sharma

, EMD and ANN based intelligent fault diagnosis model for transmission line, Fuzzy Systems32(4) (2017), 3043–3050. doi: 10.3233/JIFS-169247

27.

Yadav

A.K.

and Malik

, Comparison of different artificial neural network techniques in prediction of solar radiation for power generation using different combinations of meterological variables, in Proc. IEEE Int. Conf. on Power Electronics, Drives and Energy Systems (PEDES)2014 pp. 1–5. doi: 10.1109/PEDES.2014.7042063

28.

Saad

and Malik

, Selection of Most Relevant Input Parameters Using WEKA for Artificial Neural Network Based Concrete Compressive Strength Prediction Model”, in: Proc. IEEE PIICON-2016, 2016, pp. 1–6. doi: 10.1109/POWERI.2016.8077368.