Abstract
This paper proposed a new hybrid performance evaluation model of wind turbine (WT) based on Morlet wavelet neural network (MWNN), self-organizing map (SOM) neural network and Markov chain. All the data are collected from supervisory control and data acquisition (SCADA) system. Firstly, the WT power prediction model is presented based on Morlet neural network and the number of input variables, learning rates, and hidden layer nodes are discussed to obtain the optimal one. The prediction deviations under the optimal model are calculated accordingly. Then, the wind power deviations are clustered by SOM neural network, and classified by Markov chain. Two wind turbine anomaly indices (AI), which are represented by AI1 and AI2, are proposed to analyze the WTs’ performance. The results show that the proposed indices could accurately evaluate the operation state of the WTs and have important reference value for improving the operation and maintenance efficiency of the wind farm concerned.
Keywords
1. Introduction
In the wake of increasing global energy demand and environmental degradation caused by non-renewable fossil fuels, people have invested extensive research on the engineering application of renewable resources, such as solar, geothermal, wind and so on. Because wind energy resource is clean, abundant, inexhaustible, and environmentally friendly, it has become the fastest growing renewable energy source in both developed and developing countries (Celiktas and Kocar, 2012). However, high dependency on the wind speed gives rise to the mutability of the wind power (Sun et al., 2016). At the same time, long-term operation of WT could lead to many problems, such as the degradation of power generation performance, the increase of failure rate and so on. Therefore, increased needs of wind energy require better operation state of wind. Fortunately, the supervisory control and data acquisition (SCADA) system can provide hundreds of measurements, such as wind speed, WT operation parameters, energy conversion parameters and so on, which could be used effectively to solve the problems above (Zhao et al., 2015).
For WT performance evaluation, because WT can be classified in different types, for example, small WT(SWT), horizontal axis WT(HWT), vertical axis WT(VWT), offshore WT(OWT) and distributed WT(DWT), the methods of performance analysis for different types of WTs are not the same. For SWTs, five major EU countries (France, Germany, Italy, Spain, and The Netherlands) examined the technical and economic feasibility of SWTs (Ozgener, 2006). Using wind power to meet green building approach, Onder et al. presented an energy analysis and aerodynamic performance analysis of SWT systems (SWTS) (Li et al., 2016; Wekesa et al., 2016). For HWT, Nobari et al. (2016) studied the improvement of performance of HWT with different blade tip plates. A three-D numerical code based on the finite volume method was independently developed to solve the governing equations by using Reynolds average Navier-Stokes equations and SST k-x turbulence model (Nobari et al., 2016). For VWT, Tetsuya Wakui and Yokoyama used a dynamic simulation model for numerical analysis, and developed a wind speed senseless performance monitoring method for a stand-alone VWT. This method focused on improving the response speed of the rotor when the VWT deteriorated during constant blade tip speed operation (Wakui and Yokoyama, 2013). Yao et al. (2012) simulated two-dimensional unsteady flow field of VWT with different turbulence models numerically by using software FLUENT and algorithm SIMPLEC. Wang et al. combined the characteristics of horizontal and vertical axis wind turbines (HAWT and VAWT) to a newly designed small WT, and the cross-axis WT (CAWT) was examined experimentally on the power performance in a low speed, with Reynolds numbers of Re1/442900, 57100, and 71400 respectively. The results were compared with a conventional straight-bladed VAWT (Wang et al., 2018). For OWTs, they were typically supported on large diameter monopoles and subjected to cyclic loads such as wind and waves. For DWT, availability and reliability are the top priorities for distributed generation (DG) system deployment, especially when operating in harsh environments. Condition monitoring (CM) can meet the requirements, but due to the large number of sensors deployed, it is challenging to process large amounts of data in real time. Wang et al. (2016) proposed a sensor optimization method based on principal component analysis (PCA) for condition monitoring of wind power generation DG system.
Every part of the WT has a big impact on the power generation performance. In order to consider the centrifugal pumping effect caused by the radial flow of the blades (Himmel Kamp effect), Arramach et al. established a model by using the aerodynamic coefficients of the pre-stall and post-stall regions. The numerical calculation program was used to predict the WT aerodynamic forces and power (Arramach et al., 2017). Because the minimum WT flow velocities of standard airfoils were about 7 m/s at stall angles of attack, Yavuz et al. (2015) chose NACA4412-NACA6411 slat-airfoil arrangement to investigate the potential performance improvements. Bontempo and Manna (2014) analyzed the aerodynamic performance of ducted WT using a nonlinear and semi-analytical actuator disk model. The aerodynamic performance model was performed based on the unsteady Reynolds average Navier-Stokes equations (URANS) combined with finite element method (FEM) in a loosely coupled manner (Bontempo and Manna, 2014). Dai et al. (2017) studied the WT relationship among average power, thrust and yaw angle. Lee et al. (2016) used an incompressible unsteady Reynolds average NaviereStokes (k-ε RNG) model to understand the performance and shape characteristics of a helical Savonius WT at various helical angles. Sağlam (2018) carried out the performance and fault analysis of the wind farm’s Wind Turbinerators (WTGS) to find out the reasons for the low efficiency of the wind farm. Herp et al. (2016) proposed a data driving model for wind farm performance monitoring based on Bayesian classifier and multivariable power curve analysis. Lapira et al. (2012) proposed a systematic framework that utilized multi-regime modeling approach to consider the WT dynamic working. Jia et al. (2016) proposed a novel similarity metric for machine performance curves and presented a framework to evaluate the WT health condition based on principal component analysis and the power curve.
This paper studies the WT power generation performance in a certain area of China, and combines power prediction model with performance analysis to evaluate the WT power generation performance. The logical relationship of each part is shown in Figure 1.

Framework of the WTs’ performance evaluation model based on ANN.
2. Modeling of WT power based on Morlet wavelet neural network
The data used in this paper come from two operational WTs located in Jilin Province, China, which are No. 2 WT and No. 3 WT. Their specifications and operation characteristics are as follows:(1) rated power 1500 kW, (2) designed air density 0.9 kg/m3, (3)cut-in speed 3.0 m/s, (4) rated wind speed 3.0 m/s, (5) cut-out speed25 m/s. An advanced data acquisition system directly connected to the turbine controllers was used to gather data over approximately 3 months. The No. 2 WT is used to build the prediction model based on ANN.
2.1. Database description
For each turbine, over 50 parameters are measured, including generator power (GP), generator speed (GS), wind speed (WS), average gear oil temperature per second (GOT), average gearbox DE end bearing temperature per second (G_DET), average gearbox NDE end bearing temperature per second (G_NDET), average generator DE end bearing temperature per second (GE_DET), average generator NDE end bearing temperature per second (GE_DENT), average generator cold air temperature per second (GE_CWT), average generator stator winding U temperature per second (GE_UT), average generator stator winding V temperature per second (GE_VT), average generator Stator winding W temperature per second (GE_WT). To determine the major independent variables, here, we select person correlation analysis method to calculate the correlation coefficients between the measured parameters and GP, and the formula is defined as follows:
Generally, if the absolute value of the correlation coefficient r is greater than 0.95, there is a significant correlation. If the absolute value of the correlation coefficient r lies between 0.8 and 0.95, there is a high correlation. If the absolute value of the correlation coefficient r lies between 0.5 and 0.8, there is a moderate correlation. If the absolute value of the correlation coefficient r lies between 0.3 and 0.5, there is low correlation. If the absolute value of the correlation coefficient r is less than 0.3, there is a weak correlation, and also it is called no correlation. If the distribution of variable values is non-normal or not clear, we should sort the large amount of discrete data or calculate the rank of continuous variable during calculation, then calculate the correlation coefficient. Here, we select the process variables whose correlation coefficients are bigger than 0.50 and the analysis results are shown in Table 2. Stated this, GS, WS, GOT, G_DET, G_NDET, GE_DET, GE_DENT, GE_CWT, GE_UT, GE_VT, and GE_WT are chosen to build the model. The analysis results are shown in Table 1. For the whole database, the ranges for the parameters are shown in Table 2.
Pearson correlation for the No. 2 WT.
The ranges for the whole database.
2.2. Data pre-processing
When the wind speed varies within its low value region, it may cause WT start up or shut down frequently. On the other hand, the data collected have different stochastic characteristics and should be filtered out. Here, wind speeds less than 3 m/s (cut-in wind speed) and wind speeds greater than 25 m/s (cut-out wind speed) are stroked out. Besides, only the GP data that is greater than 0 kW, could be used to train the ANN model.
2.3. ANN selected
Here, BP neural network (Xue et al., 2019), RBF neural network (Wen et al., 2018), Gaussian wavelet neural network and Morlet wavelet neural network (Wen et al., 2017) are selected to build the prediction models. The input nodes of all the models above are 11, the hidden nodes are 6 and the output nodes are 1. The learning rates among input, hidden and output layer are all 0.3. The initial weight vectors among input, hidden and output layer are initialized randomly. The 9000 samples from the No. 2 WT are selected as training data, and 1053 ones are used as test data. Metrics including root mean square error (RMSE), mean absolute percentage error (MAPE) and regression coefficient(R2) are used to analyze the performance of prediction models:
where
The prediction results are shown in Figure 2. For the No. 2 WT, the Morlet wavelet neural network (MWNN) model has the highest prediction accuracy for both training and test samples. For the training samples, its MAPE converges to 0.083, its RMSE converges to 24.07, and its R2 value is 0.992. For the test samples, its MAPE converges to 0.046, its RMSE converges to 18.36, and its R2 value is 0.996. Table 3 also shows the statistical results. Their generalization abilities from strong to weak are as follows: MWNN, Gaussian wavelet network, BP and RBF. So, the MWNN model is finally selected to build the power prediction model. The architecture of the MWNN is shown in Figure 3. There are three layers in the MWNN model: input layer, hidden layer and output layer. The input layer contains 11 neurons, the hidden layer contains n neurons and the output layer contains 1 neuron. The framework of wavelet neural network is constructed based on BP neural network. It replaces the sigmoid function with the Morlet wavelet transform function and combines translation factor and scaling factor to construct the wavelet basis. The threshold-controllable function in BP neural network, which is to carry on the horizontal fine adjustment to the weighted input vector after the input quantity, is realized by translation factor. The weight adjustment at different scales is realized by the scaling factor. Due to the combination of these two factors, the MWNN can be used to approximate the objective function at different scales.

Prediction results based on different models.
Comparisons of different prediction methods.

Architecture of MWNN.
2.4. Modeling based on MWNN
The number of the hidden layer nodes is a premise in simplifying models and ensuring prediction accuracy. Take the average value through multiple training, the RMSEs and MAPEs are shown in Figure 4, when the number of hidden-layer node ranges from 1 to 10.

Results when the node of midnum-ayer ranges from 1 to 10.
Figure 4 shows that when the hidden layer node is 1 or 2, the values of MAPE and RMSE are both bigger. In other words, the prediction models have lower prediction accuracies. Figure 5 shows the RMSEs and MAPEs, when the hidden layer node ranges from 3 to 10. From Figure 4, we could draw a conclusion that the optimal number of hidden layer node is 3, because both the MAPE and the RMSE get the minimum.

Updating RMSE and MAPE of different hidden-layer nodes.
Now, we analyze the influence of learning rates on the model. From the MWNN topological structure and network convergence algorithms, we know that there are two weight coefficients that need to train: α and β.
Here, we discuss the learning rate α and β by orthogonal method. The ranges of the learning rates are all from 0.01 to 0.09. Each time, the two learning rates take a value from the range (0.01, 0.09), respectively, and are used to training the MWNN model. Take the average value through multiple training, Figure 6 shows a three-dimensional contour maps of RMSEs and MAPEs. The legends from dark blue to deep red indicate that the data range from small to large respectively. Table 4 shows several minimum points in Figure 6. According to Figure 6 and trained results, we know that the point (0.02, 0.01) has the minimum RMSE and MAPE value.

Prediction results. (a) RMSE of different learning rates after updating. (b) MAPE of different learning rates after updating.
The several better points of RMSE and MAPE in Figure 4(a) and 3(b).
3. WT performance analysis
3.1. Deviation clustering based on SOM
Once the optimal prediction model has been created, it can be used in real time to predict the expected value for normal behavior under the current working conditions. Any incipient failure node that could appear during the WT operation, would produce a deviation between the expected value and the real value measured. Take into account this reasoning, it is possible to assert that in general the failure node would be a function of its symptoms characterized by deviations with respect to the normal behavior expected, suggesting that this failure node is present, or at least that there are contextual circumstances facilitating it (Sun et al., 2016). Here the “Deviation” (Dev) of WT performance indicator is defined as follows (Du et al., 2018):
Where,
Here, the unsupervised clustering algorithm—self-organizing map (SOM) is used to cluster the deviations.
The deviations can be formulated as:
The target type to be divided into can be formulated as:
Therefore, SOM is a mapping process from S to V. The SOM used here has 9 neurons and mean square error (MSE) is used as performance indicator. The topological structure and clustering results are shown in Figures 7 and 8. Based on the results, the neurons, which have the same or similar color, are divided into one cluster. Stated this, the 1st, 2nd, and 4th neuron are classified as state 1, the 7th, 8th, and 9th neuron are classified as state 2, the 3rd and5th neuron are classified as state 3, and the 6th neuron is classified as state 4. Therefore, all the clusters of the No. 2 WT are classified into four categories based on the different colors, which also correspond to the four different performance/operating states of the WT. The clustering results of all the deviations is also shown in Figure 9. Take the performance states to determine whether the WT is operating normally or not, by comparing actual parameters with fault records. The results show that: in the fault records, there are no fault samples from state 1 and state 2, and there are some fault samples from state 3, and there are almost all fault samples from state 4. The fault probability is directly proportional to the deviation, so it can be used to evaluate the operation state of the WT. So, it prove that it is reasonable to divide the deviation sets into four categories.

The structural diagram of neurons in the SOM of the No. 2 wind turbine.

The neuron weight value distribution.

The deviations and each states of the No. 2 WT.
3.2. Markov model
Figure 10 shows the samples selected from state 4, containing an operating period of about 3 h (19 consecutive data points). From Figure 10(b), we know that the wind speed around the No. 2 WT has not changed during the period concerned. However, when the WT was running between points 14 and 16, the GP and GS changed abnormally with a sudden descent (Figure 10(a)), followed by a sharp fall in temperature (Figure 10(c) and (d)). There is a fault record by querying historical data during this period. As we know, the GS parameter should be stable under normal operating conditions and GP parameter varies along with GS parameter. However, if the GS fluctuates over a wide range, that would result in a larger power error. The accumulated deviations over a period of time is defined as the total deviations observed during the time. May be the total deviation is more due to the detection of a failure mode than due to a particular working conditions in other cases, but the deviation is accumulated continuously according to the value observed. This approach accounts for accumulating stress. Here, in order to explain the relationship between the clustering results and provide useful information for WT performance, this paper proposes a performance index and calculation model based on Markov Chain.

Changes in parameters during the 3-h operation in State 4 of the No. 2 wind turbine. (a) Curves of Dev, GP, GS and the prediction results of Morlet NN. (b) Curves of WS, GTO, G-NDET and G-DET. (c) Curves of GE-DET, GE-NDET and GE-CWT. (d) Curves of GE-VT, GE-WT and GE-UT.
Here, a discrete Markov model is proposed based on finite state space. Based on the clustering results by the SOM above, the state transition (ST) probability matrix and initial state probability (IS) of wind turbine are defined as:
where, p ij = P(V j |V i ),which represents the transition probability from state Vi to state Vj under the state Vi, c i =P(V i ),which represents the transition probability of the state Vic i = P(V i ),ST is the probability of transition from one state to another, and IS is the initial probability of each state. The memorylessness of Markov model is shown as follows:
For each Vit, the subscript i is the current state of the system and the superscript t is time. For the abnormal level quantization of prediction errors, the Vieira and Sanz-Bobi (2013) defined the anomaly level index, which provided interval threshold references for the parameters during the WT operation process.
In order to calculate the abnormal index, here, the probability of state transition (
For each
Equation (11) can be obtained from the conditional probability.
Through (11), each state transition paths can be calculated as follows:
Because (9) is memoryless, so the (12) can be further simplified by (9):
Where
Table 5 shows the transition possibility matrix of the No. 2 WT. According to the analysis results above, most observation values in State 4 correspond to the WT abnormal (fault) state. The transition probabilities from other states to State 4 are also shown in Figure 11. For the State 4, the higher transition probability from other states to State 4 lies between State 3 and State 4. It means that most observation values of State 4 are from State 3. Therefore, when the WT deviation values enters State 3, we should pay close attention to the operation state and avoid failure risk. Besides, the transfer probability from State 4 to itself is 0.3360. It means that, once the fault occurs, it is difficult for the WT itself to transfer from the fault state to the normal state. Here, the Markov mode for the No. 3 WT has also been established.
Transition possibility matrix for wind turbine 2.

Transition possibility from one state to another (No. 2 WT).
3.3. Abnormal state probability
Because the State 3 has some fault samples, the State 4 has almost all fault samples, and the higher transition probability from other states to State 4 lies between State 3 and State 4, here the transition probability from State 3 to State 4 is defined as anomaly index 1 (AI1). Similarly, if the WT stays in State 1 for at least 2 h and then directly transfers to State 4, which means the WT falls into a severe abnormal state, the transition probability from State 1 directly to State 4 is defined as the anomaly index 2 (AI2).
According to formula (10), AI1 can be expressed as:
AI2 can be expressed as:
AI1 and AI2 correspond to two different levels of safety for the WTs. Different safety levels correspond to different WT risk levels, and the priorities of the measures taken are also different. Table 6 shows the initial state probability calculation results for the No. 2 and 3 WT and Table 7 shows the calculation results of AI1 and AI2. WTs are at failure risk in State 3, and there is no probability of anomalies occurring in State 1. According to the previous definition, the WTs’ risk degree in the case of AI2 is much greater than that of AI1. If the AI2 value is larger than that of AI1, the WT failure risk is very high, and it is necessary to repair and eliminate the failure risk immediately. If the AI1 value is near a given value, maintenance and repair are required. From Table 7, we can see that the AI1 values of the two WTs are much larger than the AI2 values, which indicates that the two WTs are in good health. Although their risk degrees are relatively low, the troubleshooting of alarm event records and scheduled maintenance must be done with high risk requirements to ensure the reliable operation of the WTs and the better utilization of wind resources. Besides, the AI1 value of the No.2 WT is smaller than that of the No.3 WT, but the AI2 value is greater than that of the No.3 WT. Therefore, the failure risk of the No.2 WT is greater than that of the No.3 WT. At the same time, by comparing the their alarm and maintenance data, we know that, for the No.2 WT, the number of alarms and repairs is much more than that of the No.3 WT during this period.
Initial state probability of the two WTs.
Abnormal index of the two WTs.
4. Conclusion
In this paper, a WT performance evaluation method is proposed based on SCADA data, Morlet NN, SOM and Markov model.
Firstly, a WT power prediction model is proposed based on Morlet NN and SCADA data. Compared with other neural network models mentioned, the training and testing results prove that the proposed model has higher prediction accuracy. Secondly, the power deviations obtained from the optimal Morlet wavelet network model are clustered by SOM, and the Markov model is used to analyze and evaluate the WT abnormal state. A new index called AI is proposed to quantify the WT abnormal level. Two case studies and analysis of onshore wind farms in northern China are conducted and the results show that the proposed AI could evaluate the WTs’ current power generation performance effectively and accurately.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors are thankful to Jilin City outstanding young talents training program (20190104156), the support of the science and technology projects by Jilin Province Department of Education (JJKH20190709KJ), the KEY Scientific and Technological Project of Jilin Province of China (20180201001SF) and National Natural Science Foundation of China (51476025).
