Abstract
This paper deals with the identification of anomalies in wind turbine (WT) gearbox by temperature trend analysis approach. Support vector regression (SVR) is adopted to build two models for forecasting operating temperature of WT gearbox. One model is trained with historical supervisory control and data acquisitions (SCADA) data in the normal state, and the other is trained with abnormal state data. The prediction accuracy of two models is compared, and the sequences of relative error (SRE) index for two models are calculated. Then, two trend cloud model, namely normal cloud, and abnormal cloud, are built based on an improved inverse normal cloud generator, meanwhile the SRE are used as inputs of the generator, and the parameters of different trend cloud models are obtained as outputs. The closeness degree of the current state related to the normal or abnormal cloud can be calculated using the current SCADA data, and the principle of maximum closeness degree is adopted to judge the anomaly. The proposed approach has been used to analyze a real gearbox failure occurred in a 1.5 MW WT. The results obtained confirm the feasibility and efficiency of the proposed approach.
Keywords
Introduction
Gearbox is one of the core components of WTs. According to the statistic results on faults and shutdown times of wind turbines, the average downtime of gearbox is about 12 days per year [1]. Though the failure rate of gearbox is not higher than those of other components, the total costs of maintenance, replacement and downtime are prohibitive [2–4]. Therefore, it is of great significance to alarm the anomaly of WT gearbox at early stage to avoid serious accidents. In [5], a summary of common failure modes for WTs gearbox was made, and it is pointed out that temperatures of gearbox, which are recorded through the SCADA system of WT, are often used to diagnose gearbox failure. In [6], an artificial neural network (ANN) was used to forecast the operating temperature of gearbox. When the difference between the forecasting value and the measured value is greater than a given value, an early failure warning is given. However, ANN often converges to local optima and its training process is always time consuming, which limits its application. SVR is a machine learning algorithm based on statistical learning theory, it is based on the minimum principle of structural risk and has better generalization ability, so it is suitable for large noise and nonlinear regression problems [7, 8]. Recently, SVR has been used for the prediction of wind power with positive results [9]. In [10], a temperature trend analysis approach was given to diagnose the WT generator’s early failures, in which the nonlinear state estimation method was adopted to improve the robustness of the diagnostic method. However, the temperature boundary is always fuzzy and uncertain in deteriorating process of the gearbox, so it still keeps a ticklish problem of determining the alarming threshold. Cloud models are effective tools in transforming between qualitative concepts and their quantitative expressions [11]. It can fit the fuzziness and gentleness in human cognizing process and is more applicable and universal in the representation of uncertain notions [12, 13], which has been adopted to evaluate the status of WTs and has gotten better results [14]. In this paper, a trend cloud model is proposed to identify the WTs gearbox anomalies. At the outset, SVR is used to build prediction models of the WT gearbox temperature, and the time series of relative errors between the predictive values and the measured values are calculated, which can be used to describe quantitatively the deterioration process of gearbox. By adopting the improved inverse normal cloud generator, the parameters of two state clouds, one being the normal cloud and the other the abnormal one, are extracted from the time series of relative errors. When the gearbox has a potential fault, the closeness degree of the current state related to the anomaly cloud will exceed the closeness degree related to the normal cloud, so the anomaly of gearbox can be identified at an early stage.
This paper is organized as follows: Section 2 introduces the basic idea of temperature trend analysis method and normal cloud model. The trend cloud-based anomaly identification is discussed in Section 3. In Section 4, the proposed approach is adopted to analyze a real gearbox failure that occurred in a 1.5 MW WT to demonstrate the effectiveness of the approach for early failure warning. Section 5 provides a discussion and conclusions, including suggestions for further research.
Related works
In this section, the basic concepts of normal cloud model and SVR method are presented.
Identification of WT Gearbox Failures using temperature trend analysis
The identification of WT gearbox failures using temperature trend analysis based on SCADA data was proposed by Peng Guo et al. [10]. In which the behavior models of the WT gearbox temperature were built to predict the variation trend of the gearbox temperature, and alarm triggered when the predicted variation trend varied from the measured temperature variation. There were two key points related to the proposed method: one is the accuracy of the prediction models, the other is the way to decide the allowable range of variation.
Normal cloud model
Let U be the universe of discourse, C is a qualitative concept in U. If x ∈ X is a random instantiation of the qualitative concept C, which satisfies , , then the degree of certainty of X belonging to concept C satisfies y (x) = exp [- (x - Ex) 2/(2En′2)]. Then the distribution of X on the domain U is called the normal cloud.
The normal cloud model is characterized by three parameters, including the expectation value Ex, which is the most representative numerical qualitative concept, the fuzzy entropy En, which reflects the fuzzy degree of qualitative concept, and the excess entropy He, which reflects the occurrence random of the sample that represents qualitative concepts, it reveals relationship between fuzziness and randomness. A typical normal cloud model is shown in Fig. 1.

A normal cloud model (Ex = 5, En = 2, He = 0 . 2).
In the cloud model, the cloud generators are used to convert between qualitative concepts and quantitative values, which include the forward cloud generator, the backward cloud generator, and the condition cloud generator.
The forward cloud generator can convert qualitative concepts to quantitative values. When three parameters (Ex, En, He) of the cloud and the number of cloud droplets N are given, the cloud droplets can be obtained as Table 1 [11].
The algorithm of the forward cloud generator
The backward cloud generator can convert quantitative data into qualitative concepts, that is, the qualitative concept is extracted from the sample data belonging to the qualitative concept, which is represented by parameters (Ex, En, He). In practical use, only sample data are given, and the certainty information on each sample is unknown. Therefore, it is of high practical value to determine the certainty by adopting a backward cloud generator [16]. The algorithm of backward cloud generator is described as Table 2.
The algorithm of backward cloud generator
However, in the above algorithm, the hyper entropy He is calculated as
The algorithm of improved backward cloud generator
When the parameters (Ex, En, He) are obtained, the certainty degree of given data x = x i is calculated by X condition normal cloud generator [11], the calculation process is described as Table 4.
The algorithm of X condition normal cloud generator
For a given data set of the (x
i
, y
i
) ∈
where σ is the Gauss kernel function width.
When Equation (1) is solved, the hyper-plane function can be found, and a future value can be predicted employing Equation (3)
Framework for anomaly identification
The flowchart of the proposed methodology is shown in Fig. 2:

Flowchart of anomaly identification for a WT gearbox.
The framework of the anomaly identification of WT gearbox is as follows.
Based on the historical data of the SCADA system, a prediction SVR model of the temperature of the gearbox is build, the normal and abnormal monitoring data are used as the input of the model to predict the temperature, and the relative error e
ijk
(i = 1, 2 j = 1, 2, 3, 4 k = 1, ⋯ , t) are calculated, where i = 1 denotes normal state, i = 2 denotes anomaly state, j = 1, ⋯ , 4 denotes different temperature items (including the temperature of the input shaft, output shaft, oil and the bearing), k denotes the prediction step number. Using the improved inverse normal cloud generator to extract the parameters of state clouds from the relative error sequences, then calculating the correlation degree k
j
and contribution degree w
j
between different state clouds. Using k
j
and w
j
to calculate the closeness of the current state to the normal cloud (ρ1) and abnormal cloud (ρ2). If ρ1 ⩽ ρ2, anomaly state is alarmed.
In the proposed approach, the definition of the closeness degree ρ is used to describe the closeness of the current state to different state clouds, while ρ1 stands for the normal state cloud C1 with parameters
The calculation algorithm of the closeness degree
The calculation algorithm of the closeness degree
An actual failure of gearbox occurred in a WT at Jingzhou wind farm on March 21, 2012 is analyzed by adopting the proposed approach. The SCADA system in this wind farm collects 60-s data. Due to the thermal inertia, the temperature of gearbox rises slowly, 60-s resolution is not necessary, and 10-min SCADA data is more suitable in this study, so the data was averaged up to obtain 10-min data sets. These data sets were used to build SVR model. Furthermore, the SCADA system includes 47 continuous monitoring items. Not all monitoring items are necessary for temperature prediction of gearbox. To reduce redundancy, a correlation analysis is made firstly between the predicted target and the monitoring items by using SPSS 19.0 [15], only those monitoring items, whose Pearson coefficients exceed 0.8, are selected as the input of SVR model. Taking the temperature prediction model of gearbox input shaft as an example, the input of SVR model includes the output shaft temperature of the gearbox, the oil temperature of the gearbox, the rotational speed of WT, the impeller speed, and bearing temperature of the gearbox. The historical SCADA data of the WT are collected and subdivided into two groups (normal and abnormal data), which can be used to train the normal and anomaly SVR models, respectively. In this paper, the training and test of SVR model are implemented in the Matlab 2009a software by calling the LibSVM toolbox in a PC with 2.6 GHz CPU and 20 G memory. LibSVM is a library for support vector machines [17]. Its goal is to promote SVM as a convenient tool. The best parameters of SVR model, including the complexity penalization term C and the width of Gauss kernel function σ, are chosen by the ten cross validation method, the results are C = 16, σ = 0.354.
Temperature prediction of gearbox in different states
The temperature of the gearbox input shaft is taken as an example to analyze the characteristics of the operating temperature trend in different states. The relative error e
REi
is used to describe the evolution process of gearbox operating state, which is calculated from Equation (5).
The prediction results in the normal states are depicted in Fig. 3. In Fig. 3(a), the predicted results for the normal state are shown, while the relative errors are presented in Fig. 3(b). The average relative error is 0.45%, the maximum relative error is 1.19%, which illustrates that the prediction accuracy of the model is high enough to reflect accurately the operating temperature trend of the gearbox input shaft.

The gearbox input shaft temperature prediction results for the normal state.
The SCADA system of the wind turbine gave a gearbox fault alarm at 2:41 on March 21, 2012. The prediction results for the abnormal states are depicted in Fig. 4. In Fig. 4(a), the prediction results during 70 minutes before the alarm are shown, and the relative errors are presented in Fig. 4(b). The average relative error is 18.07%, the maximum relative error is 24.44%. It is obvious that the prediction errors increase sharply.

The gearbox input shaft temperature prediction results for the abnormal state.
The sequences of relative errors are used as the input of the improved backward normal cloud generator, and the parameters of trend clouds are calculated by adopting algorithm 3. The calculation results are listed in Table 6.
The parameters of different trend clouds
The parameters of different trend clouds
The closeness degree of the current state related to normal and abnormal clouds are calculated by adopting algorithm 5. The SCADA system of the WT gave a gearbox fault alarm on March 21, 2012 at 14:27. In this paper, the SCADA data within 30 hours before the fault occurred are collected to test the effectiveness of the proposed approach. The curves of closeness degree of the normal and abnormal cloud are shown in Fig. 5.

Closeness degree curve of the normal cloud model.
As seen in Fig. 5, in the first half of the normal cloud, though there are fluctuations in the degree of closeness of the curve, its value is significantly greater than the closeness degree of the abnormal clouds. In the latter part, the closeness degree of normal cloud decreased rapidly near zero, while that of the abnormal cloud increased rapidly to a large value. Therefore, the model proposed in this paper can describe the transformation process from the normal state to the abnormal one. According to the principle of maximum closeness degree, the results of online identification are shown in Fig. 6. It can found that the proposed approach can identify the anomaly at the time point 105, and the SCADA system alarmed the over temperature at the time point 180.Since the time period of the data sets is 10 minutes, the approach proposed in this paper can alarm the anomaly 760 minutes ahead of the SCADA system. In literature [10], based on the standard deviation threshold crossing, the early failure detection time is 510 minutes before the first over-temperature alarm of SCADA system. So the proposed approach in this paper is more timely.

Gearbox state online assessment results.
A novel approach to the online anomaly identification of gearbox in wind turbine based on SCADA data is proposed. The selected monitoring items of SCADA system have been used to construct a SVR model to predict the operation temperatures of gearbox, and the relative error sequences between the predictive values and measured ones under different states have been calculated using trained SVR models. On this basis, an improved uncertainty normal cloud generator was given to extract the parameters from the relative error sequences. Both normal and abnormal clouds were constructed by using the extracted parameters. The concept of closeness degree was introduced to establish the anomaly identification model of gearbox. Finally, the effectiveness of the proposed approach was confirmed by a gearbox fault. Further research on the approach application to other parts of WT, such as generator and bearing, is already being planned. Furthermore, the cloud model proposed in this paper is one dimensional model, which cannot be applied to multidimensional decision problems, so it is necessary to study multidimensional cloud model for early warning of overall unit of WT.
Footnotes
Acknowledgments
This study was supported by Major Technology Plan Project of Xiamen (3502Z2011008).
