Abstract
Supervisory control and data acquisition data including comprehensive signal information have been widely applied to fault diagnosis. However, because of the complex operational condition of wind turbines, supervisory control and data acquisition data become complicated and abstract to study. This article proposes a pitch fault diagnosis method of wind turbines in multiple operational states using supervisory control and data acquisition data. According to the performance of characteristic parameters in nine operational states of wind turbines, Gaussian mixture model clustering and the analysis of normal performance curves are applied to model the relationship of pitch angle, rotor speed, and wind speed. Four cases have been studied to demonstrate the feasibility of the proposed method. The advantages of the proposed approach are as follows: (1) simplifying the analysis of supervisory control and data acquisition data through dividing the data into nine parts; (2) detecting pitch faults earlier than supervisory control and data acquisition monitoring system; (3) visualizing the abnormal behavior of the pitch system; and (4) improving the interpretability of the method with the incorporation of domain knowledge.
Keywords
Introduction
In recent years, wind energy as a kind of “green energy” is currently the fastest growing renewable energy source with the lack of fossil energy and the issue of global warming. At the end of 2016, wind industry increased 54,642 MW for a total of 486,790 MW (Global Wind Energy Council, 2016). With the cumulated installed capacity of wind turbines increasing, it is vital to reduce the operation and maintenance (O&M) costs for enhancing the competitiveness of wind farms. The traditional reactive maintenance which does not meet the need of cost savings has been replaced by preventive maintenance. In order to check the fault levels and prevent catastrophic faults, the time intervals of inspection using preventive maintenance will be shortened. The periodic inspections are very labor effective and expensive. Meanwhile, the downtime will reduce the power generation. Condition-based maintenance (CBM) is a maintenance strategy that recommends maintenance actions based on the information collected through condition monitoring and fault diagnosis of wind turbines. It schedules maintenance at an optimum time according to the condition of wind turbines and makes wind turbines have a maximum uptime with minimum maintenance costs. Therefore, CBM has received much attention in recent years.
Most downtime of wind turbines is the result of subsystem faults. It is an effective way to improve the reliability of wind turbines through monitoring the subsystems of wind turbines and schedule maintenance. There have been many condition monitoring and fault diagnosis systems for the main bearing, generator, and gearbox, which are the highest cost subcomponents of a wind turbine. Several works (An et al., 2012; Lei et al., 2012; Sheng, 2016; Yang et al., 2008) have studied the vibration analysis and oil monitoring. Pitch system plays a key role in wind turbines to collect wind energy. The blade angle is regulated by the pitch system not only to maintain the efficiency of wind energy conversion, but also to protect wind turbines from high wind speeds and emergency situations. It has been noted by Wilkinson et al. (2010) that the electrical pitch system accounted for 15.5% of failures and 20% of total downtime after making statistics analysis of 31,500 downtime events. The highest failure rate and downtime from public domain surveys by Chen et al. (2015) are shown in Table 1, in descending order of significance. The pitch system suffers the highest failure rate and ranks fourth on the downtime. Therefore, condition monitoring and fault diagnosis of the pitch system for improving the reliability of wind turbines can dramatically reduce downtime and O&M costs.
Wind turbine sub-assemblies’ failure rate and downtime.
Supervisory control and data acquisition (SCADA) system is a standard installation on large wind turbines, which provides comprehensive signal information including operational and state data, environmental information, alarms, and detailed fault logs. In addition to providing large collections of history data of wind turbines, SCADA system is also a condition monitoring system which can detect the early faults and take simple action to protect wind turbines, such as stop-restart. As condition monitoring and fault diagnosis of wind turbines using SCADA data is a potentially low-cost solution requiring no additional sensors, there have been many studies (Purarjomandlangrudi et al., 2013; Tautz-Weinert and Watson, 2017) on condition monitoring of wind turbines based on SCADA data. The SCADA data of the six known wind turbine pitch faults have been used to train a priori knowledge-based adaptive neuro-fuzzy inference system (ANFIS) by Chen et al. (2013). The trained system is tested in a wind farm containing 26 wind turbines. The results show a strong feasibility for wind turbine pitch fault prognosis. Kusiak and Verma (2010) have built a data-mining-based prediction model to diagnose the blade angle asymmetry fault and blade angle implausibility fault. Genetic programming algorithm results in the best accuracy and is selected to perform prediction at different time stamps.
Wind turbine system as a mechanical and electrical system has complicated operational condition. With the condition transforming, SCADA data become complicated and abstract to analyze. Therefore, when SCADA data are applied to monitor the condition of wind turbines, it is essential to study the multiple operational states of wind turbines. Rotor speed–wind speed curve, power–wind speed curve, and blade angle–wind speed curve have been used for monitoring a wind farm’s performance by Kusiak and Verma (2012). The performance curve data are grouped into several clusters by k-means clustering for better identification of outliers. Yesilbudak (2017) has developed a partitional clustering-based outlier detection approach to optimize the power curve. The local Mahalanobis distance of each data point to its cluster centroid is computed for detecting outliers. Ye et al. (2010) have defined multiple states to distinguish different working conditions, including complete shutdown, under-performing states, abnormally frequent default states, and normal working states. The combinations of some states have been discovered to be strong indicators for fault detection. Pitch motor torque and current as the indicators of the condition of the pitch system have been analyzed and compared with a theoretical model by Nielsen et al. (2014). The potential factors which are classified according to six operational states of wind turbines are studied in detail. The results are important to consider, if using the pitch motor current or torque as an indicator for the pitch system health is considered. Bi et al. (2016) have proposed a detail study on the power–generator speed curve and the pitch angle–generator speed curve according to the operational states of pitch-regulated wind turbines. Criteria for pitch fault detection change with the operational states. But the operational states are segmented directly based on wind speed. There is likely to be deviation in transitional regions.
This article provides a novel pitch fault detection procedure in multiple states using SCADA data, based on a full understanding of wind turbine operational states and pitch system working principle. According to the performance of characteristic parameters in the nine operational states of wind turbines, Gaussian mixture model (GMM) clustering and the analysis of normal performance curves are applied to diagnose the relationship of pitch angle, rotor speed, and wind speed. The cases studied in this article demonstrate that the proposed method is an effective technique to diagnose the pitch faults and performed much better than the SCADA system.
This article is organized as follows. In section “Operational states of wind turbines,” nine operational states of wind turbines are analyzed. In section “Pitch control strategy,” the pitch control strategy is introduced. Section “Analysis of characteristic parameters in nine states” presents a description of the blade angle–rotor speed curves in different states. The diagnosis method of the pitch system is proposed in section “Fault diagnosis methods in multiple operational states.” In section “Cases study and discussions,” four cases from a wind farm in China are studied. Finally, the conclusions are presented in section “Conclusion.”
Operational states of wind turbines
According to the power generation process, programmable logic controller (PLC) of the wind turbine control system defines nine operational states of wind turbines. The explanation of nine states is shown in Table 2. The corresponding control strategies of wind turbine in different operational states are different. With the working condition of wind turbines changing, the states can be converted as the following rules:
When wind turbines are in the normal condition of power generation, the change rule of state is four-five-six as the wind speed increases;
If the SCADA system detects a fault, the state will transform as seven-one-two;
When a scheduled maintenance task is implemented, the state will change as seven-nine-two;
Wind turbines can restart automatically to eliminate potential failures and the state will turn from eight to three.
Nine operational states of wind turbines.
Pitch control strategy
With the development of large-scale wind turbine, variable-speed and constant-frequency (VSCF) pitch-controlled wind turbine, which captures the maximum of wind energy and outputs stably, is increasingly becoming the mainstream wind turbine (Girsang and Dhupia, 2013). The control diagram of VSCF pitch-controlled wind turbine is shown in Figure 1. The pitch control is discussed above: When the wind speed is less than the rated speed, the blade pitch angle remains at

The control diagram of variable-speed and constant-frequency (VSCF) pitch-controlled wind turbine.
Analysis of characteristic parameters in nine states
Three characteristic parameters which are used to detect the pitch faults in this study are blade angle, rotor speed, and wind speed. In order to analyze the normal performance curves of the parameters, 1-s interval SCADA data are collected for 1 month from a healthy direct-drive wind turbine. Figure 2 shows the performance curve of blade angle and rotor speed, which is obviously piecewise. According to the discussion of wind turbine operational states in section “Operational states of wind turbines,” the SCADA data were divided into nine parts using the state code of wind turbines from the PLC system. The divided performance curves of blade angle and rotor speed in different states are shown in Figure 3. A further study of the performance curve in different operational states is discussed in the subsequent sections.

Blade angle–rotor speed curve in all states.

Blade angle–rotor speed curves in the nine operational states.
State six
Wind turbines generate power only in state six. The three-dimensional (3D) performance curve of blade angle, rotor speed, and wind speed is shown in Figure 4. When the wind speed is low and the rotor speed is lower than the rated value, the blade angle will remain at 0°. The rated rotor speed is the value at which wind turbines produce rated power output. When the wind speed is higher and the rotor speed reaches the rated value, blade angle will be adjusted between 0° and 20° to keep power output around the rated value. State six is the key working state of the pitch system. In the next section, an outlier detection method based on GMM clustering used in state six is proposed.

Blade angle–rotor speed–wind speed curve in state six.
State five and state eight
In state five, wind turbines rotate blades to capture wind energy and drive the generator for power generation. First, the blade angle decreases from 86° to 16°. Then if the rotor speed is lower than 3.5 r/min, the blade angle will remain constant at 16°. If the rotor speed is between 3.5 and 6.5 r/min, the blade angle will decrease to 0°. When the rotor speed is higher than 6.5 r/min, state five will convert to state six directly. The blade angle increases from 0° to 86° for shutdown in state eight. The halt in state eight is for the sake of self-examination and maintenance. Therefore, it does not belong to fault shutdown. In both the states, the blade angle changes continuously and monotonously. And the movement is not influenced by wind speed.
Other states
In state one, wind turbines are shut down because of faults. Blade angle may be adjusted by manual maintenance. Hence, the data in state one cannot be used to detect faults.
In state two, state three, and state four, the blade angle maintains at around 86°.
The blade angle increases from 0° to 86° in state seven and state nine. The curves shown in Figure 3 are from a healthy wind turbine. So there is no data point in state seven and state nine. When the SCADA system detects a fault, wind turbines will be shut down in state seven. The normal curve in state seven is illustrated in Figure 11. In state nine, the shutdown is manual. The normal curve in state nine is illustrated in Figure 9.
Fault diagnosis methods in multiple operational states
In this article, the performance of characteristic parameters is respectively monitored in different operational states of wind turbines. The SCADA data used in this article come from six direct-drive wind turbines of the 2-MW class in a wind farm located in China. Data are collected at 1-s intervals from June 2016 until June 2017. Two of the wind turbines are known to be healthy, the data of which are chosen as the training set and validation set, respectively. The other wind turbines once had failures in the pitch system. The fault cases of the pitch system will be discussed in the case study.
Fault diagnosis method in state six
GMM clustering
The goal of cluster analysis is to find structures of individuals characterized by the greatest similarity within the same cluster and the greatest dissimilarity between different clusters. GMM (Nguyen and Wu, 2013) clustering is a model-based method. Each of the clusters is represented by a Gaussian distribution, the parameters of which are optimized. Assume that an n-dimensional random variable
where
where
In order to determine the parameters
The expectation and maximization (EM) algorithm (Lei and Jordan, 1996; Yang et al., 2012) is employed to compute the parameters in this article. The EM algorithm can be applied in two steps:
E-step. Estimate the probability that the data are generated by each component. By Bayes’ theorem, the a posteriori probability of the ith component is as follows
M-step. Update the parameters
Then repeat the above two steps until the value of the likelihood function is converged.
Normal performance curve in state six
State six is the key phase of all the states. In this phase, the pitch system is activated by high wind speed to control the power output. The 3D curve of blade angle, rotor speed, and wind speed is shown in Figure 4. In this study, GMM clustering is used to fit the normal performance curve in state six. The proposed fitting procedure is illustrated in Figure 5.

The fitting procedure of normal performance curve in state six.
First, data preprocessing is needed. In order to obtain the stable data in state six and delete the outliers coming from the SCADA system, the data at the first 120 s and the last 120 s of state six are removed. Before selecting the GMM, data are centralized and normalized to optimize the computation.
Second, the parameters of GMM clustering are selected. Mixture modeling is a powerful statistical technique for unsupervised density estimation. GMM clustering divides the entire space of data to several regions and each region is modeled by a probability density which is usually chosen from a class of similar parametric distributions. Two important issues in GMM clustering are the selection of the number of mixture components and the covariance type. In this study, we use Bayesian information criteria (BIC) to select the model parameters. The BIC measure is formulated as
where
The BIC scores of four covariance types in different number of mixture components are shown in Figure 6. The full covariance and the diagonal covariance are much better than the tied covariance and the spherical covariance. When the covariance type is full and the number of components is 7, the BIC value is the smallest. So the full covariance and the seven components are confirmed in this study. The clustering result of data in state six is shown in Figure 7. The data in the same cluster are marked in the same color. The red symbol “x” in Figure 7(a) indicates the mean centroid of each cluster.

The BIC scores of four covariance types in different number of mixture components.

The clustering result of data in state six: (a) the clustering result shown in the two-dimensional figure and (b) the clustering result shown in the three-dimensional figure.
The International Standard IEC 61400-12-1 (2005) provides a standard methodology for measuring the power performance characteristics of a single wind turbine. The measured power curve is determined by applying the “method of bins” for the normalized data sets, using 0.5 m/s bins and by the calculation of the mean values of normalized wind speed and normalized power output for each wind speed bin. This article divides the wind speed range into 0.5 m/s contiguous bins referring to this standard to fit the performance curve of the pitch system. The two-dimensional (2D) scatter plots of rotor speed and blade angle in different wind speed ranges are shown in Figure 8. There are several clusters in each wind slice. The mean Euclidean distances between individual data points and the corresponding cluster centroids in each wind speed slice are computed and represented by

The blade angle–rotor speed curves in different wind speed ranges.
Evaluation indicator of faults
After fitting the normal performance curve of blade angle, rotor speed, and wind speed, the sum of relative deviation between the healthy data and the measured data D is applied to detect pitch faults. The fault diagnosis of the measured data takes place on an hourly basis in this article. The mean Euclidean distance of the measured data is denoted as
The normal threshold D is determined by the validation set. All the values of D from the normal data in the test set are less than 14.5. So the normal threshold of D is set to be 14.5 in this study.
Fault diagnosis method in state five and state eight
In state five and state eight, the blade angle varies continuously and monotonically. The track of the blade angle between 20° and 80° is simulated with a line using the least square method.
Fault diagnosis method in state two, state three, and state four
In state two, state three, and state four, the blade angle is limited by the limit switch. The normal threshold of the blade angle is 85°–89°. If the blade angle exceeds the threshold, the diagnosis system will alarm.
Fault diagnosis method in state seven and state nine
The blade angle increases from 0° to 86° to slow down the rotor in state seven and state nine. Under normal conditions, when the rotor speed decreases continuously, the pitch angle increases continuously. The average rotor speed and the average pitch angle of the first 5-s data in the two states are, respectively, denoted as
Cases study and discussions
In order to validate the feasibility of the proposed fault diagnosis method, four cases of pitch system faults have been detected and analyzed in this section. Results were compared with SCADA alarm information.
Case 1
In the first case, a wind turbine alarmed a pitch system fault and entered the shutdown state at 12:03 am on 28 January 2017. The maintenance personnel detected that the angle encoder connection of blade 3 was loose. This fault will decrease the accuracy of angle control and cause a serious position deviation of the blades during moving and braking. The SCADA data collected for 3 days from 26 January 2017 to 28 January 2017 are shown in Figure 9. There are obvious outliers in state seven, which are outlined by a red rectangle. The abnormal data distribution cannot be directly seen clearly in state six.

The scatter plot of SCADA data collected from 26 January 2017 to 28 January 2017 in case 1.
The alarms from the proposed diagnosis system are marked as 1. The alarms from the SCADA system are marked as 0.5, as shown in Figure 10. The graph’s horizontal axis shows the time range (26–28 January 2017). There are alarms in state six and state seven. The earliest alarm from state six which is denoted by the green square is 5 h earlier than that from the SCADA system. The alarm from state seven which is denoted by a purple circle and that from the SCADA system ring at the same hour.

The comparison of alarms in case 1.
Case 2
n the second case, a turbine was shut down at 7:36 pm on 18 June 2017 due to a polluted slip ring, which caused the communication between the pitch actuators and the main controller unstable. The pitch angle could not change because the pitch actuators do not receive any command from the control system.
The SCADA data collected from 16 January 2017 to 18 January 2017 are shown in Figure 11. Abnormal data in state five and state six are outlined by red rectangles. The comparison of alarms is illustrated in Figure 12. The first two alarms from state five and state six ring at the same time. The earliest alarm is 18 h earlier than the alarm from the SCADA system.

The scatter plot of SCADA data collected from 16 June 2017 to 18 June 2017 in case 2.

The comparison of alarms in case 2.
Case 3
In the third case, a driver failure of blade 3 of a turbine occurred at 9:10 am on 26 April 2017. The angle of blade 3 could not follow the set value. After maintaining the angle of blade 3, no fault was alarmed. The SCADA data collected from 26 April 2017 to 28 April 2017 are shown in Figure 13. In state five, state six, and state seven, outliers are outlined by red rectangles. The comparison of alarms is shown in Figure 14. The earliest alarm from state six is 16 h earlier than that from the SCADA system. The alarm from state five is 12 h earlier than that from the SCADA system. The alarm from state seven and that from the SCADA system ring at the same time.

The scatter plot of SCADA data collected from 24 April 2017 to 26 April 2017 in case 3.

The comparison of alarms in case 3.
Case 4
In the fourth case, a fault was alarmed at 2:21 am on 30 August 2016 by the SCADA system. After inspection, it was observed that a limit switch got stuck by a towel. The limit switch is installed on the rotor to control the extreme position of the blades. The SCADA data collected from 28 August 2016 to 30 August 2016 are shown in Figure 15. The outliers in state seven are outlined by two red rectangles. The alarms are illustrated in Figure 16. The proposed method only alarms in state seven. And the time of the alarm from the proposed method and that from the SCADA system are the same.

The scatter plot of SCADA data collected from 28 August 2016 to 30 August 2016 in case 4.

The comparison of alarms in case 4.
Discussions
The proposed fault diagnosis method of the pitch system detects the faults successfully and alarms much earlier than the SCADA system. The fault types and the alarm model of the above cases are presented in Table 3. The communicational fault always affects state six. The actuator fault has the most extensive influence on the data. The mishandling fault can be detected easily. The statistics provides some guidance on diagnosing the pitch system faults and dividing the fault types, although more fault cases are needed to analyze later.
Fault type and alarm model statistics of the four cases.
Conclusion
The aim of this research was to develop a much easier and visual fault diagnosis method of the pitch system using the SCADA data. Based on a full understanding of wind turbine operational states and the pitch system working principle, three characteristic parameters of the pitch system were analyzed in the nine operational states of wind turbines. The GMM clustering and the analysis of normal performance curves were applied to model the normal behaviors of the pitch system in multiple states. The proposed method has been proved to be effective with four faults in three types and alarmed much earlier than the existing SCADA monitoring system. In addition, compared with the data mining method, the proposed approach can visualize the abnormal behavior of the pitch system during different operational states. The incorporation of the pitch control strategy improves the interpretability of the diagnosis method.
Through the study of four cases in the case study, the fault operational states of wind turbines provide some guidance for detecting the pitch system fault types. More pitch failures are needed to demonstrate the feasibility. In future, the study of anomalies in multiple operational states for classifying faults should be implemented. And more characteristic parameters should be considered.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Nature Science Foundation of China (No. 61573046).
