Condition monitoring and fault diagnosis of hydropower generator based on LSTM correction model

Abstract

Hydropower station is vital for the stable growth of the national economy. How to timely warn the possible faults of hydropower stations has become an increasingly popular research topic. The traditional detection model is difficult to detect the small abnormal changes in the data, and these abnormal changes are often the precursor of faults. To improve the sensitivity of the traditional detection model, this study introduced a weight factor into the traditional LSTM detection model. By using the correction mechanism, the LSTM correction model makes the prediction model never deviate from the normal track following the appearance of abnormal data. This ensures that the model can generate large residuals after abnormal data occur so that we can detect these abnormal data in time. Finally, this paper puts forward two factors related to equipment health and integrates these two factors to form a health index. The results show that the LSTM correction model based on the health index can not only detect small changes that cannot be detected by traditional detection models but also knows the wear and tear of equipment during operation based on the changes in health indicators.

Keywords

Hydropower station LSTM correction mechanism anomaly detection health factors

1 Introduction

The normal operation of hydropower plants is the guarantee for the rapid growth of the national economy. With the rapid development of hydropower plant automation equipment and the continuous expansion of the capacity of hydropower station generator sets, it is of great significance for power plants to use big data to analyze the degradation degree of equipment components and make a timely prediction of unit faults [1]. A variety of sensors are installed on the generator set of the hydropower plant, and researchers take various measures to ensure the accuracy of these sensors data [2,3, 2,3]. Those sensors monitor the running status of the equipment in real-time, monitoring data can not only reflect the characteristics and state of the hydropower plant but also provide a maintenance basis for maintenance personnel. Maintenance personnel can effectively prevent faults by monitoring sensors data.

In recent years, researchers around the world have devoted themselves to the research of equipment fault warning. The relevant status is summarized as follows:

Fault diagnosis and early warning are often considered as classification issues. Zhao [4] put forward the deep residual shrinkage network. Different from fixed threshold classification, this algorithm noticed unimportant features through the attention mechanism and set them to zero by the soft threshold function. And Wang [5] collected the spindle current signal into the particle swarm optimization support vector machine model and then evaluated the performance of the spindle. Plakias et al. [6] proposed a focused dense convolutional neural network (ADCNN). This algorithm combined dense convolution blocks with attention mechanisms and can effectively detect fault data. Song et al. [7] proposed a fault detection method based on standardized knn (SKNN). The method developed a standardized distance to characterize the distance between data and its neighbors, taking into account both inter-pattern and inter-pattern scale information. Through the experiment, the result showed that this algorithm had high superiority in fault diagnosis. Wu et al. [8] proposed a semi-supervised fault diagnosis method called hybrid classification autoencoder. This algorithm modifies the mean square error (MSE) commonly used in unsupervised autoencoders, and it can obtain fairly high diagnosis accuracies. Mo et al. [9] proposed a new weighted cyclic harmonic-to-noise ratio (WCHNR) index, this index can be used to directly evaluate the quality and quantity of harmonics in the square envelope spectrum of bearing characteristic frequencies, so that fault harmonics can be detected. Selak et al. [10] applied a support vector machine to data detection of hydropower station generators and effectively realized fault classification. Zhu et al. [11] proposed a monitoring model based on nuclear independent component analysis (NICA) and principal component analysis (PCA), which was applied to the environment of hydropower units to detect the occurrence of faults more effectively. Liu et al. [12] developed a potentially optimized stable generative adversarial network for fault data with a small number of samples and verified that the proposed model can perform fault diagnosis through fault diagnosis experiments of bearings. These algorithms have high detection accuracy when detecting statistical anomalies, and does not need to annotate the data set in advance. But these algorithms don’t take into account the effect of time on the data.

Other common fault detection algorithms are to set a fixed threshold, once the monitored data exceeds this threshold, it will be labeled as abnormal data. Hundman et al. [13] adopted the LSTM dynamic detection algorithm, they exploited historical operating data under normal conditions to predict running data for the next period. The threshold is then set based on the mean and standard deviation of the residual. Li et al. [14] proposed an enhanced weighted voting (EWV) combination strategy with a specific class threshold to achieve selective ensemble learning. Finally, the BAS algorithm is used to adaptively select the optimal class-specific threshold. Experimental bearing data verify the effectiveness of the proposed method. Tojeiro et al. [14] proposed the implementation of a fault detection scheme for photovoltaic systems based on neural fuzzy model identification. The residuals are evaluated to determine which threshold is more favorable for robust fault detection, experiments show that this algorithm is highly detectable. Youssef et al. [16] proposed an algorithm of Kullback-Leibler divergence (KLD) analysis based on the probability distribution function. Santis et al. [17] detected abnormal points before the occurrence of hydropower station accidents by applying the isolated forest algorithm to the abnormal detection of hydropower stations. These algorithms set the threshold according to prior knowledge and can perform the fault diagnosis well.

However, this kind of algorithm relies too much on the setting of the threshold value, once the threshold is not accurate, it will inevitably affect the monitoring process and may lead to misdiagnosis. In order to solve this problem, researchers at home and abroad have begun to study dynamic indicators to judge whether the equipment fails. Chen et al. [18] proposed a threshold self-setting HCM scheme (DCGAN) based on a deep convolution generative adversarial network to define a self-setting threshold to monitor wind turbine bearings. It can automatically create the threshold model of HCM through the G network output in DCGAN, thus solving the difficult problem of threshold setting. Yang et al. [19] proposed a new signal reconstruction modeling technique based on the support vector regression model. In addition, three statistical indicators (deviation index, volatility index, and significance) are proposed to quantify the degree of deviation from normal state to abnormal state. Lin et al. [20] used the improved global empirical mode decomposition (MEEMD) scalar index to obtain the bearing state. Moreover, they used the improved health index (MHI) to test different fault modes, which overcame the shortcomings of the health index as a single fault mode classifier and deep neural networks (DNNs) as a multi-fault mode classifier.

In addition, in order to conduct real-time monitoring of equipment and effective fault warning, researchers introduced the time series model to fault diagnosis and equipment life prediction. Zhang et al. [21] proposed an adaptive method based on KF-EM-RTS to predict battery RUL, and experimental results show that the algorithm can accurately predict battery degradation. Huang et al. [22] presented a prediction method based on BILSTM. This method can integrate multiple sensors data with operational conditions data for RUL prediction of engineered systems. Zhang et al. [23] proposed a new BiGRU-TSAM network, by assigning different weights to different time sequences, this method can predict the life of equipment very well. Zhang et al. [24] established a single-cell dynamic degradation model based on UPF, and introduced the wilcoxon rank sum test to determine the capacity regeneration point, so as to reduce the prediction error, and achieved good results in the remaining life experiment of 18650 lithium-ion batteries.

Due to the excellent performance of LSTM in time sequence modeling, many anomaly detection methods based on LSTM have been proposed. Kim et al. [25] used CNN+LSTM detection to detect abnormal network data. Their experiment proves that their method achieves almost perfect anomaly detection performance for network traffic data. In order to realize fault detection of the Internet of things, Yao et al. [26] adopted the BILSTM algorithm to analyze abnormal data and finally detected abnormal data and fault types. Fernando et al. [27] proposed an LSTM+Soft+Hardwired mechanism to improve the speed and accuracy of fault detection. Bian et al. [28] applied particle swarm optimization (LSTM-Attention) to electric quantity anomaly detection. Their model uses an attention mechanism to give different weights to the hidden state of LSTM so that it can detect abnormal electric quantities more accurately. These algorithms have good sensitivity to data with sequential characteristics and are often used to predict the life of the equipment and monitor the operation status of the equipment.

Although LSTM has a strong predictive ability in time series, it also causes a problem. The appearance of abnormal data will make LSTM fit with abnormal data and deviate from the normal model track. We want the predicted data to be quite different from the abnormal data so that our detection model can produce a large residual after the occurrence of abnormal data and we can detect these abnormal data in time. In addition, most of the current fault warning methods are based on the residual to set the fault threshold but do not convert the residual into the equipment fault probability model. In order to solve all the above situations, this paper proposes the LSTM correction model based on the health index. The contribution of this paper is listed as follows:

The weight mechanism is introduced into the LSTM model so that our prediction model will not be affected by abnormal data and deviate from the normal track. When abnormal data appears, this model can generate a larger residual, which makes it easier to exceed the set threshold, so as to realize the function of fault warning.

Compared with other detection models, our model can identify small abnormal fluctuations in the early stages. Thus, the model can better carry out fault warning and avoid unnecessary losses.

This paper converts the residual between the actual data and the predicted data into a probability model for assessing equipment fault and presents health indicators related to the running state of the equipment. By observing the changing trend of equipment health factors, maintenance personnel can know the running state of equipment.

2 LSTM correction model

2.1 LSTM model

Because the data of hydropower stations is collected according to time, it is more accurate to use the network with time memory for fault warning. RNN model can be used for the prediction of various time data, but the problem of vanishing gradient or gradient explosion may occur in the process of prediction. LSTM model is improved based on RNN. LSTM has a forgetting gate, input gate, and output gate, which overcomes the vanishing gradient problem [29]. Therefore, this paper chooses LSTM as the prediction model. The LSTM unit structure is illustrated in Fig. 1:

Fig. 1

LSTM unit structure.

LSTM transmission unit can be expressed as follows:

$f_{t} = σ [w_{f} (h_{t - 1}, x_{t}) + b_{f}]$ (1) $i_{t} = σ [w_{i} (h_{t - 1}, x_{t}) + b_{i}]$ (2) $\tilde{C_{t}} = Re lu [w_{c} (h_{t - 1}, x_{t}) + b_{c}]$ (3) $C_{t} = f_{t} * C_{t - 1} + i_{t} * \tilde{C_{t}}$ (4) $o_{t} = σ [w_{o} (h_{t - 1}, x_{t}) + b_{o}]$ (5) $h_{t} = o_{t} * Re lu (C_{t})$ (6)

As shown in Fig. 1 and the Equations (1)—(6), t represents the time, the subscript t represents the value of the current moment, and the subscript is the value of a moment before in the t-1. f_t represents the forgetting gate used to determine how much information about output h_t-1 and input x_t is forgotten. The input gate i_t indicates how much information is retained, $\tilde{C_{t}}$ and C_t are the immediate and long-term states respectively, σ and Relu are sigmoid function and linear rectifier activation function respectively. The weights w and deviation b of different subscripts are the parameters to be trained by the model. The output gate o_t determines which states to output, and h_t represents the final output value of the LSTMmodel.

2.2. Correction mechanism

LSTM has a strong predictive ability in time series [30], but it also causes a problem. The appearance of abnormal data will make the LSTM model fit with the abnormal data, so that the generated residual is below the threshold line, resulting in the failure to detect these abnormal data in time. Therefore, we want that the value predicted by the improved model will always be within an acceptable range so that large residuals can be generated when abnormal data appear. By considering the time factor and correction mechanism, the improved detection model can more effectively detect the equipment and fault warning. Inspired by literature [31, 32], the process of the LSTM correction model is shown in the figurebelow:

As shown in Fig. 2, the procedure of the LSTM correction model algorithm is as follows:

Fig. 2

LSTM correction model.

Because the LSTM correction model is used to detect whether abnormal data exists in real-time data, we first need to train the LSTM model with the training set data, and then the trained LSTM model is weighted to form the LSTM correction model.

Step1: Divide the data of the hydroelectric generator into two parts. One part is represented as the data without fault, which is used as the training set. The other part is the data about 12 hours before the fault as a test set.

Step2: Use the data of the training set to train the model and get a well-trained LSTM model

Step3: Use the trained LSTM model to predict the test set data, and weighted correction of the predicted value according to formula (7) and formula (8). So as to ensure that when abnormal data occurs, the weighted predicted value will produce a large residual difference from the abnormal data.

Step4: Subtract the actual data from the corrected forecast data and take the absolute value to obtain the residual data.

The LSTM correction formula can be expressed as follows: $h_{correct} = α {\hat{h}}_{t} + β x_{t - 1} + λ μ$ (7) $u = \frac{\sum_{j = 1}^{t - 1} x_{j}}{t - 1}$ (8) where ${\hat{h}}_{t}$ represents the value of the current moment predicted by LSTM at the previous moment h_t-1, u is the mean value of the data used in the training set (where the training set is the data before no anomaly occurs), h_correct represents the value of the corrected data, and the sum of α, β and λ is 1.

The residual is obtained by subtracting the actual data from the corrected forecast data: $e_{t} = | h_{correct} - x_{t} |$ (9)

The generated residuals will provide a calculation basis for subsequent health evaluation indicators.

3 LSTM correction model based on the health index

In order to know the wear condition of equipment during operation, two indicators related to equipment health are set to evaluate the equipment condition. The flow of health indicator production is shown in Fig. 3.

Fig. 3

Health indicator generation framework.

As shown in Fig. 3, the health indicator generation framework is as follows:

The health indicators are used to judge the probability of device failure in this paper, we extract three new statistical indexes from the residual distribution and transform them into the failure probability indexes of equipment.

Step 1: Calculate the residuals generated from the normal operation data of the equipment by Equations (9).

Step 2: Calculate the mean and standard deviation of residuals during normal operation of the device, and the mean and standard are denoted as u₀ and σ₀ respectively.

Step 3: Calculate the mean and standard deviation of the test set residuals, and denoted as u₁ and σ₁ respectively.

Step 4: According to the literature [33], if a potential fault occurs, the running data of the equipment may deviate from the normal track. In order to represent the deviation of data from water, we define the deviation index (DI), and the formula of the deviation index is as follows: $DI = (| μ_{1} - μ_{0} |) / σ_{0}$ (10)

When a fault is about to occur, the device data usually increases. In order to judge whether the increase in data is accidental or caused by faults, we define the kurtosis index (KI). The formula of the kurtosis index is as follows:

K I = \frac{\frac{1}{n} {\sum x_{i}^{4}}_{i = 1}^{n}}{μ^{4}}

Step5: Use the corresponding formula to calculate the probability value of equipment fault.

PDI is the probability of DI. Since the distribution of deviation index (DI) follows the probability distribution, the cumulative distribution function is selected to calculate the probability of deviation index. The calculation formula is as follows: $PDI = 2 * Φ (DI * α) - 1$ (12)

Where Φ (x) is the cumulative distribution function of the standard normal distribution, and α is the ratio parameter set by the user, it represents in this formula the scaling factor which can be regarded as the significance index.

PKI is the probability of the Kurtosis index (KI) using the tanh function. The tanh function is selected as the activation function of the significance index because when the input of this function is larger than a certain value, its increase becomes slower and eventually approaches 1. Therefore, when the PKI increases significantly, we can conclude that the equipment is or will be in a state of fault. The formula is as follows: $PKI = tanh (KI * β)$ (13)

In this formula, β represents the specific factor set by the user, which is used to help the user discover the abnormal situation of the device earlier.

The health index (HI) can measure the health status of a device in real time. It takes into account both PDI and PKI indicators. The smaller the HI index is, the healthier the device becomes. Health indicators are defined as follows: $HI = PDI * PKI$ (14)

To sum up, the LSTM correction model based on the health index framework of this paper is shown in Fig. 4, including two steps:

Fig. 4

LSTM correction model based on the health index framework.

Step 1: Establish the LSTM correction model.

Step 2: Use generated residual to establish the health indicators.

4 Case study

The LSTM correction model based on the health index proposed in this paper is verified by real-time monitoring data from a small hydropower station in southern Brazil. The plant has an installed capacity of 22.6 MW and provides energy for the local population. The main single Hydro-generator unit records data every five minutes. The detection equipment monitors a total of six parts. These are: generator apparent power, bearing hydraulic lubrication unit (HLU) inflow, and different positions of the four bearings vibration: axial, vertical radial, horizontal radial, and coupled.

In this experiment, the monitoring variable of generator apparent power is selected as the research object. The research time is selected from mid-January 2019, early February 2019, and early April 2019 respectively. During these periods, the apparent power data is shown in the following figures:

4.1 Data standardization

In order to facilitate the calculation of the model, we normalized the normal operating data to [0, 1], and processed the data of the test set in the sameway. $x_{train} = \frac{x - min (x)}{max (x) - min (x)}$ (15)

Where x is an original value, x_train is the normalized value for training.

4.2 Experimental comparison of the LSTM detection model and LSTM correction detection model

This chapter selects the normal data 1—3 days before the fault as the training set, and the data on the day when the fault occurs as the test set. We use the LSTM model and the LSTM correction model respectively to detect the above data, to see which model can realize the fault warning as soon as possible.

In order to compare the detection effect of the LSTM model and LSTM correction model, the 3-Sigma threshold is used for both model thresholds.

As shown in Fig. 6, this paper uses the LSTM model to detect whether the data of apparent power in mid-January 2019 has abnormal data. The green line represents the data predicted by the LSTM model, the red line represents the training set and test set data, and the blue line represents the residual, which in this paper represents the difference between the predicted data and the training set data, as well as the difference between the red and green lines in the figure. The black cross indicates the time when the fault occurs, and the red dashed line represents the threshold in this article. In the figure, the abscissa represents the time and the ordinate represents the data of apparent power.

Fig. 5

Apparent power monitoring char.

Fig. 6

LSTM detection map in mid-January 2019.

In Fig. 7, the LSTM model is first trained by using the data of the training set and ensure that the precision and accuracy of the LSTM model are within the acceptable range. Secondly, this paper uses the trained LSTM model to predict the data of the training set and uses the green line to indicate the predicted value. Then the trained LSTM model is weighted to obtain the LSTM correction model. Finally, the LSTM correction model is used to predict the test set data, and the predicted value of the test set is represented by the light blue line.

Fig. 7

LSTM correction model detection in mid-January 2019.

The LSTM and LSTM correction models are used to train the data of apparent power in early February 2019. The detection results are shown in Figs. 8 and 9 respectively.

Fig. 8

LSTM detection map in early February 2019.

Fig. 9

LSTM correction model detection in early February 2019.

The LSTM and LSTM correction models are used to train the data of apparent power in early April 2019. The detection results are shown in Figs. 10 and 11 respectively.

Fig. 10

LSTM detection map in early April 2019.

According to the monitoring data of the hydropower station, the power station failed separately at 12 : 08 : 47 on January 18, 2019, 23 : 57 : 53 on February 3, 2019, and 09 : 06 : 14 on April 4, 2019.

According to the data from mid-January and early February 2019, it can be found that the LSTM model will follow the abnormal data to fit, and cannot give a timely warning of faults. However, the LSTM correction model can detect that the residual tends to increase before the fault occurs, and the residual exceeds the threshold in a period of time before the fault occurs. The improved model will help maintenance personnel to warn in time and avoid greater losses.

According to the data in early April 2019, when the data is less volatile, both LSTM and LSTM correction models do not give early warning before faults, which may be related to the small weight setting of our LSTM correction model. However, with help of the health indicators below, we can give timely warning before faults according to the fluctuations of health indicators.

The above cases prove that the LSTM correction model can overcome the insensitivity of the traditional LSTM detection model. In the early stages of the fault, it can detect small abnormal changes in the data.

4.3 Experimental verification of health index

In this section, we use the LSTM correction model to obtain the residual of the equipment fault day in January 2019, the residual of the equipment fault day in February 2019, and the residual of the equipment fault day in April 2019 respectively. These residuals also represent the residuals of the test sets in Figs. 7, 9, and 11 in the 4.2 section, which are plotted separately in this section as follows:

Fig. 11

LSTM correction model detection in early April 2019.

This paper uses Equations (10)—(14) to process the residual data mentioned above and obtain the health indicators of the equipment, the results are plotted in the following figures. Where the abscissa represents the time and the ordinate represents the probability of equipment fault.

The PDI, PKI, and HI indices are applied to the residual data, the threshold is set based on a confidence interval of 0. 99 for data without failure, and the HI index threshold is taken as 0.5. And the results are shown in the figures below:

The PDI index is sensitive to data deviation, so it can well reflect the fluctuation of data. According to analyze the PDI in Fig. 13(a), we can discover that at 0 o ’clock on the 18th, the index increased greatly, and the possibility of fault exceeds 0.5.

Fig. 12

Prediction residuals of apparent power monitoring using LSTM correction model.

Fig. 13

Indicator chart of January 18 residual data.

The PKI can avoid data offset caused by some noise, so analyzing PKI can avoid the possibility of false positives. Based on the analysis of the PKI in Fig. 13(b), it can also be found that the possibility of fault exceeds 0.5 at about 1 o ’clock on the18th.

The HI index combined with the characteristics of the above index, it can accurately estimate the probability of equipment fault. This index in Fig. 13(c) exceeds 0.5 at 08 : 37 on the 18th, and it can give a warning four hours before the equipment fault occurs. If effective interference is carried out on the equipment at this time, the fault can beavoided.

The PDI, PKI, and HI indices are applied to the residual data from early February 2019, the results are shown in the figures below:

By analyzing Fig. 14, we can know the equipment fault occurred at 23.52 on February 3, while the HI index began to increase continuously around 14 : 30 and exceeded 0.5 around 15 : 45, allowing for an early warning 11 hours before the fault.

Fig. 14

Indicator chart of February 3 residual data.

The PDI, PKI, and HI indices are applied to the residual data from early April 2019, and the results are shown in the figures below:

Fig. 15

Indicator chart of April 2 residual data.

The PDI and PHI analyze the probability of fault of hydropower stations from different aspects, and we only need to consider the combination of the two (HI index), we can accurately get the probability of equipment fault.

Fig. 16

BILSTM detection map in early February 2019.

Fig. 17

CNN+LSTM detection map in early February 2019.

Figure 15(c) shows that the HI index increased slightly at around 03 : 00, 04 : 00, and 05 : 00 on the day of the accident. And this index kept increasing after 07 : 30, exceeding 0.5 at around 08 : 30. If effective interference is carried out on the equipment at this time, the fault can be avoided.

By comparing the detection results from April 2 of the LSTM correction model (in Fig. 11) and LSTM correction model based on the health index (in Fig. 15(c)) respectively. We can find that the LSTM correction model considering health indicators has higher sensitivity to the early minor abnormal data, and can also play a more important role in the early warning of equipment fault.

4.4 Algorithm comparison

In this paper, the LSTM correction model based on the health index is used to give a timely warning of possible faults, and the result is compared with the LSTM model, LSTM correction model, CNN+LSTM model, BILSTM, and extended isolation forests algorithm.

Since the fault detection algorithm of the LSTM model has been mentioned above, and the extended isolation forests algorithm has been applied in literature [17], this chapter only discusses CNN+LSTM and BILSTM algorithms.

Taking early February 2019 and early April 2019 as examples, we respectively use CNN+LSTM and BILSTM models to verify the results, and the results are as follows:

Fig. 18

BILSTM detection map in early April 2019.

Fig. 19

CNN+LSTM detection map in early April 2019.

Finally, in this chapter, all the above algorithms are applied to the apparent power of hydropower stations in January, February, and April 2019, and the results are summarized in the following table. In the following table, the first column represents the name of the algorithm, the second column represents the time when the equipment fault occurred, and the third column represents the time when the algorithm detected the fault. NULL in the following table indicates that the algorithm did not detect abnormal changes in the data before the fault occurred and the following table shows the algorithm of this paper in bold.

Table 1

Early warning of data in mid-January by each model

Algorithm	Fault time	Detect fault time
LSTM detect model	01-18 12 : 08	NULL
LSTM correction model	01-18 12 : 08	01-18 8 : 57
BILSTM model	01-18 12 : 08	NULL
CNN+LSTM model	01-18 12 : 08	NULL
extended isolation forests algorithm [17]	01-18 12 : 08	01-18 10 : 30
LSTM correction model based on the health index	01-18 12 : 08	01-18 8 : 37

Table 2

Early warning of data in early February by each model

Algorithm	Fault time	Detect fault time
LSTM detect model	02-03 23 : 52	NULL
LSTM correction model	02-03 23 : 52	02-03 15 : 47
BILSTM model	02-03 23 : 52	02-03 21 : 27
CNN+LSTM model	02-03 23 : 52	02-03 17 : 17
extended isolation forests algorithm [17]	02-03 23 : 52	02-03 16 : 00
LSTM correction model based on the health index	02-03 23 : 52	02-03 15 : 47

Table 3

Early warning of data in early April by each model

Algorithm	Fault time	Detect fault time
LSTM detect model	04-02 9 : 06	NULL
LSTM correction model	04-02 9 : 06	NULL
BILSTM model	04-02 9 : 06	NULL
CNN+LSTM model	04-02 9 : 06	NULL
extended isolation forests algorithm [17]	04-02 9 : 06	04-02 8 : 50
LSTM correction model based on the health index	04-02 9 : 06	04-02 8 : 35

It can be seen from the above table that compared with other algorithms, the LSTM correction model based on the health index can provide earlier and more effective fault warning. When abnormal data occurs, the residual error generated by the LSTM correction model can exceed the set threshold faster, so it can realize more effective fault warning. The HI index proposed in this paper is calculated according to the residual error generated by the LSTM correction model and it represents the probability of potential device faults. So the LSTM correction model based on the health index not only has the advantages of the LSTM correction model, but also can monitor the health status of the equipment in real-time, and help the maintenance personnel observe the running status of the equipment.

5 Conclusions

Equipment faults are usually not periodic, which makes it impossible for traditional detection models to learn periodic features from data. In this paper, we weighted the traditional LSTM model to form the LSTM correction model. This model can produce large residuals in the presence of abnormal data, which facilitates timely fault warning. So it is superior to the traditional LSTM model in anomaly detection and maintains the same excellent prediction ability as the LSTM model. In the case of small abnormal data, the traditional model will often follow these abnormal data for fitting, thus unable to produce a large residual error and lose the role of early warning. The model in this paper can detect the abnormal trend of data in time and give early warning of fault in advance. On this basis, two factors related to equipment health are proposed: PDI and PKI, and combine these two indicators to form the health indicators (HI) of this paper. Finally, based on fully considering the characteristics of hydropower plant data time series, this paper uses the LSTM correction model based on the health index to verify the data of hydropower station. The experimental results show that: this paper proposes the model can not only know the health status of the device in real-time but also has good sensitivity, which can detect abnormal changes in the early stage of the fault and give a timely warning.

References

Kaldellis

J.K.

, The contribution of small hydropower stations to the electricity generation in Greece: Technical and economic considerations, Energy Policy 35 (2007), 2187–2196.

Dai

, Li

, Xu

, Meng

, Wu

and Fan

, Microseismic Monitoring of the Left Bank Slope at the Baihetan Hydropower Station, Rock Mechanics & Rock Engineering 50 (2017), 225–232.

You

X.X.

, Dian

S.Y.

, Liu

, Guo

, Xiang

and Zhu

Y.Q.

, Command filter-based adaptive fuzzy finite-time tracking control for uncertain fractional-order nonlinear systems, IEEE Transactions on Fuzzy Systems 31 (2022), 226–240.

Zhao

, Zhong

, Fu

, Tang

and Pecht

, Deep Residual Shrinkage Networks for Fault Diagnosis, IEEE Transactions on Industrial Informatics 16 (2019), 4684–4690.

Wang

H.J.

, Zou

A.N.

and Zuo

Y.B.

, Evaluation Method of Spindle Performance Degradation Based on Current, Transaction of Beijing Institute of Technology 39 (2019), 22–27.

Plakias

and Boutalis

Y.S.

, Fault detection and identification of rolling element bearings with Attentive Dense CNN, Neurocomputing 405 (2020), 208–217.

Song

, Tan

, Shi

H.B.

and Zhao

, Fault detection and diagnosis via standardized k nearest neighbor for multimode process, Journal of the Taiwan Institute of Chemical Engineers 106 (2020), 1–8.

X.Y.

, Zhang

, Cheng

C.M.

and Peng

Z.K.

, A hybrid classification autoencoder for semi-supervised fault diagnosis in rotating machinery, Mechanical Systems and Signal Processin 149 (2020), 107327.

Z.L.

, Wang

J.L.

, Zhang

and Miao

, Weighted Cyclic Harmonic-to-Noise Ratio for Rolling Element Bearing Fault Diagnosis, IEEE Transactions on Instrumentation and Measurement 69 (2018), 432–442.

10.

Selak

, Butala

and Sluga

, Condition monitoring and fault diagnostics for hydropower plants, Computers in Industry 65 (2014), 924–936.

11.

Zhu

, Zhou

, Xia

, Li

, Xiao

and Zhang

, A novel KICA– PCA fault detection model for condition process of hydroelectric generating unit, Measurement 58 (2014), 197–206.

12.

Liu

, Chen

J.L.

, Qu

, Hou

R.J.

, Lv

H.X.

and Pan

T.Y.

, LOSGAN: latent optimized stable GAN for intelligent fault diagnosis with limited data in rotating machinery, Measurement Science and Technology 32 (2021), 045101.

13.

Hundman

K.L.

, Constantional

, Laporte

, Colwell

and Soderstrom

, Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (available at: https://dl.acm.org/doi/abs/10.1145/3219819.3219845.

14.

X.Q.

, Jiang

H.K.

and Niu

M.G.

, An enhanced selective ensemble deep learning method for rolling bearing fault diagnosis with beetle antennae search algorithm, Mechanical Systems and Signal Processing 142 (2020), 106754.

15.

Tojeiro

D.O.

, Caneza

R.T.

and Potts

A.S.

, Fault detection based on Neuro-Fuzzy models and residual evaluation with fuzzy thresholds applied to a photovoltaic system, IFAC-PapersOnLine 54 (2021), 717–722.

16.

Youssef

Abdulrahman

, An Optimal Fault Detection Threshold For Early Detection Using Kullback-Leibler Divergence For Un-known Distribution Data,, Signal Processing 120 (2016), 266–279.

17.

Santis

R.B.

and Costa

M.A.

, Extended Isolation Forests for Fault Detection in Small Hydroelectric Plants, Sustainability 2020, 12. (available at: https://www.mdpi.com/2071-1050/12/16/6421).

18.

Chen

, Li

, Wang

K.S.

, Zuo

M.J.

, Heyns

P.S.

and Baggeröhr

, A threshold self-setting condition monitoring scheme for wind turbine generator bearings based on deep convolutional generative adversarial networks, Measurement 167 (2021), 108234.

19.

Yang

C.Z.

, Liu

J.Q.

, Zeng

Y.Y.

and Xie

G.Y.

, Real-time condition monitoring and fault detection of components based on machine-learning reconstruction model, Renewable Energy 133 (2018), 433–441.

20.

Lin

M.Y.Z.

, Shan

M.Y.

, Zhou

and Pan

Y.J.

, A Data-Driven Fault Diagnosis Method Using Modified Health Index and Deep Neural Networks of a Rolling Bearing, Computing and Information Science in Engineering 22 (2021), 021005.

21.

Zhang

, Jiang

, Li

, Huo

, Luo

and Yin

, An adaptive remaining useful life prediction approach for single battery with unlabeled small sample data and parameter uncertainty , Reliability Engineering & System Safety 222 (2022), 108357.

22.

Huang

C.G.

, Huang

H.Z.

and Li

Y.F.

, A bidirectional LSTM prognostics method under multiple operational conditions, IEEE Transactions on Industrial Electronics 66(11) (2019), 8792–8802.

23.

Zhang

, Jiang

, Wu

, Li

, Luo

and Yin

, Prediction of remaining useful life based on bidirectional gated recurrent unit with temporal self-attention mechanism, Reliability Engineering & System Safety 221 (2022), 108297.

24.

Zhang

, Jiang

, Li

, Luo

, Yin

and Kaynak

, Remaining useful life prediction of lithium-ion battery with adaptive noise estimation and capacity regeneration detection, IEEE/ASME Transactions on Mechatronics 2022.

25.

Kim

T.Y.

and Cho

S.B.

, Web traffic anomaly detection using C-LSTM neural networks, Expert Syst Appl 106 (2018b), 66–76.

26.

Jiahao

, Jiang

, Wang

, Jiang

and Yu

, SVM-BiLSTM: A fault detection method for the gas station IoT system based on deep learning,, IEEE Access 8 (2022), 203712–203723.

27.

Fernando

, Denman

, Sridharan

and Fookes

, Soft + Hardwired attention: An LSTM framework for human trajectory prediction and abnormal event detection,, Neural Networks 108 (2018), 466–478.

28.

Bian

, Wang

, Scherer

, Woźniak

, Zhang

and Wei

, Abnormal Detection of Electricity Consumption of User Based on Particle Swarm Optimization and Long Short Term Memory With the Attention Mechanism, IEEE Access 9 (2021), 47252–47265.

29.

Gers

F.A.

, Schraudolph

N.N.

and Schmidhuber

, Learning Precise Timing with LSTM Recurrent Networks, Journal of Machine Learning Research 3 (2003), 115–143.

30.

Hochreiter

and Schmidhuber

, Long Short-Term Memory[J], Neural Computation 9 (1997), 1735–1780.

31.

Liu

, Bai

, Li

and Sun

, Improved LSTM-Based Abnormal Stream Data Detection and Correction System for Internet of Things, IEEE Transactions on Industrial Informatics 18 (2022), 1282–1290.

32.

Hou

, Jiang

, Wan

, Yi

, Gao

, Ding

and Xue

, Deep learning and data augmentation based data imputation for a structural health monitoring system in multi-sensor damaged state, Measurement 196 (2022), 111206.

33.

Wang

, Du

, Lu

, Duan

and Wu

, Improved lstm-based time-series anomaly detection in rail transit operation environments, IEEE Transactions on Industrial Informatics, (2022). (available at: https://ieeexplore.ieee.org/abstract/document/9748023).