Abstract
Based on the deficiency in the traditional fault diagnosis method of rotating machinery, i.e. shallow learning is usually used to characterize complex mapping relationship between vibration signals and the rotor system, a deep neural network (DNN) based on stacked denoising autoencoder (SDAE) is proposed. The proposed method has been successfully applied to the fault diagnosis of rotating machinery. In the proposed method, the frequency domain information of vibration signal is used as input signal, and the deep neural network is obtained by layer-by-layer feature extraction from denoising autoencoder (DAE). Then the dropout method is used to adjust the network parameters, and reduces the over-fitting phenomenon. In additional, the principal component analysis is used to extract fault features. The experiment result shows that the proposed method is very effective, and can effectively extract the hidden features in the vibration signal of rotating machinery.
Introduction
Rotating machinery has been widely used in many engineering fields, such as power system, petroleum, aviation and so on. Generally, rotating machinery has a high operating speed, and is often the key equipment of enterprises, such as aeroengine, large rolling mills, turbines, large centrifugal compressor units, etc. Once the fault occurs in these equipment, it will cause huge economic loss, and may even lead to catastrophic accidents. For rotating machinery, from normal state to fault state is a cumulative process [1–3], therefore it is very important how to effectively monitor to ensure the safe and reliable operation of large rotating machinery.
The working environment of rotating machinery is very harsh, the mechanism of failure is complex, and the fault feature is also not very obvious in the early stage of fault, therefore it is difficult to extract the fault features effectively by traditional fault diagnosis method. In recent years, in the field of machine learning, stacked denoising autoender (SDAE) is used to construct a deep model for studying the hidden features of data by simulating cognitive process of human brain. SDAE has been applied in the some fields such as image processing speech recognition, etc [4–8]. Compared to shallow machine learning method, SDAE is intuitively plausible, each layer operation is a transformation from input data to output data. So the next layer can learn a new representation from the upper layer data, and then the multi-layer stack structure can learn more robust features from raw input. Therefore deep learning methods may lead to a profound evolution in the field of mechanical maintenance, including fault monitoring, diagnosis and prediction. Here, the stacked denoising autoender is introduced into the fault diagnosis of rotating machinery, and a new fault diagnosis method of rotating machinery based on stacked denoising autoender theory is proposed. In the proposed method, deep neural networks with deep architectures are established to extract fault feature adaptively from vibration signal in the frequency domain. The proposed method has been successfully applied into fault diagnosis of rotor system. The experiment results verify the effectiveness of the proposed method.
Stacked denoising autoencoder
Denoising autoencoder (DAE), as a training module in deep learning architecture, has a good ability to learn data set features like restricted Boltzmann machine and autoencoder (AE) [9, 10]. A stacked denoising autoencoder can be formed by stacking multiple AEs [11, 12]. In the SAD training processing, unsupervised learning and data destruction can further learn the characteristics and data structure of the dataset. These implicit representation obtained by learning is more suitable for supervised classification. The Ref. [13] shows that SDAE is better than Deep Belief Network (DBN) in most cases. Because in the training process, SDAE does not require Gibbs sampling, however DBN requires Gibbs sampling.
Denoising autoencoder
The autoencoder (AE) which is a multi-layered feed-forward neural network [14], which was proposed in 2007 by Bengio Y. Denoising autoencoder was proposed in 2008 by Vincent [15, 16], Compared with AE, DAE improves the model’s robustness and anti-interference ability. An autoencoder is devided into two parts: encoder and decoder. Encoder is used to map the input data into hidden representation, and decoder is used to reconstruct the original input data from hidden representation. The structure of autoencoder is shown in Fig. 1.

Structure diagram of autoencoders.
The input layer and the hidden layer constitute a coding network, and the hidden layer and the output layer constitute the decoding network. In autoencoder, the input layer and the output layer have the same dimension, which is denoted by the symbol n, the hidden layer dimension is denoted by the symbol m. In a given unlabeled sample set
Set L (x, y) is the minimization reconstruction error between input data and output data, autoencoder completes the network training by the optimization of L (x, y). i.e.
The autoencoder can reconstruct the input data x m , However, in the engineering practice, the working environment of the rotating machinery is complex and changeable, and the sampling data is easily disturbed by the environment, which leads to the change of the sample nature. For this reason, the constraint conditions are usually added to the autoencoder, so that the features obtained by the learning are robust [17].
Denoising Autoencoders (DAE), which is an improvement of autoencoder, can reconstruct the data in the noisy sampling data, overcome the deficiency in the AE. The structure of DAE is shown in Fig. 2. The process of DAE can be described as follows: in the encoded network, the noise with a certain statistical characteristic is added to the sampling data, in the decoded network, the original form of noisy data is estimated from the noiseless data according to the added noise statistics characteristic.

Structure diagram of DAE (•is set to 0).
In the training process of DAE, the random noise, which satisfies binomial distribution q
D
[18], is added to the samples to obtain noisy samples
Then the network training is completed by the minimization of reconstruction error.
The principle of DAE is similar to the sensory system of human body. For example, when a person sees an object, even if the small part of the object is covered, the person can still identify the object. Similarly, DAE can effectively reduce the influence of stochastic factors, such as mechanical conditions and environmental noise, on the extraction of health status information by adding noise, and improve the robustness of feature expression.
In order to obtain the deep neural network, in the paper, the unsupervised method is used to stack multiple DAE as the hidden layer of deep neural network. To do this, firstly, a large number of unlabeled samples x m of vibration signal in rotating machinery are used to initialize the parameters of the model. In the process of initialization, the multi-layer encoder model is obtained by a layer-by-layer DAE training. By initializing, multiple DAE are connected to form a SDAE. The output of the lower layer DAE constitutes the input of the upper layer. The DAE for all layers constitutes the hidden layer structure of DNN. The pre-training process of DNN is shown in the Fig. 3. After the initialization of the parameters, the network may have an over-fitting phenomenon. In this paper, the dropout method [19] is used to reduce this over-fitting problem, and avoid repetitive features produced by mutual adaptation between hidden layers of DAE.

The pre-training process of DNN.
In the implement process of the dropout method, the output of hidden layer DAE neurons is randomly set to zero according to a certain proportion. Thus only the weights of zero-setting neurons are retained, and not participate in the spread of the network. However, in the network test phase, in order to maintain the equilibrium of network, all neurons participate in the forward propagation, and the output values are all attenuated in proportion to dropout.
Based on the advantages of SDAE, in this paper, a fault diagnosis method of rotating machinery based of SDAE is proposed. Considering the similarity between the fault states of complex rotating machinery parts and the heterogeneous data in the classification of high-dimensional fault features, the SDAE method may show great potential in fault diagnosis of rotor system in terms of the dominant training mechanism and the advantages of deep learning architecture.
In order to apply the SDAE to the diagnosis of rotating machinery, firstly, the information in frequency domain is used as the input of the neural network [20], secondly, the hidden layer number N of the deep neural network is determined, and unsupervised training is performed to get a DNN network consisting of N-layer DAE. After completing the training of N DAEs, we use the dropout method to adjust the network parameters and complete the training of the deep neural network. Finally, the obtained deep neural network trained is used to diagnose the fault of rotating machinery. The proposed algorithm is shown in Fig. 4.

The fault diagnosis method of rotating machinery based on SDAE.
In order to verify the effectiveness of the proposed method, the five typical faults of rotor system, i.e. misalignment, unbalance, rubbing, pedestal looseness and oil film whirl, are sampled in the rotor experiment kit, which is shown in Fig. 5. In the experiment test, the sampling frequency is 5 KHz, and the sampling length is 1024. The time domain waveform of five typical faults is shown in Fig. 6.

The rotor experiment kit.

Time domain waveform of five typical faults.
In the experiment, for each fault, we have collected 100 groups of data, sampling length is 1024. Here the former 50 groups of data are used for network training, and the latter 50 groups of data are used for network test. The number of network layers is determined according to the Ref. [21]. The structure of DNN is chosen as 1024-512-256-128-8. The layer of network structure is 5, and the number of neurons in per layer is 1024, 512, 256, 128, and 8, respectively.
After the network training, In order to verify the ability of the proposed method to diagnose faults, here the random proximity embedding method of T distribution [21] is used to visualize the results of fault diagnosis. Their distribution is shown in Fig. 7.

The recognition rate of the 30 repeat trials.
From Fig. 7, the accuracy of the 30 repeat trials is 100%, the proposed method can overcome the interference of the working conditions effectively and accurately identify the different faults. For comparison, the multi-hidden layer back propagation (BP) neural network which has the same structure of DNN, and the support vector machine (SVM) are used to diagnose the same data set, their test results are also shown in Fig. 7. It can be seen from the diagnosis results that the accuracy rate of the multi-hidden layer back propagation neural network is between 16.45% and 92.36% in 30 repeat trials, and the fluctuation range is large and the precision is unstable. The accuracy of the support vector machine is within the range of 82.35% ∼ 94.52%. The diagnostic accuracy and stability in the BP and SVM recognition method is lower than the proposed method. Table 1 gives the average diagnostic accuracy and standard deviation of the 30 repeat trials. From Table 1, the accuracy of the proposed method has the highest accuracy in these three recognition methods, the average diagnostic accuracy is 100%. The multi-hidden layer BP method is less stable and the standard deviation of the diagnostic accuracy is 22.9% , and diagnostic accuracy of SVM is 92.51% , which is slightly lower than the proposed method.
The average diagnostic accuracy and standard deviation of the 30 repeat trials
In order to verify the feature extraction ability of the proposed method, the principal component analysis (PCA) method is used to extract the first two main components of the fault feature. The feature map of five typical faults is shown in Fig. 8.

Feature map of five typical faults.
From Fig. 8, in the proposed method, the same fault type of the rotor system can be gathered together and that different fault type can also be separated effectively. However in the BP and SVM method, different states cannot be discerned effectively. Therefore the deep neural network composed of DAE can adaptively learn the characteristics of vibration signal from many input vibration samples, and has a strong modeling learning ability. Of course, a small number of samples appear to overlap, such as the misalignment and the pedestal looseness fault. This is because the difference between these two types of fault samples is very little, and the sample characteristics have too many similarities. The experiment results show that the proposed method can extract the hidden features of vibration signal of rotating machinery effectively, and discern the different types of fault.
The deep neural network based on SDAE can be effectively characterized the the deep features in the data, The proposed method has a strong robustness. The experiment results show that SDAE can effectively learn the characteristics hidden in the input data by the unsupervised training, and adaptively extract the characteristic information of different faults in the rotor system. The proposed method can overcome the deficiency in the traditional fault diagnosis method of rotating machinery, i.e. shallow learning is usually used to characterize complex mapping relationship between vibration signals and the rotor system.
Footnotes
Acknowledgments
This work was supported by a grant from National Natural Science Foundation of China (Nos. 51675258, 51075372, 51265039), The State Key Laboratory of Mechanical Transmissions of Chongqing University (SKLMT-KFKT-201514) and the Science and Technology Project of Jiangxi Provincial Education Department (No. GJJ150699).
