Abstract
A planetary gearbox is a crucial but failure-prone component in rotating machinery, therefore an intelligent and integrated approach based on impulsive signals, deep belief networks (DBNs) and feature uniformation is proposed in this paper to achieve real-time and accurate fault diagnosis. Since the gear faults usually generate the repetitive impulses, an integrated approach using the optimized Morlet wavelet transform, kurtosis index and soft-thresholding is applied to extract impulse components from original signals. Then time-domain features and frequency-domain features are calculated by both original signals and impulsive signals, and probability density functions are applied to study the sensitivities of the features to the faults. The extracted features are fed into DBNs to identify the fault types, and the results show that the DBN-based fault diagnosis method is feasible and the impulsive signals play a positive role to improve the accuracies. Finally, by the mean value of various signals under multiple load conditions, uniformed time-domain features are constructed to reduce the interference of loads, and the experimental results validate that feature uniformation can improve the accuracies and robustness of intelligent fault diagnosis approach.
Keywords
Introduction
Compared with ordinary parallel-axis gearboxes, the planetary gearboxes can achieve a more efficient and stable power transmission in a compact package, therefore, planetary gearboxes are widely used in mechanical transmission systems of various equipment [1]. Whereas, the complex structures and severe operating conditions give rise to a higher probability of failure, and to make matters worse, an unexpected fault most likely cause a halt of the whole power transmission chain and even enormous economic losses and casualties, hence it is necessary to detect planetary gear faults efficiently, accurately and timely [2].
Faulty signatures of the planetary gearboxes can be reflected by symptoms in vibratory, acoustic, thermal, electric, and oil-based signals [3], in which vibration signals are the most commonly-used tool for fault diagnosis due to its cheap expense and simple implementation [4, 5]. As failure symptoms are usually confused by noises and interferences, researchers proposed various signal processing techniques, such as Hilbert-Huang transform, energy operator demodulation (EOD), empirical mode decomposition (EMD) and wavelet transform, which promote the development of fault diagnosis technology [6, 7]. For instance, Bozchalooi and Liang used the Teager energy operator (TEO) to obtain amplitude, phase and frequency modulations for gear fault diagnosis [8]. However, the signal processing methods require expertise and experience to identify the types of faults. Even worse, these manual judgements may not be enough efficient and the accuracies of diagnosis are largely limited by the degree of operators’ knowledge and experience. Therefore, it is urgently demanded for developing convenient and intelligent diagnosis methods.
In the past few years, machine learning techniques such as decision tree [9], naive Bayes networks [10], backpropagation neural networks [11], support vector machine [12] have been attracting more and more attention. Those methods have been successfully applied to fault diagnosis. For example, Li et al. [13] addressed a multimodal deep support vector classification (MDSVC) approach using separation-fusion to diagnose faults for gearboxes, which achieved a satisfactory classification rate. Shen et al. [2] applied transductive support vector machine (TSVM) to diagnose the gear faults, and Hu et al. [14] developed dynamic Bayesian network model based on the functional Hazard and Operability (HAZOP) for fault diagnosis of process plant. Nevertheless, the above methods have difficulty in complex classification because of their shallow architectures. In 2006, the concept of deep learning proposed by Geoffrey Hinton in [15] provides an elegant solution to this problem. The deep belief networks (DBNs) proposed by Hinton in [16] are one of the most classical models of deep learning. A DBN is a generative model with a multi-layer architecture, which has better extensive adaptability and mapping capability that are helpful in classification [1]. Owing to its stronger modeling ability and efficient algorithm, the DBN has been widely applied to many fields, including fault diagnosis. Yin at el [17] applied DBN, k-nearest neighbor (KNN) and artificial neural network with back propagations (ANN-BP) on the fault diagnosis for vehicle on-board equipment (VOBE) of high speed trains, and the DBN performed best. Tamilselvan et al. [18] proposed a multi-sensor health diagnosis method based on DBNs for aircraft, which performed better than the existing methods. Tran et al. [19] presented an approach based on Teager– Kaiser energy operator (TKEO) and DBNs for fault diagnosis of the valves in reciprocating compressors. It follows that DBNs are feasible for fault diagnosis, thus DBNs will be applied in this study for fault diagnosis.
In particular, when a gear fault occurs, some vibration transients will be excited at a specific rate. Compared with original signals, the impulsive signals contain more abundant information of early faults, whose impulsiveness or cyclostationarity can make up the deficiencies of original features [3]. So in this study, impulse components of original signals will be extracted by a weak transient feature extraction approach proposed in [20] to improve the diagnosis. Ultimately, the diagnosis results show that the impulsive features extracted from impulsive signals do play a positive role in the diagnosis for planetary gearboxes, and the accuracies of the proposed method basically met the usage requirement.
However, another stark problem revealed by the above experiments is that load conditions severely impact the distributions of features, resulting that the diagnosis under multiple load conditions is more difficult. In order to reduce the interference of load conditions, a specific parameter MEAN (the mean value of various signals under mixed working conditions) is proposed creatively to uniform the time-domain features extracted from original signals. By the use of the uniformed features, the DBNs achieve fault diagnosis with amazing accuracies, and the experimental results verify that the proposed method substantially reduces the interference of loads and enhances the robustness of the DBNs.
In a nutshell, the three main contributions of this study lie in that (1) DBNs are employed in diagnosis for planetary gearboxes, (2) impulsive features are innovatively applied to improve the diagnostic accuracy and (3) a creative method named feature uniformation is proposed to strengthen the robustness of the DBNs. The flow diagram of diagnosis approach is shown in Fig. 1. According to Fig. 1, the rest of this paper is organized as follows. Section 2 briefly presents DBNs and RBMs and their learning rules. Vibration signal acquisition and impulsive signal extraction are described in Section 3. Section 4 introduces the features extraction and sensitivity analysis. Section 5 discusses the fault diagnosis and the results. In Section 6, feature uniformation is introduced in detail. Finally, some conclusions are addressed in Section 7.

Flow diagram of the diagnosis for planetary gearboxes based on DBNs.
RBMs and contrastive divergence learning
Restricted Boltzmann machines (RBMs), a kind of energy-based stochastic neural network, are the main building block of DBNs. A RBM is a specific bipartite undirected graphical model [21] that has a two-layer architecture, in which the visible units

The architecture of a RBM with n hidden neurons and m visible neurons.
The energy function of a RBM of states {
The joint probability distribution of states {
The marginal probability and posterior probability can be calculated by (3) and (4).
The conditional probability of a certain visible neuron can be interpreted as the activation rate of a stochastic neuron, as expressed in (5):
The derivative of the log-likelihood with respect to the weight w ij can be obtained by:
Here, the second item of (6) means to calculate the expectations over the distribution defined by the model. If we approximate the second term in the log-likelihood gradient by a sample from the RBM-distribution, it would require to run a Markov chain until the stationary distribution is reached, which is computationally intractable. In order to solve this problem, in [22] Hinton proposed a new algorithm called contrastive divergence (CD) to approximate the RBM log-likelihood gradient.
The CD learning algorithm can be seen as follows:
Then the parameters can be calculated by (7), (8) and (9).
Usually one step of the above mentioned Gibbs chain is enough for yielding satisfactory performance. After the training, the output of the current RBM will be taken as the input training data to train the next RBM, which is called stacked structure.
A deep belief network consists of multiple RBMs stacked with each other and a top-layer classifier acting as the output layer, and the DBN structure is shown in Fig. 3.

A DBN with three hidden layers.
After completing the last RBM training, the state values of its hidden layer would be inputted into the top-layer classifier for classification. The transfer function of the top-layer classifier includes Linear function, Logistic function and Softmax function, and among them, Linear classifier is used in linear problems, and the latter two are often suitable for nonlinear classifications. The Logistic classifier is used to classify the samples that are mutually inclusive, and the Softmax classifier is more effective when the classes of data are mutually exclusive.
DBNs learning process includes two stages, the first is the greedy layer-wise unsupervised learning of RBMs, which is used to initialize the parameters of DBNs, and the other is a supervised fine-tuning process, which is based on back-propagation algorithm, so a DBN is a semi-supervised model. During the fine-tune process the labeled samples will be inputted into the network, and the error between standard output and actual output will be used to fine-tune all the parameters, which is similar to the learning process of classical backpropagation neural networks.
Experimental setup
As shown in Fig. 4, the test rig of fault diagnosis for planetary gearboxes consists of a driving motor, a two-stage planetary reducer, a two-stage parallel shaft reducer and a magnetic powder brake, which is a drivetrain diagnostics simulator (DDS) designed by SpectraQuest Inc specifically to simulate industrial drivetrains for experimental purposes. By adjusting the motor speed and the torque of the brake, different working conditions can be achieved. In this study we focus on the secondary sun gear, because the load imposed on it is not only larger than the first sun gear, but also larger than other components in the same stage, which causes a higher probability of failure. Surface wear, crack tooth, chipped tooth and missing tooth are four most common gear faults in engineering practice, which are mainly discussed in this paper, additionally a normal gear is used for comparison, and the four fault gears are shown in Fig. 5.

The test rig.

The fault gears. (a) surface wear, (b) crack tooth, (c) chipped tooth, (d) missing tooth.
In the experiment, the rotating frequency of the motor is set as 25 Hz, and other important parameters can be calculated as follows: the rotating frequency of the secondary sun gear is 4.17 Hz and its meshing frequency is 91.14 Hz, then its fault characteristic frequency is calculated as 13.02 Hz. In order to discuss the influence of different load conditions on fault recognition, the torques of the brake are set as 0 Nm, 1.4 Nm, 2.8 Nm and 25.2 Nm respectively, so that vibration signals can be collected under four different load conditions. For the purpose of obtaining sufficient fault information, sampling frequencies of all experiments are set at 5120 Hz uniformly, and the sampling time of each experiment is 7,000 seconds. So that 35,000 samples can be obtained under each load condition and totally 140,000 samples are available.
The time-domain waveforms, frequency spectrums and envelope spectrums of the original signals collected under the load of 1.4 Nm are shown in Fig. 6.

The time-domain waveforms, frequency spectrums and envelope spectrums of the original signals collected under the load of 1.4 Nm.
Planetary gearbox faults can excite the frequency resonances at a specific rate, so the healthy conditions and the defective patterns can be detected by analyzing the impulsive signals [23]. To extract the impulsive signals, an integrated approach proposed in [20] will be employed.
Since the shape of the Morlet wavelet is similar to the impulsive signal, it has been widely applied to detect the faults of rotating components. The first step of the proposed method is Morlet wavelet transform, whose formula can be defined as follows.
After Morlet wavelet transform, the next step is to use the kurtosis index to identify the impulse components from all the other signal components, because the kurtosis is sensitive to sharp variant structures, like impulses, which can be calculated by:
Then, in order to enhance the impulsive signals, the adaptive soft-thresholding method is applied to denoise the wavelet coefficients which are polluted by strong noises. Finally, the impulsive signals can be reconstructed with the denoised wavelet coefficients. At last, the time-domain waveforms, frequency spectrums and envelope spectrums of the impulsive signals under the load of 1.4 Nm are shown in Fig. 7.

The time-domain waveforms, frequency spectrums and envelope spectrums of the impulsive signals collected under the load of 1.4 Nm.
Both Figs. 6 and 7 show that there are differences between signals of fault gears, and these differences will be learned by DBNs to diagnose faults.
Feature extraction
It can be seen from the Fig. 6 that when the gear faults occur, the faulty vibration signals will be different from the normal vibration signals or the vibration signals generated by other faults, which results in that the time-domain features and the frequency-domain features extracted from signals are also different. Therefore, these features that may represent fault information can be used as the input data of DBNs. As shown in Table 1, 25 features will be calculated, in which t1 - t17 are time-domain features that are extracted from original signals, and f1 - f4 are frequency-domain features that are extracted from frequency spectrums and envelope spectrums, respectively. Similarly, the same features will be calculated by the impulsive signals. So that 50 features can be obtained totally, and then these features will be grouped into multidimensional samples.
Statistical features
Statistical features
The features t1 - t7 mainly characterize the amplitude and energy in time domain, and in fact gear failures can usually increase the amplitude and energy of signals, therefore these features are sensitive to the fault types. The features t8 and t9 can reflect the degree of fluctuations of signals. The features t10 and t11 are the third and fourth order cumulants of signals respectively, and they can represent respectively the impulse and asymmetry of signals. The above dimensional features are easily affected by the working conditions and even the sensitivities of instruments and so on. To solve this problem, non-dimensional features t12 - t17 were proposed. The kurtosis factor, peak factor and impulsive factor represent the sharpness of the signal, whereas the three factors have different characteristics. The peak factor reveals the information of early failures, such as tiny surface wear and slight crack tooth, while kurtosis factor and impulsive factor are more sensitive to impulse detection. The skewness factor describes the degree of asymmetry, thus friction and collision may cause the increase of skewness factor. The clearance factor reflects the fullness of waveform, and the waveform factor expresses the excursion and aberrance of the signal. The features f1 - f4 present the frequency-domain properties of the signals [24], which are sensitive to the changes of both frequency and amplitude. From the above analysis, it is obvious that those features can comprehensively reflect the fault information of the planetary gearboxes from multiple aspects. These features will be grouped into training samples, and then DBNs can extract information from these training data layer by layer to achieve fault recognition.
Compared with traditional scatter diagrams, the probability density functions (PDFs) can reflect the distribution of features more intuitively and comprehensively. For different gear faults, the PDFs of features are plotted in one figure, so that we can easily compare the feature sensitivities to different faults.
For simplicity, we will take the signals under the load of 1.4 Nm as an example to analyze the distributions of the features. The PDFs of the features obtained by the original signals are shown in Fig. 8, and the PDFs of the features obtained by the impulsive signal are shown in Fig. 9. It can be seen from Figs. 8 and 9 that the feature distributions of different faults are different, and these differences will be learned by DBNs to identify the faults. As is shown in Fig. 8, there is less overlapping area between distribution density curves of some features, such as Mean, MS (mean square) and MF (mean frequency), which means those features are more sensitive to different faults.

Probability density functions of original features under the load of 1.4 Nm.

Probability density functions of impulsive features under the load of 1.4 Nm.
Comparing Fig. 8 with Fig. 9, some impulsive features, such as Peak and MS have stronger classification capability than those extracted from the original signals. For example, the impulsive MS and RMS (root mean square) are very sensitive to surface wear failure, while the original features of surface wear failure are always severely mixed with those of crack tooth and missing tooth, thus the impulsive features will be helpful to identify the surface wear failure.
In addition, the feature Mean of original signal is taken as an example to check the distributions of the features under different loads. The PDFs of feature Means under four loads are illustrated in Fig. 10. From this figure, we can see that the feature Means under loads of 1.4 Nm and 2.8 Nm are more sensitive than those under loads of 0 Nm and 25.2 Nm. It follows that the identification capacities of the same feature under different loads are different, i.e. the DBNs trained by the features under different loads have different detection accuracies even if they have the same architectures and parameters, which will be verified in the next section. Thus it is necessary to investigate a new method to process the features, so that they can consider different working conditions simultaneously.

Probability density functions of original feature Mean under the loads of 0 Nm, 1.4 Nm, 2.8 Nm and 25.2 Nm.
Firstly, 25-dimension feature samples consisting of 25 original features are used to train the DBN, and then 50-dimension feature samples that are composed of 25 original features and 25 impulsive features are applied to study the accessorial effect of impulsive signals on classification. Training subsets that contain 30,000 samples (The rest 5,000 samples will be used to test the trained DBNs.) collected under four different loads will be used to train the DBNs separately to discuss the influence of load conditions. Then samples under different loads are selected randomly to grouped into mixed training subsets, and then, the mixed subsets that respectively contain 30,000 and 120,000 samples are used to train DBNs for comparison.
DBN design and data preprocessing
For more accurate and faster convergence, all the samples are grouped into a lot of “mini-batches”, and each mini-batch contains 25 samples, so that the matrix-matrix multiplications can be used, which is efficient on GPU boards or in MATLAB [25, 26]. In order to facilitate the fitting, it is necessary to normalize the training data as [0,1] by the following min-max normalization formula [27]:
The key parameters of DBNs will be specially designed as follows. Firstly, it is necessary to design a proper topology. Too many layers easily lead to overfitting, while few layers may lead to underfitting and longer training time. In this study, according to some empirical formulas proposed in [28, 29], a number of networks with different structures are used for testing, and then the one with minimum error and maximum accuracy is chosen as the optimal network topology. For 25-dimension sample subsets, the DBN of ‘25-50-23-5’ (The DBN has 4 layers, and they consist of 25, 50, 23 and 5 nodes respectively.) is the best choice, while for 50-dimension sample subsets, the optimal network structure is ‘50-100-52-26-5’. Meanwhile the learning rate is set to adaptively adjust according to the classification error [30, 31]. When the error increases, we reduce the learning rate to decrease the updating step size. The initial learning rate is set as 0.1, and the adjustment factor is 0.95 [1]. Besides, to avoid unstable oscillations and speed up training, the momentum is set as 0.9 according to [26, 32].
The average diagnostic accuracies of 30 repeated runs for all of the subsets are compared in Table 2, and the boxplots of accuracies for 120,000 mixed samples are plotted in Fig. 11, and the variation curves of classification errors are shown in Figs. 12 and 13.
Accuracy comparison of different feature subsets
Accuracy comparison of different feature subsets

Boxplots of the accuracies of DBNs trained with 120,000 mixed 25-dimension samples and 50-dimension samples.

Classification error curves of DBNs trained with 25-dimension sample subsets.

Classification error curves of DBNs trained with 50-dimension sample subsets.
As is shown in the first four columns of Table 2, the test accuracies under four single loads are respectively 97.52%, 96.26%, 97.32% and 93.30% for 25-dimension sample subsets, and 99.52%, 98.82%, 98.65% and 95.82% for 50-dimension sample subsets. Comparing these four columns, the DBNs trained with subsets under single loads of 0 Nm, 1.4 Nm and 2.8 Nm provide better performance than the ones trained with sample subsets under load of 25.2 Nm. This is mainly because the vibration caused by normal rotation become more severely under heavy load, which masks a part of the impulse components caused by faults.
The difference between the training accuracy and the testing accuracy represents the generalization capability over unseen data of the trained DBN. As is shown in the first four columns of Table 2, the accuracy differences are respectively 0.90%, 1.96%, 1.58% and 4.15% for the 25-dimension sample subsets under single loads, and 0.40%, 1.12%, 0.68% and 2.46% for 50-dimension subsets under single loads. It is obvious that the differences under the load of 25.2 Nm are larger than the others, which proves that load conditions indeed impact the sensitivities of features to different faults from another aspect.
Additionally, taking a comprehensive look at the entire table, for all the subsets under various loads, the accuracies for 50-dimension samples are higher than those for 25-dimension samples. Furthermore, in a statistical view, Fig. 11 shows that the DBN trained with 50-dimension samples perform better than the DBN trained with 25-dimension samples. Besides, comparing Fig. 12 with Fig. 13, we can find that the DBNs trained with 50-dimension samples have lower classification errors and better convergence speeds. The above analyses prove that impulsive features make up some disadvantages of original features and play a positive role in the fault diagnosis from two aspects.
Comparing the first four columns with the fifth column in Table 2, with the same number of training samples, the DBNs trained with mixed sample subsets yield lower classification accuracies than the DBNs trained with sample subsets under single loads. Moreover, both Figs. 12 and 13 show that the DBNs trained with 30,000 mixed samples have larger classification errors, and there are more fluctuations during the learning. This is due to the fact that multiple load conditions can cause the higher inter-cluster similarity in mixed sample subsets. So reducing the interference of load conditions is an important way to optimize the fault diagnosis, and a method to achieve this goal by processing features is proposed in next section.
Focusing on the last two columns, the DBNs trained with mixed subsets composed of 120,000 samples outperform those trained with mixed subsets composed of 30,000 samples, not only in the training and testing accuracies but also in the accuracy differences. Besides, Figs. 12 and 13 show that the DBNs trained with 120,000 mixed samples have faster convergence speeds and fewer fluctuations. The main reason for these is that with more samples available for training, DBNs can fit the distribution of the input data more adequately. Therefore, to create more samples is an effective way to improve diagnosis.
Besides, support vector machine (SVM) and deep neural network (DNN) are used for comparison to validate the superiority of DBN over the other machine learning techniques. The relevant parameters of both SVM and DNN are optimized, and 120,000 mixed 25-dimension samples and 50-dimension samples are used to train and test them. The average accuracies of 30 repeated runs are shown in Table 3. It is obvious that, the accuracies of DBNs are higher than both DNNs and SVMs with round differences of 8.00% and 24.00%, respectively, which prove that DBNs are indeed better than the other classifiers.
Accuracy comparison of different classifiers
The discussion in Section 5 shows that the load conditions have severe influence on the sensitivities of features, resulting that DBNs have unsatisfactory robustness. So an innovative approach to reduce the interferences of loads called feature uniformation is addressed in this section to improve the fault diagnosis.
In practical applications, we can only obtain a limited number of samples under certain typical load conditions to train the DBNs, however, the real gear faults may occur under various and unpredicted loads. In order to simulate this practical condition, in this study, mixed sample subsets under three certain loads will be used to train the DBNs, and then the sample subset under the rest load will be applied to test the trained DBNs. At first, the 50-dimension samples will be used to do experiments, and then the features processed by feature uniformation will be applied to valid the effect of the proposed method.
The results of the experiments using 50-dimension samples are shown in the first two rows in Table 4. As is shown in the first row, the DBN trained with mixed subset under loads of 0 Nm, 2.8 Nm and 25.2 Nm has a training accuracy of 99.62%, but the test accuracies are generally lower. Even worse, the test accuracy for subset under load of 1.4 Nm is 50.57%, which is far less than the others, since it has not been learned by the DBN. The same is true for the mixed sample subset under loads of 0 Nm, 1.4 Nm and 25.2 Nm, which is shown in the second row. Those are mainly due to the fact that load conditions affect severely the distributions of features, and the trained DBNs have difficulty tolerating the samples under unseen loads.
Accuracy comparison among DBNs trained with original features and uniformed features
Accuracy comparison among DBNs trained with original features and uniformed features
1The features extracted from original signals under 0 Nm, 1.4 Nm, 25.2 Nm are mixed with each other to test the trained DBNs. 2The DBNs are trained by features extracted from original signals under 0 Nm, 2.8 Nm, 25.2 Nm.
Four typical features, such as Peak, RMS (root mean square), SMR (square mean root) and CF (clearance factor) are taken as an example to study the influence of load conditions on the distributions of features and the results are illustrated in Figs. 14 and 15. We can see from Fig. 14 that there is much overlapping area among the curves, and it means that the inter-cluster similarities of those features are too high, i.e. those features under mixed loads are not sensitive to faults. Whereas, Fig. 15 shows that the curves separate clearly with each other, and it means that their intra-cluster similarities are too low, i.e. those features are sensitive to load conditions. Both of the above factors terribly hinder the fault diagnosis. Therefore, we devote ourselves to develop an approach named feature uniformation to improve these features by minimizing the inter-cluster similarities and maximizing the intra-cluster similarities.

Probability density functions of original features under mixed loads of 0 Nm, 1.4 Nm, 2.8 Nm, 25.2 Nm.

Probability density functions of original features under four single loads.
Through attempts, we find that a parameter MEAN (It has the same formula with the feature Mean, but it will be written as MEAN to differentiate itself from the feature Mean.) calculated by original signals has a significant impact on the distributions of time-domain features, which can be calculated by:
Briefly, feature uniformation is to divide the time-domain features by the average of all MEANs that are extracted from several original signals under different loads. The concrete procedure is given as follows.
In this study, the MEANs calculated by the original signals are shown in Table 5, and the last two rows of Table 5 shows the uniformation factors F MEAN calculated by the MEANs under the corresponding loads. Then use them to uniform the 16 time-domain features (Feature Mean is abandoned, because its values are roughly equal to MEAN, which makes the uniformed feature Mean useless), and the distributions of those uniformed features are shown in Figs. 16 and 17.
The MEANs and F MEAN s of original signals
1The F MEAN is calculated by feature Means extracted from original signals under 0 Nm, 1.4 Nm, 25.2 Nm.

Probability density functions of uniformed features under mixed loads of 0 Nm, 1.4 Nm, 2.8 Nm, 25.2 Nm.

Probability density functions of uniformed features under four single loads.
Comparing Fig. 16 with Fig. 14, we find that the inter-cluster similarities of uniformed features become lower. In Fig. 17, the curves almost overlap with each other, which means that the intra-cluster similarities become higher. The above analysis shows that the interferences of the load conditions are reduced successfully.
Then the 16-dimension samples consisting of the 16 time-domain features that have been uniformed will be used to train and test the DBNs to further validate the effectiveness of feature uniformation. The results are shown in the last two rows in Table 4. It is worth noting from Table 4 that the test accuracies for all the subsets of uniformed features are 100%, no matter what loads the test samples are under and no matter whether the loads have been used to train.
Therefore, through the above analysis, it can be concluded that the proposed method can efficiently reduce the interference of load conditions. It can be speculated that with more samples under more different load conditions, the trained DBNs can be adequate for more complex diagnosis tasks. In further work, more complex operating conditions containing different motor speeds and different brake torques will be taken into consideration.
Aiming at achieving accurate and efficient fault diagnosis for planetary gearboxes, an intelligent fault diagnosis method integrating impulsive, and deep belief networks and feature uniformation is proposed in this study. During the experiment, for each fault gear, the vibration signals are collected under four different load conditions. Since the gear faults usually generate the repetitive impulses, they are extracted from original signals by a weak transient feature detection method, so as to improve classification. The probability density functions are innovatively applied to represent the sensitivities of gear fault features, which are more intuitive than traditional scatter diagrams. Then those features are grouped into samples to train and test the DBNs, and the experimental results demonstrate that the diagnosis based on DBNs is feasible and the impulsive signals make up some shortcomings of original signals and improve the diagnostic accuracies. Besides, as the sensitivities of features are severely influenced by the load conditions, an approach based on MEAN are developed for uniforming time-domain features to eliminate the interference of load conditions. With the improved features, the DBNs perform the fault diagnosis tasks with accuracies of 100%, which demonstrates the superiority of the new features over the traditional features. It then follows that the proposed method has great potential and practical value in diagnosing various faults of planetary gearboxes under mixed working conditions.
Footnotes
Acknowledgments
The work described in this paper was supported by the National Natural Science Foundation of China (no. 51675065), Chongqing Research Program of Basic Research and Frontier Technology (no. cstc2017 jcyjAX0459) and the project for industrial transformation and foundation enhancement funded in 2014.
