Abstract
To guarantee the performance and security of the complex system, in this paper, we focus on the problem of fault diagnosis and fault prediction method for the complex system. The proposed fault diagnosis and prediction system is made up of three parts: 1) Data preprocessing, 2) Degradation state detection, and 3) Fault diagnosis. Afterwards, we exploit the Wavelet transform correlation filter to extract features for complex system fault diagnosis and prediction. Particularly, the direct spatial correlations of wavelet transform contents are used to search the locations of edges. To promote the performance of Hidden Markov model, we propose a HMM-based semi-nonparametric method by the probabilistic transition frequency profile matrix and the average probabilistic emission matrix. Then, the training sequence which is the most similar to a particular sequence can be found by the modified HMM model. Finally, experimental results prove that the proposed algorithm can effectively enhance the accuracy of equipment fault diagnosis and equipment state recognition task.
Introduction
With the emphasis on social science and technology, high-tech has been injected into all aspects of human life, and let the industrial, communications, aviation and other fields of engineering and systems more complex, integrated and intelligent[1, 2]. Meanwhile, with the rapid development of these complex engineering systems, the development and production costs are getting higher and higher, the reliability and safety requirements increase significantly as well [5]. When these systems are in the failure state, it may bring serious economic losses and environmental pollution, and sometimes endanger personal safety, and cause catastrophic destruction [6, 7].
As one cannot know when the system failure occurs in advance, there are two main maintenance strategies, that is, 1) regular maintenance and 2) corrective maintenance. Regular maintenance will not only spend a lot of money, but also may bring in the maintenance of other risks to the system [8]. Therefore, to enhance the reliability, safety and economy of the complex engineering system, the Prognostics and Health Management (PHM) technology has been more and more important [9, 10].
The concept of PHM technique is firstly proposed in the military equipment, and has been used in aircraft, spacecraft and other complex systems. PHM technology consists of two main elements: 1) Failure prediction and 2) health management [11]. Failure prediction is implemented based on the current and historical information systems, and make the pre-diagnosis of the relevant health system to estimate its remaining life. Health information management is based on a diagnosis or prognosis of the system, and then makes the appropriate decisions. Health information management can effectively enhance the security and reliability of the system [12].
As is well known that reliability and safety analysis of the complex systems have been one of the key problems in reliability engineering, in this paper, we focus on the problem of Fault diagnosis and prediction of complex system. The rest of the paper is organized as follows. Section 2 illustrates the related works about the hidden Markov model. In Section 3, the framework of fault diagnosis and prediction method for the complex system is given. Section 4 proposes the fault diagnosis and prediction method which can be used in the complex system. Experimental results and related analysis are given in Section 5. Finally, the conclusions are drawn in Section 6.
Related works
A hidden Markov model (HMM) refers to a statistical Markov model in which the system is supposed to be a Markov process with hidden states, and hidden Markov model can be regarded as a simple dynamic Bayesian network. In a hidden Markov model, the state is not directly visible, but the output, dependent on the state, is visible. Each state has a probability distribution on the possible output tokens. Hence, the sequence of tokens produced by an HMM provides information of the sequence of states. In the following part, we list the related works about hidden Markov model.
Roth et al. proposed a computationally efficient method for mobile subscriber position estimation in wireless networks. The proposed approach aims to promote the accuracy of the traditional case where no data scaling is utilized and is evaluated in a simulated environment under changing channel conditions [13].
Kostiou et al. aimed to detect and classify members of the four distinct families, plus the G beta and the G gamma subunits of G-proteins from sequence alone. Particularly, this work constructed six specific profile Hidden Markov Models (pHMMs), and these models are utilized to ten (10) proteomes and then identify all known G-protein [14].
Benoit et al. proposed an analytical likelihood method by which the hidden state is modeled as a three-state CTMC model allowing for some observed states to be possibly misclassified. Covariate effects of the hidden process and misclassification probabilities of the hidden state are computed without using information from a gold standard’ as comparison [15].
Glas et al. proposed the double-stranded HMM which is used to analyze the strand-specific genomic processes. Particularly, this paper exploited the double-stranded HMM to yeast utilizing gstrand specific transcription data, nucleosome data, and protein binding data for a set of 11 factors associated with the regulation of transcription [16].
Dammeier et al. utilized an automatic classification approach with hidden Markov models to discover rockslide signals in seismic data from two stations in central Switzerland [17].
Gassiat et al. have proved that finite state space non parametric HMMs are identifiable as soon as the transition matrix of the latent Markov chain has full rank and the emission probability distributions are linearly independent. Moreover, this work demonstrated that the general result allows the utilization of semi-or non-parametric emission distributions [18].
Apart from the above research works, hidden Markov model has been used in other domains, such as Localizing the latent structure canonical uncertainty [19], Action recognition [20], Cell Outage Detection in 5G HetNets [21], Apnea Bradycardia Detection [22], Human behavior recognition [23].
Overview of fault diagnosis and prediction method for the complex system
As is shown in Fig. 1, structure of fault diagnosis and prediction for the complex system is proposed.

Structure of fault diagnosis and prediction for the complex system.
From Fig. 1, it can be seen that our proposed fault diagnosis and prediction system is made up of three modules: 1) Data preprocessing, 2) Degradation state detection, and 3) Fault diagnosis. Particularly, the data preprocessing step is discussed in this section.
We use the Wavelet transform correlation filter to extract features for the problem of complex system fault diagnosis and prediction. We exploit the direct spatial correlation Cr l (m, n) of wavelet transform contents to search the locations of edges by the following equation.
Based on the above definition, the Wavelet transform correlation filter based feature extraction method is illustrated as follows.
Afterwards, the time sequence signal x (n) is obtained by the Wavelet transform correlation filter, and the scale Wavelet coefficient {D1, D2, ⋯ , D j } and C j scale coefficient can be obtained.
Exploiting the information entropy theory, Wavelet correlation feature scale entropy is defined as follows.
In this section, we will discuss how to tackle the problem of fault diagnosis and prediction of complex system utilizing the Hidden Markov model. Assume that
Subject to
We define
Subject to
where π i = P (Z1 = S i ). Suppose that for a set of space evolution Q ={ q1, q2, ⋯ , q T } and a set of symbols V ={ v1, v2, ⋯ , v T }, the sequence probability is estimated as follows.
To improve the performance of Hidden Markov model, we propose a HMM-based semi-nonparametric method to promote the performance of the HMM-based method using the log-likelihood. The parameter estimation process is the same to the traditional HMM-based method. When the expectation-maximization training process ends, the trained HMM model is used to the training sequences.
Exploiting the probabilistic inference, two matrices are calculated and saved for all training sequences, that is, 1) probabilistic transition frequency profile matrix (denoted as F), and 2) average probabilistic emission matrix (denoted as E). Matrix F and E are used to represent a specific sequence using the HMM model in state space and observation domain respectively.
The probability distribution of the state values at time t is defined as follows.
For a particular observation sequence, the average probabilistic emission matrix E is calculated as follows.
On the other hand, probabilistic transition frequency profile matrix F is calculated as follows.
Utilizing the above definitions, we aim to obtain the training sequence which is the most similar to a particular sequence according to the values of probabilistic transition frequency profile and average probabilistic emission in one class. Afterwards, the highest similarity value is calculated from each class.
To implement the fault diagnostics task, the values of probabilistic transition frequency profile and average probabilistic emission are calculated using each hidden Markov model for a particular class.
The similarity function is defined as follows.
where
In this section, we test the performance of the proposed method by two parts: 1) Fault diagnosis, and 2) Fault prediction (Equipment degradation state Recognition).
Fault diagnosis
In this part, we utilize the two common fault modes in the synchronous motors to test the performance of our method, that is, 1) unbalanced rotor and 2) bearing faults. Furthermore, a SpectraQuest fault simulator machine is used in this experiment. Particularly, the accumulated data comprises data sequences for the bearing fault (named as BRG), unbalanced rotor fault (named as UBR), and healthy machine (named as HTY), with 180 sequences for each class, where 10 sequences are saved at each machine with the operating speed between [15 Hz, 32 Hz].
To make performance comparison, healthy machine sequences are utilized as well. Figures 2 to 4 demonstrate three sample vibration sequences under different conditions, that is, 1) Bearing fault, 2) Unbalanced rotor and 3) Healthy conditions at 23 Hz.

Samples from bearing fault.

Samples from unbalanced rotor.

Samples from healthy conditions.
In this experiment, 150 samples are used for training, and 100 samples are utilized for performance testing.
As is shown in Table 1, the average fault diagnosis rate is 94% for three various fault types.
Equipment fault diagnosis results
To test the performance of the proposed approach in fault prediction of complex system, we analyze the running states of roller from normal state to abnormal state on the roller bearings test-bed. The tester and test data are collected from USA Case Western Reserve University electric engineering lab. Particularly, in the tester, three step induction motors is utilized, and its power is 1.5 KW with a power meter and a torque sensor by one self-calibration coupling. Furthermore, experimental data is collected using the vibration acceleration sensors.
In this experiment, we define five levels to describe equipment running states, that is, 1) L1: normal state, 2) L2: degradation state 1, 3) L3: degradation state 2, 4) L4: degradation state 3, and 5) L5: degradation state 4. In particular, L1 denotes the normal state, and L2, L3, L4, L5 refer to the degradation states with fault depths 0.18 mm, 0.36 mm, 0.53 m and 0.71 mm respectively. In addition, for each state, we collect 60 samples, in which 20 samples are used as training samples and others are regarded as testing samples. Furthermore, the sampling frequency is set to 12 kHz.
Wavelet correlation feature scale entropy eigenvectors is represented as T = {Wc1, Wc2, Wc3, Wc4, Wc5, } and the rescaled vector W
c
eigenvectors is represented as follows.
Afterwards, as information of vibration signal is belonged to the high frequency data, we simply the W c eigenvectors as T s ={ Wc1, Wc2 }.
It can be observed from Table 2 that the values of Wc1 and Wc2 are higher when the faults are more serious. W c eigenvectors are utilized as the inputted eigenvectors of the hidden Markov model for roller bearings fault diagnosis and prediction. Iterative step of each model and logarithm likelihood estimation probability value of each iterative step is shown in Fig. 5.
Values for different equipment running states

Training curve of the proposed method.
Figure 5 demonstrates that iterative curves of L1, L2, L3, L4 and L5 model is able to reach the error criterion in the training process, and the training processes of all the above five models can finish within thirty steps. Therefore, we can see that the training speed of the proposed method is fast. Next, the state recognition results are shown in Table 3, in which 40 testing samples are utilized.
Equipment state recognition results
As is shown in Table 3, the average accuracy of equipment state recognition by our proposed approach is 89.5%.
This paper aims to solve the problem of fault diagnosis and fault prediction in the complex system. Firstly, the proposed fault diagnosis and prediction system contains three parts: 1) Data preprocessing, 2) Degradation state detection, and 3) Fault diagnosis. Secondly, we use the Wavelet transform correlation filter to extract features for complex system, and the direct spatial correlations of wavelet transform contents are exploited used to search the locations of edges. Thirdly, we propose a HMM-based semi-nonparametric method to obtain the most similar training sequence. Finally, experimental results demonstrate the effectiveness of the proposed algorithm.
Footnotes
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Grant Nos. 71332003 & 71501007). The study is also sponsored by the Aviation Science Foundation of China (Grant No. 2014ZG51075) and the Technical Research Foundation.
