Sparse coding based RUL prediction and its application on roller bearing prognostics

Abstract

Roller bearings are among the most frequently encountered components in the majority of rotating machines. Thus, prognostic and health management of roller bearing plays an important role on the working conditions of the machine. Remaining useful life prediction is one of keys to apply PHM for practical applications. The collected bearing vibration signals are generally non-linear and non-stationary. However, those auto-regression model based methods are only suitable for the prediction of linear and stationary time series. Moreover, most of the existing machine learning based techniques require considerable training and parameter tunings which are time consuming and difficult for practical applications. To overcome these issues, a novel remaining useful life prediction method for rolling bearing prognostics is proposed in this work based on the sparse coding and sparse linear auto-regression model without training and parameter tunings. Sparse coding is formulated as a basis pursuit L₁-norm problem, where a sparse set of weight can be estimated for each test vector. Sparse local linear and neighbor embedding are employed to construct the proposed weight constraint sparse coding method. Two different experimental validations are conducted to well demonstrate the effectiveness and robustness of the proposed method for remaining useful life prediction of bearing via root-mean-square, peak-to-peak and kurtosis indicators in time-domain.

Keywords

Prognostic and health management trend prediction remain useful life sparse coding roller bearing

1 Introduction

Prognostic and health management (PHM) has recently attracted substantial attention due to its importance in increasing the maintainability and reliability of machines. PHM mainly involves condition monitoring [1], fault diagnosis [2] and remaining useful life (RUL) prediction [3], health management etc. As a fault can be predicted as early as possible, catastrophic consequences and economic losses can be avoided [4]. Thus, machine health management via RUL prediction has attracted much research interest in the past decades.

A large number of data-driven methods have been proposed for machine PHM [5]. Classical linear prediction techniques apply different autoregressive and moving average (ARMA) models [6]. Moreover, some machine learning based techniques, such as fuzzy artificial neural networks [7], support vector machines (SVMs) [8, 9] and back-propagation neural networks [10] methods have also been attempted to analyze the degradation process of a machine. Tse et al. [11] proposed a method to forecast RUL of a defective gearbox in a compressor and a bearing in a cooling tower fan based on recurrent neural networks. Recently, the RUL of the rolling element bearings was predicted using SVM [12]. Huang et al. [13] employed self-organizing map and BPNN to predict the trend of rolling element bearings. However, machine learning based techniques usually take much time in the training stage, and their prediction accuracy greatly depends on the historical data. Consequently, it is difficult to use them for practical online applications.

Recently, dictionary-based L₁-norm sparse coding (SC) has been used for image super resolution [14], pattern classification and nonlinear time series analysis etc. Different from those traditional machine learning methods, SC algorithm [15] is an unsupervised learning method through looking for an over-complete set of basis vectors to efficiently represent the sample data. Actually, two dictionaries are constructed in a SC model for time series prediction: one dictionary contains predictive training vectors, and the other one involves the corresponding target values [16]. Thus, a testing series can be constructed from the linear combination of the first dictionary atoms of SC and sparse weight. The main advantages of SC approaches are that it does not need any training process and time-consuming parameters tuning [17].

Consequently, a novel approach is proposed to predict RUL of rolling bearing in this paper using SC model combined with local linear embedding and sparse neighbor embedding technique. The rest of this paper is organized as follows. Section 2 recalls the theory of the SC prediction algorithm. In Section 3, the effectiveness of the proposed method has been demonstrated using data of accelerated degradation tests on rolling element bearings. Conclusions are drawn in Section 4.

2 The RUL prediction using sparse coding

SC-based time series prediction model has been successfully applied in many fields. Its improved version of SC will be adopted in this work for fault prediction of rolling bearing. This section describes our proposed method in detail.

2.1 Dictionaries of SC

In the case of one-step-ahead time series prediction, the training data is described by prediction window vectors of length L and their corresponding scalar targets. Given a sample series X, a training vector with a predictive window is given by:

$\begin{matrix} X_{nL} & = & [x (n - 1), x (n - 2), x (n - 3), \\ \dots, x (n - L)] \end{matrix}$ (1) where x (n) is the next sample target, and L is the corresponding window length. The training data is used to estimate the parameters of the model.

SC is considered as an unsupervised method to efficiently represent samples via an over-complete dictionary. This prediction model needs two dictionaries: one is applied for the prediction vectors in the form of the matrix ψ with the dimension L × N, $ψ = [X_{1 L}, X_{2 L}, X_{3 L}, \dots, X_{NL}]$ (2)

The other dictionary is introduced by targets in the form of a vector X_N with the dimension N × 1, which can be written by: $X_{N} = [x (1), x (2), x (3), \dots, x (N)]$ (3)

In this paper, the sample series X_N is the fault characteristic vector that represents fault trend of a rolling element bearing. Then, a test vector can be written below:

$\begin{matrix} X_{mL} & = & [x (m - 1), x (m - 2), x (m - 3), \\ \dots, x (m - L)] \end{matrix}$ (4)

2.2 Normalization of the training and testing vector

Normalization of the data is applied to improve its sparsity and dimensional homogeneity, which plays an important role in data mining and image processing. Thus, normalization will be also adopted to SC technique. The standard deviation G₁ (L₁ -norm) of a vector Y with length J has been used for normalization in this paper, which are given below, $G (Y) = \sqrt{\sum_{j = 1}^{J} {(Y (l) - \bar{Y})}^{2}} (L_{1} - norm)$ (5) where $\bar{Y} = \frac{1}{J} \sum_{j = 1}^{J} Y (m)$ . The training vector is normalized via $X_{nL}^{r} = X_{nL} / - G (X_{nL})$ (6)

Thus, those target vectors are also normalized in the same way, $X_{N}^{r} = X_{N} / - G (X_{nL})$ (7)

However, the test vector X_mL is normalized by its own norm: $X_{mL}^{r} = X_{mL} / - G (X_{mL}) .$ (8)

Moreover, dictionary ψ can be also normalized using Equation 6.

2.3 Weight constraint sparse coding

Many methods can be used to solve the weight estimation problem of SC, such as the basis pursuit (BP), basis pursuit denoising (BPDN) and LASSO. BPDN is used for SC approach, where the linear programming problem is solved with the primal-dual algorithm in this work. The over-complete dictionary is modified for each test vector to include only the K-nearest neighbors [21]. Moreover, sparse vector reconstruction problems can be solved via several different equivalent methods [19]. In this paper, sparse vector reconstruction is conducted by

$\begin{matrix} \min {∥ w ∥}_{1} \\ subject to ψ^{r} \cdot w = X_{mL}^{r} \end{matrix}$ (9) where ∥w ∥ ₁ is the L₁ norm of the weight vector, ψ^r = ψ/- G (ψ) is the normalized dictionary.

Subsequently, the constrained reconstruction algorithm, named locally linear and manifold embedding (LLE) [20] and sparse neighbor embedding are employed in this work. Combined with the Equation (9), for any given testing vector, the sparse weight that is also named as the weight constraint sparse coding (WCSC) is written by

$\begin{matrix} \min {∥ w ∥}_{1} \\ subject to {\begin{matrix} {∥ ψ_{knn}^{r} \cdot w - X_{mL}^{r} ∥}_{2}^{2} \leq ɛ^{2} \\ 1^{T} \cdot w = 1 \end{matrix} \end{matrix}$ (10) where ∥w ∥ ₁ is the L₁-norm of the weight vector, and $ψ_{knn}^{r}$ is the dictionary of normalized atoms which contains the K-nearest neighbor of the test vector $X_{mL}^{r}$ . Actually, if a dictionary and its constraints are not given in Equation (10), the algorithm can be simplified to the least squares autoregressive model which has a good performance in dealing with linear and stationary signals.

2.4 RUL prediction based on sparse linear AR model

Time series is forecasted via over-complete dictionary coding using sparse linear prediction model in this work. Given a training series with N samples, the test vector $X_{mL}^{r}$ is estimated by a weighted sum of the training vectors [22] $X_{mL}^{r} = \sum_{n = 1}^{N} w_{n} * X_{nL}^{r} + ɛ_{m}$ (11) where w_n, n = 1 : N, is estimated by minimizing the reconstruction error ɛ_m for each new testing vector. The estimated weight can be achieved as a regularized least squares optimization problem, $w = \arg \min_{w} {∥ ψ_{knn}^{r} \cdot w - X_{mL}^{r} ∥}_{2}^{2} + τ \cdot R_{N} (w)$ (12)

in which R_N (w) is a regularization function on the weight vector corresponding to a Lagrange coefficient τ. The weights can be estimated using Equation (10) if τ = 0. Then the estimated weights are used to calculate the predicted value of a test vector in terms of a linear superposition of (not necessarily orthogonal) basis functions [15], $x (m) = \sum_{n = 1}^{N} w_{n} * x (n)$ (13)

Then, Equation (13) can be further written as follows: $x (m) = w * X_{N}$ (14)

Combined with Equation (7), the predicted value is thus given by $x (m) = G (X_{nL}) * \sum_{n = 1}^{N} w_{n} * x^{r} (n)$ (15)

Moreover, the optimal weight vector is also estimated in the process of solving a convex optimization problem.

In this research, the BPDN is used for the SC approach in which the corresponding linear programming problem is solved with the primal-dual algorithm. The parameter of prediction error is always set to 0.001 (ɛ = 0.001) for all datasets [16] in this work. Due to the non-stationary and non-linear of the degradation of the roller bearing, a good robustness of the prognostic technique is necessary for the practical applications. The proposed method can well solve these problems, which will be demonstrated in the following section.

3 Experimental validation

In this section, data collected from two experimental test-rigs were utilized to verify the effectiveness of the proposed method. The first experimental validation mainly demonstrated performance of the trend prediction using the proposed model with different step length, while the second one will further verify the robustness and accuracy of the RUL prediction using some different datasets.

Moreover, the performance of the proposed method is also compared with the existing techniques, for instance, support vector regression combined with particle swarm optimization (PSO-SVR), sparse nearest neighbor embedding (SNNE). PSO-SVR does not need any prior knowledge due to its analytic property of the generalization performance measure. Thus it can be used to determine some multiple hyper-parameters at the same time [23]. It should be noted that BP is used for the SNNE and the dictionary is modified for each test vector. The L₁-magic library that uses the primal-dual algorithm for linear programming is utilized in BP algorithm. Herein, PSO is also applied to optimize three hyper-parameters of support vector regression γ, ɛ and c in cost function. All of these approaches use the same parameter settings for all datasets in this work.

3.1 Time-domain feature and evaluation criterion

Three time-domain features, i.e., peak-to-peak value F_p, kurtosis F_k and root mean square F_rms are used to indicate the degradation of bearing in this paper. All those parameters are illustrated in Table 1.

Table 1
Time-domain features used in this work

Feature Formula

Peak-to-Peak F_p = max |x (n) |

Root mean square $F_{rms} = \sqrt{\sum_{1}^{N} x (n) / - N}$

Kurtosis $F_{k} = \sum_{1}^{N} x (n)^{4} / - N / - F_{rms}^{4}$

Feature	Formula
Peak-to-Peak	F_p = max \|x (n) \|
Root mean square	$F_{rms} = \sqrt{\sum_{1}^{N} x (n) / - N}$
Kurtosis	$F_{k} = \sum_{1}^{N} x (n)^{4} / - N / - F_{rms}^{4}$

Since normalized mean square error (NMSE) is independent of the length of the time series, it is well suitable for evaluating the performance of a time series prediction approach. NMSE is given by $NMSE = \frac{\sum_{i = 1}^{M} (y_{i} - y_{d})^{2}}{\sum_{i = 1}^{M} (y_{i} - {\bar{y}}_{i})^{2}}$ (16) where y_i and y_d are the real and the forecasted value, while ${\bar{y}}_{i}$ is the mean of y_i and is illustrated as ${\bar{y}}_{i} = \frac{1}{M} \sum_{i = 1}^{M} y_{i}$ and M is the number of forecasted samples.

3.2 Case I

3.2.1 Introduction to the experimental system and vibration data

The performances of three methods have been evaluated using rolling element bearing run-to-failure tests [24]. The bearing test platform and its sketch are both shown in Fig. 1. In this bearing accelerated life testing, the power was transmitted from an AC motor to a shaft by a belt. Additionally, the shaft was supported by four double-row bearings of type Rexnord ZA-2115 [25]. A radial load 26648.16N was applied to bearings No. 2 and No. 3 by a spring mechanism. Additionally, accelerometers with high sensitivity were installed on the bearing housing. The rotating speed of the shaft was set to 2000RPM. Bearings have been worked more than 100 million revolutions which was far more exceeded the schedule lifetime of the bearing. The vibration data from the shell of bearing No. 1 was recorded at the specific interval of ten minutes. The experiment totally collected 984 data files in the bearing lifecycle with the sample frequency 20 kHz, and each file has 20480 sample points. Since three signatures can be computed using equations given in Table 1 for each file, 984 signature samples corresponding to a time-domain feature are used in this work.

Fig.1

The bearing test-rig of the IMS.

Fig.2

The bearing vibration signal and its time-domain features.

It can be found in Fig. 2 that the first 700 sample points are relatively stable. Subsequently, magnitude of the signature samples fluctuates nonstationarily. Therefore, for case I, 800 samples in each data set are used in the training (training is only used in the competitor methods), while 180 ones are used for the following test.

3.2.2 RUL prediction

Figure 3 shows the comparison of forecasted and actual trends of three feature sets, as well as their NMSEs with different steps of prediction. It is clearly shown in Fig. 3(a), (c) and (e) that the proposed method not only well predicts the failure trend but also has a good performance for both nonlinear and nonstationary detail characteristics. Then, the step length of prediction is initially set to 1 and will be increased one for the following 30 predictions in this paper. The results of NMSE are illustrated in Fig. 3(b), (d) and (f), where we can find that NMSE gradually increases as the step raises.

Fig.3

Results of bearing trend prediction and their corresponding NMSEs using three different features.

3.2.3 Discussion

A series of experiments have been carried out to further evaluate the proposed method using different training and testing strategies. All of the data-sets recorded feature values every 10 minutes. Figures 4–6 show the comparison of the actual data and the predicted results of the above mentioned three sets. Four different time scales, that is, 10-min, 1-h, 3-h and 5-h have been used as the prediction step in these experiments. As is shown in these results of the proposed WCSC method are much better than those of SNNE and PSO-SVR. More specially, the enlarged local parts illustrated in Fig. 4 clearly show that the proposed method has its unique advantages in the process of bearing fault deterioration. It can be also seen in the local enlarged figures marked in Figs. 5 and 6 that WCSC also displays the stable predictive performance even in the fault transition phase.

Fig.4

Predicted results of P2P data using three methods and four different time scales.

Fig.5

Predicted results of kurtosis data using three methods and four different time scales.

Fig.6

Results of RMS data using three methods for four different time scale predictions.

Tables 2–4 summarize results of NMSE for P2P, RMS and kurtosis feature sets, respectively. It has been demonstrated that WCSC outperforms the other two methods (SNNE and PSO-SVR) for the fault prediction, except for the 33- and 67-h training data. The reason is that the training data is not enough for WCSC. Nevertheless, for RMS and kurtosis signature sets, results of WCSC are always better than SNNE and PSO-SVR. It is noted that the SC technique has obvious advantages over traditional machine learning methods like PSO-SVR.

3.3 Case II

3.3.1 Introduction to the experimental system and vibration data

Table 2
NMSE of Three methods with four different time scales on P2P feature set

Time Method Training step

33-h 67-h 100-h 133-h

10-min PSO-SVR 0.2943 0.3345 0.3270 0.2911

SNNE 0.2526 0.2261 0.2401 0.2599

WCSC 0.2765 0.2387 0.2235 0.1972

1-h PSO-SVR 0.2976 0.3326 0.3308 0.3145

SNNE 0.2555 0.2287 0.2429 0.2629

WCSC 0.2797 0.2414 0.2261 0.2004

3-h PSO-SVR 0.3040 0.3344 0.2998 0.3222

SNNE 0.2605 0.2327 0.2474 0.2678

WCSC 0.2854 0.2461 0.2303 0.2031

5-h PSO-SVR 0.3095 0.3026 0.3430 0.3024

SNNE 0.2633 0.2350 0.2500 0.2707

WCSC 0.2885 0.2487 0.2327 0.2023

Time	Method	Training step
10-min	PSO-SVR	0.2943	0.3345	0.3270	0.2911
	SNNE	0.2526	0.2261	0.2401	0.2599
	WCSC	0.2765	0.2387	0.2235	0.1972
1-h	PSO-SVR	0.2976	0.3326	0.3308	0.3145
	SNNE	0.2555	0.2287	0.2429	0.2629
	WCSC	0.2797	0.2414	0.2261	0.2004
3-h	PSO-SVR	0.3040	0.3344	0.2998	0.3222
	SNNE	0.2605	0.2327	0.2474	0.2678
	WCSC	0.2854	0.2461	0.2303	0.2031
5-h	PSO-SVR	0.3095	0.3026	0.3430	0.3024
	SNNE	0.2633	0.2350	0.2500	0.2707
	WCSC	0.2885	0.2487	0.2327	0.2023

Table 3

NMSE of three methods with four different time scales on RMS feature set

Time	Method	Training step
		33-h	67-h	100-h	133-h
10-min	PSO-SVR	0.1959	0.1991	0.1997	0.2155
	SNNE	0.2224	0.1986	0.2094	0.2044
	WCSC	0.1892	0.1881	0.1802	0.1813
1-h	PSO-SVR	0.2235	0.2037	0.2043	0.2256
	SNNE	0.2278	0.2034	0.2148	0.2093
	WCSC	0.1937	0.1926	0.1846	0.1858
3-h	PSO-SVR	0.2099	0.2133	0.2137	0.2153
	SNNE	0.2391	0.2135	0.2254	0.2196
	WCSC	0.2033	0.2021	0.1936	0.1947
5-h	PSO-SVR	0.2144	.02178	.02192	.02179
	SNNE	0.2443	0.2080	0.2300	0.2241
	WCSC	0.2076	0.2064	0.1978	0.1989

Table 4

NMSE of three methods with four different time scales on kurtosis feature set

Time	Method	Training step
		33-h	67-h	100-h	133-h
10-min	PSO-SVM	0.7082	0.7093	0.7087	0.7089
	SNNE	0.5969	0.9451	0.5682	0.7057
	WCSC	0.5824	0.5570	0.5337	0.6382
1-h	PSO-SVR	0.7122	0.7124	0.7124	0.7129
	SNNE	0.6007	0.9512	0.5718	0.7103
	WCSC	0.5860	0.5605	0.5371	0.6423
3-h	PSO-SVR	0.7197	0.7196	0.7197	0.7186
	SNNE	0.6067	0.9610	0.5774	0.7174
	WCSC	0.5919	0.5661	0.5422	0.6487
5-h	PSO-SVR	0.7213	0.7200	0.7201	0.7189
	SNNE	0.6063	0.9604	0.5768	0.7174
	WCSC	0.5915	0.5650	0.5411	0.6479

Fig.7

Overview of the experimental system.

The experimental data was provided by the IEEE Reliability Society and the Franche-Comté Electronics Mechanics Thermal Science and Optics-Science and Technologies (FEMTO-ST) Institute [4]. The experimental system is illustrated in Fig. 7. Type of the tested bearings is NSK 6804RS. In order to conduct accelerated degradation testing for bearings, a horizontal force that equals to the bearings’ maximum dynamic load 4000 N was applied to the tested bearings. The rotating speed was set to 1800 r/min. The load and the speed were accurately controlled by the pressure regulator and the speed controller of the motor, respectively. The load and speed were both kept constant in the experiments. Therefore, influences of their variations on the RUL prediction can be ignored in this paper [26]. The sampling frequency was 25.6 kHz, and 2560 samples were recorded every 10 seconds. In order to avoid damages to the test-rig, the end-of-life time was determined when the amplitude of the acquired vibration signal magnitude exceeded 20 g.

Fig.8

The temporal vibration signal of data Bearing1_7 (Top) followed by three features.

Fig.9

The temporal vibration signal of data Bearing3_3 (Top) followed by three features.

Bearings of these three operations were separately tested 7, 7 and 3 times. Thus, those recorded data are named as Bearing1_1 to Bearing1_7, Bearing2_1 to Bearing2_7 and Bearing3_1 to Bearing3_3, respectively [27]. The first two data sets in every group were regarded as the training set and the others were used as the testing sets. Figures 8 and 9 show the original vibration signal and its three feature sets of Bearing1_7 and Bearing3_3. It can be seen that the amplitude of vibration signals rises along with the degradation of the bearing performance.

Table 5

Description of all the datasets

Data	Training sampling	Test sampling
	(10 s)	(10 s)
Bearing1_3	1801	573
Bearing1_4	1138	339
Bearing1_5	2301	161
Bearing1_6	2301	146
Bearing1_7	1501	757
Bearing2_3	1201	753
Bearing2_4	611	139
Bearing2_5	2001	309
Bearing2_6	571	129
Bearing2_7	171	58
Bearing3_3	351	82

Fig.10

Results of kurtosis feature of the data Bearing1_7 using three methods.

Fig.11

Results of P2P feature of the data Bearing1_7 using three methods.

Fig.12

Results of RMS feature of the data Bearing1_7 using three methods.

3.3.2 RUL prediction

In this experiment, all the training data and test data are shown in Table 5. Figures 10–12 show the actual trends results of three feature sets for Bearing1_7, respectively. RUL prediction results of Bearing1_7 with three features are given in Fig. 13. Similarly, Figs. 14–16 separately show the actual trends results of three feature sets for Bearing1_7. Bearing3_3 and Bearing2_4. RUL prediction results of Bearing3_3 with three features are given in Fig. 17. It can be seen that the proposed method not only well predicts the failure trend but also has a good robustness. Bearing RUL can be well predicted using three features. However, prediction results using RMS feature is much better than those achieved using P2P and kurtosis features. Figure 18 shows the fault trend prediction and its RUL results of the data Bearing2_4.

Fig.13

The RUL prediction results of three features of the data Bearing1_7 using WCSC.

Fig.14

The comparison of three models of Bearing3_3 on Kurtosis Data.

Fig.15

The comparison of three models of Bearing3_3 on P2P Data.

Fig.16

The comparison of three models of Bearing3_3 on RMS Data.

Fig.17

The RUL prediction results of the data Bearing3_3 using three features and WCSC.

Fig.18

Results of the predicted trend and RUL of data Bearing2_4 using three features and WCSC.

3.3.3 Discussion

A series of experiments have been carried out to further evaluate the proposed method using different bearing dataset. All of the data-sets are used to record feature values in 10-second intervals. Figures 10–12 show the comparison of the actual data and the predicted results of the above mentioned three features of the Bearing1_7 dataset. As is shown in Fig. 10, results of the WCSC and the PSO-SVR are both much better than those of the SNNE. However, the result of our method shown in the enlarged local part has the best performance. It can be seen in Figs. 11 and 12 that results of the proposed method and SNNE are relatively similar and are both much better than that of the PSO-SVR. WCSC is still superior to the SNNE as is illustrated in local enlarged figure. The RUL prediction results of three features are shown in Fig. 13. It is seen that the results of RUL prediction almost coincide with actual RUL, but RMS has the better performance. Simultaneously, the RUL prediction result of data Bearing3_3 with RMS feature is also better than P2P, as is shown in Fig. 17. This implies that RMS are much more suitable for RUL prediction than kurtosis and P2P. This observation can be also found in the RUL prediction result of data Bearing2_4, as is illustrated in Fig. 18.

Table 6
The comparison of NMSE results using three methods for all datasets

Feature Method Bearing Bearing Bearing Bearing Bearing Bearing Bearing Bearing Bearing Bearing Bearing

1_3 1_4 1_5 1_6 1_7 2_3 2_4 2_5 2_6 2_7 3_3

Kurtosis PSO-SVR 1.2038 1.3396 1.1054 1.1591 0.9177 6.3769 0.5226 0.3847 NAN 3.4824 0.2231

SNNE NAN 2.0160 1.1044 1.3238 0.9506 NAN 0.4456 0.3794 0.6651 1.1771 0.2621

WCSC 2.6537 1.0679 1.0696 1.0641 0.6926 1.4017 0.3899 0.3834 0.6685 1.9282 0.1761

P2P PSO-SVR 0.0861 1.2673 0.0271 0.0410 0.0331 0.6662 0.1599 0.5788 0.0469 0.4523 0.1043

SNNE 0.0831 1.5036 0.0453 0.0235 0.0299 0.5698 0.1047 0.3828 0.0976 0.7653 0.1562

WCSC 0.0331 0.0579 0.0344 0.0136 0.0161 0.5406 0.1660 0.3181 0.0526 0.5790 0.1148

RMS PSO-SVR 5.3408 4.6558 0.0657 0.0391 0.0393 0.5646 0.2582 0.2077 0.1256 0.5017 0.0940

SNNE 0.9499 1.2729 0.0680 0.0401 0.0371 2.6441 0.2399 0.2275 0.0978 1.3647 0.1077

WCSC 0.8794 1.1522 0.0663 0.0289 0.0371 0.6548 0.1605 0.1846 0.0729 0.6203 0.0789

Feature	Method	Bearing	Bearing	Bearing	Bearing	Bearing	Bearing	Bearing	Bearing	Bearing	Bearing	Bearing
Kurtosis	PSO-SVR	1.2038	1.3396	1.1054	1.1591	0.9177	6.3769	0.5226	0.3847	NAN	3.4824	0.2231
	SNNE	NAN	2.0160	1.1044	1.3238	0.9506	NAN	0.4456	0.3794	0.6651	1.1771	0.2621
	WCSC	2.6537	1.0679	1.0696	1.0641	0.6926	1.4017	0.3899	0.3834	0.6685	1.9282	0.1761
P2P	PSO-SVR	0.0861	1.2673	0.0271	0.0410	0.0331	0.6662	0.1599	0.5788	0.0469	0.4523	0.1043
	SNNE	0.0831	1.5036	0.0453	0.0235	0.0299	0.5698	0.1047	0.3828	0.0976	0.7653	0.1562
	WCSC	0.0331	0.0579	0.0344	0.0136	0.0161	0.5406	0.1660	0.3181	0.0526	0.5790	0.1148
RMS	PSO-SVR	5.3408	4.6558	0.0657	0.0391	0.0393	0.5646	0.2582	0.2077	0.1256	0.5017	0.0940
	SNNE	0.9499	1.2729	0.0680	0.0401	0.0371	2.6441	0.2399	0.2275	0.0978	1.3647	0.1077
	WCSC	0.8794	1.1522	0.0663	0.0289	0.0371	0.6548	0.1605	0.1846	0.0729	0.6203	0.0789

Moreover, it can be clearly seen in Figs. 14–16 that proposed WCSC method has the best results in comparison with PSO-SVR and SNNE methods. More specially, the prediction results of PSO-SVR shown in Fig. 15 shows a better performance than that of WCSC for the stationary stages. However, WCSC has its unique advantage when the fault became seriously shown in local enlarged part of Fig. 15. These results show the proposed model has good performance in RUL prediction of bearing.

In order to further verify the robustness of the model, all of the datasets given in Table 5 were used for demonstration its validations. Results of three datasets are illustrated in Table 6 where we can find that the proposed model has the lowest NMSE for the majority of datasets. For example, the best results of NMSE can be obtained using the proposed WCSC method for seven bearing datasets among them, while PSO-SVR only achieves the best result for the dataset Bearing1_3 and SNNE get the best results for the last three bearing datasets. For the P2P feature, WCSC shows the best results of NMSE of six bearing datasets, while PSO-SVR and SNNE can only get their corresponding best results for some of bearing datasets. As for RMS metrics, it is still clearly revealed that the proposed approach almost achieves good results for all bearing datasets. In general, most of results of NMSE for all bearing time-domain features sets achieved using the proposed method show the best performances in comparison with PSO-SVR and SNNE techniques. Thus, it is demonstrated that the proposed sparse coding based prediction method exhibits better robustness than PSO-SVR and SNNE for bearing fault prediction. These experiments further verify the WCSC technique has obvious advantages over traditional machine learning methods. Hence, the proposed WCSC method is much more suitable for bearing fault trend prediction.

4 Conclusions

A RUL prediction method is proposed for rolling element bearing of rotating machine based on the proposed WCSC technique and the sparse linear AR model. The main advantage of the proposed approach is that it does not require any training, which is very suitable for practical applications. Moreover, fault trend of rolling bearing is indicated via three time-domain metrics, that is, RMS, P2P and kurtosis. Data collected from two different bearing run-to-failure test-rigs is used to evaluate its performances. Results show that the proposed method is much more effective for three time-domain features with one-step-ahead prediction, compared with SNNE and PSO-SVR methods. The results from two cases also demonstrate the good robustness of the proposed method. Moreover, remaining useful life (RUL) of bearing can be also further computed based on the predicted fault trend.

Moreover, it should be noted that the proposed method is able to be adjusted to different operation conditions benefited from the WCSC algorithm.

Footnotes

Acknowledgments

The financial sponsorship from the project of National Natural Science Foundation of China (51475098 and 61463010) and Guangxi Natural Science Foundation (2016GXNSFFA380008) are gratefully acknowledged. It’s also sponsored by Guangxi Key Laboratory of Manufacturing System & Advanced Manufacturing Technology (1514030001Z, 1638012004Z) and Innovation Project of Guangxi Graduate Education (YCSW2017136).

References

Wei

, Wang

, et al., A novel intelligent method for bearing fault diagnosis based on affinity propagation clustering and adaptive feature selection, Knowledge-Based Systems116 (2017), 1–12.

Wang

Y.X.

, Xiang

J.W.

, Markert

, et al., Spectral kurtosis for fault detection, diagnosis and prognostics of rotating machines: A review with applications, Mechanical Systems and Signal Processing66–67 (2016), 679–698.

Lee

, Wu

F.J.

, Zhao

W.Y.

, Ghaffari

, Liao

L.X.

and Siegel

, Prognostics and health management design for rotary machinery systems— Reviews, methodology and applications, Mechanical Systems and Signal Processing42 (2014), 314–334.

Lei

, Li

and Lin

, A new method based on stochastic process models for machine remaining useful life prediction, IEEE Transactions on Instrumentation and Measurement65 (2016), 2671–2684.

Kan

M.S.

, Tan

A.C.C.

and Mathew

, A review on prognostic techniques for non-stationary and non-linear rotating systems, Mechanical Systems and Signal Processing62–63 (2015), 1–20.

J.G.

, Gooijer and R.J. Hyndman, 25 years of time series forecasting, International Journal of Forecasting22 (2006), 443–473.

Khashei

and Bijari

, Fuzzy artificial neural network (p, d, q)model for incomplete financial time series forecasting, Journalof Intelligent & Fuzzy Systems26 (2014), 831–845.

, Yin

, Cai

and Zheng

, Fault diagnosis for rotating machinery based on Local Mean Decomposition morphology filtering and Least Square Support Vector Machine, Journal of Intelligent & Fuzzy Systems32 (2017), 2061–2070.

Song

, Niu

, Qiu

, Xiao

and Ma

, Improved short-term load forecasting based on EEMD, Guassian disturbance firefly algorithm and support vector machine, Journal of Intelligent & Fuzzy Systems31 (2016), 1709–1719.

10.

Kavousi-Fard

, Niknam

and Golmaryami

, Short term load forecasting of distribution systems by a new hybrid modified FA-backpropagation method, Journal of Intelligent & Fuzzy Systems26 (2014), 517–522.

11.

Tse

P.W.

and Atherton

D.P.

, Prediction of machine deterioration using vibration based fault trends and recurrent neural networks, Journal of Vibration121 (1999), 355–362.

12.

Soualhi

, Medjaher

and Zerhouni

, Bearing health monitoring based on Hilbert–Huang transform, support vector machine, and regression, IEEE Transaction on Instrumentation and Measurement64 (2015), 52–62.

13.

Huang

, Xi

, et al., Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods, Mechanical Systems and Signal Processing21 (2007), 193–207.

14.

Yang

, Wright

, Huang

T.S.

and Ma

, Image super-resolution via sparse representation, IEEE Transactions on Image Processing19 (2010), 2861–2873.

15.

Olshausen

B.A.

and Field

D.J.

, Emergence of simple-cell receptive field properties by learning a sparse code for natural images, Nature381 (1996), 607–609.

16.

Fakhr

M.W.

, Sparse locally linear and neighbor embedding for nonlinear time series prediction, International Conference on Computer Engineering & Systems (2015), 371–377.

17.

Fakhr

M.W.

, Online nonstationary time series prediction using sparse coding with dictionary update, pp, International Conference on Information and Communication Technology Research2015, 112–115.

18.

Gordo

, Perronnin

, Gong

and Lazebnik

, Asymmetric distances for binary embeddings, IEEE Transactions on Pattern Analysis and Machine Intelligence36 (2011), 33–47.

19.

Becker

, Bobin

and Candes

E.J.

, NESTA: A fast and accurate firstorder method for sparse recovery, SIAM J Imaging Sciences4 (2009), 1–39.

20.

Roweis

S.T.

and Saul

L.K.

, Nonlinear dimensionality reduction by locally linear embedding, Science290 (2000), 2323–2326.

21.

Elhamifar

and Vidal

, Sparse subspace clustering: Algorithm, theory, and applications, IEEE Transactions on Pattern Analysis and Machine Intelligence35 (2013), 2765–2781.

22.

Giacobello

, Christensen

M.G.

, Murthi

M.N.

, et al., Sparse linear prediction and its applications to speech processing, IEEE Transactions on Audio, Speech, and Language20 (2012), 1644–1657.

23.

Guo

C.X.

, Yang

H.J.

, Wang

G.C.

and Liang

C.Y.

, A novel LS-SVMs hyper-Parameter selection based on particle swarm optimlzation, Neurocomputing71 (2008), 3211–3215.

24.

Qiu

, Lee

and Lin

, Wavelet Filter-based weak signature detection method and its application on roller bearing prognostics, January of Sound and Vibration289 (2006), 1066–1090.

25.

Lin

and Dou

, A novel method for condition monitoring of rotating machinery based on statistical linguisticanalysis and weighted similarity measures, January of Sound and Vibration390 (2017), 272–228.

26.

Guo

, Li

, Jia

, Lei

and Lin

, A recurrent neural networkbased health indicator for remaining useful life prediction ofbearings, Neurocomputing240 (2017), 98–109.

27.

Lei

, Li

, Gontarz

, Lin

, Radkowski

and Dybala

, A model-based method for remaining useful life prediction of machinery, IEEE Transactions on Reliability65 (2016), 1314–1326.

Sparse coding based RUL prediction and its application on roller bearing prognostics

Abstract

Keywords

1 Introduction

2 The RUL prediction using sparse coding

2.1 Dictionaries of SC

3.1 Time-domain feature and evaluation criterion

Table 1 Time-domain features used in this work Feature Formula Peak-to-Peak F p = max |x (n) | Root mean square F rms = ∑ 1 N x ( n ) / - N Kurtosis F k = ∑ 1 N x ( n ) 4 / - N / - F rms 4

3.2.1 Introduction to the experimental system and vibration data

3.3.1 Introduction to the experimental system and vibration data

Footnotes

Acknowledgments

References

Table 1
Time-domain features used in this work

Feature Formula

Peak-to-Peak F_p = max |x (n) |

Root mean square $F_{rms} = \sqrt{\sum_{1}^{N} x (n) / - N}$

Kurtosis $F_{k} = \sum_{1}^{N} x (n)^{4} / - N / - F_{rms}^{4}$