Abstract
Based on phase space reconstruction (PSR) and hybrid VNS-SVR model, a remaining useful life (RUL) prediction method for aircraft engines is proposed. The proposed hybrid model combines support vector regression (SVR), which has been successfully adopted for regression problems, with the variable neighborhood search (VNS). First, the phase space reconstruction is used to transform the selected one-dimensional performance sequences of aircraft engines into matrix forms, which increases the data information and improve the learning efficiency of the model effectively. Then, SVR is used to construct the prediction model. Meanwhile, a VNS algorithm is proposed to optimize the kernel parameters. Finally, the hybrid model is used to RUL prediction of the aircraft engines. The experimental results show that the method has a good prediction performance.
Keywords
Introduction
The remaining useful life (RUL) refers to the operating time of the machine before maintenance or replacement, which is a key component in predictive and health management (PHM) system [1, 2]. As the core component of the aircraft, the realization of aircraft engines safe operation is very important [3–5]. At present, there is a relatively complete PHM system in aerospace field, which has a significant advantage in fault prospecting, automated management and decision making for maintenance. RUL prediction of aircraft engines is the core technology of the aero-engine PHM system. Predicting the RUL of aircraft engines effectively and accurately based on monitoring parameters is a prerequisite for rational maintenance planning and scientific management.
The aircraft engine is a complex system, the performance of which relates to the health status of each component, as well as the environment, load, and operation. With the rapid development of sensors and monitoring technology, airlines have accumulated more and more engines performance monitoring data during the use of aircraft engines and established a data warehouse to store these data, which facilitates the extraction of useful information for decision support. The major engine manufacturers in the world have successively developed corresponding engine health monitoring software, such as the SAGE system of General Motors, the EHM system of Pratt & Whitney, and the COMPASS system of Rolls-Royce. Figure 1 shows the diagram of main components for engine model [6]. Engine performance monitoring mainly includes the following three aspects:

Diagram of main components for engine model.
Gas circuit performance monitoring, including engine exhaust temperature, rotor speed, pressure, fuel flow, etc. Oil monitoring, including oil abrasive monitoring, oil temperature, and pressure monitoring, average oil consumption rate monitoring, and oil quality monitoring. Vibration monitoring, including vibration offset of high and low-pressure rotors, etc.
The commonly used RUL prediction methods include reliability-based methods, failure mechanism-based methods, and data driven-based methods. The aircraft engine is a highly reliable device, its operating conditions are complex and variable, and different individuals vary widely. In practice, it is hard to collect plenty of failure data. Therefore, traditional reliability-based RUL prediction methods proves to be ineffective. Furthermore, the aircraft engine is a type of complex and high-precision equipment, which makes it challenging to find a mechanism-based model to characterize the working state. Thus, failure mechanism-based RUL prediction methods are difficult to be implemented. The data-driven approach attempts to predict the device RUL directly from the online monitoring data. The establishment of the data-driven RUL prediction model only needs to collect enough performance data, which makes it become the most widely used modeling method currently. Gertsbakh and Kordonskiy [7] first used the performance degradation data to assess the reliability of equipment and gave a simple linear degradation model with random parameters of intercept and slope. From then on, a series of research on RUL prediction with data-driven methods began. Dong and He [8] built a prediction model for diagnostics and prognostics based on semi-Markov method. Peng et al. [9] proposed a RUL estimation approach by modifying the echo state network. Chen and Tsui [10] applied the degradation signals to predict the RUL. Moghaddass and Zuo [11] diagnosed the fault based on the multistate deterioration process. Other studies on data-driven RUL prediction include support vector machine (SVM) [12–16], Gaussian process regression [17], hidden Markov models (HMM) [18], and various neural networks (NN) [19–23], etc.
The degradation process data sequence of aircraft engines exhibit strong non-linear and non-stationary characteristics, hence, the RUL prediction for aircraft engine is a typical non-linear time series prediction problem. Some recent references on RUL prediction of aircraft engines are listed in Table 1. Machine learning methods have performed well in solving prediction problems in recent years. Zhang and Xu [24, 25] predicted the solubility and solidification cracking through LSBoost. Ignacio et al. [26] used machine learning & uncertainty modelling to predict the ransformer lifetime. For other studies on prediction using machine learning, see [27–29]. Table 2 shows the comparison of common machine learning methods. As shown in Table 2, compared with other models, support vector regression (SVR) is capable of handling non-linear and high-dimensional data and has high generalization ability. Consider the non-linearity and non-stationarity of the degradation process data sequence of aircraft engines, SVR is selected to solve the RUL prediction problem of aircraft engines. Based on the advantages of SVR, it has been widely used in many fields to get better accuracy. Pai and Hong [36] established a recurrent SVR model to predict the rainfall. Li et al. [37] built a manufacturing knowledge model based on SVR and the manufacturing experience. SVR model also was used to solve the traffic flow prediction problem [38] and the electric load prediction problem [39].
RUL prediction problems of aircraft engines in recent years
Comparison of common machine learning methods
The prediction accuracy of a SVR model is not only concerned with the input features, but also closely relates to the selection of model parameters. The traditional parameter selection methods are grid search method and empirical method. The former method is time-consuming, and the latter method is too subjective. In addition, the above two parameter selection methods only adjust the parameters one by one, and can not achieve the overall unified adjustment. Recently, more and more intelligent optimization algorithms have been applied to parameter selection of SVR model, such as GA [40, 41], PSO [42, 43].
Chaos is the most commonly used model for characterizing non-linear dynamic systems. In a chaos system, enough information can be extracted from current state variables to forecast the future state [44]. The system state is not only caused by external factors, but also relates to the internal motion process determined by certain rules of the system itself. The values of the state variables in a time series result from the complex interaction among the system elements [45]. In order to extract the characteristics of a chaos system, phase space reconstruction (PSR) is widely used to process the time series data. Baydaroğlu and Koçak [46] proposed a SVR-based evaporation prediction method combined with chaotic approach. Yang et al. [47] predicted the electric load based on PSR and an learning SVR. In this paper, PSR is employed to preprocess the aircraft engine time series data.
Predicting remaining useful life through the health condition can help airlines solve engine management problems and improve their management level. Due to the importance of the RUL prediction in aircraft engines, extensive research has been undertaken to obtain higher prediction accuracy, as shown in Table 1. Although the prediction accuracy has improved in recent years, it is not enough for policymakers. Based on previous studies and the analysis on RUL prediction, this paper aims to obtain a higher prediction accuracy.
The main work of this paper concerns the performance of the SVR coupling with an intelligent algorithm for parameter optimization of RUL prediction in aircraft engines. The innovations of this paper are as follows: 1) First, PSR is used to select the features adaptively. The aircraft engine is a power system, and the phase space reconstruction of the engine performance data can help to restore the real operation process of aircraft engines. It can increase the useful information and improve the learning efficiency of the model. 2) Second, the VNS algorithm is used to optimize the parameters of the SVR model. Different from other algorithm, the VNS designed in this paper can use a variety of combinational optimization modes to find the optimal scheme of parameters, which improves the speed of finding the optimal solution. 3) Finally, the model is performed on the degradation processes data sequence of aircraft engines. The results show that the proposed method can be successfully applied to the RUL prediction of aircraft engines.
The rest of the paper is organized as follows. In section 2, phase space reconstruction algorithm is introduced and the hybrid VNS-SVR model is designed for the RUL prediction of aircraft engines. In section 3, a real example of RUL prediction for aircraft engines is used to illustrate the proposed model. The conclusions are shown in section 4.
This section will introduce the necessary background knowledge and our proposed model. First, PSR and the selection methods of delay time and embedding dimension are elaborated. Then the proposed VNS-SVR model is described in detail. The flowchart of the proposed scheme is illustrated in Fig. 2. The detailed steps are as follows:

The flow chart of proposed method in this paper.
Step 1. Divide the sample data into training data and predicted data.
Step 2. Use C-C method to calculate the delay time τ and embedding dimension m for both training data and testing data.
Step 3. Reconstruct the phase space with the delay time τ and embedding dimension m.
Step 4. Apply Wolf method to calculate the maximum Lyapunov index λ.
Step 5. Train and generate a SVR model based on the reconstructed training data.
Step 6. Use the designed VNS algorithm to optimize the parameters C, ɛ and γ.
Step 7. Use the trained model to predict the RUL of the testing data and output the results.
Affected by the coupling of varieties of factors, the performance degradation time series of aircraft engines have typical chaotic characteristic. To improve the RUL prediction accuracy of aircraft engines, based on the chaos characteristics, this paper establishes chaotic systems of performance degradation time series.
For univariate chaotic time series x (t) , t = 1, 2, ⋯ , T, Packard and Takens [48] proposed the PSR based on delay-coordinate method,
Then the reconstructed phase space X is obtained by using the delay sequence X
t
, with delay time τ and embedding dimension m, as shown in Equation (2).
Next, we will discuss the phase space reconstruction of multivariate time series. Set the N-dimensional multivariate time series X1, X2, ⋯ , X
N
, where
A key step of the phase space reconstruction is to select appropriate values for m and τ. Theoretically, there is no limit to the choice of delay time τ for the time series without noise. However, in actual situation, due to the limitation of the length of time series and noise signal, the determination methods of τ and m are of great significance to the quality of the reconstruction.
The selection of delay time τ is crucial. If τ is too small, the trajectory of phase space will be in the same location, and then redundant errors will be generated. If τ is too large, the shape dynamics of the adjacent moments will change dramatically and the system signal will be distorted. For the selection of embedding dimension m, in general, as long as m is large enough, the singular attractors of chaotic systems can be appropriately described, and the memory motion law of chaotic systems can be revealed. However, if m is too large, it will increase the computation. According to Takens theorem [48], when m ⩾ 2D + 1, where D is the correlation dimension, the reconstructed phase space is equivalent to the original dynamic system. In this paper, C-C method [49] is applied to calculate the τ and m.
Chaos characteristic judgment
Only time series with chaotic characteristics can be predicted accurately by using relevant models of chaotic theory. Therefore, to predict time series using various models of chaos theory based on phase space reconstruction, we need to prove the chaotic characteristic of time series first. The maximum Lyapunov index λ shows the divergence of two adjacent orbital spaces and measures the sensitivity of a chaotic system to the initial value, which is an important basis for judging chaotic characteristics. Based on the definition of Lyapunov index, Wolf et al. [50] proposed Wolf method to extract Lyapunov index directly according to the evolution process of the phase space. In this paper, Wolf method is used to calculate the maximum Lyapunov index λ. If λ is positive, the system has initial sensitivity and presents a chaotic state.
Support vector regression
SVR is a regression prediction method. Suppose (x1, y1) , ⋯ , (x
n
, y
n
), x
i
∈ R
m
, y
i
∈ R, are the sample data. A linear function, namely SVR function, is given by
The problem of solving the optimal hyperplane can be treated as the following optimization problem.
To solve a quadratic problem with linear constraints, the Kalush-Kun-Tuck (KKT) optimal conditions are necessary and sufficient. Then the solution can be obtained by a combination of support vectors (s.v.),
Kernel selection is one of the key technologies to improve the ability of SVR. This paper uses the radial basis function as shown in the following equation:
The prediction accuracy and the generalization performance of the SVR model depends on the settings of the hyperparameters C, ɛ, and the kernel parameter γ. Next, we will introduce the VNS algorithm to optimize SVR parameters. Figure 3 shows the process of the combining SVR with VNS.

Flowchart of VNS-SVR model.
The performance of SVR depends greatly on the parameters C, ɛ and γ in the model. Therefore, how to select proper parameters according to the training set to ensure a good performance is a key step in designing SVR model. In this paper, a VNS algorithm is proposed to optimize the above three parameters. The pseudocode of VNS is displayed in Algorithm 1.
Initial solution
First, an initial solution needs to be generated. Obviously, the initial solution has a positive effect on the performance of the VNS algorithm. A good initial solution can not only improve the accuracy of the algorithm but also reduce the running time. In this paper, we determine the initial solution of the parameters C, ɛ and γ according to Cherkassky and Ma [51].
Selection of ɛ. For a training data set { (x1, y1) , ⋯ , (x
n
, y
n
) }, the parameter ɛ can be calculated by the following prescription.
Selection of C. The initial value of parameter C can be calculated as follows:
Selection of γ. RBF kernel function is used in this paper, and the parameter γ should be selected to reflect the input range of the training data. For multivariate d-dimensional problems, γ is set to make γ d ∈ (0.1, 0.5), which proves to yield a good SVR performance for various regression data sets.
The main purpose of the shaking procedure is to find a better neighborhood of the current solution. In the shaking phase, a new solution is randomly selected in the N k (s) structure of the current best solution s. The pseudo-code is given in Algorithm 2.
Neighborhood structures
Considering the characteristics of VNS, we design a six-digit integer coding (an integer between 0 and 106) in this paper. Given the range of parameters, the values of parameters can be obtained by multiplying the coding with the corresponding coefficients. In addition, VNS explores the solution space exhaustively by sampling increasingly larger neighborhoods. Considering the above two points, we set the neighborhood structure of VNS as follows:
Where {10a(i), 10b(i)} is the range of the kth element in the code. m (k) reflects the size of the kth neighborhood. A larger k indicates a bigger size. Pa,b (c) is a function to guarantee that the solution after neighborhood transformation can be mapped to the solution space [52].
As a matter of fact, the neighborhood structures N k , k = 1, ⋯ , k max , contain as many possible cases in the solution space as possible. The local search stage finds the optimal solution according to the fitness function. Finally, we obtain the optimal combination of parameters (C, ɛ, γ) by VNS algorithm, which contributes to improving the performance of SVR.
In this paper, the coefficient of determination R2 is used as the main criteria to measure the accuracy of algorithm. Let y
i
be the observed values and
The larger the R2, the better the prediction rusults.
In addition, the scoring function (S) [53] is considered as the criterion to measure the algorithm performance in this research. The score S can be calculated by:
The lower the score S is, the higher performance the prediction presents.
The dataset used in the following experimental study is from NASA’s C-MAPSS turbofan engine, as seen in Table 3. Every subset consists of 3 operating parameters and 21 sensor measurements at different cycling time of the engines. Each subset of multivariate time series reflects the life cycle of each engine from inception to failure.
The C-MAPSS dataset
The C-MAPSS dataset
As shown in Table 3, time series training set and time series test set represent the number of engines in the training sample and the test sample in each subset, respectively. There is only one operation condition for FD001 and FD003. FD002 and FD004 have 6 operation conditions, which makes it more difficult to predict than FD001 and FD003. FD001 and FD002 have one fault condition, and FD003 and FD 004 have two fault conditions. Maximum life span (Cycles) shows the maximum running time of engine in each subset. See [53, 54] for a deeper understanding of the dataset.
The time each engine begins to degrade is unknown and the life cycle of each engine varies greatly. In addition, the major degradation occurs in the second half of the life cycle. To show the engine performance degradation more accurately, inverse life is defined as follows:
Before carrying out the experiment, we find that the values of variables Temp_fan, Pressure_fan, Pressure_bypass, Engine_pressure_ratio, Burner_fuel-air_ratio, Demanded_corrected_fan_speed, and Demanded_fan_speed keep constant in the life cycle. Therefore, we choose another 14 variables as the input features for dataset.
To eliminate the influence of different dimensions on the numerical values, further normalization of data is required. The normalization formula is given by
Then, the phase space is reconstructed for FD001 dataset. The time series consisting of the mean value of selected sensor measurements in a certain cycle (which is represented as red lines in Fig. 4) is used to calculate the parameters τ and m of the reconstructed phase space. Table 4 shows the results of the C-C method and Wolf algorithm.

The mean value of selected sensor measurements for 100 engines.
From Table 4, the maximum Lyapunov index of each feature is greater than 0, which indicates that each time series is chaotic, allowing us to reconstruct the phase space.
The phase space reconstruction parameters and the maximum Lyapunov index of each feature
Before the experiment, the dataset is divided into two parts. The first 80 engines are training set and the last 20 engines are testing set. We conduct an experiment to evaluate the proposed algorithm in predicting RUL for aircraft engines.
The experiment is run on Intel Core i7-7700 8 GB, the Microsoft Windows 10 operating system, and the development environment of Python 3.6.6, PyCharm 2018.3, and 500 iterations are executed. The parameter settings are shown in Table 5.
List of preset parameters in proposed model
Allowable error ɛ, penalty parameter C and kernel width γ are the hyper-parameters that need to be optimized by VNS algorithm. The ranges of ɛ, C and γ are [10-3, 10-1], [10-3, 103] and [10-6, 101], respectively. According to above specified parameters, 500 iterations have been performed based on the training set and testing set, and the results are reported in Table 6.
Optimal hyper-parameters of the PSR-VNS-SVR model
From the Table 6, we can know that the PSR-VNS-SVR has a good effect on predicting the RUL of aircraft engines. Figure 5 visually shows the prediction performance of the method. From Fig. 6, we find that almost all R2 for these predictions and for each engine are above 80% (the 99th is 78.53%). Figure 7 describes the degree of fluctuation of errors between the observed and predicted values As shown in Fig. 8, through Pearson’s correlation coefficient, we can see a strong correlation is exist between the observed and predicted values. From the comprehensive analysis of Figs. 5, 6, 7, and 8, we can conclude that the proposed method has a good performance on predicting RUL of aircraft engines.

Comparison between the actual RUL and the predicted RUL.

Coefficient of determination (R2) for the RUL prediction for each turbine by the proposed method.

Boxplot representing the difference between the actual RUL and the predicted RUL.

The correlation between the actual RUL and the predicted RUL.
As shown in Fig. 7, almost half of the boxplots have significant outliers. This is because the prediction results on some engines are not very good, such as engines 81, 84, 86, 88, 92, 95, 96, 98. At the high point of engine RUL, the predicted results are very poor. Moreover, the closer the RUL gets to 0, the better the predicted results become. From Fig. 6, the predicted effect of engine 90 is better than that of engine 99. However, in Fig. 7, the predicted effect of engine 90 is worse than that of engine 99. This situation also occurs in previous research, refer to engine 11 and engine 24 of group1 in Nieto et al. [6]. The reason may be that the predicted value of engine 90 is better than that of engine 99 in some stages, whereas the predicted value of other stages is not accurate and the difference between the observed and predicted values is large. The predicted value of engine 99 is not accurate overall, but the difference between the observed and predicted values is very small.
Many scholars have reported studies on all sub-datasets C-MAPSS datasets using different methods. To demonstrate the superiority of the proposed method, four neural network models Convolutional Neural Networks (CNN), Multilayer Perceptron (MLP), Convolutional Neural Networks Gated Recurrent Units (CNN-GRU) and Convolutional and Recurrent Neural Networks (CNN-RNN) and five SVR related models are selected to compare with the proposed method. Table 7 recapitulates the results of the proposed method and the comparative methods on the RUL estimation problem. We find that the proposed hybrid model outperforms the selected comparative methods.
Score comparison with other methods on C-MAPSS dataset
Score comparison with other methods on C-MAPSS dataset
Table 7 shows the score (S) comparison with the other methods on C-MAPSS dataset. Score function represents that the penalty grows exponentially with increasing error [53]. For engine degradation, late predictions are penalized more severely than early ones. Therefore, the lower the score S is, the higher performance the prediction presents. As shown in Table 7, the proposed method in this paper indicates substantially improved S on all subsets compared with other methods. The result is mainly due to the following two reasons.
First, the aircraft engine is a power system, and the phase space reconstruction of the engine performance data can help to restore the real operation process of aircraft engines. Second, a VNS algorithm is designed to optimize the SVR parameters. The superiority of combination of SVR and VNS is reflected in the following three aspects: 1) The initial solution generated method in VNS can not only increase the speed of convergence to the optimal parameters, but also get better parameters under the same number of iterations. 2) The shaking procedure of VNS can find a better neighborhood of the current solution. 3) In local search stage, VNS explores the solution space exhaustively by sampling increasingly larger neighborhoods. 4) We can obtain the optimal combination of parameters (C, ɛ, γ) by VNS algorithm, which contributes to improving the performance of SVR.
Although the proposed model performs well among many models, it still has some limitations. VNS optimizes the parameters of SVR in each subset by training individuals on the training set and evaluating the scores on the testing set. The average running time for each subset is 87 hours. The proposed model trades time for to accuracy a large extent. More computational-efficient SVR and faster parameter tuning mechanism should be explored to reduce the running time. With the development of computer technology, the number of layers of neural networks that can handle is increasing, and the performance of deep learning methods has surpassed machine learning in many fields. To improve the performance of SVR, it is necessary to improve the objective function, constraint conditions and kernel function of the SVR model based on the problem itself. This is what we will consider in future study.
Considering the non-linear and non-stationary characteristics of degradation processes data sequence of aircraft engines, a RUL prediction method for aircraft engines is proposed based on PSR and hybrid VNS-SVR in this paper. VNS is used to optimize the parameters of the SVR model. Experiments show that the proposed model has a better prediction accuracy of RUL prediction of aircraft engines. The main innovations of this research work can be summarized as follows: To avoid insufficient information of the input for SVR, the features are adaptively selected by the phase space reconstruction, and the SVR predictor is trained by using the phase points as input features. It can increase the useful information and improve the learning efficiency of the model. VNS algorithm is used to select the parameters of SVR model. The automatic optimization of model parameters selection can improve the generalization ability and training speed of prediction model. Finally, the model is performed on the degradation processes data sequence of aircraft engines. The results show that the proposed method can be successfully applied to the RUL prediction of aircraft engines.
There are also some problems in this study. Although some features were removed before the experiment, the running time is still very long because of the phase space reconstruction of the data set. In the next study, parallel technology and methods will be utilized to improve the learning performance and reduce the computational cost.
Footnotes
Acknowledgments
This work is supported by The National Key Research and Development Program of China (2019YFB1705300), the Fundamental Research Funds for the Central Universities (Nos. JZ2020HGTB0035), the National Natural Science Foundation of China (Nos. 71801071, 71922009, 72071056, 71871080, 71601065, 71690235, 71501058, 71601060), and the Decision-making in the Manufacturing Process of Complex Product (111 project).
