Track vibration sequence anomaly detection algorithm based on LSTM

Abstract

Subway structure monitoring obtains structure monitoring data in real time, and the obtained subway track vibration sequence exhibits obvious time series characteristics. Therefore, the difficulties of abnormal detection of subway track vibration sequence include not only the general scarcity and diversity of data, but also the large amount of sample data. In this paper, an anomaly detection method based on the long short-term memory (LSTM) was applied to detect anomalous subway track vibration sequences. Firstly, subway track vibration signals were preprocessed. An approach to extract subway track vibration sequences was proposed. According to this method, whether the collected data constituted running data was determined via the mean square error, and track vibration sequences were then extracted from the original data according to an adaptive threshold value to obtain subway track vibration samples. Afterwards, Savitzky-Golay filtering was performed to smooth the obtained subway track vibration sequences, and then the wavelet transform was applied to denoise the signal. Second, an anomaly detection algorithm based on the LSTM was employed to detect subway track vibration sequences. Finally, compared to other algorithms, the LSTM algorithm performed better in anomaly detection on the subway track vibration dataset with a small anomaly proportion than did the other three methods. However, in the case of a large proportion of anomalies in the signal, the detection effect of the proposed algorithm was close to BPNN and superior to the LOF and OCSVM. The results indicated that the LSTM-based sequence anomaly detection algorithm attained a satisfactory detection effect for subway track vibration sequences. The anomaly detection algorithm can be applied to subway structure monitoring systems, which can monitor subway track vibration signals in real time and determine whether these signals are anomalous to ensure the safe operation of subway structures.

Keywords

subway track vibration sequence anomaly detection deep learning LSTM data processing

Introduction

Due to the complexity of subway lines, tunnel walls may suffer leak, ground collapse and pipe rupture at the time of construction. In addition, problems such as excessive water, human intrusion and loose track screws may occur during subway operation. If these problems are not detected in time, it is more likely that major safety accidents and economic losses occur. In order to prevent safety accidents, it is necessary to determine whether there are safety risks in subway operation and whether they can be effectively alleviated. It is thus of much practical significance to closely monitor and anomaly detect the subway structure for various safety risks.

Subway structure monitoring systems observe subways at all times and in all areas, with a large number of sensors and a high demodulation frequency, resulting in massive data. Therefore, subway track vibration sequence data generated by sensors generally exhibit time series characteristics, high dimensions and large amounts. Therefore, anomaly detection of subway track vibration sequences is one of the anomaly detection methods of time sequences. The difficulty in subway track vibration sequence anomaly detection includes not only the common scarcity and diversity of data but also the large amount of sample data. Traditional anomaly detection methods, such as classification-, statistics- and machine-learning-based methods, essentially ignore the original timing of the considered data and require the introduction of additional feature engineering. In addition, the detection efficiency and classification accuracy of these methods on large-scale datasets are low. Therefore, it is extremely difficult to apply these traditional anomaly detection methods to subway track vibration sequences. In contrast to traditional machine learning algorithms, deep learning models usually contain multiple hidden layers. In the training process, feature recognition and selection are automatically performed, which overcomes the necessity of introducing additional features, thus avoiding errors in manual feature selection and improving the model recognition and classification accuracy. The recurrent neural network (RNN) (Medesker and Jain, 1999) is a commonly adopted model in deep learning. This model is usually employed to process sequence data, fully utilize the correlation characteristics of sequence data, and deeply mine the hidden features of sequence data. Although the RNN can notably process sequence data, this model also suffers certain limitations, namely, the RNN cannot resolve the problem of long-term dependence. To overcome this difficulty, many researchers have improved the RNN and finally proposed the long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997) based on the RNN. The LSTM solves the problem of RNN gradient disappearance, enabling this model to capture the long-term dependency of sequences. The LSTM can address the problem of gradient disappearance via the addition of cell states to the hidden layer and designation of a specific gate structure controlling the increase and decrease in cell states. The LSTM, through the above-mentioned gate structure, can better capture the long-term dependence of sequences than can the RNN, which exhibits a suitable applicability to sequences of unknown length.

The LSTM is widely applied to a variety of application domains, including text recognition, time series forecasting, natural language processing, computer vision, and image and video captioning, among others (Van Houdt et al., 2020). The LSTM has been also applied in various structural health monitoring studies. Based on structural health monitoring systems, Li et al. (2021) incorporated field monitoring and the LSTM network to propose a data-driven approach for bridge buffeting response modelling in the time domain. Wang et al. (2022) proposed a method for condition assessment and timely warning of suspension bridges. The LSTM was used to detect damage states by tracking the feature changes of time-series deflection and temperature data. Sharma and Sen (2022) adopted a two-step deep learning-based approach powered by LSTM networks to develop a real-time damage detection and localization method that is robust to ambient temperature changes. Based on the variability of modal frequency in long-term SHM of a steel plate girder bridge, Jiang et al. (2022) proposed an anomaly detection method based on residuals of one-step ahead predictions by LSTM which was proposed associating with the Mann-Whitney U test. Dutta and Nath (2022) used LSTM method to train the network and predicted the strain of different components of railway bridge. LSTM demonstrated clear effectiveness in training and prediction in the presence of large amounts of noise in the experimental data. Sony et al. (2022) proposed a windowed LSTM network method for damage detection and localization of civil structures, and demonstrated that a simple LSTM architecture was capable of classifying the time-series signals into multiclass and multidamage levels with high accuracy.

The above literature indicates that the application of LSTM in structural health monitoring has achieved good results. Considering the high efficiency, excellent processing ability and notable generalization ability of the LSTM network in regard to all types of nonlinear time series data, the LSTM could be employed for anomaly detection of subway track vibration sequences. In this study, the LSTM network was introduced into the hidden layer of the computing network. With a processed normal dataset of subway track vibration sequences as the input of the LSTM model, the reconstruction error of the model was reduced with the time-based backpropagation algorithm, and the model parameters were optimized through constant iterations of the dataset. Accordingly, an LSTM model capable of fitting the normal data was constructed to determine whether subway track vibration sequence signals were anomalous.

Preprocessing of subway track vibration signals

The vibration signal samples considered in this section were obtained from the fibre Bragg grating (FBG) subway structure monitoring system developed for a section of Line 7 of the Wuhan subway. There were two optical cables in the tunnel, of which one was the tunnel wall line along the tunnel wall and the other was the track bed line along the railway track. These two parallel cables involved three stations with a total length of 2635 m and involved a total of 527 measurement points deployed at 5-m intervals. To acquire real track vibration signals, the above FBG vibration sensor adopted the high-frequency acquisition mode, and the frequency reached 1000 Hz. Herein, 2 h of data were obtained between 16:00 and 18:00 on December 18, 2018, from the original data generated by the subway structure monitoring system for subsequent analysis. The data in each bin file in the data sample contained monitoring data for all raster measurement points at one-minute.

Before any test of subway structure measurement data, it is necessary to analyse and preprocess the data to reduce the influence during model construction based on vibration sequences at the later stage. Regarding the construction of vibration signal samples, normal train running data are required, but not all the collected data comprised train running data. Instead, the collected data also included non-passing train data and anomalous data resulting from external construction, external interference and sensor damage. A comparative analysis was conducted regarding the common statistical features of all measurement points within 1 minute of train passage, such as the mean, median, maximum, standard deviation (SD) and mean square error, as shown in Figure 1. The mean square error of the measurement points during train passage was much more notable than that of the measurement points not subject to train passage. The median and maximum values exhibited consistent changes, and the mean value and SD did not significantly differ, thus making it difficult to determine whether train passage had occurred or not. Therefore, the mean square error could be employed to distinguish subway track vibration signals with and without train passage effects in the original data samples. Since the intensity of the vibration signal under train passage impact is much higher than that of other vibration signals, the subway track vibration sequence can be obtained by setting an adaptive threshold. As such, the subway track vibration sequence at a certain measurement point could be extracted from the original data.

Figure 1.

Comparison of different eigenvalues.

Through the aforementioned analysis, a method was proposed to extract subway track vibration sequences. The method consisted of the following two parts:

1. A measurement point was selected for vibration sequence extraction, and it was then determined whether there occurred train running data pertaining to the measurement point in the data file based on the mean square deviation of the measurement point as recorded in the data collection file. If this was confirmed, the train passing time and current file name were recorded, with the final result shown in Figure 2. According to the figure, trains basically passed every 5 min, which is consistent with the timetable provided by the subway company.

2. The source file from which the data were collected was indexed according to the name of the file recording the travel time. Then, considering the adaptive threshold, the running vibration signal of the measurement point was retrieved from the source file. Finally, the subway track vibration sequence was extracted, as shown in Figure 3.

Figure 2.

Collected data files of train passing time.

Figure 3.

Extracted vibration sequence of the subway track.

Moreover, to improve the efficiency of subway track vibration sequence extraction, the subway track vibration sequence extraction algorithm was written as a multithreaded program. Through the distributed platform, the train track vibration sequences of all measurement points were extracted in parallel. The subway track vibration sequence extraction algorithm is presented in Appendix 1.

To eliminate the impact of interference signals on the subsequent research results, Savitzky-Golay (S-G) filtering (Savitzky and Golay 1964) was adopted to smooth the subway track vibration signals while improving the signal-to-noise ratio. The S-G algorithm is affected by two hyperparameters, the size M of the moving window and the order d of the polynomial. Herein, different hyperparameter values were adopted in comparative experiments to determine the optimal window size and polynomial order, which made the S-G algorithm to optimize the smoothing effect of subway track vibration signal. The filtered results are shown in Figure 4. After S-G filtering, the original subway track vibration signal became significantly smoother, while the signal shape and width had barely changed. The basic information remained available, and the smoothing processing effect was self-evident.

Figure 4.

Smoothed vibration sequence of the subway track.

Smoothing was conducted to filter the high-frequency components of the signal while retaining the low-frequency components. When S-G was used to filter the vibration signal of subway track, the low-frequency component of the signal was fitted. In case of noise in the high-frequency component, it would be filtered out. Conversely, in case of noise in the low-frequency component, it would be retained. However, the noise of subway track vibration signal was largely concentrated in all the frequency domain of the signal, while there was noise in both the low-frequency and high-frequency components. Therefore, in addition to smoothing processing, noise reduction processing was also required. Herein, wavelet transform (Donoho and Johnstone, 1994; Donoho, 1995; Grossmann et al., 2009) was employed for subway track vibration signal denoising, with the optimal wavelet basis function and threshold function chosen through experiments. After processing, the interference effect of noise on the signal was eliminated, and the original image of the signal was further retained to ensure highly obvious signal characteristics.

After smoothing processing and wavelet threshold denoising, the ultimate subway track vibration signal was produced, as shown in Figure 5. With the basic signal characteristics maintained, not only was the signal smooth, but the low-frequency components were also effectively denoised.

Figure 5.

Effect of the processed subway track vibration sequence.

Anomaly detection algorithm based on LSTM

Anomaly detection algorithm framework

An anomaly sequence detection algorithm based on the LSTM was proposed, which only required model training on normal sequences. Then, the anomaly score was estimated via the reconstruction error between the model input sequence and output sequence. Finally, it was determined whether the input sequence could be categorized as an anomaly sequence through comparison to the anomaly threshold value. Figure 6 shows the process of anomaly detection with the algorithm.

1. To reduce the impact of noise on the model performance, the collected subway track vibration signal dataset was preprocessed in two steps. One step entailed smoothing processing, and the other step involved wavelet denoising. In addition, to avoid errors caused by dimensionless differences between data characteristic values, it was necessary to perform normalization of all data.

2. The basic LSTM model was constructed, with the processed normal dataset as model input. Then, the backpropagation through time (BPTT) (Werbos, 1990) algorithm was applied to reduce the reconstruction error of the model. The model parameters were optimized through continuous iterations of the dataset, based on which an LSTM model suitable for normal data fitting was established.

3. The test sequence in the test set was entered as input into the constructed LSTM model to obtain a reconstructed sequence. The next step was to determine the anomaly score of the input sequence based on the reconstruction error between the input and output sequences of the calculation model. Then, the score was compared with the preset threshold to determine whether the original sequence was anomalous.

Figure 6.

Frame diagram of sequence anomaly detection algorithm based on LSTM.

Data preparation

In the LSTM training phase, it was essential to train on normal data, learn the mode of the time series, and optimize the model parameters according to the reconstruction error. The dataset was split into 4 subsets, namely, a training set N with normal samples, a verification set V_N with normal samples, a verification set V_A with normal and anomalous samples, and a test set T with normal and anomalous samples. Training set N was used to train the model. Verification set V_N was used to end training in advance to prevent the model from over-fitting the training data. The verification set V_A was aimed to determine the threshold of abnormal score. The test set T was applied to assess the performance of the model. After dataset segmentation, it was necessary to preprocess the data. In the previous section, the subway track vibration signal was smoothed and denoised. In addition, all datasets were normalized. Due to outliers in the dataset and inability to determine the maximum and minimum sample values, the z-score normalization method was adopted to normalize the samples.

Model training

After the whole dataset was standardized, the LSTM model architecture (Hochreiter and Schmidhuber, 1997; Gers et al., 2000) was adopted and trained on the training set N. The Adam algorithm was applied as the optimization algorithm during model training (Kingma and Ba, 2015; Wu et al., 2020). The initial architecture of the model and basic Settings of parameters are shown in Table 1.

The mapping relationship between the input data x_t originating from the input layer and the current state h_t of the hidden layer is expressed as Equation (1):

Table 1.

Initial LSTM model Settings.

Network architecture	Sequence length	Batch size	Epochs	Adam optimizer
Input: {1}
Dropout: 0.2
Hidden: {64}	8	128	1	Learning Rate: 0.05
Dropout: 0.2				Decay: 0.99
Output: {1}
Linear activation

h_{t} = l i n e a r (x_{t}, h_{t - 1})

(1)

where linear denotes the nonlinear activation function, x_t denotes the input of the current time input layer, and h_t-1 denotes the value of the hidden layer at the previous moment.

The conversion phase between each neuron node in the hidden layer can be expressed as Equation (2):

h_{1,1} = 0, L = 64

h_{t, i} = l i n e a r (h_{t, i - 1,} h_{t - 1}), i = 2, \dots, L

(2)

where h_1,1 denotes the first node of the hidden layer, and L denotes the number of hidden layer nodes. The mapping process from the hidden layer to the output layer can be expressed as Equation (3):

o_{t} = l i n e a r (h_{t, L}, h_{t - 1})

(3)

The reconstruction error of the input and output sequences of the model is treated as the loss function of the model, as expressed in Equation (4):

l o s s (x^{(i)}) = \sum_{t = 1}^{T} ‖ x_{t}^{(i)} - o_{t}^{(i)} ‖

(4)

Anomaly detection

Since the theoretical premise of the LSTM anomaly detection algorithm comprises a model constructed based on a normal sequence, the reconstruction error of the normal sequence is relatively insignificant. However, the characteristics of anomalous sequences are starkly different from those of normal sequences. As the reconstruction error of anomalous sequences is relatively significant, this is conducive to distinguishing normal sequences from abnormal sequences. In this section, the subway track vibration sequence was divided into a normal training set and an artificial exception verification set for comparative experiments. First, the training set was adopted as the input of the previously constructed LSTM model, while the reconstruction error at each point between the output and input sequences on the training set was calculated. Then, the validation set was employed as the input of the LSTM model, while reconstruction errors of the normal and anomalous data at each sequence point were calculated. Finally, the reconstruction errors were compared between the training and validation sets, with the result presented as a line chart, as shown in Figure 7.

Figure 7.

Contrast diagram of reconstruction error between training set data and validation set data.

Figure 7(a) shows a line chart-based comparison between the reconstruction error of the training set data and the reconstruction error of the normal data contained in the validation set. The difference between the reconstruction errors of these two datasets is mostly small, which is largely consistent with the general trend. Therefore, the overall model performance is relatively satisfactory, and the reconstruction error obtained from the model is considered acceptable. Figure 7(b) shows a line chart-based comparison between the reconstruction error of the training set data and the reconstruction error of the anomalous data contained in the validation set. The comparison of the two line charts in the figure is quite different, where the reconstruction error of the anomalous data is much larger than that of the training set data, and the curve trend of these two datasets is starkly different. As suggested by the results, the model performs poorly in regard to anomalous data fitting but suitably in normal data fitting. Therefore, the model can be applied to detect differences in reconstruction errors between normal and anomalous sequences to realize anomaly detection.

Different normal sequences yield different reconstruction errors. Similarly, different abnormal sequences could lead to distinct reconstruction errors. Consequently, a reconstruction error-based method of anomaly detection was proposed, which was divided into two steps. In the first step, the maximum likelihood estimation (MLE) method was adopted to estimate the anomaly score of the sequence, assuming that the reconstruction error obtained upon input of the anomalous sequence into the LSTM model exhibited a Gaussian distribution over time. In the second step, the anomaly score was considered to determine the anomaly threshold, where a large number of samples was trained to estimate the anomaly threshold to evaluate the abnormal vibration sequence.

1. Anomaly score via maximum likelihood estimation

After the input sequence was reconstructed with the LSTM model, the reconstruction errors of both the reconstructed and original sequences conformed to a Gaussian distribution (Bock and Aitkin, 1981), which is denoted as p ∼N (μ, Σ). The probability that a point in the sequence is a normal data point is expressed as Equation (5):

p_{i} = \frac{1}{{(2 π)}^{m / 2} {| Ʃ |}^{1 / 2}} \exp (- \frac{1}{2} {(e_{i} - μ)}^{T} Ʃ^{- 1} (e_{i} - μ))

(5)

e_{i} = ‖ x_{t}^{(i)} - o_{t}^{(i)} ‖

(6)

where the reconstruction error between the input sequence and the output sequence of the mode at this point is indicated by e_i, as expressed in Equation (6). Then, MLE was applied to estimate the parameters μ and Σ based on the input dataset. Calculating the logarithm of both sides of Equation (5) and summing all terms, Equation (7) can be obtained as:

H (μ, Ʃ) = \sum_{e_{i} \in V_{A}} I n (p (e_{i} | μ, Ʃ))

(7)

The partial derivative with respect to H(μ, Σ) can be obtained as Equation (8):

\frac{\partial H (μ, Σ)}{\partial μ} = 0 \frac{\partial H (μ, Σ)}{\partial Σ} = 0

(8)

With the above equations solved, the values of μ and Σ can be obtained with Equations (9) and (10), respectively:

μ = \frac{1}{n} \sum_{i = 1}^{n} e_{i}

(9)

Σ = \frac{1}{n} \sum_{i = 1}^{n} (e_{i} - μ) {(e_{i} - μ)}^{T}

(10)

Let the anomaly score of a certain point in the sequence be expressed as Equation (11) (Malhotra et al., 2015; Dou et al., 2019). Then, the anomaly score a_i can be calculated according to the values of μ and Σ as obtained with Equations (9) and (10), respectively:

{a_{i} = (e_{i} - μ)}^{T} Σ^{- 1} (e_{i} - μ)

(11)

Adopting the average value of the anomaly scores of all data in the sequence as the anomaly score of the whole sequence data, Equation (12) can be obtained as follows:

a = \frac{1}{n} \sum_{i = 1}^{n} a_{i}

(12)

2. Application of the anomaly threshold to evaluate anomalous data

Herein, the F_β operator was adopted to estimate the anomaly threshold α on the verification set V_A. The F_β operator is a commonly applied evaluation index in machine learning (Malhotra et al., 2015; Goutte and Gaussier, 2005), the calculation of which can be expressed as Equation (13):

F_{β} = \frac{(1 + β^{2}) PR}{β^{2} P + R}

(13)

P and R are the accuracy and recall rates, respectively, and β denotes the different emphases on the accuracy and recall rates. When the F_β operator reached its maximum, α was chosen as the anomaly threshold, which is expressed as Equation (14):

\underset{α_{j} \in R}{α = {a r g m a x F}_{β}}

(14)

After the anomaly threshold was determined, the threshold was compared to the anomaly score of the sequence in the verification set V_A, and the feasibility of the set threshold was evaluated through the anomaly score of the sequence. For a >α, the input sequence was classified as anomalous. For a ≤ α, the sequence was deemed normal. Based on the above analysis, a sequence anomaly detection algorithm based on the LSTM is presented in Appendix 2.

Experimental results and analysis

Experimental dataset

Abnormal data has scarcity and diversity. The abnormal data may be the abnormal vibration signals caused by abnormal knocking or impact caused by falling walls, falling foreign bodies or falling equipment in subway tunnels. It can also involve the abnormal vibration signals of sudden invasion of foreign bodies and invasion of personnel. The dataset employed in this study was the subway track vibration data set constructed in Section 2, which was normal sequence data. To consider exceptions, exceptions were added artificially. First, percussion and human invasion experiments were separately conducted at the fibre optic cable pavement site. Three kinds of abnormal data were collected, including anomalous knock, human invasion and simultaneous knock and human invasion data. Then, the collected data were labelled, pretreated and added to the subway track vibration dataset as anomalous data to obtain the simulation dataset. Regarding the dataset containing 31602 samples, two-fifths of the normal sequence data was randomly selected from the dataset as the training set N, one-fifth as the verification set V_N, one-fifth as the verification set V_A, and the remaining one-fifth as the test set T. Then, random samples obtained from the anomalous data were added to the verification set V_A and testing set T in a certain proportion. To assess the performance of the algorithm, the ratio of anomalous data ρ was set to 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, and 0.4, which were eight different abnormal proportions.

Hyperparametric sensitivity tuning

To improve the detection efficiency of the model, certain key hyperparameters were selected for tuning, including the sequence length, batch size and number of hidden layers. Since the Adam optimization algorithm was applied to the model, the learning rate could be automatically adjusted. Therefore, the initial learning rate was not specifically assessed. The F1 score and the area under the curve (AUC) were treated as the evaluation index of the algorithm. The experiment was conducted on the subway track vibration test set with an anomaly ratio of 0.2. The hyperparameters of the LSTM network were initially set, and a comparison was then performed of the hyperparameter sensitivity of the LSTM network. The initial number of hidden layers of the LSTM network was set to 1, the sequence length was set to 8, the batch size was set to 128, and the learning rate was set to 0.05. Experiments were conducted on the subway track vibration dataset under different sequence lengths, batch sizes and numbers of hidden layers, and the experimental results are shown in Figure 8.

Figure 8.

Experimental results of different hyperparameters.

Figure 8(a) shows the experimental effects of the different sequence lengths on the subway track vibration dataset. With increasing sequence length of the model input, the F1 score and AUC value of the subway track vibration dataset initially both increased significantly. However, when the sequence length reached 32, the F1 score declined. After the sequence length reached 40, the F1 score again increased. It is hypothesized that these changes could be attributed to model overfitting. The AUC value started to decline after the sequence length increased to 40. Considering these two indicators, the sequence length of the model was finally set to 32.

Figure 8(b) shows the experimental effects of the different batch sizes. The two curves highly corresponded to each other, indicating that the model performance significantly decreased with increasing batch size of the model. When the batch size was increased to a certain extent, the gradient descent direction of the model remained unchanged, which made it difficult to obtain better parameters through training and thus achieve model convergence. Therefore, the batch size was set to 128 in the subsequent experiments.

Figure 8(c) shows the experimental effects of the different hidden layers. The F1 score and AUC value of the model reached peak values when the number of hidden layers reached 2. However, when the number of hidden layers of the model became excessive, the model overfitted the data, which affects the generalization ability. Therefore, the number of hidden layers of the model was set to 2.

The detection outcome of the algorithm was considerably improved after hyperparameter sensitivity tuning. Figure 9 shows the ultimate outcome of anomaly detection for subway track vibration sequences. The part with an anomaly score exceeding the threshold is highly consistent with the anomalous part of the input sequence, which suggests that the model can accurately distinguish a normal sequence from an anomaly sequence with the set threshold. In summary, the LSTM-based sequence anomaly detection algorithm proposed in this study performed well in abnormal sequence detection for subway track vibration datasets.

Figure 9.

Abnormal detection results of subway track vibration sequence.

Comparison of the experimental results

To demonstrate the satisfactory performance of the proposed algorithm in anomaly sequence detection, three traditional anomaly detection algorithms were chosen for comparison, including the local outlier factor (LOF) (Breunig et al., 2000), one-class support vector machine (OCSVM) (Yong et al., 2016) and BP neural network (BPNN) (Rumelhart et al., 1986). The hyperparameter settings of these three methods in the comparative experiment are listed in Table 2. With the use of these four algorithms, a comparison was performed on subway track vibration datasets with 8 anomaly proportions.

Table 2.

Hyperparameter Settings of the comparison algorithm.

The comparison algorithm	Hyperparameter setting
LOF	MinPtsLB = 10, MinPtsUB = 50, step size = 1
OCSVM	Radial basis function (RBF) kernel function, g = 0.2, v = 0.1
BPNN	Learning rate = 0.1, The input layer node was 6,and the number of network layers was set as 3.

The F1 score is shown in Figure 10(a), which reveals that the detection outcome of the LSTM was comparable to that of the BPNN model with increasing anomaly proportion. In addition, the figure demonstrates that the LSTM outperformed the OCSVM in general. When the anomaly proportion exceeded 30%, however, the performance of the LSTM was inferior to that of the BPNN because the samples tended to become balanced. The detection efficiency of the LOF algorithm consistently remained the worst because the LOF algorithm is an unsupervised method and the absence of data labels is unfavourable for achieving a better detection outcome. According to the AUC value shown in Figure 10(b), the LSTM algorithm performed well in anomaly detection when the anomaly proportion was relatively small. However, when the anomaly proportion was large, there occurred a small difference with the other algorithms, whereby the LOF algorithm still produced the least satisfactory result.

Figure 10.

Experimental results of different algorithms on the subway track vibration data set.

Based on the two indicators of the F1 score and AUC value, the LSTM algorithm performed better in anomaly detection on the subway track vibration dataset with a small anomaly proportion than did the other three methods. However, in the case of a large proportion of anomalies in the signal, the detection effect of the proposed algorithm was close to BPNN and superior to the other two algorithms. Therefore, when there is abnormal data in the subway vibration sequence, the LSTM-based sequence anomaly detection algorithm can detect the anomaly in time and give early warning before the accident.

Conclusion

To improve the detection outcome of the considered anomaly detection algorithm, a series of preprocessing procedures was conducted of the subway track vibration signal. First, according to the characteristics of the original data collected by the system, a subway track vibration sequence extraction method was proposed to construct vibration samples. Then, the S-G filtering method was applied to smooth the signal, while the wavelet threshold denoising method was adopted to reduce the noise level. After preprocessing the subway track vibration signal, an anomaly detection algorithm based on the LSTM was proposed. As the algorithm relied on normal data, the LSTM performed Gaussian modelling of the reconstruction error of the input and output sequences and thereafter estimated the anomaly score of the input sequence according to MLE. Then the proposed model considered these data to determine whether the original sequence was anomalous via comparison to the preset anomaly threshold. In addition, the model hyperparameters were optimized on the subway track vibration dataset, and finally, comparative experiments were conducted involving different anomaly detection algorithms. As suggested by the experimental results, compared to the other algorithms, the proposed model outperformed on the sample, when the proportion of abnormal signals is less than 30%.When the proportion of abnormal signals exceeds 30%, the detection effect of the proposed algorithm is similar to that of BPNN, and better than the LOF and OCSVM. In conclusion, the sequence anomaly detection algorithm based on the LSTM produced a satisfactory anomaly detection outcome for the subway track vibration sequence. This anomaly detection algorithm can be integrated into fibre grating subway structure monitoring systems to detect subway track vibration signals in real time. When anomalous vibration signals are generated due to abnormal impact or invasion of foreign objects in the subway tunnel, the system could alert the subway tunnel to the anomaly, thus avoiding the risk of traffic accidents and ensuring the safe operation of subway structures.

In the future work, in order to further improve the abnormal detection method of subway track vibration sequence, the following three aspects will be discussed. First, for the determination of the abnormal threshold, some integrated learning methods can be adopted to obtain a more appropriate abnormal threshold, so as to improve the accuracy of the algorithm. Second, for the samples with large proportion of abnormal data, the detection effect of the algorithm needs to be improved. The network structure can be optimized or combined with other neural networks, such as convolutional neural network and self-coding neural network, to improve the performance of the model. Third, for the scope and boundary of the algorithm, experiments can be carried out on different datasets and application scenarios to verify the feasibility of the algorithm, and the algorithm can be applied to more practical scenarios.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

This project was supported by Key Program of National Natural Science Foundation of China 61735013 and the Youth Fund of National Natural Science Foundation of China 61402345.

ORCID iD

Liu Jie

Appendix

References

Bock

Aitkin

(1981) Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika 46(4): 443–459.

Breunig

Kriegel

Sander

(2000) LOF: identifying density-based local outliers. ACM sigmod record. ACM SIGMOD Record, 29(2): 93-104.

Donoho

(1995) De-noising by soft-thresholding. IEEE Transactions on Information Theory 41(3): 613–627.

Donoho

Johnstone

(1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3): 425–455.

Dou

Zhang

Xiong

(2019) Anomaly detection of process unit based on LSTM time series reconstruction. Chemical Engineering Journal 70(02): 61–66.

Dutta

Nath

(2022) Learning via long short-term memory (LSTM) network for predicting strains in railway bridge members under train induced vibration. In: Kumar

Senatore

Gunjan

(eds), ICDSMLA 2020. Singapore: Springer, Vol 783.

Gers

Schmidhuber

Cummins

(2000) Learning to forget: continual prediction with lstm. Neural computation 12(10): 2451–2471.

Goutte

Gaussier

(2005) A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. Advances in Information Retrieval, 27th European Conference on IR Research, ECIR 2005, Santiago de Compostela, Spain. Berlin, Heidelberg: Springer. March 21-23, 2005, Proceedings.

Grossmann

Kronland-Martinet

Morlet

(2009) The Wavelet Transform. Amsterdam, Paris: Atlantis Press, World Scientific.

10.

Hochreiter

Schmidhuber

(1997) Long short-term memory. Neural Computation 9(8): 1735–1780.

11.

Jiang

Kim

Goi

, et al. (2022) Data normalization and anomaly detection in a steel plate-girder bridge using LSTM. ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 8(1): 04021082.

12.

Kingma

(2015) Adam: a method for stochastic optimization. The 3rd International Conference for Learning Representations (ICLR). San Diego, USA: IEEE, 1–15.

13.

Laima

, et al. (2021) Data‐driven modeling of bridge buffeting in the time domain using long short‐term memory network based on structural health monitoring. Structural Control and Health Monitoring 28(8): e2772.

14.

Malhotra

Vig

Shroff

, et al. (2015) Long Short Term Memory Networks for Anomaly Detection in Time Series. 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2015, 89-94. https://www.researchgate.net/publication/304782562_Long_Short_Term_Memory_Networks_for_Anomaly_Detection_in_Time_Series?ev=auth_pub

15.

Medesker

Jain

(1999) Recurrent Neural Networks: Design and Applications. Boca Raton, FL: CRC press.

16.

Rumelhart

Hinton

Williams

(1986) Learning representations by back propagating errors. Nature 323(6088): 533–536.

17.

Savitzky

Golay

MJE

(1964) Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry 36(8): 1627–1639.

18.

Sharma

Sen

(2022) Real-time structural damage assessment using LSTM networks: regression and classification approaches. Neural Computing and Applications 35: 557–572.

19.

Sony

Gamage

Sadhu

, et al. (2022) Vibration-based multiclass damage detection and localization using long short-term memory networks. Structures 35: 436–451.

20.

Van Houdt

Mosquera

Nápoles

(2020) A review on the long short-term memory model. Artificial Intelligence Review 53(8): 5929–5955.

21.

Wang

Ansari

, et al. (2022) LSTM approach for condition assessment of suspension bridges based on time-series deflection and temperature data. Advances in Structural Engineering 25(16): 3450–3463.

22.

Werbos

(1990) Backpropagation through time: what it does and how to do it. Proceedings of the IEEE 78(10): 1550–1560.

23.

Liang

Pei

, et al. (2020) Deep Learning: Foundations and Applications. Beijing: Beijing Institute of Technology Press.

24.

Yong

Tao

Yuan

, et al. (2016) Anomaly Detection of User Behavior for Database Security Audit Based on OCSVM. International Conference on Information Science & Control Engineering IEEE, 214-219. https://doi.org/10.1109/ICISCE.2016.5