A novel anomaly detection model for secure multipath QUIC communications by jointly using empirical mode decomposition and long short-term memory networks

Abstract

In the era of rapid development of modern internet technology, network transmission techniques are continuously iterating and updating. The Quick UDP Internet Connections (QUIC) protocol has emerged as a timely response to these advancements. Owing to the strong compatibility and high transmission speed of QUIC, its extended version, Multipath QUIC (MPQUIC), has gained popularity. MPQUIC can integrate various transmission scenarios, achieving parallel transmission with higher bandwidth. However, due to some security flaws in the protocol, MPQUIC is susceptible to attacks from anomalous network traffic. To address this issue, we propose an MPQUIC traffic anomaly detection model based on Empirical Mode Decomposition (EMD) and Long Short-Term Memory (LSTM) networks, which can decompose and denoise data and learn the long-term dependencies of the data. Simulation experiments are conducted by obtaining MPQUIC traffic data under normal and anomalous conditions for prediction, analysis, and evaluation. The results demonstrate that the proposed model exhibits satisfactory prediction performance when trained on both normal and anomalous traffic data, enabling anomaly detection. Moreover, the evaluation metrics indicate that the EMD-LSTM-based model achieves higher accuracy compared to various traditional single models.

Keywords

Multipath QUIC network traffic anomalous detection model empirical mode decomposition long short-term memory

1. Introduction

Amidst the rapid development of the internet era, a multitude of emerging network transmission technologies have been proposed. Given the increasing demand from network users, webpage loading time has become a crucial performance metric in the internet realm, garnering significant attention. To enhance webpage performance, Hypertext Transfer Protocol version 2 (HTTP/2) was introduced, which employs header compression and multiplexing to effectively improve webpage performance [1]. Subsequently, as HTTP/2 exhibits positive impacts, a Quick UDP Internet Connections (QUIC) protocol was proposed based on its foundation to further augment communication performance [2]. QUIC protocol combines features of TCP-like connections and HTTP/2 multiplexing, breaks certain limitations of traditional TCP $+$ TLS versions, incorporates TLS parameters in data requests, and optimizes the three-way handshake process to achieve 0-RTT data transmission [3].

In comparison to the TCP protocol, the QUIC protocol presents a more versatile overall design. Specifically, due to the kernel-based nature of TCP, its evolution process is prolonged. However, the user-space-based QUIC protocol enables pluggable congestion control (implementing various congestion control algorithms at the application level) and easy compatibility with other transport protocols (e.g., HTTP), resulting in faster iteration updates and higher transmission speeds, among other advantages [4]. These benefits grant QUIC protocol higher adaptability across different transmission scenarios [5]. Currently, numerous browsers, such as Google, have already deployed the QUIC protocol [6]. Although the QUIC protocol, built upon HTTP/2, effectively enhances network transmission speed, limitations in network interfaces still affect bandwidth. Hence, single-device multiple-interface technologies, such as Wi-Fi and LTE, are highly suitable for mitigating this constraint and provide feasible means for integrating QUIC protocol with multipath technologies [7]. Accompanying the introduction of HTTP/3 [8], the QUIC protocol has undergone standardization for multipath extensions, namely the Multipath QUIC (MPQUIC) protocol [9]. Similar to the deployment of Multipath TCP (MPTCP) in the industrial internet, multipath transmission technologies demonstrate their advantages [10]. MPQUIC also exhibits certain benefits in various application scenarios.

Figure 1.

Multi-scenario multi-path transmission of MPQUIC.

As depicted in Fig. 1, in comparison to the single-path QUIC protocol, MPQUIC can achieve higher bandwidth parallel transmission by utilizing multiple distinct paths, such as Wi-Fi and 5G, and accommodating various transport scenarios. Furthermore, MPQUIC can effectively reduce end-to-end latency and implement suitable scheduling algorithms according to application layer requirements. As shown in Fig. 2, MPQUIC inherits the connection ID feature from QUIC, enabling connection migration capabilities. When IP address changes due to path switching (e.g., transitioning from Wi-Fi to 5G) or port number alterations occur, the connection generally becomes unavailable; however, the connection ID allows for the identification of a connection using only a 64-bit identifier, facilitating rapid reconnection [11]. Simultaneously, the MPQUIC packet information includes a path ID, which also allows the protocol to perform multi-stream control when the remote IP address changes, ensuring the information on the path remains unchanged. QUIC is a new generation standard transport protocol based on the HTTP/3 protocol, while MPQUIC combines QUIC with multi-path technology, offering numerous advantages. Nonetheless, MPQUIC still has some shortcomings in terms of protocol security and is susceptible to attacks from abnormal network traffic.

In this study, considering the volatile nature of non-stationary traffic data in network environments, we selected the Long Short-Term Memory (LSTM) deep learning method as our primary analysis method and combined it with Empirical Mode Decomposition (EMD) from digital signal decomposition techniques. We proposed an EMD-LSTM-based MPQUIC traffic anomaly detection model. Based on network traffic data both with and without attacks, this model employs the EMD method to adaptively decompose multiple intrinsic mode functions (IMF) at various time scales in non-stationary data with a significant amount of noise. Subsequently, the LSTM extracts dynamic features from each component, analyzing their correlations and trends. Finally, the model accumulates all IMF components to generate the final network traffic prediction data. Additionally, we employed the Fast Fourier Transform (FFT) to distinguish between high and low-frequency IMF components, selecting all low-frequency IMF components for reconstruction strategy implementation. This step aims to eliminate noise interference, allowing the reconstructed values to better reflect data fluctuations, more visibly and intuitively differentiating between normal and abnormal conditions, and demonstrating the model’s feasibility for anomaly detection. Comparative results indicate that our proposed model is more effective at detecting abnormal traffic, whether under abnormal or normal conditions, compared to using the Backpropagation (BP) model, Random Forest (RF) model, or LSTM model, achieving more accurate predictive analysis for both abnormal and normal network traffic.

Figure 2.

High-level architecture of MPQUIC.

The research purpose of this paper is to study the use of EMD and LSTM techniques to process complex MPQUIC network traffic data and attempt to verify the performance of the proposed EMD-LSTM model through experimental evaluation. Based on deep learning and signal decomposition technology, the research developed a model for MPQUIC network traffic anomaly detection and evaluated the feasibility of the model in predicting MPQUIC network traffic through experimental analysis. The ultimate goal is to enhance the safety of the MPQUIC protocol in the network. The main variables involved in this study include network data (i.e., MPQUIC traffic data, used to train and test the model), EMD-LSTM model (a deep learning model used for data processing and anomaly detection), and performance indicators (including MAPE and RMSE). At the same time, the parameters under study are divided into simulation experimental parameters and model parameters. The simulation experiment parameters include bandwidth, delay, and other parameters on the MPQUIC network path. Model parameters include the architecture, learning rate, number of training times, and other parameters of the LSTM model.

The assumptions of this study are based on the following prerequisites: First, we assume that MPQUIC traffic data are non-stationary on the time scale and need to be decomposed using EMD. Second, we assume that the LSTM model can effectively capture the features and long-term correlations in the data. Finally, we hypothesize that abnormal network traffic patterns are different from normal patterns and can be detected by the model.

In this study, the EMD process decomposes non-stationary data into IMFs, where each IMF represents components at different time scales. The LSTM model receives these IMFs as input learns and predicts the future value of each IMF. Finally, we evaluate the performance of the model using metrics such as Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) to judge its effectiveness. The entire research builds a complete system model to solve the problem of MPQUIC traffic data anomaly detection and prediction, aiming to improve the security of the MPQUIC protocol in a multi-path transmission environment.

2. Related work

In 2012, following Google’s proposal of the QUIC protocol, the Internet Engineering Task Force (IETF) initiated the standardization process for the QUIC protocol [12]. Inspired by the MultiPath TCP (MPTCP) protocol, Coninck et al. designed the MPQUIC protocol in 2017 [9]. To enhance and optimize the performance of MPQUIC, Xing et al. [13] developed a flow-aware scheduler for the packet scheduling process, which can distinguish priorities of individual flows, enabling efficient allocation of aggregated bandwidth and outperforming existing MPQUIC schedulers. Nguyen et al. [14] integrated reinforcement learning into the MPQUIC multipath scheduler, leveraging the self-learning capabilities of reinforcement learning for optimal path deployment and data transmission, resulting in a 10% performance improvement. Zhuang et al. [15] designed a novel congestion control algorithm suitable for addressing issues of sub-flow switching smoothness and computational complexity in multipath scenarios, achieving fair bandwidth allocation and high-performance data transmission capabilities through online learning techniques.

It can be observed that numerous studies have focused on the packet scheduling mechanism, optimal path selection, and congestion control aspects of MPQUIC, primarily aiming to optimize its data transmission performance. However, anomaly detection in network protocols has also been a critical research area, playing a significant role in maintaining network stability and facilitating network management and control. Research on MPQUIC in this aspect appears to be somewhat lacking.

Data generated in network systems exhibit temporal characteristics, effectively reflecting correlations and trends between data points on a temporal scale. Traditional time-series analysis mainly targets the time and frequency domains, employing methods such as averaging, calculating standard deviation, and using autocorrelation functions [16] in the time domain, while spectral analysis [17] and Fourier transforms are more commonly used in the frequency domain. However, these methods are only suitable for linear, stationary sequence data, and struggle to accurately analyze more complex, non-linear, and non-stationary data, yielding limited effectiveness.

In recent years, rapid advancements in artificial intelligence have led to an increasing number of deep learning-based data monitoring models being proposed [18], with widespread applications across numerous fields. Jafari et al. [19] sampled real-world ground data and, leveraging time-series characteristics, proposed a truncation-free physics-informed neural network (PINN) model utilizing backpropagation (BP) and Taylor approximation. Zhang et al. [20] focused on the analysis of the LSTM model, combined with the Sparrow Search Algorithm (SSA), and proposed the SSA-LSTM model, which optimizes LSTM’s node count and learning influence factors via the SSA algorithm, significantly enhancing the model’s air quality data prediction accuracy. Wang et al. [21] followed the multimodal approach, integrating convolutional neural networks (CNN) and LSTM to develop the CNN-LSTM model, which effectively predicts electricity demand by utilizing CNN to uncover hidden data features and LSTM to capture long-term data characteristics. Bi et al. [22] conducted in-depth research on attention mechanisms, incorporating the Savitzky-Golay (SG) filter into the model, and eventually utilizing bidirectional long short-term memory (BiLSTM) for water quality prediction, effectively applied to the prevention and control of sudden water quality deterioration events. These studies demonstrate that technologies such as LSTM, CNN, BP neural networks, and attention mechanisms in deep learning can automatically learn the correlation and trend features of time-series data.

In their comprehensive review of Long Short-Term Memory (LSTM), Yu et al. [23] discuss how LSTM can learn long-term features of continuous data and abstractly represent non-linear data. Nihale et al. [24] employed LSTM models for network traffic data prediction, with experimental results demonstrating the model’s robust predictive performance for traffic trend changes. Wang et al. [25] built upon the LSTM model by incorporating the autocorrelation coefficient, which enhances the model’s accuracy in network traffic data prediction and modeling. These studies indicate the feasibility of using LSTM for traffic prediction and the potential to integrate various methods into the model, ultimately improving its predictive performance.

Considering the adaptive characteristics of network traffic and the presence of substantial noise in network data, Huang et al.’s [26] Empirical Mode Decomposition (EMD) method can be applied to effectively denoise and reconstruct data, enabling clearer identification of data trends and directions. This is particularly useful for achieving greater accuracy in recognizing anomalies within the network. Consequently, this study proposes an EMD-LSTM model based on the EMD method and LSTM model, which can detect network anomalies, thereby providing timely alerts and control measures to enhance network security. Furthermore, this study applies the proposed model to traffic anomaly detection in the MPQUIC network environment, enriching research on the security and robustness of the MPQUIC protocol.

Our research paper addresses novelty research in the field of network traffic analysis, specifically in the context of MPQUIC. Although MPQUIC has demonstrated the potential to improve network performance, it is accompanied by security-related issues. The partial research gap is filled by the introduction of a novel methodology that integrates the EMD and LSTM approaches. EMD is a technique that can be utilized to preprocess non-stationary traffic data in the context of MPQUIC. This involves decomposing the data into IMFs, which facilitates a more detailed analysis. The LSTM model, renowned for its effectiveness in modeling temporal data, utilizes these IMFs to forecast forthcoming traffic patterns. The integration of signal processing and deep learning in a cross-disciplinary manner offers a distinctive and inventive way to fill the existing research void in enhancing MPQUIC traffic analysis and prediction.

3. Basic theory and model analysis

3.1 The basic execution process of EMD

Originally, the Empirical Mode Decomposition (EMD) method was developed for processing signals, capable of decomposing complex non-stationary signals into a series of Intrinsic Mode Functions (IMFs), effectively handling the local time-domain features of the signal [27]. EMD is data-driven and adaptive, allowing for signal processing without the need for a priori basis functions, and has been widely applied across various domains [28].

Figure 3.

The implementation process of EMD.

The execution process of EMD consists of the following four steps, with the first three steps comprising the decomposition process, as illustrated in Fig. 3:

Preparing the original data: Obtain a set of non-stationary and non-linear signal data, $S(t)$ .

Determining the data envelope: Identify the local extrema points of the signal to be decomposed; define all the local maxima points as the upper envelope, referred to as the upper limit $U(t)$ , and all the local minima points as the lower envelope, referred to as the lower limit $L(t)$ . In this step, the mean value of all envelopes, $M(t)$ , is calculated, which is used for constructing a new IMF component.

$\displaystyle M\left(t\right)=\frac{U\left(t\right)+L\left(t\right)}{2}$ (1)

Decomposition: Subtract the mean value of the upper and lower limits from the original data to obtain the first IMF, represented as $I_{1}(t)=S(t)-M(t)$ . To obtain the second IMF, perform the second decomposition on $I_{1}(t)$ as the original data, obtaining $I_{2}(t)=I_{1}(t)-M_{1}(t)$ . During the iterative process, a filtering condition is set for the original signal: when the absolute difference between the number of local extrema points $E$ and the number of zero-crossing points $Z$ is greater than 1, or when the sum of the upper and lower limits is not equal to 0, the filtering terminates. Notably, each decomposition generates a residual term, and the execution process only terminates when the final residual term is a monotonic function. Assuming k iterations are conducted, the final result is obtained as follows:

$\displaystyle I_{k}(t)=I_{k-1}(t)-M_{k-1}(t)$ (2)

Reconstruction: Add up all the obtained IMF components to yield the reconstructed data. In this step, given $C_{1}(t)=I_{1}(t)$ , $C_{2}(t)=I_{2}(t)$ , $\cdots$ , $C_{k}(t)=I_{k}(t)$ , since $S(t)$ is ultimately decomposed into $n$ IMF components $I_{i}(t)$ and the residual $R_{n}(t)$ , the reconstruction is performed as follows:

$\displaystyle S(t)=\sum_{i=1}^{n}C_{i}(t)+r_{n}(t)$ (3)

From the execution process of EMD, it can be observed that the iterative decomposition process is based on the continuous processing of changing data, rather than pre-determined data. This adaptive analysis for varying data enables EMD to perform adaptive time-frequency analysis on time-series data, such as network traffic.

3.2 The basic theory of LSTM

Long Short-Term Memory (LSTM) is a time-based recurrent neural network that has been developed as an improvement upon the foundational Recurrent Neural Network (RNN). RNNs exhibit certain limitations when dealing with long-term memory data [29], and LSTMs address these shortcomings by introducing the concept of “gates”, which include forget gates, input gates, and output gates. These gates can analyze which parts of temporal data need to be retained and memorized and which parts should be forgotten and discarded [30]. The notion of gates enables LSTMs to handle dependency relationships in long sequence data and make reliable predictions based on the identified data correlations.

Figure 4.

LSTM basic internal structure.

The basic internal structure of LSTM is illustrated in Fig. 4, which demonstrates the components such as cell states, forget gates, input gates, and output gates. The cell state serves as the transmission mechanism of LSTM, enabling the conveyance and preservation of information within data sequences. Due to the presence of gated units, the transmission of information through cell states is not limited by data length. Moreover, cell states intricately manipulate the flow of information through addition and multiplication, playing a pivotal role in controlling the input and output of information. Gated units execute operations such as forgetting, input, and output by employing weight values generated by the activation function $\sigma$ (indicating the importance of information about corresponding operations). In the figure, $X_{t}$ represents the input data sequence, $h_{t}$ is the output obtained through the mapping of data information under hidden states using the tanh function, and $C_{t}$ is the updated cell state information at time $t$ . The LSTM computation process is as follows:

Initialize the forget gate $F$ , input gate $I$ , output gate $O$ , and candidate cell state $\widetilde{C}_{t}$ in the gated units by assigning corresponding weight values $W$ and biases $b$ to the input information $X_{t}$ at time $t$ and the hidden state output $h_{t-1}$ at time $t-1$ , and using the sigmoid activation function $\sigma$ . The calculation formulas are as follows:

$\displaystyle F_{t}=\sigma(W_{\textit{XF}}X_{t}+W_{\textit{hF}}h_{t-1}+b_{F})$ (4)

$\displaystyle I_{t}=\sigma(W_{\textit{XI}}X_{t}+W_{\textit{hI}}h_{t-1}+b_{I})$ (5)

$\displaystyle O_{t}=\sigma(W_{\textit{XO}}X_{t}+W_{\textit{hO}}h_{t-1}+b_{O})$ (6)

$\displaystyle\widetilde{C}_{t}=\tanh(W_{\textit{XC}}X_{t}+W_{\textit{hC}}h_{t-% 1}+b_{C})$ (7)

Compute the updated cell state $C_{t}$ . This step involves performing element-wise multiplication $\odot$ operations between the forget gate information value $F_{t}$ at time $t$ and the cell state $C_{t-1}$ representing historical information at time $t-1$ , as well as between the input gate information value $I_{t}$ at time t and the candidate cell state $\widetilde{C}_{t}$ . The resulting cell state, obtained by adding both products, has already forgotten some information compared to the previous time step and has stored new information.

$\displaystyle C_{t}=F_{t}\odot C_{t-1}+I_{t}\odot\widetilde{C}_{t}$ (8)

Calculate the output $h_{t}$ under the hidden state. This step employs the output gate to control how much of the current memory information $C_{t}$ is outputted to $h_{t}$ .

$\displaystyle h_{t}=O_{t}\odot\tanh C_{t}$ (9)

The primary rationale for selecting the LSTM model in this study is its exceptional efficacy in modeling and processing time series data. QUIC network traffic data typically exhibits sequentiality and temporal dependence. LSTM, being a recurrent neural network, possesses the ability to retain long-term memory and is thus well-suited for capturing extended dependencies. This attribute is highly advantageous in the prediction and analysis of network traffic behavior. Furthermore, the LSTM model represents a state-of-the-art advancement in the realm of deep learning, exhibiting remarkable versatility in its ability to accommodate diverse data sets and tasks. Hence, LSTM, a robust tool, is well-suited for the prediction of MPQUIC network traffic timing and the identification of anomalies.

3.3 Model design

Figure 5.

EMD-LSTM model structure diagram.

The EMD-LSTM model for detecting anomalous MPQUIC traffic is proposed in this study, leveraging the advantages of data decomposed by the EMD method at different time scales, which adapts well to data frequency changes, and the capability of the LSTM to effectively model and accurately predict each IMF. The model is designed to explore anomaly detection in the MPQUIC network environment. The structure of the model modules is illustrated in Fig. 5.

The model is primarily divided into three modules, which are designed based on data acquisition, data preprocessing, and data prediction. The data acquisition module in the figure generates MPQUIC network traffic data through the simulation design component of the ns3 system. Upon generating the corresponding traffic data, the statistical component processes the data to obtain a series of simulation datasets. Subsequently, the traffic preprocessing module in the figure decomposes the simulated datasets into multiple IMFs using the EMD method, and cleans and normalizes the IMF data to meet the LSTM model’s data input requirements. The model’s data input format is as follows:

$\displaystyle Y_{in}=\begin{bmatrix}Y_{1}\\ \vdots\\ Y_{t-n}\\ \end{bmatrix}=\begin{bmatrix}y_{1}&\cdots&y_{n}\\ \vdots&&\vdots\\ y_{t-n}&\cdots&y_{t-1}\\ \end{bmatrix}$ (10)

Here, $n$ is the time delay step, indicating the use of $n$ historical data points to predict future data.

Finally, the LSTM module in the figure divides each IMF sample set into test and training sets and trains the LSTM model using the training set data. After training the model, the LSTM model file is saved, and the test set data is used for the prediction process to obtain the predicted values for each IMF. The model’s data output format is as follows:

$\displaystyle Y_{out}=\begin{bmatrix}\widetilde{Y}_{1}\\ \vdots\\ \widetilde{Y}_{t-n}\\ \end{bmatrix}=\begin{bmatrix}y_{n+1}&\cdots&y_{n+z}\\ \vdots&&\vdots\\ y_{t-n}&\cdots&y_{t-n+z}\\ \end{bmatrix}$ (11)

Here, $z$ is the prediction window size, indicating the data for the next $z$ time points after the prediction.

The model reconstructs the predicted IMF data to obtain complete prediction data. To evaluate the model’s performance, a metric evaluation component is added, calculating the Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) parameter values of the model.

$\displaystyle\textit{RMSE}=\sqrt{\frac{1}{t}\sum_{j=1}^{t}(y_{j}-\widetilde{y}% _{j})^{2}}$ (12)

$\displaystyle\textit{MAPE}=\left(\frac{1}{t}\sum_{j=1}^{t}\left|\frac{y_{j}-% \widetilde{y}_{j}}{y_{j}}\right|\right)\times 100$ (13)

In the above expressions, $y$ represents the actual value, while $\widetilde{y}$ denotes the predicted value. A smaller RMSE indicates a higher precision of the model, whereas a MAPE closer to 0% signifies a more accurate model. Conversely, a MAPE closer to 100% implies a lower-quality model. The design of the model effectively employs Empirical Mode Decomposition (EMD) to decompose complex time series data with significant noise, generating multiple Intrinsic Mode Functions (IMFs). By employing a granular approach, multiple Long Short-Term Memory (LSTM) models and predictions are performed, reducing the prediction error and enhancing the accuracy. This design allows the LSTM to thoroughly analyze the original traffic data across multiple time scales, thereby capturing long-term dependencies within the data more effectively.

4. Experiment analysis

4.1 Simulation settings

Table 1
Experimental environment configuration

Software and hardware	Configuration
Operating system	Ubuntu 20.04 & Windows 10
CPU	Intel(R) Core(TM) i7-9750H
Matlab	R2022b
Network Simulation-3	3.26
RAM	16 GB

In this study, the simulation experiments were conducted using the Network Simulator-3 (NS-3) system, with the experimental environment configuration as shown in Table 1. NS-3 is a dedicated tool for network simulation experiments, incorporating numerous algorithms such as fair scheduling, routing selection, and congestion control. It enables precise configuration of parameters such as bandwidth, delay, and packet loss rate in the network environment, offering high controllability for conducting MPQUIC simulation experiments. Moreover, the NS-3 system features a powerful engine, encompassing a wide range of network models to provide accurate simulation results. Additionally, NS-3, in conjunction with its visualization capabilities, plots the obtained simulation data, facilitating a more intuitive analysis and understanding of MPQUIC’s performance for researchers.

The topology of the MPQUIC network transmission system in the experimental setup is depicted in Fig. 6, with the parameters of the experimental setup presented in Table 2. In the MPQUIC network transmission system, the sender can transmit QUIC packets to the receiver via two transmission paths. This experiment aims to investigate the differences in traffic data on the paths under normal and abnormal conditions of MPQUIC, thus setting two paths with identical bandwidth and delay for variable control and comparison. In this network environment, the sender’s data is transmitted through two routers, R0 and R2, propagating into the core network, then through routers R1 and R3 at the next hop address, and ultimately reaching the receiver. We set the path bandwidth in the edge network to 12 Mb with a 1 ms delay. Since data congestion typically occurs in the core network, accumulating large amounts of data, we intentionally set the path bandwidth in the core network to 5 Mb with a 25 ms delay to align with realistic scenarios. The queue management algorithm employed on the path is the classical PfifoFast algorithm [31]. The experiment primarily investigates the impact of abnormal traffic bursts on normal traffic on the path. To avoid excessive packet loss obscuring the experimental results, a global packet loss rate of 1% is set, which does not affect the generation of overall result data.

Table 2

Simulation experiment parameter

Parameter name	Value
Core network delay	25 ms
Core network bandwidth	5 Mbps
Edge network delay	1 ms
Edge network bandwidth	12 Mbps
Queue algorithm	PfifoFast
CBR send rate	50 Mbps
Number of attack cycles	10
Attack duration	2 s
Attack interval	1 s
Simulation time	40 s
Number of CBR nodes	5
Scheduler type	Round robin

Figure 6.

MPQUIC network transmission system topology.

Furthermore, to simulate the attack scenario of abnormal traffic bursts, five attack flow generation nodes and five termination nodes are connected to routers R0 and R1, respectively. These nodes are referred to as constant bit rate background traffic generators (CBR) and send TCP-type traffic as abnormal traffic in the MPQUIC transmission system for attack simulation. To ensure that the arrival time of the attack flow at the router is consistent with the sender, the connection path bandwidth is also set to 12 Mb with a 1 ms delay. The attack characteristics of the CBR nodes are as shown in the following formula:

$\displaystyle\textit{Attack}\left(C,I,R,S\right)=\left(2s,1s,50∼{}\text{Mbps},% 1400∼{}\text{B}\right)$ (14)

Here, $C$ represents the duration, set to 2 s, meaning the attack will continue for 2 s after initiation; $I$ represents the interval, set to 1 s, indicating that the attack will stop immediately after the duration and resume after 1 s; $R$ represents the attack rate, set to 50 Mbps; and $S$ represents the size of the attack traffic packets, set to 1400 bytes. The entire simulation experiment is conducted from 0 s to 40 s, with the attack starting at 9 s and ending at 29 s.

Figure 7.

Comparison of network traffic throughput.

Drawing on the robust capabilities of the NS-3 system, our simulation experiments have yielded numerous parameters that can be employed for analysis. As depicted in Fig. 7, throughput data of the paths is presented, where throughput refers to the amount of data transmitted per unit time in a network. A higher throughput indicates better network transmission performance. In Fig. 7, Path A is shown to be subjected to an anomalous traffic attack, while Path B represents the normal scenario, and Total reflects the overall throughput situation within the MPQUIC network environment.

Figure 8.

Comparison of network traffic delay.

Delay data for the paths is demonstrated in Fig. 8, where delay characterizes the time taken for network traffic packets to be transmitted from sender to receiver. A smaller delay signifies improved network communication conditions. Figure 8 reveals that, under normal circumstances, Path B exhibits stable delay variations, whereas Path A, impacted by the attack traffic, experiences severe fluctuations during the attack period. The aforementioned data collectively underscores the significant influence of attack traffic on the network environment, underscoring the importance of researching traffic detection for network anomalies.

4.2 Processing of MPQUIC traffic data

This study primarily focuses on analyzing the temporal variation characteristics of MPQUIC traffic data, thus selecting jitter rate data as the input for the model. The value of the jitter rate reflects the fluctuation and instability of the network and is highly sensitive to traffic changes. We can determine the stability changes of the network through indicators such as the peak values, frequency magnitudes, and vibration amplitudes of the jitter rate.

Figure 9.

Unattacked data reconstruction process.

Figure 10.

Attacked data reconstruction process.

Figures 9 and 10 respectively demonstrate the operation process of the preprocessing module of the EMD-LSTM model for MPQUIC traffic under attacked and non-attacked paths. To facilitate better input into the LSTM model, we propose an MPQUIC traffic anomaly detection model based on EMD-LSTM, which combines the EMD method to preprocess the jitter rate of MPQUIC traffic. According to the decomposition operation, the jitter rate is divided into multiple IMF components, and each IMF component undergoes FFT frequency filtering. Additionally, as the model has different evaluation indicators with varying magnitudes and units, it is necessary to normalize the data to eliminate these discrepancies and enable comparison and analysis. Therefore, we further implement mapminmax processing (i.e., normalization, mapping the original data to the 0–1 range through a deviation transformation function) in the data preprocessing module to remove noise, unify indicator levels, and improve the prediction accuracy and analyzability of the input jitter rate data. The deviation transformation formula is as follows:

$\displaystyle y_{\textit{map}}^{\ast}=\frac{y-y_{\min}}{y_{\max}-y_{\min}}$ (15)

As the raw data in the figures indicate, the jitter rate data generated under the MPQUIC environment is highly dense and contains a significant amount of noise. The trends in the raw data are not visually discernible under both normal and abnormal conditions, making accurate analysis of the MPQUIC network environment challenging. To address this issue and drawing from previous work [32], we employ the EMD reconstruction strategy to conduct the first partial reconstruction in the preprocessing module, obtaining a visually discernible data trend. Under the attacked path, 18 IMF components are generated, and IMFs with frequencies lower than 50 Hz are selected to obtain the reconstruction plot for abnormal situations. Under the non-attacked path, 17 IMF components are generated, and the same filtering operation is performed to obtain the reconstruction plot for normal situations. The results of the preprocessing module indicate that adopting the reconstruction strategy is beneficial for denoising and reflecting the current network status of MPQUIC.

4.3 EMD-LSTM model parameter

In this study, the primary prediction process involves decomposing multiple IMF components and utilizing them as input data for the LSTM model. These components are predicted individually, and the final result is obtained by summing all IMF components to achieve complete data reconstruction. It can be observed that the LSTM model remains the core of the prediction process, and the parameter settings play a crucial role.

For the simulation experiment, a sampling frequency of 0.001 s was set (i.e., one sample data point was collected every 0.001 s), yielding a total of 40 s of simulation time and resulting in 40 $\div$ 0.001 data points, totaling 40,000 data points per path. Subsequently, we employed the LSTM module in the EMD-LSTM model to partition the acquired data, obtaining a training set comprising 70% of the data and a test set containing the remaining 30% of the data.

Table 3
Model parameter

Parameter name	Value
MaxEpochs	200
InitialLearnRate	0.005
LearnRateDropPeriod	125
LearnRateDropFactor	0.2
Predict window size	1
Delayed step	100
Gradient drop algorithm	Adam
Sequence input node	100
Hidden layer unit	200

The model parameter of the time delay step is set to 100 (i.e., utilizing 100 historical data points to predict the subsequent $n$ points), resulting in a total of 39,900 data points for dataset partitioning, with 27,930 data points for the training set and 11,970 data points for the test set. Additional parameter settings are shown in Table 3. As displayed in Table 3, the prediction window size is set to 1, Adam is employed as the gradient descent algorithm, the maximum number of epochs is 200, the initial learning rate is set to 0.005, the learning rate decay factor is 0.2, and the learning rate decay period is 125 (i.e., the learning rate is reduced by the decay factor when the training count reaches 125). The LSTM model comprises 100 input layer nodes and 200 hidden layer units.

The selection of the parameters for the LSTM model in this paper was determined by specific considerations.

The InitialLearnRate is initialized to a value of 0.005, which is deliberately chosen to be a minimal learning rate. This choice is motivated by the objective of preventing rapid parameter updates during the initial stage of training. By employing a smaller learning rate, the intention is to facilitate the model’s convergence towards a local minimum. The learning rate decay approach, which involves the parameters LearnRateDropPeriod and LearnRateDropFactor, is employed to systematically decrease the learning rate in order to sustain the optimization process of the model.

Additionally, the parameters for time series prediction, namely the Predict window size and Delayed step, were configured to align with the specific requirements of the prediction task. The aforementioned parameters are responsible for governing the manner in which the model analyses time series data in order to make predictions about future values and accommodate for potential temporal lags.

Furthermore, the Hidden layer unit and Sequence input node are crucial elements utilized in the development of the LSTM network architecture. The model complexity and capacity are influenced by these parameters, and hence, we have chosen them in accordance with the task requirements and experimental results.

The selection of these parameters is determined through a complete evaluation of domain expertise, experimental results, and model efficacy. The objective is to identify the optimal model configurations for the task of detecting network traffic anomalies in the MPQUIC system. While parameter selection may include a certain degree of subjectivity, we are committed to ensuring their reasonableness through experimental verification and performance evaluation.

4.4 Analysis and comparison of experimental results

In this study, we conducted a comparative analysis of experimental results with several models, including BP, RF, and LSTM, to demonstrate the usability and advantages of the proposed model, as well as to verify the feasibility of implementing traffic anomaly detection in MPQUIC networks.

Figure 11.

Comparison of EMD-LSTM predicted results.

Figure 11 shows the results of the EMD-LSTM model’s prediction of the MPQUIC network traffic jitter rate. Specifically, Figs 11(a) and 11(b) represent the prediction scenarios under attacked and non-attacked paths, respectively. To reflect the prediction trends of the predicted values, we incorporated the reconstructed data after decomposing the experimental data in the figures. The results indicate that the data predicted by the EMD-LSTM model are in close agreement with the actual values, regardless of whether they are trained on abnormal or normal data, exhibiting favorable trend consistency.

Moreover, as the attack in the experiment persisted until 29 seconds, the predicted values in Fig. 11(a) between the sample points 2.793 $\times$ 10⁴ and 2.9 $\times$ 10⁴ still reflect the attack situation of the abnormal traffic, which is consistent with the expected results of the experimental setup. In summary, the proposed EMD-LSTM-based MPQUIC traffic anomaly detection model demonstrates high sensitivity to abnormal traffic data and is suitable for traffic anomaly detection, exhibiting considerable potential and broad application prospects in practical implementation.

Figure 12.

Each IMF component prediction result of attacked data.

Figure 13.

Each IMF component prediction result of unattacked datas.

The EMD-LSTM model’s excellent performance in detecting anomalies in MPQUIC traffic can be attributed to its adoption of the EMD decomposition operation and its combination with the LSTM model. Figures 12 and 13 show the prediction results of the IMF components after decomposing the abnormal and normal data, respectively. It can be observed that the prediction performance of each IMF component after EMD decomposition exhibits some discrepancies. For instance, IMF2 and IMF3 in Fig. 12, and IMF1 in Fig. 13. This situation arises due to the significant noise present in the network data; although EMD decomposition can eliminate noise, it still has certain limitations. Notably, the presence of noise also greatly affects the predictions of other models. Combining EMD methods still outperforms single models, which is reflected in the subsequent model indicator comparisons.

In conclusion, the prediction results in Figs 12 and 13 indicate that the EMD-LSTM model, when processing network traffic data, exhibits suboptimal prediction performance for a small portion of IMFs, primarily due to the substantial interference from noise. However, the EMD-LSTM model demonstrates high prediction accuracy for the majority of IMF components, suggesting that its overall data fitting performance remains satisfactory. Therefore, we believe that the EMD-LSTM model can effectively exploit the local features of IMF components, providing reliable predictions for the majority of IMF components and offering new strategies and approaches for future MPQUIC traffic anomaly detection efforts.

Figure 14.

MAPE parameter comparison of IMFs.

Figure 15.

Comparison of predicted results for each model.

For all IMF component predictions, the MAPE (Mean Absolute Percentage Error) metric is utilized for analysis and discussion, resulting in Fig. 14. As observed in Fig. 14, the noise impact generated by the attacked paths is generally greater than that of the unattacked paths. This phenomenon indicates that traffic attacks significantly reduce network stability, produce substantial noise data, and interfere with model predictions of network traffic data. The overall prediction accuracy of the model is affected by network attacks, thus necessitating further optimization.

Figure. 15(a) displays a comparison of prediction results for the proposed EMD-LSTM model with other models such as BP, RF, and LSTM under attacked paths. Figure 15(b) presents a comparison and analysis of prediction values generated by multiple models under unattacked paths. Upon examining Figs 15(a) and 15(b), the EMD-LSTM model demonstrates the most overlap between predicted values and true test set values compared to other models, indicating lower prediction errors and superior predictive performance. Furthermore, the overlap phenomenon is more pronounced in normal situations, exhibiting larger overlapping areas, closer peak prediction values, and true peak values, highlighting the model’s high accuracy and analytical capabilities. This not only signifies the EMD-LSTM model’s robustness and accuracy in handling anomalies in network traffic data but also suggests its higher prediction efficiency under normal conditions compared to other models.

Table 4

Model prediction evaluation index

Comparison model	Unattacked		Attacked
	RMSE	MAPE%	RMSE	MAPE%
BP	0.0010469	6.05%	0.0011993	27.74%
RF	0.0010103	6.85%	0.0011311	23.09%
LSTM	0.0010321	6.49%	0.0011467	23.82%
EMD-LSTM	0.00092477	1.13%	0.0011191	22.44%

Figure 16.

Model prediction analysis indicator.

Based on the predicted values and true values displayed in Fig. 15, the evaluation indicators for each model are calculated, as shown in Table 4. Figure 16(a) includes the root mean square errors (RMSE) for each model’s predictions, while Fig. 16(b) displays the average relative percentage error (MAPE) for each model’s predictive analysis. By analyzing the aforementioned figures and tables, it is evident that the EMD-LSTM model, by decomposing operations to reduce the influence of noise data on prediction results and utilizing LSTM’s long-term memory capability to learn complex data dependencies, optimizes and enhances various prediction evaluation indicators. Based on the findings of our research, we have put forth a proposed EMD-LSTM model that is well-suited for the MPQUIC network environment. Furthermore, we conducted a comprehensive performance evaluation by comparing it with the BP model, the RF model, and the LSTM model, all of which were individually deployed under identical experimental settings. The aforementioned comparisons serve as an early verification of our novel methodology. The EMD-LSTM model’s indicators for abnormal data analysis are higher than those for normal data due to the impact of noise, a phenomenon also reflected in other comparison models and considered normal. Comparing the same type of data, the RMSE values of EMD-LSTM under normal conditions are 11.66%, 8.46%, and 10.39% lower than BP, RF, and LSTM, respectively, while MAPE values are 81.32%, 83.5%, and 82.5% lower, respectively. This implies that the EMD-LSTM model exhibits lower errors and MAPE values approaching zero in normal network conditions, hence demonstrating its superiority in the context of normal network traffic. The aforementioned outcome underscores the superior precision and analytical prowess exhibited by the EMD-LSTM model when employed for the analysis and prediction of normal network traffic. Under abnormal conditions, the RMSE values are 6.68%, 1.06%, and 2.4% lower, respectively, and the MAPE values are 19.1%, 2.81%, and 5.79% lower, respectively. This demonstrates that under anomalous data situations, the EMD-LSTM model is still able to better match the data, decrease mistakes, and maintain high performance while evaluating network traffic anomalies. Smaller RMSE values and MAPE values closer to 0 indicate better model performance. Evaluation indicators reveal that the EMD-LSTM model outperforms single BP, RF, and LSTM models in both normal and abnormal situations.

After considering all factors, the EMD-LSTM model effectively improves and optimizes many evaluation metrics. This is achieved through the utilization of EMD decomposition processes, which mitigate the influence of noisy data on prediction outcomes. Additionally, the model leverages the long-term memory capacity of LSTM to capture intricate data dependencies. Despite potential interference from noise under exceptional circumstances, the performance of the EMD-LSTM model is generally deemed good. The findings presented in this study not only showcase the superior performance of the EMD-LSTM model in detecting abnormal network traffic, but also highlight its notable predictive efficiency compared to other models in normal scenarios. These results offer valuable insights and potential strategies for the future detection of MPQUIC network traffic anomalies.

5. Conclusions

In this study, we propose an EMD-LSTM-based MPQUIC traffic anomaly detection model, addressing the volatility and trend characteristics of MPQUIC network traffic data. We obtain MPQUIC traffic data through simulation experiments implemented on the NS-3 system. Subsequently, the Empirical Mode Decomposition (EMD) method is employed to decompose traffic data, and the Long Short-Term Memory (LSTM) model is used to learn long-term dependencies within the traffic data, enabling the overall model to predict and analyze traffic data. Finally, we utilize the Fast Fourier Transform (FFT) to filter out and exclude IMF components with high noise, achieving partial reconstruction of the data. The reconstructed data is then incorporated into the comparative analysis to investigate whether the predicted data conforms to the original data trend. Comparative results demonstrate that, in both cases, the EMD-LSTM model exhibits the highest degree of coincidence with the true values of the original data. Moreover, the RMSE values of the EMD-LSTM model are reduced by 11.66% and 6.68% compared to the BP model, 8.46% and 1.06% compared to the RF model, and 10.39% and 2.4% compared to the LSTM model, while the MAPE values are reduced by 81.32% and 19.1% compared to the BP model, 83.5% and 2.81% compared to the RF model, and 82.5% and 5.79% compared to the LSTM model. The results not only visually demonstrate the model’s accurate and reliable predictions for both normal and anomalous situations but also validate its superior performance through predictive evaluation indicators.

However, it is worth noting that some IMF components, after EMD decomposition, exhibit unsatisfactory prediction performance and low accuracy. This indicates that the EMD decomposition method is not entirely suitable for all IMF components, and adjustments and optimizations are required for IMF components containing high noise data.

The research conducted in this paper exhibits some constraints and deficiencies, encompassing limitations pertaining to the accessibility of data, the selection of model parameters, the handling of noisy data, and the intricacy of experimental configurations. To address these challenges, future research endeavors will involve an exploration of additional data sources, parameter adjustments, enhancement of noise removal techniques, and streamlining of the experimental procedures. Notwithstanding the encountered difficulties, these constraints present intriguing avenues for prospective investigation and promote good changes to improve the quality and usability of research.

Footnotes

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant No. 61962026, and in part by the Natural Science Foundation of Jiangxi Province under Grant No. 20224ACB202007.

References

Zhan

Zhu

Zhang

Wang

. Website-aware protocol confusion network for emergent HTTP/3 website fingerprinting. IEEE Transactions on Information Forensics and Security (2023). 2023; 18: 2427-2439.

Langley

Riddoch

Wilk

Vicente

Krasic

Zhang

Yang

Kouranov

Swett

Iyengar

. The QUIC transport protocol. Proceedings of the Conference of the ACM Special Interest Group on Data Communication – SIGCOMM’. 2017; 17: 183-196.

Carlucci

De Cicco

Mascolo

. HTTP over UDP. Proceedings of the 30th; Annual ACM Symposium on Applied Computing. 2015; 609-614.

Almuhammadi

Alnajim

Ayub

. QUIC Network traffic classification using ensemble machine learning techniques. Applied Sciences. 2023; 13(8): 4725.

Kakhki

Jero

Choffnes

Nita-Rotaru

Mislove

. Taking a long look at QUIC. Proceedings of the 2017; Internet Measurement Conference. 2017; 290-303. Available from: https//conferences.sigcomm.org/imc/2017/papers/imc17-final39.pdf.

Thomson

. RFC 0000 QUIC: A UDP-based multiplexed and secure transport status of this memo copyright notice. 2021. Available from: https//www.rfc-editor.org/v3test/draft-ietf-quic-transport-34-bad-pdf-line-break.pdf.

Benson

. Dissecting performance of production QUIC. Proceedings of the Web Conference 2021. 2021; 1157-1168.

Trevisan

Giordano

Drago

Khatouni

. Measuring HTTP/3: Adoption and performance. IEEE Xplore. 2021; 1-8. Available from: https//ieeexplore.ieee.org/document/9501274.

De Coninck

Bonaventure

. Multipath QUIC. Proceedings of the 13th International Conference on emerging Networking EXperiments and Technologies. 2017; 160-166.

10.

Cao

Lei

Wang

Shao

. l2-MPTCP: A learning-driven latency-aware multipath transport scheme for industrial internet applications. IEEE Transactions on Industrial Informatics. 2022; 18(12): 8456-8466.

11.

Wejin

Badejo

Jonathan

Folasade

. A brief survey on the experimental application of MPQUIC protocol in data communication. 2022 5th Information Technology for Education and Development (ITED). 2022; 1-8.

12.

Roskind

. QUIC Quick UDP Internet Connections Multiplexed Stream Transport over UDP. 2012. Available from: https//docs.google.com/document/d/1RNHkx_VvKWyWg6Lr8SZ-saqsQx7rFV-ev2jRFUoVD34/edit.

13.

Xing

Xue

Zhang

Han

Wei

DSL

Sun

. A Stream-Aware MPQUIC Scheduler for HTTP Traffic in Mobile Networks. IEEE Transactions on Wireless Communications. 2023; 22(4): 2775-2788.

14.

Thanh Trung

Minh Hai

Phi Le

Phan Thuan

Nguyen

. A Q-learning-based Multipath Scheduler for Data Transmission Optimization in Heterogeneous Wireless Networks. 2023 IEEE 20th Consumer Communications & Networking Conference (CCNC). 2023.

15.

Zhuang

Han

Xue

Wei

DSL

Sun

. Achieving Flexible and Lightweight Multipath Congestion Control Through Online Learning. IEEE Transactions on Network and Service Management. 2023; 20(1): 46-59.

16.

Weiß

Aleksandrov

Maxime

Jentsch

. Partial Autocorrelation Diagnostics for Count Time Series. Entropy. 2023; 25(1): 105.

17.

Tan

Pietrafesa

. Spectral analysis of a time series: From an additive perspective to a multiplicative perspective. Applied and Computational Harmonic Analysis. 2023; 63: 94-112.

18.

Jiang

. Cellular traffic prediction with machine learning: A survey. Expert Systems with Applications. 2022; 201: 117163.

19.

Jafari

Kaan

Sel

Mohammadi

Pettigrew

. Physics-informed neural networks for modeling physiological time series: A case study with continuous blood pressure. 2023.

20.

Zhang

. Forecasting of PM25 concentration time series based on SSA-LSTM model. International Conference on Statistics, Data Science, and Computational Intelligence (CSDSCI 2022). 2023; 12510: 373-380.

21.

Wang

Gan

Mao

Chen

. Forecasting power demand in China with a CNN-LSTM model including multimodal information. Energy. 2023; 263: 126012.

22.

Zhang

Yuan

Zhang

. Multi-indicator water quality prediction with attention-assisted bidirectional LSTM and encoder-decoder. Information Sciences. 2023; 625: 65-80.

23.

Zhang

. A review of recurrent neural networks: LSTM cells and network architectures. Neural Computation. 2019; 31(7): 1235-1270.

24.

Shyam

Sharma

Parashar

Singh

. Network traffic prediction using long short-term memory. 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC). 2020.

25.

Wang

Zhuo

Yan

. A network traffic prediction method based on LSTM. ZTE Communications. 2019; 17(2): 19-25.

26.

Huang

Shen

Long

Shih

Zheng

Yen

Tung

Liu

. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London Series A: Mathematical, Physical and Engineering Sciences. 1998; 454(1971): 903-95.

27.

Sun

Gao

Zhang

Liu

. Transfer learning: A new aerodynamic force identification network based on adaptive EMD and soft thresholding in hypersonic wind tunnel. Chinese Journal of Aeronautics. 2023.

28.

Boudraa

Cexus

. EMD-based signal filtering. IEEE Transactions on Instrumentation and Measurement. 2007; 56(6): 2196-202.

29.

Zhao

Jiang

Zhang

Zhao

Zhang

Guo

. ERNN: Error-resilient RNN for encrypted traffic detection towards network-induced phenomena. IEEE Transactions on Dependable and Secure Computing. 2023; 1-18.

30.

Sherstinsky

. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Physica D: Nonlinear Phenomena. 2020; 404: 132306.

31.

Hubert

Graf

Maxwell

van Mook

van Oosterhout

Schroeder

Larroy

. Linux advanced routing & traffic control. In Ottawa Linux Symposium. 2002; 213.

32.

Lei

Cao

Shao

. An QUIC Traffic Anomaly Detection Model Based on Empirical Mode Decomposition. IEEE Xplore. 2022; 76-80. Available from: https//ieeexplore.ieee.org/abstract/document/9831335/citations?tabFilter=:papers#citations.

A novel anomaly detection model for secure multipath QUIC communications by jointly using empirical mode decomposition and long short-term memory networks

Abstract

Keywords

1. Introduction

3. Basic theory and model analysis

3.1 The basic execution process of EMD

4.1 Simulation settings

Table 1 Experimental environment configuration

Table 3 Model parameter

Footnotes

Funding

References

Table 1
Experimental environment configuration

Table 3
Model parameter