Data driven hybrid fuzzy model for short-term traffic flow prediction

Abstract

Traffic flow prediction can not only improve the reasonability of the managers’ decision-making and road planning effectively, but also provide helpful suggestions for travelers to avoid traffic congestion. In order to further improve the prediction accuracy of traffic flow, this study presents one data driven hybrid model for short-term traffic flow prediction. This hybrid model firstly extracts the periodicity pattern from the traffic flow data, then, constructs the functionally weighted single-input-rule-modules connected fuzzy inference system (FWSIRM-FIS) for the residual data after removing the periodicity pattern from the original data, and finally, generates the final prediction results through integrating the periodicity pattern and the output from the FWSIRM-FIS model. The partial autocorrelation function (PACF) method is adopted to determine the optimal inputs for the data driven FWSIRM-FIS model, and the iterative least square method is utilized to train the parameters of the FWSIRM-FIS. Furthermore, three detailed experiments on traffic flow prediction are made, and comprehensive comparisons with three popular artificial intelligence methods are done to verify the effectiveness and advantages of the proposed hybrid model. According to five comparison indices, the proposed hybrid model can achieve the best prediction performance, although with much less fuzzy rules. The error histograms also verify that the proposed hybrid model has the smallest prediction errors comparing to the three comparative methods. The hybrid approach proposed in this study can also be extended to some other applications which have periodicity patterns, e.g. the traveling time estimate and the electricity load forecasting.

Keywords

Traffic flow prediction fuzzy method single input rule module least square learning traffic-flow pattern

1. Introduction

With the rapid development of the city, more and more urban problems exposed in the public’s view, such as the air pollution, the traffic congestion, etc [1 –3]. To solve such problems, urban computing was presented as a new theory [4 –6]. It is a novel comprehensive discipline that utilizes the data mining methods to handle and calculate the big data generated during urban development [4 –6]. Urban computing plays an important role in improving the living standards and quality of the urban people. Lots of studies have been done in the research area of urban computing. For example, in [7, 8], the traffic congestion and the congestion propagation were well identified and analyzed. The road usage patterns that are important to understand the traffic characteristics were presented in [9]. And, in [10], the cooperative parallel particle filters based on-line model selection method was proposed and applied to urban mobility. As another important aspect of urban computing, traffic flow prediction can provide decision support to the officers and urban managers, and increase people’s traveling efficiency.

In the past several decades, variety of methods have been used to predict traffic flow, such as the statistical methods, the neural network (NN) method, the fuzzy inference systems (FIS), etc. The statistical methods apply the statistical regression of the historical traffic data to predict the future values of the traffic flow. In [11], a multivariate linear model was built by Dang et al. for traffic flow forecasting and compared with the autoregressive (AR) model and autoregressive moving average (ARMA) model to verify its effectiveness. In [12], seasonal ARIMA (SARIMA) model with the exponential smoothing strategy was applied to predict single-interval traffic flow in the urban freeways. In [13], it was shown that the AR model and the Back Propagation NN (BPNN) have different effects on traffic flow prediction at different time periods. In [14 –17], several extended ARMA models were proposed and used in the urban traffic flow prediction.

Compared with the statistical methods, artificial intelligence techniques can give more reasonable response in uncertain and dynamical environments. In [18], the artificial NN was used to learn the characteristics of traffic flow at one major intersection in Istanbul for accurate traffic flow prediction. In [19], the BPNN, one of the most popular kinds of NNs, was applied to the short-term traffic flow prediction. In [20], the NN approach combined with Bayesian method was utilized for forecasting the short-term traffic flow in the freeways. In [21], the Kalman filter model and the NN model were combined by the fuzzy method as one hybrid intelligent model for the short-term traffic flow prediction. In [22], as a variant of the traditional support-vector regression (SVR), a supervised weighting-online learning SVR was given for the short-term traffic flow forecasting. Recently, deep learning methods have also been applied to this application, for example, a deep belief network (DBN) was examined by the traffic flow prediction in [23], and a stacked autoencoder model (SAE) was used to short-term traffic flow prediction in [24]. In addition to the NN methods, FISs also have perfect ability to process the nonlinear traffic flow data for accurate prediction. In [25], the Mamdani and Sugeno FISs were designed based on historical traffic flow data to realize the prediction. In [26], the fuzzy neural network (FNN) which combines the merits of both NN and FIS was applied to traffic flow prediction based on the collected data from on-road sensors. In [27], a urban street traffic flow prediction model was given through combining the FNN, the gate network (GN) and the expert network (EN). In [28], a pruned fast learning FIS (PFLFIS) was proposed for the data-driven prediction of the traffic flow. And, a type-2 FNN, in which type-2 fuzzy sets were adopted to replace the conventional type-1 fuzzy sets, was proposed to traffic flow prediction in [29].

The aforementioned researches are all data driven. However, many prediction models will become complex and be difficult to be constructed when the number of input variables becomes large [30 –34], especially for fuzzy methods, whose fuzzy rule base faces the rule explosion problems [35]. In order to alleviate the design difficulty and reduce the number of fuzzy rules, Yi et al. [36, 37] proposed the single-input-rule-module (SIRM) connected FIS (SIRM-FIS). Since its appearance, the SIRM-FISs have found lots of applications, such as the stabilization control of the inverted pendulum systems [37], position control of the over-head crane system [38]. The SIRM-FIS firstly sets each input variable one SIRM which is one single-input-single-output FIS, and then aggregates the outputs of all the SIRMs by multiplying their corresponding importance degrees [39, 40]. In the traditional SIRM-FIS, the importance degrees of all SIRMs are crisp values. Generally, for modeling and prediction applications, the performances of traditional SIRM-FISs are limited due to their simple input-output mappings. Recently, in order to enhance the approximation ability of the SIRMs method, the crisp importance degrees (crisp weights) of all SIRMs were replaced by the functional weights [41]. This new kind of SIRM-FIS is named the functionally weighted SIRM-FIS (FWSIRM-FIS). It has proved that the FWSIRM-FIS has more powerful ability for modeling and prediction problems compared with the traditional SIRM-FISs. Simulation results in [41] also verified the approximation ability and design simplicity of the FWSIRM-FIS.

On the other aspect, in addition to the observed traffic flow data, some kinds of prior knowledge exist in the traffic flow. For example, the traffic flow in different weekdays have similar trends, i.e. the traffic flow in weekdays has the daily periodic pattern. If such kind of prior knowledge can be used, the constructed prediction models can be expected to have improved performance. Despite its importance, there are limited studies that consider periodic features in traffic prediction. In [42], the spectral analysis was used to extract the periodic pattern of the short-term traffic flow, while in [43], the Fourier series was given to model the periodic component. In [44 –47], other kinds of prior knowledge, e.g. the monotonicity, the continuity and convexity have been encoded into the data driven models. Their results also verified the usefulness of the prior knowledge.

From the discussion above, it can be concluded that encoding prior knowledge into the data-driven model can achieve better performance. Thus, in order to further enhance the prediction accuracy, this study presents a hybrid model and apply it to the short-term traffic flow prediction. The main contributions and novelties of this study are listed as follows:

A hybrid model is proposed for the traffic flow prediction. This hybrid model combines the periodicity knowledge model and the residual data driven model to generate more accurate predicted results.

The residual data driven model is accomplished by the FWSIRM-FIS. As shown in [41], the FWSIRM-FIS is easy to design and train, and can reduce the number of fuzzy rules greatly. In this study, the residual data driven FWSIRM-FIS model is also trained by the iterative learning method proposed in [41].

The partial autocorrelation function (PACF) method is adopted to determine the optimal input variables for the data driven FWSIRM-FIS. Existing studies usually choose the input variables manually, while the PACF method provides us one effective way to automatically determine the input variables.

Three detailed experiments are done. And comprehensive comparisons with three popular methods, such as the adaptive-network-based fuzzy inference system (ANFIS) [48, 49], the PFLFIS [28], and the BPNN [50, 51], are made. Experimental results demonstrated that the proposed hybrid model can give excellent predicted results with small errors, and can perform much better than the three comparative methods.

The rest of this paper is as follows: In Section 2, the FWSIRM-FIS will be introduced. In Section 3, the hybrid model will be presented. In Section 4, experiments will be done. Finally, the conclusions will be given in Section 5.

2. The FWSIRM-FIS

The FWSIRM-FIS was proposed in [41] to enhance the approximation ability of the traditional SIRM-FIS [36, 37] which can efficiently solve the fuzzy rule explosion problem. The number of fuzzy rules in the FWSIRM-FIS still increases linearly with respect to the number of the inputs. Thus, the FWSIRM-FIS can not only deal with the rule explosion phenomenon efficiently, but also have much better approximation performance than the traditional SIRM-FIS.

The FWSIRM-FIS with n input variables x₁, x₂,…, x_n is composed of n SIRMs. Each SIRM can be seen as one single-input-single-output FIS [36, 37, 41]. In the FWSIRM-FIS, each SIRM is assigned with one functional weight. In the inference process of the FWSIRM-FIS, the outputs of all the SIRMs are firstly computed, and then, such output values are multiplied by their corresponding functional weights and combined to generate the final predicted result [36, 37, 41].

The SIRM for the input x_i (i = 1, 2, ⋯, n) can be expressed as [41]

$SIRM - i : {R_{i}^{j_{i}} : x_{i} = {\tilde{A}}_{i}^{j_{i}} \to y_{i} = c_{i}^{j_{i}}}_{j_{i} = 1}^{m_{i}}$ (1) where ${\tilde{A}}_{i}^{j_{i}}$ s are fuzzy sets of the input variable x_i, $c_{i}^{j_{i}}$ is the consequent parameter of rule $R_{i}^{j_{i}}$ , and m_i is the number of fuzzy rules in SIRM-i.

Choosing the singleton fuzzifier and the center-of-sets defuzzifier, the crisp fuzzy inference result of SIRM-i can be computed as

$y_{i} (x_{i}) = \frac{\sum_{j_{i} = 1}^{m_{i}} μ_{{\tilde{A}}_{i}^{j_{i}}} (x_{i}) c_{i}^{j_{i}}}{\sum_{j_{i} = 1}^{m_{i}} μ_{{\tilde{A}}_{i}^{j_{i}}} (x_{i})} .$ (2)

Suppose that the functional weight of SIRM-i is

$w_{i} (x) = w_{i}^{(0)} + w_{i}^{(1)} x_{1} + \dots + w_{i}^{(n)} x_{n},$ (3) where $x$ = (x₁, x₂, ⋯, x_n).

Then, the final input-output mapping of the FWSIRM-FIS can be computed as [41]

$\hat{y} (x) = \sum_{i = 1}^{n} w_{i} (x) y_{i} (x_{i}) .$ (4)

For simplicity, we give some notations firstly. We respectively denote the parameters in the consequent parts of the fuzzy rules in all SIRMs and the parameters in the functional weights as the following vectors $c = [c_{1}^{1}, \dots, c_{1}^{m_{1}}, \dots, c_{n}^{1}, \dots, c_{n}^{m_{n}}]^{T} .$ (5) $w = [w_{1}^{(0)}, \dots, w_{1}^{(n)}, \dots, w_{n}^{(n)}]^{T}$ (6)

Then, the input-output mapping of the FWSIRM-FIS can be rewritten in the vector form as [41]

$\begin{matrix} y (x, w, c) & = \sum_{i = 1}^{n} (w_{i}^{(0)} + \sum_{j = 1}^{n} w_{i}^{(j)} x_{j}) y_{i} (x_{i}) \\ = g (x, c)^{T} w \end{matrix}$ (7) in which

$\begin{matrix} g (x, & c) = [y_{1} (x_{1}), x_{1} y_{1} (x_{1}), \dots, x_{n} y_{1} (x_{1}), \dots, \\ y_{n} (x_{n}), x_{1} y_{n} (x_{n}), \dots, x_{n} y_{n} (x_{n})]^{T} . \end{matrix}$ (8)

In another way, the input-output mapping of the FWSIRM-FIS can also be calculated as [41]

$\begin{matrix} y (x, w, c) & = \sum_{i = 1}^{n} w_{i} (x) \frac{\sum_{j_{i} = 1}^{m_{i}} μ_{{\tilde{A}}_{i}^{j_{i}}} (x_{i}) c_{i}^{j_{i}}}{\sum_{j_{i} = 1}^{m_{i}} μ_{{\tilde{A}}_{i}^{j_{i}}} (x_{i})} \\ = f (x, w)^{T} c \end{matrix}$ (9) where

$f (x, w) = [\begin{matrix} w_{1} (x) \frac{μ_{{\tilde{A}}_{1}^{1}} (x_{1})}{\sum_{j_{1} = 1}^{m_{1}} μ_{{\tilde{A}}_{1}^{j_{1}}} (x_{1})} \\ ⋮ \\ w_{n} (x) \frac{μ_{{\tilde{A}}_{n}^{m_{n}}} (x_{n})}{\sum_{j_{n} = 1}^{m_{n}} μ_{{\tilde{A}}_{n}^{j_{n}}} (x_{n})} \end{matrix}] .$ (10)

3. The Proposed Hybrid Model

In this section, the proposed hybrid model will be presented firstly. Then, how to generate the periodicity knowledge model will be discussed. At last, the training of the residual data driven FWSIRM-FIS will be given.

3.1. The Structure of the Hybrid Model

As previously discussed, the short-term traffic flow has the periodic characteristic. This periodicity knowledge can provide superior complementarity to the uncertainties of the observed data. Thus, in this paper, the hybrid model as shown in Fig. 1 is proposed to obtain better accuracy for the prediction of the short-term traffic flow.

Fig. 1.

The structure of the proposed hybrid model for traffic flow prediction.

This hybrid model combines the knowledge model (the model of the periodicity knowledge) and the residual data driven model (the model constructed by observed residual data using the FWSIRM-FIS). In more detail, we utilize the following steps to construct this hybrid model.

Step 1: Construct the knowledge model through extracting the traffic flow pattern from the training data.

Step 2: Obtain the residual data through removing the traffic flow pattern from the training data.

Step 3: Generate the data-driven model through utilizing the residual data to optimize the parameters of the FWSIRM-FIS.

Step 4: Integrate the outputs from the knowledge model and the residual data driven model to give the final predicted result.

In this hybrid model, how to extract the traffic flow pattern and to design the residual data driven model are crucial issues to be solved. In the following subsections, we will give detailed discussions on these problems.

3.2. Extraction of Periodicity Knowledge and Residual Data

To begin, assume that the sampling data of the traffic flow have been collected for M weekdays, and in each day, T data points have been collected. As a result, the sampled time series of the traffic flow data can be written as a series of one dimensional vectors as

$S = {S_{1}, S_{2}, \dots, S_{M}}$ (11) in which $S$ _k is a vector of the sampling traffic flow data in the kth weekday, and can be expressed as

$S_{k} = [s_{k} (1), s_{k} (2), \dots, s_{k} (T)] .$ (12) where s_k (j) is the traffic flow at the jth sampling time of the kth weekday, and j = 1, 2, ⋯, T.

As well known, the traffic flow in weekdays has the daily-periodic characteristic. This kind of daily-periodic pattern can be extracted from the original time series as follows

${\bar{S}}_{Ave} = [\frac{1}{M} \sum_{k = 1}^{M} s_{k} (1), \dots, \frac{1}{M} \sum_{k = 1}^{M} s_{k} (T)]$ (13)

Consequently, through removing this daily-periodic pattern, the residual time series $S$ _Res of the data set can be obtained as

$S_{Res} = {S_{1} - {\bar{S}}_{Ave}, \dots, S_{M} - {\bar{S}}_{Ave}} .$ (14)

For simplicity, we denote this residual time series of the traffic flow as

$S_{Res} = {s_{R} (1), s_{R} (2), \dots, s_{R} (MT)} .$ (15) where s_R (j) represents the jth data points in the residual traffic flow time series of the M weekdays, and j = 1, 2, ⋯, MT.

3.3. Learning of the FWSIRM-FIS Model

In order to design a satisfactory data driven model by the FWSIRM-FIS, we should determine the fuzzy rules in all SIRMs and the parameters of all the functional weights. For the fuzzy rules, their antecedent fuzzy sets are usually generated by partitioning the input domains while their consequent parameters need to be tuned by learning algorithms. For the functional weights, all their parameters need to be optimized by learning algorithms. In conclusion, once the fuzzy sets are generated by fuzzy partitions, we still need to optimize the consequent parameters in fuzzy rules and the parameters of the functional weights. Below, we will give an iterative learning algorithm to tune such parameters.

Suppose that the traffic flow of the next sampling time can be affected by the traffic flows of the n sampling times before it. In other words, the data-driven model constructed by the residual data has n inputs and one output, which are respectively denoted as $x$ = [x₁, x₂, ⋯, x_n] and y. Then, the residual training data can be generated from the residual time series as

${x^{(k)}, y^{(k)}} = {[s_{R} (k), \dots, s_{R} (k + n)]},$ (16) where k = 1, 2, ⋯ N, and N is the number of the training data pairs.

Considering these N pairs of training data { $x$ ^(k), $y^{(k)}}_{k = 1}^{N}$ , the parameters $w$ , $c$ should be determined to minimize the following training criteria

$E (w, c) = \sum_{k = 1}^{N} (y (x^{(k)}, w, c) - y^{(k)})^{2}$ (17) where y ( $x$ ^(k), $w$ , $c$ ) is the predicted value from the FWSIRM-FIS.

It is not an easy thing to determine $w$ and $c$ simultaneously. However, from (22) and (24), we can observe that the output of the FWSIRM-FIS is linear with respect to the parameters $w$ and $c$ . Thus, we can determine $w$ and $c$ iteratively by the least square method. The Algorithm 1 in [41] provided us with the iterative parameter learning algorithm for the FWSIRM-FIS.

4. Experiment Setting and Results

In this section, the data sets used for experiments will be given firstly. Then, comparison indices will be provided. Furthermore, in this section, detailed experiments on traffic flow prediction will be presented. At last, comprehensive analysis and discussions on the experimental results and comparisons with the ANFIS, BPNN and the PFLFIS will be made.

4.1. Applied Data Sets

In this study, the traffic flow data for experiments were downloaded from the PeMS (California Performance Measurement System) traffic flow dataset [24, 28]. In the PeMS project, the traffic flow data are collected every 30 s by the loop detectors and then sent to the computer workstation in the University of California, Berkeley. The traffic flow dataset used in this study was collected by the Detector 1006210 (NB 99 Milgeo Ave) which is located at north bound freeway SR99, Ripon city, California. This freeway has three lanes under surveillance [24, 28].

In this paper, we select the traffic flow data collected in the weekdays from October 1, 2009 to November 30, 2009 for training and testing. And, we take into account three experiments which are the 5-minute prediction experiment, the 10-minute prediction experiment, and the 15-minute prediction experiment, in which the collected data are respectively aggregated 5, 10 and 15 minutes interval each.

4.2. Comparison Indices

To make a quantitative comparison of the proposed hybrid model and the comparative methods, comparison indices are needed. This subsection will introduce five commonly used indices for our comparison.

The first kind of the comparison indices considers the root mean square error (RMSE), the mean of the absolute errors (MAE), and the average percentage error (APE), which can be computed respectively as $RMSE = \sqrt{\frac{1}{N} \sum_{k = 1}^{N} | {\hat{y}}^{(k)} - y^{(k)} |^{2}},$ (18) $MAE = \frac{1}{N} \sum_{k = 1}^{N} | {\hat{y}}^{(k)} - y^{(k)} |,$ (19) $APE = \frac{1}{N} \sum_{k = 1}^{N} \frac{| {\hat{y}}^{(k)} - y^{(k)} |}{| y^{(k)} |} \times 100 %,$ (20) where y^(k) and ${\hat{y}}^{(k)} = \hat{y} (x^{(k)})$ are respectively the actual and predicted values with respect to the input $x$ ^(k), and N is the number of training or testing data.

We also consider another kind of prediction performance measure including two statistical indices which are respectively the Pearson correlation coefficient and the coefficient of determination. These two indices are respectively denoted as r and R², and can be calculated as $r = \frac{N \sum_{k = 1}^{N} {\hat{y}}^{(k)} y^{(k)} - \sum_{k = 1}^{N} {\hat{y}}^{(k)} \sum_{k = 1}^{N} y^{(k)}}{\sqrt{\hat{Err} \times Err}},$ (21) $R^{2} = \frac{{[\sum_{k = 1}^{N} ({\hat{y}}^{(k)} - {\hat{y}}_{Ave}) (y^{(k)} - y_{Ave})]}^{2}}{\sum_{k = 1}^{N} ({\hat{y}}^{(k)} - {\hat{y}}_{Ave}) \sum_{k = 1}^{N} (y^{(k)} - y_{Ave})}$ (22) where N is also the number of training or testing data, $\hat{Err}$ , Err, ${\hat{y}}_{Ave}$ and y_Ave are respectively the averages of the predicted and actual values and can be computed as $\hat{Err} = N \sum_{k = 1}^{N} ({\hat{y}}^{(k)})^{2} - (\sum_{k = 1}^{N} {\hat{y}}^{(k)})^{2},$ (23) $Err = N \sum_{k = 1}^{N} (y^{(k)})^{2} - (\sum_{k = 1}^{N} y^{(k)})^{2},$ (24) ${\hat{y}}_{Ave} = \frac{\sum_{k = 1}^{N} {\hat{y}}^{(k)}}{N},$ (25) $y_{Ave} = \frac{\sum_{k = 1}^{N} y^{(k)}}{N} .$ (26)

The index r ranges from -1 to 1, where 1 represents the total positive linear correlation, while -1 means total negative linear correlation. The index R² ranges from 0 to 1. For both indices, larger values of r and R² imply better forecasting performance of the prediction model.

4.3. Experimental Results

4.3.1. Five-Minute Prediction

In this case, we consider the traffic flow prediction of 5 minutes interval. As mentioned in subsection 4.1, we totally have the traffic flow data of 39 weekdays. And, each day has 288 five-minute data. The data in the first 21 weekdays are chosen for training while the data in the last 18 weekdays are for testing. In other words, 21 * 288 = 6048 data points in the weekday traffic flow time series are for training and 18 * 288 = 5184 data points in the time series are for testing.

The initial training and testing data are demonstrated in Fig. 1(a). Then, the daily-periodic pattern is extracted by Equation (Average). The extracted daily-periodic pattern for this 5-minute experiment is shown in Fig. 1 (b). After removing the daily-periodic pattern, the residual time series is given in Fig. 1 (c).

The residual time series will be used to train the FWSIRM-FIS based data driven model. Before this, we need to determine the input variables for the FWSIRM-FIS, i.e. we should determine which values before sampling time t will affect the value at sampling time t. To realize this objective, we adopt the partial autocorrelation function (PACF) method [52] to obtain the partial autocorrelation between s_R (t - k) and s_R (t) where k = 1, 2, ⋯. The partial correlation coefficient of s_R (t - k) with larger values will have greater influence on s_R (t). Existing studies usually choose the input variables manually, while the PACF method provides us one effective way to automatically determine the input variables. It is more reasonable and convenient to utilize the proposed approach to realize the determination of the input variables compared with the manual way.

The PACF of the traffic flow time series for the 5-minute experiment with 100 time lags is shown in Fig. 1. When choosing the threshold to be 0.05, we can observe from Fig. 1 that there exist 8 time lags which have obvious influence on the value of s_R (t). As a result, the determined optimal input variables with respect to s_R (t) are x₁ = s_R (t - 1), x₂ = s_R (t - 2), x₃ = s_R (t - 3), x₄ = s_R (t - 4), x₅ = s_R (t - 5), x₆ = s_R (t - 6), x₇ = s_R (t - 7), x₈ = s_R (t - 8). Thus, there are 6040 input-output data pairs for training and 5176 input-output data pairs for testing.

After being trained, the predicted values of the proposed hybrid model for the testing data are demonstrated in Fig. 1. From this figure, we can observe that the proposed hybrid model can capture the characteristics of the traffic flow and can provide satisfactory performance.

For comparison, the five indices of the proposed hybrid model, the ANFIS, the PFLFIS and the BPNN are listed in Table 1. And, we also plot the prediction error histograms of the four predictors in Fig. 1. From both the table and figure, we can see that the hybrid model performs best compared to the ANFIS, the PFLFIS and the BPNN.

Table 1
Comparisons of different methods in 5-minute experiment

Methods RMSE MAE APE(%) r R ²

Hybrid 22.26 17.28 10.05 0.98 0.96

ANFIS 29.45 18.30 11.34 0.96 0.92

PFLFIS 24.11 18.01 11.19 0.97 0.95

BPNN 24.25 18.53 12.08 0.98 0.95

Methods	RMSE	MAE	APE(%)	r	R ²
Hybrid	22.26	17.28	10.05	0.98	0.96
ANFIS	29.45	18.30	11.34	0.96	0.92
PFLFIS	24.11	18.01	11.19	0.97	0.95
BPNN	24.25	18.53	12.08	0.98	0.95

4.3.2. Ten-Minute Prediction

In this experiment, the traffic flow data of 5 minutes interval are aggregated into the 10 minutes interval. For the 39 weekdays, there are totally 39 * 144 = 5616 data points in the traffic flow time series. Again, we use the data in the first 21 days for training and in the last 18 days for testing. And, there are totally 144 * 21 = 3024 training data points and 144 * 18 = 2592 testing data points in the whole traffic flow time series. In this case, the training and testing data points of the initial time series are shown in Fig. 2 (a).

Fig. 2.

The training and testing data in the 5-minute (5 min) case: (a) the original data, (b) the extracted periodicity, (c) the residual data.

In this case, the periodic pattern extracted by (Average) is shown in Fig. 2 (b), and the residual time series data used for training and testing are plotted in Fig. 2 (c). Again, we use the PCAF [52] to determine the optimal input variables. The PACF of the traffic flow time series for this experiment with 100 time lags is shown in Fig. 2. From this figure, we can observe that the best input variables for predicting y = s_R (t) are x₁ = s_R (t - 1), x₂ = s_R (t - 2), x₃ = s_R (t - 3), x₄ = s_R (t - 4), x₅ = s_R (t - 5), x₆ = s_R (t - 6), x₇ = s_R (t - 7), x₈ = s_R (t - 8), x₈ = s_R (t - 9), i.e. there are 9 input variables in this case. Therefore, there left 3015 input-output data pairs for training and 2583 input-output data pairs for testing.

For the testing data, the predicted results of the proposed hybrid model are demonstrated in Fig. 2. The comparison results of the four predictors are listed in Table 2. And, the prediction error histograms of the four predictors in this experiment are plotted in Fig. 2. From the figures and table, we can observe that the proposed hybrid model has the best performance again. Detailed comparisons and analysis will be given in Subsection 4.4.

Table 2

Comparisons of different methods in 10-min experiment

Methods	RMSE	MAE	APE(%)	r	R ²
Hybrid	33.18	22.89	6.49	0.99	0.97
ANFIS	58.63	41.58	12.16	0.97	0.94
PFLFIS	54.96	41.58	12.12	0.98	0.95
BPNN	57.30	44.92	13.97	0.98	0.95

4.3.3. Fifteen-Minute Prediction

This prediction experiment is trained and tested by the traffic flow time series of 15-minute data aggregated from the 5-minute data of 39 weekdays. Thus, in this case, there exist 2016 data points in the traffic flow time series for training and 1728 data points for testing. These original training and testing data of the time series are shown in Fig. 3 (a), and then, the periodic pattern of this time series is extracted and depicted in Fig. 3 (b). After removing the periodic pattern, the residual time series are shown in Fig. 3 (c) which will be used to train the four prediction models.

Fig. 3.

The PCAF of the 5-min experiment with 100 time lags.

Again, the PCAF of this 15-minute experiment is computed within 100 time lags and demonstrated in Fig. 3. From this figure, we can conclude that, in order to predict y = s_R (t), the optimal input variables should be are x₁ = s_R (t - 1), x₂ = s_R (t - 2), x₃ = s_R (t - 3), x₄ = s_R (t - 4), x₅ = s_R (t - 5), x₆ = s_R (t - 6). Consequently, in this experiment, there are 2010 input-output data pairs for training and 1722 input-output data pairs for testing.

For the testing data, the comparison results of the four prediction models are listed in Table 3, and the prediction results of the hybrid model are shown in Fig. 3. Again, in order to better display the prediction errors of the four prediction models, the prediction error histograms of the four predictors are demonstrated in Fig. 3. The proposed hybrid model is still the best one in this case.

Table 3

Comparisons of different methods in 15-min experiment

Methods	RMSE	MAE	APE(%)	r	R ²
Hybrid	45.88	31.73	5.90	0.99	0.98
ANFIS	72.64	53.24	10.17	0.98	0.96
PFLFIS	72.28	54.15	10.33	0.98	0.96
BPNN	75.79	57.95	11.68	0.98	0.96

4.4. Comparisons and Discussions

Through considering the comparison results in the three experiments, we can make the following conclusions.

For the RMSE, MAE and APE indices, smaller values correspond to better prediction performance. From Tables 1–3, we can observe that the proposed hybrid model performs best according to the RMSE, MAE, and APE. In experiment one, the accuracy of the proposed hybrid model according to these three indices can improve at least 7%, 4% and 11% respectively compared with the ANFIS, the PFLFIS and the BPNN. And, in the second and third experiments, the improvement proportions can achieve about 40%, 40% and 45% respectively with respect to these three indices.

For the indices r and R², the larger their values are, the better the prediction performance will be. The results in Tables 1–3 also verified the performance and the advantages of the proposed hybrid model.

The error histograms in Figs. 5, 9 and 13 can reflect the error distributions of the four predictors. The less the centers of the error histograms deviate from zero, the better the prediction performance will be. From Figs. 5, 9 and 13, we can also see that the proposed hybrid model performs best and give the smallest prediction errors in the three experiments.

As we have mentioned previously, the FWSIRM-FIS can reduce the number of fuzzy rules. In the three experiments, the numbers of fuzzy rules or neural nodes of the four predictors are listed in Table 4. The results in this table again verify the ability of FWSIRM-FIS on reducing the number of fuzzy rules and alleviating the design difficulty.

Table 4
The number of fuzzy rules or neural nodes

Hybrid ANFIS PFLFIS BPNN

Experiment 1 24 81 41 80

Experiment 2 27 81 35 80

Experiment 3 18 81 39 80

	Hybrid	ANFIS	PFLFIS	BPNN
Experiment 1	24	81	41	80
Experiment 2	27	81	35	80
Experiment 3	18	81	39	80

Fig. 4.

Prediction results of the proposed hybrid model in the 5-min experiment.

Fig. 5.

Prediction error histograms of the four predictors in the 5-min experiment: (a) the proposed hybrid model, (b) ANFIS, (c) PELFIS, and (d) BPNN.

Fig. 6.

The data in the 10-min case: (a) original data, (b) extracted periodicity, (c) the residual data.

Fig. 7.

The PCAF of the 10-min experiment with 100 time lags.

Fig. 8.

Prediction results of the proposed hybrid model in the 10-min experiment.

Fig. 9.

Prediction error histograms of the four predictors in the 10-min experiment: (a) the proposed hybrid model, (b) ANFIS, (c) PELFIS, and (d) BPNN.

Fig. 10.

The data in the 15-min case: (a) original data, (b) extracted periodicity, (c) the residual data.

Fig. 11.

The PCAF of the 15-min experiment with 100 time lags.

Fig. 12.

Prediction results of the proposed hybrid model in the 15-min experiment.

Fig. 13.

Prediction error histograms of the four predictors in the 15-minute experiment: (a) the proposed hybrid model, (b) ANFIS, (c) PELFIS, and (d) BPNN.

5. Conclusions

This paper combined the knowledge model and the residual data driven FWSIRM-FIS model for the accurate prediction of short-term traffic flow. In this proposed hybrid model, the knowledge model was constructed by the periodicity pattern extracted from the traffic flow time series, while the data driven FWSIRM-FIS model was designed by the residual data through removing the periodicity pattern from the time series. In addition, the partial autocorrelation method was presented to determine the optimal inputs for the data driven FWSIRM-FIS model. And, detailed experiments have demonstrated the effectiveness and advantages of the proposed hybrid model, through comparing it with three popular methods.

On the other aspects, in the past few decades, many mathematic models have been built for forecasting traffic flow. Such mathematic models provide us the mechanism of the traffic flow and can be the supplement of the data driven model. In our future study, we will try to combine the mathematic models with the data driven models to further improve the prediction accuracy.

Footnotes

Acknowledgments

This work is supported by National Natural Science Foundation of China (61473176, 61573225), the Taishan Scholar Project of Shandong Province, and the Colleges and Universities Independent Innovation Program of Jinan City (201303008).

References

Lv ,

Chen ,

Zhang ,

Duan and

Li , IEEE/CAA Journal of Automatica Sinica 4(1) (2017), 19–26.

Duan ,

Lv ,

Y.L.

Liu , and

F.Y.

Wang , An efficient realization of deep learning for traffic data imputation, Transportation Research Part C: Emerging Technologies 72 (2016), 168–181.

Li ,

Lv and

F.Y.

Wang , Traffic signal timing via deep reinforcement learning, IEEE/CAA Journal of Automatica Sinica 3(3) (2016), 247–254.

Zheng ,

Capra ,

Wolfson and

Yang , Urban com-puting: Concepts, methodologies, and applications, ACM Transactions on Intelligent Systems and Technology 5(3) (2014), 1–55.

Zhang ,

Qin and

Zheng , Effective and efficient: Large-scale dynamic city express, IEEE Transactions on Knowledge and Data Engineering 28(12) (2016), 3203–3217.

Zheng ,

Liu ,

Yuan and

Xie , Urban computing with taxicabs, Proceedings of the 13th International Conference on Ubiquitous Computing, 2011, pp. 89–98.

Liu ,

Zhang and

Zhang , Emergence and disappearance of traffic congestion in weight-evolving networks, Simulation Modelling Practice and Theory 17(10) (2009), 1566–1574.

J.C.

Long ,

Z.Y.

Gao ,

H.L.

Ren , and

A.P.

Lian , Urban traffic congestion propagation and bottleneck identification, Science in China Series F: Information Sciences 51(87) (2008), 948.

Martino ,

Read ,

Elvira and

Louzada , Cooperative parallel particle filters for online model selection and applications to urban mobility, Digital Signal Processing 60 (2017), 172–185.

10.

Wang ,

Hunter ,

A.M.

Bayen , and

Schechtner , Understanding road usage patterns in urban areas, Scientific Reports 2 (2012), 1001.

11.

X.C.

Dang and

Yan , Traffic flow prediction based on multivariate linear AR model, Computer Engineering 38(1) (2012), 83–84.

12.

Williams ,

Durvasula and

Brown , Urban freeway traffic flow prediction: Application of seasonal autoregressive integrated moving average and exponential smoothing models, Transportation Research Record 1644(1) (1998), 132–141.

13.

Zhang ,

Pei ,

Zhao and

Y.U.

Han , Comparison of traffic flow prediction based on AR model and BP model, Telecommunications Science (2016).

14.

R.J.

Wai ,

Y.F.

Lin , and

Y.K.

Liu , Modeling and forecasting vehicular traffic flow as a seasonal stochastic time series process, Dissertation Abstracts International 60(1) (1999), 0292.

15.

Williams , Multivariate vehicular traffic flow prediction: Evaluation of ARIMAX modeling, ICE Transport 1776(1) (2011), 194–200.

16.

Wilson and

Ghosh , Short-term traffic flow forecasting with A-SVARMA, IEEE Transactions on Industrial Electronics (2013).

17.

Guo ,

Huang and

B.M.

Williams , Adaptive Kalman filter approach for stochastic short-term traffic flow rate prediction and uncertainty quantification, Transportation Research Part C 43 (2014), 50–64.

18.

B.G.

Cetiner ,

Sari and

Borat , A neural network based traffic-flow prediction model, Mathematical and Computational Applications 15(2) (2010), 269–278.

19.

B.L.

Smith and

M.J.

Demetsky , Short-term traffic flow prediction: Neural network approach, Transportation Research Record 1453 (1994).

20.

Zheng and

D.H.

Lee , Short-term freeway traffic flow prediction: Bayesian combined neural network approach, Journal of Transportation Engineering 132(2) (2006), 114–121.

21.

Shen ,

Kong and

Chen , A short-term traffic flow intelligent hybrid forecasting model and its application, Fems Microbiology Ecology 13(3) (2011), 67–75.

22.

Y.S.

Jeong ,

Y.J.

Byon ,

M.M.

Castro-Neto and

S.M.

Easa , Supervised weighting-online learning algorithm for short-term traffic flow prediction, IEEE Transactions on Intelligent Transportation Systems 14(4) (2013), 1700–1707.

23.

Huang ,

Song ,

Hong and

Xie , Deep architecture for traffic flow prediction: Deep belief networks with multitask learning, IEEE Transactions on Intelligent Transportation Systems 15(5) (2014), 2191–2201.

24.

Lv ,

Duan ,

Kang ,

Li and

F.Y.

Wang , Traffic flow prediction with big data: A deep learning approach, IEEE Transactions on Intelligent Transportation Systems 16(2) (2015), 865–873.

25.

Wang and

Chen , A comparison of Mamdani and Sugeno fuzzy inference systems for traffic flow prediction, Journal of Computers 9(1) (2014).

26.

K.Y.

Chan and

T.S.

Dillon , On-road sensor configuration design for traffic flow prediction using fuzzy neural networks and Taguchi method, IEEE Transactions on Instrumentation and Measurement 62(1) (2012), 50–59.

27.

Yin ,

S.C.

Wong ,

Xu and

C.K.

Wong , Urban traffic flow prediction using a fuzzy-neural approach, Transportation Research Part C Emerging Technologies 10(2) (2002), 85–98.

28.

Li ,

Lv ,

Yi and

Zhang , Pruned fast learning fuzzy approach for data-driven traffic flow prediction, Journal of Advanced Computational Intelligence and Intelligent Informatics 22(7) (2016), 1181–1191.

29.

Zhao , Short-term traffic flow prediction based on interval type-2 fuzzy neural networks, Communications in Computer and Information Science 98 (2010), 230–237.

30.

Li ,

Ding ,

Yi ,

Lv and

Zhang , Deep belief network based hybrid model for building energy consumption prediction, Energies 11(1) (2018), 242.

31.

Li ,

Ding ,

Zhao ,

Yi and

Zhang , Building energy consumption prediction: An extreme deep learning approach, Energies 10(10) (2017), 1525.

32.

J.Q.

Li ,

J.D.

Wang ,

Q.K.

Pan ,

P.Y.

Duan ,

H.Y.

Sang ,

K.Z.

Gao , and

Xue , A hybrid artificial bee colony for optimizing a reverse logistics network system, Soft Computing 21(1) (2017), 1–18.

33.

P.Y.

Duan ,

J.Q.

Li ,

Wang ,

H.Y.

Sang , and

B.X.

Jia , Solving chiller loading optimization problems using an improved teaching-learning-based optimization algorithm, Optimal Control Applications and Methods 39(4) (2018).

34.

Li ,

Sang ,

Han ,

Wang and

Gao , Efficient multi-objective optimization algorithm for hybrid flow shop scheduling problems with setup energy consumptions, Journal of Cleaner Production 181 (2018), 584–598.

35.

Li ,

Ding ,

Qian and

Lv , Data-driven design of the extended fuzzy neural network having linguistic outputs, Journal of Intelligent & Fuzzy Systems 34(1) (2018), 349–360.

36.

Yi ,

Yubazaki and

Hirota , A proposal of SIRMs dynamically connected fuzzy inference model for plural input fuzzy control, Fuzzy Sets and Systems 125(1) (2002), 79–92.

37.

Yi ,

Yubazaki and

Hirota , Stabilization control of series-type double inverted pendulum systems using the SIRMs dynamically connected fuzzy inference model, Artificial Intelligence in Engineering 15(3) (2001), 297–308.

38.

Yi ,

Yubazaki and

Hirota , Anti-swing and positioning control of overhead traveling crane, Information Sciences 155(1) (2003), 19–42.

39.

Seki ,

Ishii and

Mizumoto , On the generalization of single input rule modules connected type fuzzy reasoning method, IEEE Transactions on Fuzzy Systems 16(5) (2008), 1180–1187.

40.

Li ,

Wang ,

Zhang ,

Wang and

Shang , Functional-type single-input-rule-modules connected neural fuzzy system for wind speed prediction, IEEE/CAA Journal of Automatica Sinica 4(4) (2017), 751–762.

41.

Li ,

Gao ,

Yi and

Zhang , Analysis and design of functionally weighted single-input-rule-modules connected fuzzy inference systems, IEEE Transactions on Fuzzy Systems 26(1) (2018), 56–71.

42.

T.T.

Tchrakian ,

Basuand

O'Mahony , Real-time traffic flow forecasting using spectral analysis, IEEE Transactions on Intelligent Transportation Systems 13(2) (2012), 519–526.

43.

Tang ,

Liu ,

Zou ,

Zhang and

Wang , An improved fuzzy neural network for traffic speed prediction considering periodic characteristic, IEEE Transactions on Intelligent Transportation Systems 18(9) (2017), 2340–2350.

44.

Li ,

Yi and

Zhang , On the monotonicity of interval type-2 fuzzy logic systems, IEEE Transactions on Fuzzy Systems 22(5) (2014), 1197–1212.

45.

Li ,

Yi and

Wang , Encoding prior knowledge into data driven design of interval type-2 fuzzy logic systems, International Journal of Innovative Computing Information & Control 7 (2011), 3.

46.

, Yi

, Wang

, Zhang

and Li

, Interval data driven construction of shadowed sets with application to linguistic word modelling, Information Sciences, doi:10.1016/j.ins.2018.11.018

47.

Wu and

J.M.

Mendel , On the continuity of type-1 and interval type-2 fuzzy logic systems, Fuzzy Systems 19(1) (2011), 179–192.

48.

J.R.

Jang , ANFIS: Adaptive-network-based fuzzy inference system, IEEE Trans on Smc 23(3) (1993), 665–685.

49.

Amid and

T.M.

Gundoshmian and Prediction

, output energies for broiler production using linear regression, ANN (MLP, RBF), and ANFIS models, Environmental Progress & Sustainable Energy 36(2) (2017).

50.

Wang ,

Chen ,

Sun ,

Lin and

Meng , Finite time control of switched stochastic nonlinear systems, Fuzzy Sets and Systems. doi: 10.1016/j.fss.2018.04.016

51.

Wang and

X.Y.

Zhang , Adaptive finite time control of nonlinear systems under time varying actuator failures, IEEE Transactions on Systems, Man, and Cybernetics: Systems. doi: 10.1109/TSMC.2018.2868329

52.

Degerine and

Lambert-Lacroix , Partial autocorrelation function of a nonstationary time series, Journal of Multivariate Analysis 87(1) (2003), 46–59.

Data driven hybrid fuzzy model for short-term traffic flow prediction

Abstract

Keywords

1. Introduction

2. The FWSIRM-FIS

3.1. The Structure of the Hybrid Model

4.1. Applied Data Sets

4.2. Comparison Indices

4.3.1. Five-Minute Prediction

Table 1 Comparisons of different methods in 5-minute experiment Methods RMSE MAE APE(%) r R 2 Hybrid 22.26 17.28 10.05 0.98 0.96 ANFIS 29.45 18.30 11.34 0.96 0.92 PFLFIS 24.11 18.01 11.19 0.97 0.95 BPNN 24.25 18.53 12.08 0.98 0.95

Table 4 The number of fuzzy rules or neural nodes Hybrid ANFIS PFLFIS BPNN Experiment 1 24 81 41 80 Experiment 2 27 81 35 80 Experiment 3 18 81 39 80

Footnotes

Acknowledgments

References

Table 1
Comparisons of different methods in 5-minute experiment

Methods RMSE MAE APE(%) r R ²

Hybrid 22.26 17.28 10.05 0.98 0.96

ANFIS 29.45 18.30 11.34 0.96 0.92

PFLFIS 24.11 18.01 11.19 0.97 0.95

BPNN 24.25 18.53 12.08 0.98 0.95

Table 4
The number of fuzzy rules or neural nodes

Hybrid ANFIS PFLFIS BPNN

Experiment 1 24 81 41 80

Experiment 2 27 81 35 80

Experiment 3 18 81 39 80