Abstract
In a real-time application such as traffic monitoring, it is required to process the enormous amount of data. Traffic prediction is essential for intelligent transportation systems (ITSs), traffic management authorities, and travelers. Traffic prediction has become a challenging task due to various non-linear temporal dynamics at different locations, complicated underlying spatial dependencies, and more extended step forecasting. To accommodate these instances, efficient visualization and data mining techniques are required to predict and analyze the massive amount of traffic big data. This paper presents a deep learning-based parallel convolutional neural network (Parallel-CNN) methodology to predict the traffic conditions of a specific region. The methodology of deep learning contains multiple processing layers and performs various computational strategies, which is used to learn representations of data with multilevel abstraction. The data has captured from the department of transportation; thus, the size of data is vast, and it can be analyzed to get the behavior of the traffic condition. The purpose of this paper is to monitor traffic behavior, which enables the user to make decisions to build the traffic-free cities. Experimental results show that the proposed methodology outperforms other existing methods such as KNN, CNN, and FIMT-DD.
Introduction
In urban areas, traffic is one of the major issues that happen due to a large number of vehicles, and it causes significant problems for heavy vehicles to reach their destinations. Different situations may influence the traffic flow that creating trouble, i.e., weather, road works, accidents, vehicles that were repaired, and make the vehicle overloaded. In this way, the world is going to confront the traffic issue, which is a real challenge for the people. Nowadays, the flow of traffic for individual travellers, business areas, and government agencies have given importance to analyze the traffic promptly [1]. The traffic information that has gathered from various sensors is getting extracted for suitable and valuable data. Many researchers have enforced different approaches to sense the information to control the traffic. Intelligent vehicle highway systems (IVHS) have developed as intelligent systems, which include such as traffic management, vehicle control, and driver information [4]. The IVHS provides the external forces to form an alternative state based on the route and destination. Similarly, roadside telematics (RT), Advanced Driver Assistant System (ADAS), and other alternative technologies are facing heavy traffic situations in providing comfort to individuals. New approaches and techniques have been utilized to build information and predict the flow of traffic within the range of new sensors [12, 13]. MAPE (Mean Absolute Percent Error) and RMSE (Root Mean Square Error) are useful to calculate the error rate of the state of the traffic flow [2]. The preliminary issues proposed by Poincare and Hilbert analyses that, deep learning permits the nonlinear functions for efficient modeling [14, 15].
The Kolmogorov-Arnold illustration theorem provides the theoretical motivation for deep learning [10]. According to this, any continuous operate of n variables outlined by F(y) is shown in the below manner.
Where p k and q ij are the continuous functions, and q ij is a universal basis that does not depend on F. This result implies that any continuous function can depict the exploitation operations of summation and performance composition. In a neural network, operation on n variables can represent with 2n+1 activation units and one hidden layer [11, 16].
In this section, initially, the review of traffic prediction and analysis has been described. The main objective of the review is to find the different techniques and ways for traffic flow prediction. In such an analysis, the collection of traffic data that are required can collect from distributed sensors obtained by government agencies [3]. In the context of smart cities, the availability of detailed datasets on road traffic offers a great intelligent transport system. State of the art machine learning algorithms is used to investigate the big traffic datasets, including an exploratory analysis. The roadmap correlated with networks in a spatial manner, where the travel time influenced by road volumes [23]. Vaa et al. [24] proposed driver support systems to save lives from fatal accidents. However, exact and reliable traffic condition estimating remains a challenging issue since transport frameworks follow time-varying and complex structures, where the present and future advancement relies upon the connection between traffic streams. The stacked auto-encoder is employed to find out the general traffic flow options with a trained layer that works greedily. The authors of [7, 9] have proposed the most effective stacked auto-encoder method, which is applied to predict traffic flow features. In this, the feature functionality has been measured based on the spatial and temporal correlations square measure instinctively.
Billhardt and Lujak defined the implementation of a dynamic coordination system of ambulances to provide facilities for emergency medical help services [8]. EMA service considered an essential strategy to reduce the standard time of the ambulance. The process of traffic information can resolve with the regression technique. Over the past few years, deep learning has attracted many of the researchers to formalize traffic-related applications. In favor of road information, the traffic flow pattern processed to extract purposeful information by exploiting multi-layered design through a deep algorithmic program [18]. To perform the related operations and to produce different patterns within the traffic flow, a model named exploitation stacked auto-encoder has been developed [19]. The most important factors related to the climate are; predicting the traffic information, including weather information. Among the existing models, deep learning neural network has not been utilized to model the related applications [17]. In this paper, the urban city center information has been considered to predict the traffic with road traffic information in the time intervals of 15, 30, and 60 minutes [20]. Ari et al. [26] presented a model named; Fast Incremental Model Trees-Drift Detection (FIMT-DD) to predict the traffic data. In this model, the authors have used distributed sensors to collect traffic from different areas. An increase in the number of sensors generates an enormous amount of traffic for visualization. Each new point of sensors generates traffic data. It requires predicting the number of vehicles at a particular time. Amini et al. [42] outlined a strategy for real-time traffic control, in which they used big data analytics architecture that employs kafka tools for building pipelines and stream processing. Cheng et al. [43] proposed a machine learning approach for classifying the traffic state of urban roadways. Improved FCM clustering algorithm is used to show the classification accuracy and compared it with the other machine learning algorithms. Aqib et al. [44] proposed an approach for traffic prediction using big data and novel deep learning models. To evaluate the model performance, 11 years of traffic data collected from the department of transportation, California. This model uses the four cutting edge technologies such as big data, Deep learning, in-memory computing, and graphics processing units to make smarter traffic predictions. Fan et al. [45] carried out extensive research on black spot identification for safety in urban traffic accidents using SVM and black point identification algorithm based on deep neural networks.
Working strategy
To summarize our contribution, for the problem of traffic prediction, we tend to develop a multi-task deep convolutional network, which detects the presence of the target and its geometric attributes (location and orientation) about the region of interest (ROI). To leap out from the existing problems, we tend to develop a recurrent neural network (RNN), which uses its internal status memory to infer the presence or absence of a lane over a sequence of image areas.
Figure 1 shows the workflow of the proposed neural network systems. Billhardt and Lujak defined the implementation of a dynamic coordination system of ambulances to provide facilities for emergency medical help services [8]. EMA service considered a vital strategy to reduce the average time of the ambulance. The process of traffic information can resolve with the regression technique.

The workflow of a deep convolutional neural network.
The quick generation of data makes the information productive and massive. Therefore, we have a tendency not to use traditional regression algorithmic methods, which calculate and collect all the data in a time interval mode [21]. Tsangaratos et al. [22] compared the Naive Bayes Regression and k-nearest neighbors algorithmic models against the prediction models. In the proposed model, Traffic Performance Index (TPI) has employed for logistical regression analysis, and a hyper tangent functionality used as an activation operation for every layer. The advised model is useful in distinguishing real traffic flow conditions along with a countable accuracy up to the range of 99%. A variety of models have been tested to enhance the predictive value and visualize the traffic volume. As shown in Fig. 2, existing methodologies use a series of delays tested for the duration of different time intervals.

Comparison of predicted travel flow (speed) with a different method.
Various factors, such as weather, accidents, and road works, have been considered to analyze the flow of traffic. For traffic flow prediction, constant and non-variable approaches are the two ways that need to consider. In the literature, the non-parametric functionality has considered in analyzing the traffic flow [27]. Genetic Network Programming (GNP) is used to classify the attributes related to the non-parametric functionality [28].
Association rules applied to the traffic information, which categorizes the traffic information at different levels, such as low, medium, and high. Rules have been applied in different lanes on the road and considered for prediction. The process flow model depicted in Fig. 3 shows how the samples are considered to train the dataset. Fuzzy rule-based approaches are used for modeling and predicting the urban traffic flow [38].
The data are assigned with different weights and trained for a short time horizon. Lun Zhang et al. [41] developed a model to overcome the limitation of a short time horizon. The January month traffic flow prediction, K nearest neighbors’ regression approach has been used for every 5 minutes time interval. This framework compiled in three steps, namely; historical information, state vector systems, and predictions. K-means used for arranging the weight of the model and compared with the real world’s traffic model. The accuracy of the backpropagation neural network found less for the non-parametric approach used in this populated area. This limitation has been re-considered by Min and Wynter to perform the prediction based on the exploitation of spatial-temporal correlation. Temporary dimensional analysis has been carried out through the interference between the two links of traffic, where spatial dimension analysis has carried through the neighboring links of traffic behavior [29]. This model provides more substantial accuracy in urban areas. It is used to find the real-time traffic information, which is volatile and expected for 15 minutes.
Proposed methodology
Deep learning: a preliminary description
Deep learning progressively recognized as an essential tool for artificial intelligence research in many suitable applications in different areas. Deep learning models mostly used in image recognition and speech recognition [5]. However, the research community applied deep learning algorithms to find out the solutions for different problems in varied alternative fields [6]. A deep learning algorithm roughly classified into four types: Q-learning, Recurrent Neural Network (RNN), Convolution Neural Network (CNN), and Deep Neural Network (DNN). These functionalities quickly evolve with many packages together, such as; Theano, Tensorflow, CNN, Caffee, and Keras, etc. The objective of traffic prediction is to give such traffic flow information in a visualized manner. The design of DNN is used to estimate the traffic conditions, exploitation period, and transportation from a large amount of data. The recommended DNN model aims to differentiate the non-congested conditions through the provision of multivariate analysis.
Deep learning initially inspired by the information communicated in the biological nervous system. Deep learning models are assembled by multiple layers, which include both the visible layers and hidden layers [34]. Visible layers contain input and output, where hidden layers are designed to extract features from feed-forward operations. Rumelhart et al. [40] introduced backpropagation based on the gradient to train neural networks. Deep learning showcases excellent learning capability, w.r.t the increasing size of the dataset. Thus, deep learning has got rapid growth in applications. Deep learning models such as multilayer perceptron (MLP), convolutional neural network (CNN), recurrent neural network (RNN), generative adversarial networks (GAN), have been widely applied in computer vision, natural language processing, and audio generation, etc. [35].
Parallel convolutional neural networks
A parallel convolutional neural network, also known as P-CNN, has been widely applied in many areas and has shown outstanding capability in processing grid-like data. Examples include 1-D time-series samples with regular time intervals and images, which can be thought of as 2-D grid pixels. In the mid-20th century, neurophysiologists tried to understand how the brain responds to images and discovered that some neurons in the brains of cats and monkeys are highly sensitive to edges in small regions in visual fields. Thus, as an artificial intelligence model, P-CNN is fundamentally supported by neuron science [37]. A recent study of Ma et al. [30] considers traffic data as 2-D images, one dimension of time, one dimension of location, and demonstrates better accuracy with P-CNN on the speed prediction and compared with CNN.
Convolution operation and pooling
The functionality of convolution is designed to scan a sequence x with a weight function w. A continuous convolution operation defined as;
In machine learning applications, the input data are discrete and multidimensional. Using 2-D images as an example, we can define a 2-D discrete convolution as;
Where, I(p,q) refer to the input image, and K(p, q) is called kernel or feature map. As shown in Fig. 4, convolution operations use feature maps to scan images, measure their similarity, and outputs a heat map t(j, k), which highlights the regions of interest. For example, if a feature map, which extracted from images without supervision, then the heat map t(j,k) indicates the location of human faces unless there are no faces.

Process flow model for traffic prediction.

CNN based framework for transportation network.
Pooling has seen as a summarization of responses over the neighborhood. Pooling drops the unused information by reducing the output size. It helps to make the network invariant to small changes of the input. Max pooling is one of the most successful pooling operations, which reports the maximum value within a rectangular neighborhood.
To explore the traffic, three different highway stations have selected for traffic analysis. Each junction point has recorded the information between the intervals of 15 to 20 minutes. The related information sets are used to compare the data of each station with different types of aggregation. Alternate information sets are used to aggregate the season-wise information such as; an hour, day, month, and year. We assume that at this time, primary road traffic affects due to road conditions such as repairs, accidents, traffic jams, and others [31]. Since our study involves exploring the dynamic nature of the roads in an excellent geographical location, it is required to consider different traffic conditions based on time and locations. It shows that all roads are not in the same traffic conditions at the same time or may get a similar situation. The traffic variables related to vacation and seasonal mentioned in more extensive details in the context of seasonality and due to reasons for lack of dimension. The proposed methodological structure and its components depicted in Fig. 5.

Proposed system architecture.
At time t + h, the forecast of traffic flow speeds with the given measurements up to time t. For this, the functionality for traffic defined as;
To model the traffic flow data, we apply predictors x given by:
Here, n represents the number of locations in the network (loop detectors) and location i at time t, xi,t represents the cross-section traffic flow speed. The term vec represents the vectorization transformation in which the matrix converted into a column vector. The chosen length is consistent and corresponds to several existing transport corridor management deployments. The layers are made as follows with a time series “filter” given by;

Time Vs. volume on the direction.

Speed of traffic stations (minute) and correlation of two different locations.
With the help of CNN and by distributing the values for factors, the information has generated into numbers. This model predicts One-day traffic. The daily traffic monitored and the actual vs. expected value represented in the below manner. Fig. 8(a) and Fig. 8(b) shows the correlation between actual versus expected traffic information as 93%.

One-year traffic prediction (a) actual (b) predicted.
In this section, we demonstrate the performance of the proposed methodology, which is used to predict the traffic volume with the highways India dataset compared with other models. In the view of forecasting results, we demonstrate some exciting discoveries.
Data description
In this approach, we consider the traffic data for prediction and visualization. In this regard, we have collected traffic data from the highways, transport departments in India. The data collected aggregated for each detector station for a 15-minute interval. For validation, initially, we use half-year data of the year 2012. We consider the traffic data from 01/01/2012 to 30/04/2012 as a training set, 01/04/2012 to 30/06/2012 as the validation set and traffic data from 01/05/2012 to 30/06/2012 as a testing set. During the training period, before the model implemented on the test set, the validation set is first used as an indicator to prevent overfitting, and then training was backed up [36]. As shown in Fig. 9, to validate the methodology, we use some strategy for data collection, storage, data manipulation, and data quality and performance factors.

The extraction process of traffic observations.
The features in the dataset are starting point, endpoint, timestamp, visibility, pressure level, speed, and regions. The generated datasets are used randomly for training, testing, and validation. Table 1 illustrates the sample traffic data which contains features, classes, and instances.
Description of dataset
To implement the traffic prediction and visualization in a distributed environment, we use python with machine learning background on a Processor Intel (R) Core(TM) i3-2350M CPU @ 2.30 GHz, 2300 MHz, 4 Core(s), and 4 Logical Processor(s) with 24 GB of memory. To predict and visualize the instances, we use a P-CNN classifier with a traffic dataset. We model the traffic events prediction task as a problem, so we try to measure the probable traffic flow information with climate and features. Since the considered application is of dataset involving complex data, the classification should be carried out accurately — the statistical measures such as MSE and MAPE used for proposed model evaluation. The proposed model outflanks the other classifiers and shown better results.
Performance index and evaluation metrics
Three performance metrics, such as Mean absolute error (MAE), Mean Relative Error (MRE), and Root-mean-square Error (RMSE), have used to evaluate the effectiveness of our proposed model. These can be calculated as;
In the early stage of training, we use MSE (Mean Square Error) to speed up parameter optimization, as MSE is more sensitive to an error with sizeable absolute value, which can help to fit data with high values such as traffic during peak hours more quickly.
MAPE (Mean Absolute Percentage Error) is applied as final evaluation metrics and also used as a loss function to update parameters of deep neural networks. Mathematically, MAPE defined as;
Where qk is ground truth, and pk is a prediction.
With the Highways India dataset, we compare the performance of baselines with CNNs. Figure 10 shows that the P-CNN achieves the best performance with an overall 7.81% error rate on the testing set, which is lower as compared with all other models. Figure 10 shows the comparison of the proposed model with other models such as K-Nearest Neighbor (KNN), Fast Incremental Model Trees with Drift Detection (FIMT-DD), and Convolutional Neural Network (CNN). Both global attention mechanism and P-CNN embedding can effectively encode temporal dependencies in traffic prediction and have achieved improvement with lower MAPE than the raw CNN. Besides, P-CNN is more efficient than the attention mechanism, especially for longer-term prediction. It shows that P-CNN has more accuracy than the other classifiers in long term observations.

MAPE (%) on the Highways India testing set.
Figure 11 shows the performance of P-CNN using one-day observations and compared with the other models such as KNN, FIMT-DD, and CNN. One day observations are used as inputs of P-CNN and compared with other models, where one more P-CNN model implemented for one-week observations. The experiments have shown that our model has a lower MAPE than others. Intuitively, with one-day observation, the model is more sensitive, and traffic data threatened on the same day, but P-CNN is more robust with a week view. one-day observations as input with 113 data points. one-week observations as input with 792 data points.

Comparison between different lengths of input on the Highways India testing set.
The establishment of proportion error rate analysis as depicted in Fig. 12. The models KNN, FIMT-DD, as well as CNN, give a comparatively less accurate value of mixture interval for increasing traffic flow information. The proposed model provides better accuracy in comparison with other methods. The maximum average prediction accuracy given by this model is about 4.8% concerning P-CNN, with FIMT-DD over 16%, with CNN about over 15%, and over 20% with KNN.

Measurement of traffic data error.
Table 2 shows the performance comparison of the MAE, MRE, and RMSE of the P-CNN and other models such as CNN, FIMT-DD, and KNN. It also displays the cumulative distribution function (CDF) of MRE for every methodology. This function of MRE shows statistical outputs on freeways with 15-min, 30-min, and 45-min and 60-min traffic flow over 1550 vehicles [33]. The comparison shows that the P-CNN model has a low error rate than other existing models. Therefore the proposed methodology has shown the best performance and promising one than the other methodologies. As shown in Figs. 13–15, the prediction of traffic flow data calculates the average error using MAE, MRE, and RMSE.
Performance comparison of the MAE, the MRE, and the RMSE for different models

Performance comparison using MAE on highways dataset.

Performance comparison using MRE on the highways dataset.

Performance comparison using RMSE on the highways dataset.
After calculating the average error rate using different performance methods, we calculate the accuracy of traffic data prediction for daily, weekly, monthly, yearly for a different model (shown in Fig. 16).

Accuracy of the proposed algorithm.
Figure 13 shows the performance comparison of the proposed method using the MAE metric on the highways dataset. The classification methods sorted according to the MAE rate across different methods. In this comparison, we observed that our proposed method outperforms other methods in terms of nominal Mean Absolute Error.
The performance of different methods evaluated using MRE on the highways dataset shows in Fig. 14. In this comparison, the Mean Relative Error rate of all the classifiers is higher than the proposed method. P-CNN achieves better performance in terms of lower MRE rates.
The performance of the proposed method using RMSE on the highways dataset and compares it with different methods such as P-CNN, CNN, FIMT-DD, and KNN has shown in Fig. 15. For the highways dataset, the performance of the proposed method outflanks other methods in terms of minimal RMSE rate. The comparison shows that the P-CNN model has a low error rate than other existing models.
After calculating the average error rate using different performance metrics, we calculate the accuracy of the proposed method using highways data for daily, weekly, monthly, and yearly for different models, which are shown in Fig. 16. As per the comparison, the P-CNN model achieves better accuracy than other existing models. Therefore the proposed methodology has shown the best performance and is more promising than the other methodologies.
In recent years, the growth of traffic data has been expanding quickly. To analyze the enormous amount of traffic data requires new tools and techniques. Convolutional Neural Network (CNN) is the most promising model in machine learning, which utilized as a part of image recognition, voice recognition, and computer vision. In this paper, a deep learning model named parallel - convolutional Neural Networks (P-CNN) has been implemented to model temporal and spatial features of traffic volume for prediction and visualization. We have demonstrated the traffic condition in various situations. We initially analyze the temporal pattern and spatial correlation on the highways India dataset.
The volume of triple traffic on the road ‘A’ On average, peak hours on weekdays start at 7.00 –8.00 am and finally finish at 7.00–8.00 pm. Unusual traffic volume can be detected, but our related datasets do not have correlated features.
The data has been repeatedly tested to receive an error display. In each iteration, we linearly measure the performance. The predicted sequential result depicts the performance of error is consistently decreasing as several trained data increases. We keep information about traffic and develop visualization based on knowledge. With the evolving traffic conditions animation, on a particular road and time, we can analyze the traffic behavior that uses toll roads.
Footnotes
Acknowledgments
This work is supported by Indian Institute of Technology (ISM), Dhanbad, Govt. of India. The authors wish to express their gratitude and heartiest thanks to the Department of Computer Science & Engineering, Indian Institute of Technology (ISM), Dhanbad, India, for providing their continuous research support.
