Abstract
Automated traffic signal performance measures (ATSPM) have become widely adopted and utilized by state and local agencies in the U.S. for collecting real-time traffic data 24 h a day, 7 days a week. These agencies have developed new performance measures and applications to address their local transportation planning needs. However, recent research has identified data quality issues in the collected data from ATSPM systems. Specifically, the traffic volumes collected through ATSPM exhibit data anomalies that do not accurately reflect the actual traffic patterns at intersections. As such, there is a need to address the data quality issues found in ATSPM datasets. The purpose of this paper is to evaluate the use of machine learning algorithms and statistical methods to predict traffic volume at intersections. Existing traffic volume data, along with additional metrics such as timestamps, weather conditions, crash data, and holidays, are evaluated to predict traffic volume and address the data anomalies present in ATSPM datasets. Two statistical methods and four machine learning algorithms are evaluated to determine their ability to predict traffic volumes. By comparing the root mean square error (RMSE) and the mean absolute percentage error (MAPE) between each model, the results demonstrate that the long short-term memory (LSTM) model exhibits the lowest error in predicting traffic volume compared with the other models. The LSTM model achieves an RMSE as low as 9.4 vehicles and an MAPE as low as 35%. By leveraging the LSTM model, traffic agencies can enhance the quality of their ATSPM data, enabling better decision-making for traffic operations by their engineers and planners.
Traffic volume (flow) at intersections is one of the most fundamental elements in traffic studies. Knowledge of traffic volumes helps traffic engineers and planners design intersections and roads, determine signal timings, understand driving behavior, assess traffic efficiency, evaluate system performance, and improve traffic safety. Over the past century, traffic engineers have manually collected traffic volumes using five-button handheld counters or counters mounted on an intersection panel. However, with advancements in technology, electronic counters, smartphones, traffic detectors, and other intelligent devices are now used to collect traffic volume data.
In 2012, the Utah Department of Transportation (UDOT) in the U.S. implemented automated traffic signal performance measures (ATSPM) technology to replace some of their manually collected traffic data. ATSPM is a series of visual aids that display the high-resolution data from traffic controllers. ATSPM is also a valuable asset management tool, aiding technicians and managers in the control of both traffic signal hardware and traffic signal timing and coordination—they enable passive collection and analysis of traffic counts 24 h a day, 7 days a week ( 1 ). This real-time data assists traffic engineers and planners in making better assessments and faster decisions to improve safety and efficiency. A 2020 Federal Highway Administration release estimated that the UDOT ATSPM system has saved taxpayers over USD 107 million over 10 years ( 2 ). However, several limitations of ATSPM have been identified from previous research. First, the detection systems installed at intersections malfunction from time to time leading to partially missing data. Second, traffic anomalies may be recorded in the ATSPM datasets. Therefore, it is difficult for ATSPM users to maintain high-quality analysis results from the current datasets with these limitations ( 3 – 6 ).
To address the data quality issues and provide accurate traffic forecast results for traffic engineers and planners, previous traffic researchers have employed machine learning and deep learning algorithms to predict missing traffic data. For example, Cui et al. used the Graph Markov process to develop a new neural network architecture for spatiotemporal prediction of missing data ( 7 ). However, most of the research conducted thus far has focused on two-way (highway/freeway) traffic volume or on short-term traffic forecasting with small datasets, which cannot be directly applied to the four-way intersection traffic volume in ATSPM datasets and overcome these limitations ( 8 – 11 ).
This paper compares two statistical methods and four machine learning algorithms to predict missing and real-time directional traffic volume at fully signalized intersections: a long short-term memory (LSTM) model, a multilayer perceptron (MLP) model, a decision tree (DT) model, a support vector regression (SVR) model, the mean of the previous 1 week (traffic volume) model, and the mean of the previous 52 weeks (traffic volume) model. The purpose of this paper is to evaluate the use of each model to predict traffic volume at intersections. These models are trained on not only traffic volume but also on additional metrics such as weather conditions, traffic incidents, and timestamps. The predictive power of these models provides a means for UDOT to significantly improve the quality of ATSPM traffic volume data by predicting missing portions of the ATSPM data. By filling in the missing traffic volume data, this concept can also be applied to other performance measures, which will benefit all ATSPM users that are utilizing ATSPM data for their analysis needs. This approach will enhance the robustness of ATSPM datasets, enabling engineers and planners to optimize mobility, manage traffic signal timing and maintenance, reduce congestion, save fuel costs, and improve safety at signalized intersections.
The paper is structured as follows. First, the Literature Review section examines previous research on predicting traffic volumes. Next, the Methodology section describes the data and models used in this study. The Results section then compares the test accuracy of each model followed by the Discussion section that summarizes the implications and limitations of this study as well as future research opportunities. Finally, the Conclusion section summarizes the findings and contributions of the research.
Literature Review
While considerable research has been conducted to predict short-term traffic data using original data from highways, urban arterial roads, and signalized intersections, there are unique characteristics of ATSPM data that have not been extensively explored ( 7 – 11 ). Aggregated ATSPM data comprises a large historical and real-time dataset capturing traffic volume at 15 min intervals over several years. Moreover, aggregated ATSPM data records traffic volumes at signalized intersections with separate directional movements. The robustness of ATSPM datasets enables more detailed and accurate predictions of traffic volumes compared with traditional methods, but its usefulness is affected by substantial sequences of missing data. Some of the factors that may cause missing data are the position of traffic sensors, their installation location, and system malfunctions. Chang et al. discovered that sensor position, traffic volume level, and number of approach lanes had a statistically significant effect on the accuracy of traffic volume counts by sensors ( 12 ). Owais summarized that problems with traffic sensor location could lead to missing data collection ( 13 ). The aim of this research is to study the effectiveness of machine learning models to predict missing and future directional traffic volumes in ATSPM data at signalized intersections, thus elevating its value for UDOT and other traffic agencies and researchers. To the knowledge of the research team, this is the first application of short-term traffic flow prediction to the ATSPM dataset.
Over the past 4 decades, simple statistical and traditional machine learning forecasting methods have been extensively developed for traffic volume prediction ( 14 ). Common techniques, some of which are applied in this paper, include auto-regressive integrated moving average (ARIMA), MLP, SVR, DT, random forest (RF), and K-nearest neighbors (KNN) ( 15 – 29 ). More recently, there has been a growing interest in predicting traffic volume using deep learning algorithms, particularly in the field of time-series analysis. Time-series analysis involves developing models to describe the observed time series and explain the underlying patterns in the dataset. These models make assumptions and interpretations about the data to predict future observations based on current and historical data ( 30 ). Table 1 summarizes the machine learning methods for predicting traffic data found in the literature.
Machine Learning Methods Used to Predict Traffic Data
Note: ARIMA = auto-regressive integrated moving average; BPNN = back propagation in neural network; DLN = deep learning networks; DT = decision tree; GSA = gravitational search algorithm; KNN = K-nearest neighbors; LSTM = long short-term memory; MAE = mean absolute error; MAPE = mean absolute percentage error; MLP = multilayer perceptron; MSE = mean squared error; RF = random forest; SARIMA = seasonal ARIMA; SVR = support vector regression.
The prediction model that performed best during this research was the LSTM model, a deep learning algorithm that introduces new approaches to modeling the relationships between variables in a deep and multi-level hierarchy ( 31 ). Applications of the LSTM model to traffic flow prediction appear infrequently in the literature, usually in contexts quite different from the ATSPM dataset. Yuan et al. and Zhang et al. utilized LSTM and RNN to predict real-time crash risks and collisions between pedestrians and vehicles at signalized intersections, and their results showed that the LSTM and RNN models outperformed the conditional logistic model ( 32 – 34 ). Likewise, the findings of Fu et al., indicate that the LSTM model performs better than prior methods ( 35 ). These results motivated the research team’s selection of the LSTM as a potential model for the research. Other models selected from the literature and used in the research include: MLP, DT, and SVR ( 36 – 38 ). These models will be discussed in more detail in the Models section.
Methodology
The overall methodology of this research involves sampling and analyzing 2 years of traffic volume data from a subset of fully signalized intersections. This subset consists of six intersections located in the cities of Provo and Orem, Utah, U.S. The selection of prediction variables, including date, time, crashes, weather conditions, and holidays, is based on previous research studies ( 3 – 6 ). The study problem formulation includes single-step forecasting. Two simple baseline models and three comparison machine learning algorithms—MLP, SVR, and DT—are determined for comparison with LSTM. The LSTM model is introduced, and its parameters are adjusted to obtain the most accurate results. The general workflow proposed for this research development is illustrated in Figure 1. The following subsections provide detailed information on the study intersections and their data, the chosen predictor variables, the comparison models, and the steps involved in developing the LSTM model.

Methodology workflow.
Study Data
In this research, a total of six fully instrumented intersections located in the cities of Provo and Orem, Utah, were selected as the study area. The phase plans are the same for all six intersections—the major road at the intersection always uses phases 2 and 6 for through movement and they all have an individual left-turn lane or dual left-turn lanes that apply to all eight phases at the respective intersections. The choice of these intersections was based on the completeness and accuracy of ATSPM volume data, as determined by previous research ( 3 , 4 ). Figure 2 illustrates the location of these intersections.

Study area (Provo and Orem, Utah).
In the ATSPM dataset, the aggregated traffic volume is recorded at 15 min intervals with each direction and movement type. The movement types are categorized into eight phases to improve the accuracy and efficiency of directional traffic volume prediction. For example, EBL (eastbound left), NBT (northbound through), and WBR (westbound right). The left-turn movements in each direction are assigned to phases 1, 3, 5, and 7, while through and right-turn movements are combined and assigned to phases 2, 4, 6, and 8, as shown in Figure 3.

An example study intersection at University Avenue and University Parkway.
The data collected for this study consists of time series observations, which are recorded at regular time intervals that include minutes, hours, days, months, quarters, or years. In this case, the research team collected ATSPM data from fully instrumented intersections with 24 h volume data in 15 min intervals for every day between January 1, 2019, and December 31, 2020.
Predictor Variables
The selection of appropriate predictor variables is crucial for building an effective prediction model. While existing literature emphasizes the importance of using both spatial and temporal data, incorporating additional types of data beyond traditional traffic data can provide new insights and improve prediction accuracy. Previous studies have explored various data sources, including event data, crash data, weather data, adjacent road link data, and even social media data such as tweets, to enhance traffic prediction ( 10 , 39–44). Therefore, it is essential to incorporate as many relevant data types as possible to achieve more accurate predictions and gain deeper insights into the traffic prediction problem. In this study, three additional types of data were used: weather/road condition data, crash data, and holiday data.
Weather/road condition data were collected from UDOT Road Weather Information System (RWIS) ( 45 ). The data were used to identify conditions such as rain, snow, storm, and other extreme weather events that could potentially affect traffic volume. The RWIS dataset includes six road and weather conditions: dry, damp, wet, slushy, snow, and ice. The road condition data used in the analysis were collected from the nearest weather stations in the cities of Orem and Provo, Utah.
Crash data were sourced from AASHTOWare Safety (powered by Numetric) datasets ( 46 ). The number of crashes and the severity of each crash were considered as inputs for the prediction model. All of the crashes that occurred in Provo and Orem between January 1, 2019, and December 31, 2020, were included in the analysis. The impact of each crash was determined using an inverse ratio between the crash location and the intersection location.
Furthermore, the research team also included 10 national holidays and one Utah holiday in the prediction analysis, as travel behaviors tend to differ during holidays. These holiday data points were considered as additional predictors in the models.
Some predictor variables were encoded directly, while others fed into the neural network using a one-hot encoding scheme. A one-hot encoding is created by constructing an input vector with a value of 0 for each non-represented category and a value of 1 in the position corresponding to the represented category. The traffic volume data was kept as integers; timestamps were one-hot encoded by minute, hour, day in a week, and month; weather data was one-hot encoded based on the six categories of road conditions; crash data was encoded as crash scores, which was calculated as the severity of the crash divided by the distance between the crash site and the signal; and holiday data was one-hot encoded based on whether the day of interest is a holiday or not. The predictor variables and encoding methods are listed in Table 2.
Predictor Variables/Features and Encoding Methods Used in Models
Problem Formulation
The single-step forecasting problem is used in this research. Single-step forecasting is a prediction method in which the next sequential value is predicted based on the k prior values. Optionally, the method’s input can also be augmented with supplementary information, in which case the supplementary information from the past k time steps is included in the model’s input. Algorithmically, the process takes the form outlined in Equation 1.
where
The research team further introduced three distinct compositions for
1) a scenario in which the model is given traffic volume only, with no auxiliary prediction variables (Scenario 1),
2) a scenario in which traffic volume is augmented with timestamp information (Scenario 2), and
3) a scenario in which the traffic volume information is augmented with a full set of auxiliary prediction variables including timestamp, weather/road condition, traffic incident data, and holiday information (Scenario 3).
Models
The research team compared the performance of four machine learning models: MLP, SVR, DT, and LSTM. Additionally, two simple baseline models were included for comparison, which calculated the mean of the previous 1 week and the previous 52 weeks for prediction.
All machine learning models were implemented in Python using the PyTorch and scikit-learn libraries ( 47 , 48 ). The training of the LSTM and MLP models was distributed across two NVIDIA GeForce GTX 1080 Ti graphics processing units to accelerate the training process, and SVRs and DTs were trained on the central processing unit.
The specific models chosen for this study were drawn from the literature after comprehensive review of their performance at traffic prediction tasks summarized previously in Table 1. Of the many traffic prediction methods present in the literature, MLP, SVR, DT and LSTM showed particular promise for further study. The models were evaluated and compared in the context of traffic volume prediction of each 15 min timestamp, as well as across the three different scenarios. All models were trained on approximately 365 days of data (training set) and tested on 31 days of data (testing set), except SVR. The research team conducted extensive experiments to assess the performance of each model and determine their suitability for predicting traffic volume in the study area. The following subsections discuss the details of each machine learning model that was used.
Multilayer Perceptron (MLP)
MLP is a widely used function approximation algorithm that aims to identify patterns in labeled training data ( 36 ). It consists of an input layer, one or more hidden layers, and an output layer. The input layer represents the data or features to be classified, while the hidden layers capture learned features of the data. The output layer encodes the final predictions or classifications made by the MLP. These layers are interconnected within a feed-forward neural network, and the MLP is trained using the backpropagation algorithm ( 49 ).
The research team implemented the MLP learning algorithm using the PyTorch library. The MLP model consists of an initial input layer with a variable input size depending on the specific learning task. It is followed by five hidden layers, each containing 80 neurons. The output layer consists of a single neuron that predicts the expected traffic volume for the next time step. To enable the model to handle nonlinear data, the research team applied the hyperbolic tangent activation function between each pair of layers. During both training and testing, the input time series were flattened before feeding it into the MLP. This means that the width of the MLP input layer is determined by the length of the flattened time series.
Support Vector Regression (SVR)
SVR is a computational non-linear generalization algorithm using statistical learning theory ( 38 ). It aims to enhance the generalization ability to handle unseen data effectively. SVR achieves this by maximizing the margin between the locations of training samples and the hyperplane that defines the decision surface of the model. This optimization is accomplished through the solution of a convex optimization problem.
The research team implemented the SVR model using the scikit-learn library. Specifically, the research team used the radial basis function as the kernel and set the max number of iterations to 20 for SVR. To reduce training time, the model was trained and tested on only 20% of all data. Additionally, the input data were standardized by removing the mean and scaling to unit variance as an attempt to increase the SVR test accuracy.
Decision Tree (DT)
DT is a classic supervised learning method that can be applied to both classification and regression tasks ( 25 ). It operates by constructing a tree-like structure where each internal node represents a decision based on a feature or attribute, and each leaf node represents a prediction. The decision rules for splitting the data are learned based on the attribute that provides the best separation or information gain.
The research team developed the DT regression model using the scikit-learn library. The implementation utilizes an improved version of MSE for the error term known as Friedman MSE. To achieve higher accuracy, the constraints on the maximum depth of the tree and the minimum number of samples required in a leaf node were removed. By allowing the model to explore more complex decision rules and adapt the characteristics of the data, the research team aimed to improve its predictive performance in forecasting traffic volume.
Long-Short Term Memory (LSTM)
LSTM is a variant of a recurrent neural network (RNN) ( 31 ). To understand LSTM, it is necessary to first introduce RNN. RNN can handle variable-length sequence inputs by having a recurring hidden state, the activation of which in each time interval depends on the state of the model during the previous time interval. LSTM was originally proposed by Hochreiter and Schmidhuber as an evolution of RNN to overcome the vanishing gradient problem ( 50 , 51 ). Unlike traditional RNN, which overwrites its contents at each time step, LSTM determines whether the existing memory should be preserved across time steps.
The LSTM network utilized in this research is unidirectional and comprises a single hidden layer with 50 hidden units. The output of the LSTM is fed into a linear layer with a dropout rate of 0.1 to produce the final output. To predict the volumes for each left-turn lane and combined through movement and right-turn lane volumes, eight separate LSTMs were trained. Each LSTM was trained with 20 epochs (iterations).
Results
Predicted traffic volumes were evaluated from all intersections in the testing set. Root mean square error (RMSE) and MAPE were used to determine the difference in performance between LSTM and other comparison models. Equations 2 and 3 define the terms of RMSE and MAPE, respectively.
where
where
Table 3 provides the average RMSE and MAPE results for all six study intersections for each phase and overall average for all eight phases. The table displays the RMSE and MAPE values obtained from the comparison of all six models for all three scenarios that include the volume only scenario (Scenario 1), the volume plus time scenario (Scenario 2), and the all variables scenario (Scenario 3).
Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) Traffic Volume Prediction Results
Note: Avg. = average; DT = decision tree; LSTM = long short-term memory; MLP = multilayer perceptron; SVR = support vector regression.
From the results in Table 3, it can be observed that the RMSE values are generally lower for odd phases, which represent the left-turn lanes, and higher for even phases, which represent the through and right-turn lanes. This observation aligns with the understanding that the traffic volumes for through and right-turn lanes are typically higher than those for left-turn lanes at signalized intersections.
Comparing the overall average RMSE and MAPE results for eight phases among each model, the LSTM model demonstrates the best performance with the lowest RMSE and MAPE of all three scenarios at 10.13 and 0.37 for the volume only scenario (Scenario 1), 9.42 and 0.35 for the volume plus time scenario (Scenario 2), and 9.39 and 0.35 for the all variables scenario (Scenario 3), respectively. Following closely are the DT and MLP models, ranking second and third in performance among the machine learning algorithms. Models based on the mean of the previous 52 weeks and the previous 1 week are ranked fourth and fifth in the RMSE and MAPE results, suggesting that incorporating more historical data leads to improved data prediction. In contrast, the SVR model yields the least-favorable results, clearly indicating that SVR is not an ideal prediction model for the type of data used in ATSPM. The results also show that the timestamp variables included (volume plus time, Scenario 2) in the prediction model help improve the RMSE from 10.13 down to 9.42 (a 7% difference). However, other prediction variables such as weather, crashes, and holidays did not show significant improvement when added to the model (all variables, Scenario 3).
Table 4 compares the average RMSE and MAPE results across all eight phases in all three scenarios. The LSTM model consistently demonstrates the lowest RMSE and MAPE values across all three scenarios. For instance, in the volume plus time scenario (Scenario 2), the RMSE is as low as 9.42, indicating that the maximum difference between the predicted volume and the actual volume is only 9 vehicles during a 15 min time period, while the average volume of the through movement is 61. This suggests that the LSTM model performs well in accurately predicting traffic volumes for different phases. It is also noteworthy that the LSTM model performs better when additional variables or features are included in the training process. When the LSTM model is trained with volume, time, weather/road condition, crashes, and holidays, it shows the best RMSE and MAPE results overall among other models and scenarios.
Average Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) across All Phases for All Scenarios
Note: DT = decision tree; LSTM = long short-term memory; MLP = multilayer perceptron; na = not applicable; SVR = support vector regression.
Figure 4 illustrates an example of the selected signal and phase of predicted volumes from each model compared with the existing volume from the ATSPM datasets. The phase 2 volumes at the Orem, Utah, intersection of 800 East and 800 North are displayed. The figure presents the ground truth volume as a red line and the predicted volume from each prediction model as a blue line. The x-axis represents timestamps, with each unit representing a 15 min bin; each day has 96 timestamps in 15 min bins, amounting to approximately 3 days (timestamp 288) of data. The y-axis shows the total volume traveled through the intersection in each 15 min bin. The RMSE and MAPE values for each model are also displayed in the legend, with the LSTM model having the lowest RMSE and MAPE of 20.69 and 0.38, respectively, in the example of the selected signal and phase.

Prediction volume from each model versus ground truth volume for: (a) multilayer perceptron (MLP), (b) decision tree (DTR), (c) support vector regression (SVR), (d) long short-term memory (LSTM), (e) mean of previous 1 week, and (f) mean of previous 52 weeks.
Several observations can be made about the performance of each model shown in Figure 4. Note that the ground truth data are shown in red, indicating trends that follow the general path of the traffic volume, with volumes peaking in the AM and PM peak hours and traffic volumes close to zero during nighttime hours for all 3 days. The MLP model demonstrates relatively close predictions to the real volume during the first 2 days, but it exhibits some discrepancies during the afternoon of the 3rd day. A potential reason for this behavior lies in the MLP model’s inability to capture temporal dependencies of the traffic data. The mean of the previous 52 weeks model has shown that the average traffic volume around timestamp 250 experiences a local maximum, a pattern which MLP learns. However, when the test week of interest has a dissimilar trend around timestamp 250—which may occur because of weather, holiday, or crash conditions—MLP falls short of predicting the correct short-term behaviors. The DT model captures a similar trend as the real data but struggles to accurately predict peak hour volumes, resulting in overestimation or underestimation. The SVR model encounters significant issues in predicting the traffic volume, as it fails to capture the traffic patterns effectively. The research team hypothesizes that the SVR model’s objective of maximizing the margin between observed data and the model’s decision surface has resulted in drastic overcompensation, with infrequent but extreme peak traffic volumes in the training data leading to chronic overestimation of all other values. The two baseline models, which rely on statistical calculations based on mean volumes from previous traffic data, do not perform well in estimating the traffic pattern accurately. It is crucial to inform current practitioners that using the previous average volume to predict future traffic is insufficient and may lead predictions in the wrong direction. In contrast, the LSTM model stands out with its remarkable performance in minimizing RMSE and MAPE. The RMSE and MAPE of the LSTM model are both the lowest compared with the other five models. The predicted volumes align closely with the real data, indicating a high level of accuracy. The LSTM model’s ability to capture sequential dependencies and utilize long-term memory proves to be effective in predicting traffic volumes accurately. Figure 5 illustrates all six models combined compared with the ground truth volume. The results are based on six predictor variables: traffic volume, timestamp, weather/road conditions, holiday data, and a weekend flag with each timestamp representing a 15 min interval. The unique characteristics of the ATSPM dataset render traditional methods unreliable, but the LSTM deep learning model succeeds. All models illustrated have the ability to predict traffic volumes at some level except for the SVR model.

Summary of prediction models versus ground truth.
The findings from Tables 3 and 4 and Figures 4 and 5 highlight the superiority of the LSTM model in predicting traffic volumes compared with other models on a randomly sampled (but representative) time window of 3 days. The ability of the LSTM model to generate highly matched predictions compared with real data enhances its potential for practical applications in traffic management and planning.
Discussion
In this paper, the research team applied the LSTM model for predicting directional traffic volume at signalized intersections using the Utah ATSPM dataset. The results of this research have several important implications for practitioners in the field and offer valuable insights for future research.
First and foremost, the research team has shown that the ATSPM dataset can be effectively augmented using the LSTM model in conjunction with auxiliary input data, as shown in Figure 5. By leveraging a large ATSPM dataset and incorporating additional variables such as crash data, weather conditions, and holidays, the LSTM model outperformed other machine learning algorithms in accuracy. Moreover, compared with the two baseline models that traffic engineers typically use to estimate missing traffic data (i.e., the mean of the previous 1 week or previous 52 weeks), the LSTM model provided significant improvement. This highlights the potential of deep learning techniques to improve traffic prediction accuracy and offer valuable information for traffic engineers and planners, compared with their traditional traffic prediction methods.
The practical implications of this research are significant. By utilizing the LSTM model paired with auxiliary input data, traffic agencies can improve the quality of their ATSPM data by filling in missing values and generating more accurate predictions of traffic volume. This, in turn, can support better decision-making in traffic operations, such as optimizing signal timing, managing congestion, and enhancing overall traffic safety and efficiency.
However, it is important to acknowledge the limitations of this study. First, the small sample size and potential inaccuracies in the current UDOT ATSPM datasets may affect the ability to apply the findings on a wider scale. Future research should focus on expanding the dataset and validating the results on a larger-scale source of data from outside of Utah to enhance the robustness of the LSTM model. Second, the selected intersections vary in size, number of lanes, signal timing, pedestrian crosswalk involvement, and other aspects, which may influence the results. Third, it is challenging to calculate MAPE and RMSE when ground truth data are missing. Accurate assumptions require enough data samples and testing to inform practitioners that the prediction model is close enough to provide reasonable data to fill out the missing values in the ATSPM dataset. Last, the ability of the LSTM model to generalize to other unseen signals and phases is worth further research, as this initial attempt involved deploying the LSTM model on selected signals and phases by training it on those selected signals and phases. Given the LSTM algorithm’s remarkable ability to adapt to short-term variations within the ATSPM data used for this study, which included data from multiple signal phrases across a variety of seasonal changes and traffic disruptions, it seems reasonable that similar performance might be attained using data from different locations or time periods. Such performance may not be presumed, however, and should be verified independently for each dataset used.
The current study focuses on next-timestep prediction only. However, a complete augmentation of the ATSPM dataset will require multi-timestep prediction to fill large gaps of missing information. Preliminary experiments by the research team suggest that the LSTM model diverges from its next-step prediction accuracy in fewer than 30 time steps, but that it nevertheless mimics the general pattern of traffic volume across several days of data. For many applications, such predictions—although somewhat inaccurate—would be preferable to the current state of the dataset, which renders missing data as zero traffic volume.
Conclusions
The purpose of this research is to evaluate the use of machine learning algorithms and statistical methods to predict missing and/or anomalous directional traffic volume data in ATSPM datasets. This research paper highlights the effectiveness of the LSTM model in predicting directional traffic volume at signalized intersections. By leveraging advanced machine learning techniques and incorporating additional predictor variables, such as crash data, weather conditions, and holidays, the LSTM model demonstrates its capability to accurately forecast traffic volume.
The findings of this study have important implications for transportation agencies and practitioners in utilizing ATSPM datasets. The LSTM model provides valuable insights into traffic patterns and trends, enabling traffic engineers and planners to make informed decisions about signal timing, congestion management, and overall traffic efficiency. By improving the accuracy of traffic volume predictions in ATSPM data, the LSTM model contributes to more effective transportation planning and management strategies.
Limitations of this paper include: 1) the small sample size and potential inaccuracies in the ATSPM datasets, 2) the selected sample size could not represent all the signalized intersections conditions around the different locations, and 3) the ability of the LSTM model to generalize or translate to other unseen signals is still limited. Therefore, future research should focus on expanding the dataset and validating the results on a larger scale to ensure the robustness of the LSTM model. Additional future research could be the incorporation of additional predictor variables, such as traffic volumes of adjacent intersections, to further enhance the accuracy and granularity of traffic volume predictions. Furthermore, the application of the LSTM model can be extended to optimize signal timing and congestion management in larger corridors and network-wide traffic systems.
In summary, this research demonstrates the effectiveness of the LSTM model in predicting traffic volume at signalized intersections equipped with ATSPM data. By improving the accuracy of predictions and providing valuable insights into traffic patterns, the LSTM model contributes to more efficient traffic management and planning. It will benefit state and local transportation agencies, traffic engineers and planners, and all other ATSPM users in utilizing this robust system and analyzing their ATSPM data to make better transportation planning decisions.
Footnotes
Author Contributions
The authors confirm contribution to the paper as follows: study conception and design: B. Wang, N. Fulda, Z. Huang. G.G. Schultz, and G.S. Macfarlane; data collection: B. Wang and Z. Huang; analysis and interpretation of results: B. Wang, N. Fulda, Z. Huang, G.G. Schultz, G.S. Macfarlane, J. Arnesen, and A. Khayyat; draft manuscript preparation: B. Wang, Z. Huang, and J. Arnesen. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was managed and funded by the Computer Science and Civil and Construction Engineering Departments at Brigham Young University with data provided by the Utah Department of Transportation.
Data Accessibility Statement
The data that support the findings of this study are available from the Utah Department of Transportation, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors on reasonable request and with permission of the Utah Department of Transportation.
The authors alone are responsible for the preparation and accuracy of the information, data, analysis, discussions, recommendations, and conclusions presented in this article. The contents do not necessarily reflect the views, opinions, endorsements, or policies of the Utah Department of Transportation or the U.S. Department of Transportation. The Utah Department of Transportation makes no representation or warranty of any kind, and assumes no liability therefore.
