Abstract
Every year, the U.S. government provides several billions of dollars in the form of federal funding for transportation services in the U.S.A. Decision making with regard to the use of these funds largely depends on performance indicators like average annual daily traffic (AADT). In this paper, Bayesian nonparametric models are developed through machine learning for the estimation of AADT on bridges. The effect of hyperparameter choice on the accuracy of estimations produced by Bayesian nonparametric models is also assessed. The predictions produced using the Bayesian nonparametric approach are then compared with predictions from a popular Frequentist approach for the selected bridges. Evaluation metrics like the mean absolute percentage error are subsequently employed in model evaluation. Based on the results, the best methods for AADT forecasting for the selected bridges are recommended.
Every year, the U.S. government provides several billions of dollars in the form of federal funding for transportation services in the U.S.A. ( 1 ). For the fiscal year 2019, the Federal Highway Administration (FHWA) requested $46.0 billion for activities that included increasing roadway safety and mobility, while improving the condition of bridges and highways ( 2 ). Decision making with regard to the use of these funds largely depends on several indicators, and for professionals in transportation, traffic flow plays an important role in this decision-making process. Traffic counts are performed to assess traffic flow. The data obtained from performing traffic counts can be used in long-term planning, network renewal and development, customer service, and operational activities ( 3 ).
Under the long-term planning category, traffic counts can be used in traffic modeling and simulation, safety studies and crash analysis, setting and measuring levels of service, and policy development ( 3 ). Activities like deriving traffic loads for bridge and pavement design, project planning and network comparison can be undertaken when using traffic counts for the purpose of network renewal and development ( 3 ). Operational uses include traffic management, network monitoring, and the derivation of traffic loads for planning and design of maintenance interventions ( 3 ). Road users can also receive responses to traffic related enquiries with the aid of these counts ( 3 ). Traffic counts are rarely used in their raw form. Performance indicators like annual average daily traffic (AADT), average daily traffic, monthly average daily traffic (MADT) and peak-hour factor are computed from the raw traffic information.
AADT is key in network planning, renewal, and development. This performance measure indicates the activity level of a road or a bridge. It estimates the average traffic volume for every day of a year at a specified roadway location. There are traditional methods of computing AADT including the simple average method, the American Association of State Highway Transportation Officials (AASHTO) method (average of averages method), and the FHWA AADT method. Several such models have been created using various predictors for the estimation of future AADTs for different roads. This research, however, focuses on AADT estimation for bridges.
Because funding is generally limited for the maintenance and retrofitting of bridges in some areas ( 4 ), it is imperative to develop AADT models which avoid dependency on predictors which may not be readily available to aid in quick and precise decision making with regard to funding allocation. Consequently, this research aims to develop Bayesian nonparametric models through machine learning for the estimation of bridge AADT. This research also seeks to explore the effect of hyperparameter value choice on the accuracy of estimations produced by Bayesian nonparametric models. Another objective of this research is to evaluate the predictions produced by both Bayesian and Frequentist approaches.
Background
Traffic data is collected by offices in state departments of transportation which may be in charge of research, or by subdivisions which may not be directly involved in research or traffic engineering ( 5 ). Traffic data is collected for maintenance, operations, forecasting, and many other purposes. Traffic counts represent a section of traffic data collected. Some types of traffic counts are pedestrian counts and total volume counts ( 6 ). In an area designated for a study, the pedestrian traffic throughout the study period is termed a pedestrian count. Total volume counts for a selected roadway are mainly used in the calculation of average daily traffic. Various methods of data collection are employed in obtaining the different traffic counts mentioned. Selected “in-situ” technologies can be used to collect traffic data with the help of detecting equipment placed alongside the roadway. The methods of data collection can be categorized as intrusive and non-intrusive ( 7 ).
A method is classified as intrusive if the main tools used are a recorder and sensor placed on or in the road. The major intrusive methods that have been employed over the years are pneumatic road tubes, piezoelectric sensors, and magnetic loops. Non-intrusive methods are done by observing traffic remotely. These methods range from very traditional methods to new technologies. Examples of non-intrusive methods include, but are not limited to: manual counts, passive and active infra-red, passive magnetic, microwave radar, ultrasonic and passive acoustic and video image detection ( 8 ). Traffic data collected using the various intrusive and non-intrusive methods are used to compute essential performance measures. AADT is one such performance measure used by transportation professionals. It is the yearly average of the number of vehicles passing a point per day. It is also simply referred to as the vehicle flow over a road section on an average day of the year ( 7 ).
Three methods that can be used to compute AADT are: simple average, AASHTO, and AASHTO with day of week, month of year adjustment factors. The simple average method can be used to compute AADT if traffic data are available for all days of the year for a specified roadway. Using the simple average method ( 9 ), the formula for calculating AADT values for a vehicle class, c, is given by Equation 1:
where
VOLi = total traffic on ith day of year,
n = number of days in a particular year, and
c = FHWA vehicle class.
Traffic data for each day of the year may not always be available to use in the simple average method. For this reason, AASHTO developed a second method to estimate AADT while reducing potential errors in its calculation induced by missing days. This AASHTO method makes use of any known periodicity of traffic volume by month in the year, and day of week. The AASHTO method can be used to calculate the AADT on condition that at least one day of each day of week has data for each month ( 10 ). This second method is given by Equation 2:
where
VOLijm = total traffic volume for ith occurrence of the jth day of week within the mth month
i = occurrence of a particular day of the week in a particular month (i = 1, …njm) for which traffic volume is available
j = day of the week (j = 1,2, …7)
m = month (m = 1,…12)
njm = the count of the jth day of week during the mth month for which traffic volume is available (njm ranges from 1 to 5 depending on day of week, month, and data availability)
From the method formulation, the AASHTO method is seen to have a mathematical bias since it gives equal weights to each day of the week in each month and then gives equal weights to all months. As it gives equal weighting to all MADT values, the AASHTO method overweights the months with fewer than 31 days and underweights the 31-day months. For this reason, even if traffic volume were available every day in each year, the AASHTO method calculation would be less likely to match the simple average ( 10 ). This bias is fixed by using the AASHTO method with day of week, month of year adjustment factors which is given by Equation 3:
where
VOLijm = total traffic volume for ith occurrence of the jth day of week within the mth month
i = occurrence of a particular day of the week in a particular month (i = 1, …nwm) for which traffic volume is available
j = day of the week (j = 1, 2, …7)
m = month (m = 1, …12)
njm = the count of the jth day of week during the mth month for which traffic volume is available (njm ranges from 1 to 5 depending on day of week, month, and data availability)
wjm = the weighting for the number of times the jth day of week occurs during the mth month (either 4 of 5); the sum of the weights in the denominator is the number of calendar days in the month (i.e., 28, 29, 30, or 31)
dm = the weighting for the number of days (i.e., 28, 29, 30, or 31) for the mth month in the particular year.
The areas of research in relation to AADT estimation include expanding short-duration counts to yearly values, predicting AADTs, and spatially extrapolating counts from one location to another ( 11 ). This research focuses on the second area mentioned, which is AADT prediction/forecasting. Regression analysis is the most common method explored by researchers in estimating AADT (12–15). This method has been used to predict AADT for minor roads at intersections ( 15 ), rural and urban roads ( 5 ), and some classes of highways ( 16 ). Recently, more supervised learning techniques have been explored and employed in the prediction of AADT. The use of random forest (RF) ( 15 ), and neural networks (15, 17, 18) has become quite popular in this area. Researchers have also explored AADT prediction using support vector regression with data-dependent parameters ( 5 ). Geographically weighted regression (GWR) has been used to predict traffic counts in unmeasured locations ( 19 ) and the results of published research indicate that, as compared with ordinary linear regression models, the GWR models were more accurate ( 14 ).
An efficient way of performing variable selection for regression has been employed in AADT prediction called the smoothly clipped absolute deviation penalty (SCAD). It is able to select significant predictors and estimate unknown regression coefficients simultaneously. Yang et al. ( 20 ) employed SCAD in their research and concluded from the results that it further improved the local AADT estimation when satellite information was incorporated. In addition to these methods, spatial techniques have been found to be quite useful in obtaining predictions ( 21 ). AADT predictions were made for selected roads in Texas using universal kriging ( 19 ). Universal kriging was found to reduce errors in statistically significant ways although its performance was low at sites with low traffic counts. The use of nonparametric regression models has proved to be beneficial in AADT forecasting. These regression methods do not require prior training and make predictions with reference to a group of similar cases located around the current input state at the time of prediction ( 18 ). Researchers who used nonparametric regression methods revealed that it performed well with AADTs calculated from yearly traffic counts with no missing data ( 18 ).
Gaussian Process Regression
The Bayesian approach typically starts with a parametric model that describes a phenomenon. Then a prior distribution is obtained for the unknown parameters of the model which represent our past knowledge or belief about the phenomenon before observing any data. After observing new data that is generated by the model, assumptions or beliefs can be updated ( 22 ). Bayesian methods are rooted in Bayes’ theorem to obtain a posterior probability density for the unknown parameters (Equation 4):
In this equation,
Parametric Bayesian models assume that the set of parameters is finite. Given the parameters, model predictions are independent of the observed data. The parameters are able to provide all the information about the observed data ( 22 ). The complexity of the model is therefore bounded although the amount of data is unbounded. For this reason, parametric models are known for their lack of flexibility. Examples of parametric models are: polynomial regression, mixture models, k-means, hidden Markov models, factor analysis, and logistic regression. A nonparametric Bayesian model is a Bayesian model whose parameter space is said to have an infinite dimension. In other words, it is a model that assumes a potentially infinite number of parameters. This provides high flexibility as compared with the better modeling of parametric methods. Nonparametric Bayesian methods are advantageous because they provide a simple framework for modeling complex data which has features that vary as the dataset grows ( 23 ).
To define nonparametric Bayesian models, the probability distribution of the prior in an infinite-dimensional space needs to be specified ( 24 ). A distribution on an infinite-dimensional space is a stochastic process with paths in that infinite-dimensional space. Such distributions are typically harder to define than distributions on real coordinate spaces with known dimensions. Fortunately, there are many tools obtained from stochastic process theory and applied probability that can be used to obtain the needed distributions ( 24 ). Nonparametric models include infinite latent factor models, infinite hidden Markov models, Dirichlet process mixtures, Gaussian process classifiers, and Gaussian processes. These models are applied in feature discovery, time series, clustering, classification and function approximations, respectively ( 23 ).
The nonparametric Bayesian approach selected to be utilized in this research is Gaussian process regression. Traditional regression methods are used when a particular underlying function is expected. For instance, least square methods could be used in linear regression when the underlying function is suspected to be linear. Model selection can also be used to select the best model among different polynomial regression methods based on assumptions of polynomial functions ( 25 ). Gaussian process regression provides an alternate regression approach. Instead of assuming that the underlying function f(x) relates to a specific model (e.g., f(x) = mx + c), a Gaussian process rather represents f(x) obliquely, albeit rigorously, and allows the data to “speak” clearly for itself. This means that it combines the basic structure of Bayesian inference and interpretable parameters with an ability to approximate an infinite number of functions ( 26 ). Gaussian processes have been employed to solve a wide array of engineering problems ( 27 ).
Gaussian process regression (GPR) can be classified as a supervised learning method ( 25 ). Gaussian processes expand multivariate Gaussian distributions to infinite dimensions. Generally, Gaussian processes generate data found in a domain so that any finite subset of the range should follow a multivariate Gaussian distribution ( 25 ). The mean of a Gaussian process is usually assumed to be zero everywhere. If this assumption is made, it can also be stated that a covariance function, k(x, x′) relates the observations in the dataset to each other ( 25 ). A well-known covariance function is the “squared exponential” in Equation 5,
where
l = length parameter
k(x, x′) = covariance function.
In Equation 5, if x≈x′, then k(x, x′) approaches this maximum, implying that f(x) is almost perfectly correlated with f(x′). This is required for smooth functions where neighbors are alike. On the other hand, if x is farther from x′, then k(x, x′) ≈ 0, therefore the points will likely not “see” one other ( 25 ). This can be seen during interpolation where distant observations have little effect on new x values. The degree of separation is determined by the length parameter, introducing flexibility into the covariance function ( 25 ). During data collection and processing, errors could be introduced from the method of measurement and processing techniques to make the data noisy. The relationship between an observation y in the data and the underlying function f(x) can be represented in a Gaussian noise model ( 25 ) (Equation 6):
This noise can be integrated into k(x, x′), in the form (Equation 7):
where δ(x, x′) = the Kronecker delta function.
Before initiating GPR, the covariance function is computed among all possible combinations of data points. The findings are then written in three matrices (Equations 8–10) ( 25 ).
An important assumption in Gaussian process modeling is that the data can be representative of a sample obtained from a multivariate Gaussian distribution ( 25 ). This is shown by Equation 11:
where T = matrix transposition.
Given that there are n observations in an arbitrary data set,
The best estimate for y∗ is given by the mean of the following distribution (Equation 13): ( 25 ):
and the uncertainty of the computed estimate is captured by the variance in the distribution (Equation 14) ( 25 ):
Methodology
In this section, the four-step research method used in this paper is outlined, as shown in Figure 1. These steps are further expanded in the following subsections.

Research method steps.
Data Collection/Description
As part of a new initiative by New York State Governor Andrew M. Cuomo to provide access to government data and information, the Open NY initiative was launched in March 2013 by signed Executive Order 95 ( 28 ). State agencies were directed to identify, catalog, and publish their data on the state’s open data website under this initiative, which has been administered by the Office of Information Technology Services since March 2013. The data used in this research was traffic count data for selected bridges which was made available under the Open NY initiative ( 29 ). The selected bridges in this dataset were the Rip Van Winkle, Kingston-Rhinecliff, Mid-Hudson, Newburgh-Beacon, and Bear Mountain bridges. The roads hosted by the bridges and their functional classes are listed below:
Rip Van Winkle Bridge—NY23—Urban Principal Arterial (Non-Interstate)
Kingston-Rhinecliff Bridge—NY199—Rural Principal Arterial (Non-Interstate)
Mid-Hudson Bridge—US44—Urban Principal Arterial Expressway
Newburgh-Beacon—I84—Urban Principal Arterial Interstate
Bear Mountain Bridge—US202—Urban Principal Arterial (Non-Interstate)
Annual traffic counts were provided by bridge from 1933 to 2018 for each of the selected bridges under the New York State Bridge Authority system.
Data Processing and Exploration
The traffic count data was transformed into AADT data by using the simple average method given by Equation 1. Descriptive statistics for the traffic counts and AADT of each bridge were then computed and the results are presented in Table 1. As seen in the table, the Newburgh-Beacon Bridge recorded the highest and lowest annual volumes of traffic over the study period. Thus, it also registered the highest standard deviation among all the selected bridges. The data used in the actual analysis was from 1963 to 2018 because traffic volume data was not available for all the bridges before this period.
Descriptive Statistics for Data
Note: AADT = average annual daily traffic; SD = standard deviation.
To visualize the changes in traffic volume over the years, a time series heat map of all traffic volumes was plotted, as shown in Figure 2. In this plot, and with reference to the color bar at the side of the plot, it can be seen that the Rip Van Winkle, Kingston-Rhinecliff, and Bear Mountain bridges have much lower annual volumes than the Mid-Hudson and Newburgh-Beacon bridges. The actual trends over the years are not very clear for any of the bridges except the Newburgh-Beacon Bridge. To visualize trends more clearly, the data for each bridge were standardized. This process made sure the data from each bridge were comparable with the data from the other bridges, making the data internally consistent. The process of standardization creates compatibility and similarity and keeps measurement errors to a minimum ( 30 ). Standardization was also done to improve the numerical stability of the model to be created and to decrease training time. Gaussian processes assume that the underlying data is normally distributed ( 31 ). The mean of the prior distribution is often set to zero in GPR ( 32 ). Since standardization transforms data to have a mean of zero, it was found to be more appropriate than normalization for this research. The time series heat map of the standardized data is presented in Figure 3. In this figure, we can clearly see increasing traffic volumes over the years for all the bridges. The data was then split for model training (80%) and testing (20%).

Time series heat map of traffic volumes from 1963 to 2018.

Plot of standardized traffic volumes from 1963 to 2018.
Model Fitting
Gaussian Process Models
GPR modeling was implemented in Scikit-learn library for machine learning in Python ( 33 ). Gaussian processes are implemented in the “Gaussian Process Regressor” and “Gaussian Process Classifier” in Scikit-learn. “Gaussian Process Regressor” was employed for regression in this research ( 34 ). The prior was specified in the initial stages of implementation in Equation 15:
where
m(x) = mean function
k(x, x') = covariance function.
The prior mean was assumed to be the mean of the training data. The covariance function or kernel,
Radial-basis function (RBF) kernel in Equation 16:
where
Matérn kernel in Equation 17:
where
Rational quadratic kernel in Equation 18:
where
Exponential-sine-squared (Exp-Sine-Squared) kernel in Equation 19:
Dot-product kernel in Equation 20:
where
The hyperparameters of each kernel were adjusted and then optimized during fitting of Gaussian Process Regressor. For instance
Autoregressive Integrated Moving Average Models
The Frequentist models to be compared with the Gaussian process models were Box–Jenkins autoregressive integrated moving average (ARIMA) models. ARIMA is an interactive process which involves four stages: (a) identification, (b) estimation, (c) diagnostic checking, and (d) forecasting of time series ( 36 ). The general mathematical equation for ARIMA(p, d, q) models is shown in Equation 21:
where
p = number of autoregressive terms,
d = number of nonseasonal differences needed for stationarity, and
q = number of lagged forecast errors in the prediction equation.
θ = moving average parameter
y = the dth difference of Y;
therefore if d = 1, yt = Yt–Yt–1.
Models were developed for each bridge with different combinations of p, d, and q values from 1 to 4. This resulted in the creation of over 40 models. The best models for each bridge were selected based on an evaluation metric described in the next section and compared with the GPR models.
The Akaike’s information criterion (AIC) and the Bayesian information criterion (BIC) were then computed for each model using Equations 22 and 23 ( 37 ). The selected model for each bridge was the one with the least AIC and BIC values (37, 38).
where
m = number of the estimated parameters
n = number of the observations.
Model Evaluation
The entire dataset was initially split into two sub-datasets for each bridge—training (80%) and testing (20%)—by sampling randomly. The testing dataset was set aside to evaluate how close the predictions were to the actual values of AADT. The same training and testing datasets were used to develop models using ARIMA and GPR. The best ARIMA models were selected based on their AIC and BIC values. The selected model for each bridge was the one with the least AIC and BIC values.
The evaluation metrics used in this research to evaluate and compare the GPR and ARIMA models were the mean square error (MSE), root mean square error (RMSE), and mean absolute percentage error (MAPE). MSE measures the average of the squared difference between the predicted values and the actual value. It cannot be used to compare accuracy across time series with different scales. RMSE is the standard deviation of the prediction errors. MAPE measures the mean or average of the absolute percentage errors of predictions. It is commonly used because it is relatively easier to interpret and explain. Formulas for computing MSE, RMSE, and MAPE are shown in Equations 24–26, respectively:
where
y = actual value
n = number of observations.
The best method for modeling this data was then selected based on Lewis’ scale of interpretation of estimation accuracy ( 34 ). On this scale, MAPE of less than 10% is a highly accurate forecast, 11%–20% is a good forecast, 21%–50% is a reasonable forecast, and 51% and above is an inaccurate forecast. This scale was used to inform model selection.
Results and Discussion
The Rip Van Winkle, Kingston-Rhinecliff, Mid-Hudson, Newburgh-Beacon and Bear Mountain bridges are referred to as Winkle, Kingston, Hudson, Beacon and Bear, respectively, in parts of the discussion and the tables. The priors for some kernels that were specified at the initial stages of GPR have been visualized in Figures 4–6 showing the mean which was assumed to be the mean of the training data. The plots shown are for the Kingston-Rhinecliff Bridge.

Gaussian process regression model for the Kingston-Rhinecliff Bridge using the dot product kernel.

Gaussian process regression model for the Kingston-Rhinecliff Bridge using the rational quadratic kernel.

Gaussian process regression model for the Kingston-Rhinecliff Bridge using the radial-basis function (RBF) kernel.
Five kernels were used to fit models using the testing dataset for each bridge, resulting in 25 models (Table 2). For each of these models, the testing data was reduced by 20% to observe the effect on the MAPE. It was observed that the overall MAPE% was decreased by 36% when this was done, highlighting the importance of collecting as much data as possible for model development. Increasing noise levels (alpha) from 0.0015 to 0.15 resulted in an overall MAPE increase of 26%. Model evaluation indicated that the MSE and MAPE values were closest for the Bear Mountain Bridge models. The dot-product kernel produced the lowest MSE value (0.025) among the Bear Mountain Bridge models. On the other hand, the MAPE value was lowest for the model created with the rational quadratic kernel (9%). The worst model created with data from the Mid-Hudson Bridge was fit using the dot-product kernel (MSE = 0.032, MAPE = 16%).
Evaluation of Models Created with Different Kernels
Note: MSE = mean square error; MAPE = mean absolute percentage error.
Out of the 25 models, the one fitted with the exp-sine-squared kernel for the Mid-Hudson Bridge gave the lowest MAPE (approximately zero) indicating high accuracy. The RBF, rational quadratic and Matérn kernels all gave a MAPE of 1% for the Hudson Bridge models. The Rip Van Winkle Bridge models had higher MSE values that the Mid-Hudson Bridge models and MAPE values which were between 1% and 3%. The best Rip Van Winkle Bridge model was fitted with the Matérn kernel (MSE = 0.004, MAPE = 1%). Although the MSE values for the Kingston-Rhinecliff Bridge models were comparable with the MSE values of the other models, the MAPE metric revealed that they were the worst performing models with MAPE values between 5% and 38%. The Newburgh-Beacon Bridge models created with the RBF, rational quadratic and Matérn kernels had the lowest MSE values and had the same MAPE value of 2%. Overall the Matérn kernel produced an average MSE (0.008) which was the least value among all models and the rational quadratic kernel produced the least average MAPE (5%) for the bridge models. The exp-sine-squared model produced the highest average MAPE and MSE values across the bridge models.
AIC values of between 108 and 236 were obtained for the 14 ARIMA models generated. Their BIC values were between 115 and 296. The best ARIMA models selected for each bridge were those with the least AIC and BIC values. The best models for the Kingston-Rhinecliff, Rip Van Winkle, Mid-Hudson, Newburgh-Beacon and Bear Mountain bridges had the following configurations: MA(4), ARIMA(2, 1, 2), ARMA(2, 2), ARMA(1, 1) and ARMA(3, 1), respectively. Their respective AIC values were 115, 115, 108, 114, and 114. Their corresponding BIC values were 125, 121, 120, 115, and 135. The Mid-Hudson Bridge model was the best performing model (MAPE = 23%), and the Bear Mountain Bridge model was the worst performing model (MAPE = 76%) among the selected models. Further details about the models can be found in Table 3.
Best ARIMA Models for Each Bridge
Note: ARIMA = autoregressive integrated moving average; CI = Confidence Interval; ar = autogregressive; ma = moving average; L = lag.
The RMSE, MAPE, and MSE values for the best performing GPR and ARIMA models can be found in Table 4. The results show that the RMSE and MSE values for the GPR models were far lower than the values for the ARIMA models. This difference could also be seen in the MAPE values for both models. K-fold cross-validation procedure was undertaken to divide the dataset into five non-overlapping folds. Each of the five folds was used as a held back test set, while all other folds were collectively used as a training dataset. A total of five models were fit and evaluated on the hold-out test sets and the mean performance was reported. The results did not differ by much from the original results obtained before cross-validation (Table 4).
Comparison of GPR and ARIMA Models
Note: ARIMA = autoregressive integrated moving average; GPR = Gaussian process regression; MSE = mean square error; RMSE = root mean square error; MAPE = mean absolute percentage error.
Overall, the Mid-Hudson Bridge models from both methods performed best (GPR: 1%, ARIMA: 23%). The worst performing models were the Bear Mountain Bridge models (GPR: 9%, ARIMA: 76%). The best performing ARIMA model had a 22% higher MAPE than the best GPR model. All GPR models had MAPE of below 10%. On Lewis’ scale of interpretation of estimation accuracy, this indicates highly accurate forecasting power. The MAPEs of the Kingston-Rhinecliff, Rip Van Winkle, and Mid-Hudson ARIMA models fell between the Lewis’ predetermined range of 20%–50%, indicating reasonable forecasting power. The MAPEs of the Newburgh-Beacon and Bear Mountain bridge ARIMA models were greater than 50%, classifying them as models with weak forecasting power. The highest MAPE recorded for both models (76%) was far less than what was recorded in a similar study conducted by Wu and Xu ( 15 ), whose highest MAPE was 291.1%. The MAPE range for the GPR models (1%–9%) was also much lower than was recorded by Xia et al. ( 12 ) which was 1.31%–57%. When all models are considered, the MAPE values obtained in this research had a wider range than the aforementioned study (1%–76%). The overall MAPE for the GPR models was 3.6% and that of the ARIMA models was 44%. The data used for the models was collected from 1963 to 2018. Additional predictions were, however, made for the years 2019 and 2020 using the best models for each bridge. The predictions were made with respect to the last AADT recorded (2018). These predictions were made ceteris paribus. The changes in AADT for 2019 and 2020 respectively were as follows: Kingston-Rhinecliff Bridge (1.9% and 3.4%), Rip Van Winkle Bridge (3.7% and 7%), Mid-Hudson Bridge (2% and 2.7%), Bear Mountain Bridge (–26% and –61%), Newburgh-Beacon Bridge (3% and 5.5%).
Concluding Remarks
This study developed models to estimate AADT of selected bridges in New York using GPR and ARIMA methods. The GPR and ARIMA models were assessed and it was determined that the GPR models provided better forecasts than the ARIMA models. The researchers therefore recommend the use of GPR when estimating AADT for these bridges, especially if a high forecasting accuracy needs to be achieved. The advantage in using the proposed method is that extra predictors like functional class, lane configuration, and so forth that are needed for other methods whose predictions rely on the use of multiple variables, are not needed to estimate AADT. The major contribution of this paper is the development of GPR AADT forecasting models exclusively for these selected bridges which, to the best of the authors’ knowledge, has not been undertaken by any other research study. Based on the performance of the GPR model, the authors cautiously conclude that this method is adequate for AADT estimation for bridges that host principal arterial roads. This is because of the small sample size of the bridges studies (n = 5). Although good results were obtained using GPR, the authors suggest that future studies should be conducted on data from additional bridge sites with larger sample sizes to ascertain the potency of this method in AADT estimation. Bridges that host roads of other functional classes should be included in these studies. Also, future studies should be conducted using AADTs computed with methods other than the simple average method.
Footnotes
Author Contributions
The authors confirm contribution to the paper as follows: study conception and design: Grace Ashley, Nii Attoh-Okine; data collection: Grace Ashley; analysis and interpretation of results: Grace Ashley, Nii Attoh-Okine; draft manuscript preparation: Grace Ashley, Nii Attoh-Okine. Both authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
