Abstract
This paper describes the findings of an independent peer review of the modeling tools used by the Volpe National Transportation Systems Center to forecast national vehicle miles traveled (VMT) over the next 30 years. Overall, the VMT forecasting models, which use autoregressive distributed lag models for light-duty vehicle, single-unit truck, and combination truck VMT, work well to estimate travel demand. All model estimations were reviewed, and all models perform well against several validation and testing techniques. The study team was supported by an expert panel selected from academia, government, and industry with experience in econometric methods, transportation and economic data, and modeling methods. The panel reviewed model documentation as well as the report assessing the VMT forecasting models and provided insight into alternative model research. The paper is an effort to synthesize the approaches and the validation methods used. A complementary literature search was also conducted to test the validity and comparability of several estimated variable coefficients. The paper concludes by summarizing the key findings and making recommendations on future model improvements.
The Federal Highway Administration (FHWA) and the Federal Transit Administration (FTA) produce a biennial report on the conditions and performance (C&P) of the nation’s highway, bridge, and transit systems known as the Status of the Nation’s Highways, Bridges, and Transit. The 23rd Edition was published in December 2019. The transmittal letter for this document to Congress noted the importance of projections of travel as central to the evaluation of investment requirements and transportation infrastructure needs. The letter pledges continuous research and improvement in the tools used to produce the C&P report ( 1 ).
Public policy development in transportation, energy, environment, emergency planning, and other areas relies, in part, on an understanding of future demand for travel. Public policy development requires an understanding of the potential conditions today and in the near and long term. Future travel levels will affect the performance of our transportation infrastructure systems. Failing to maintain or expand our transportation systems to meet future travel demand efficiently will limit the ability of our economy to expand and meet our needs and will reduce our quality of life. Sound transportation policy and planning requires that analysts can simulate future conditions and plan accordingly. At the federal level, the focus of this planning is on macro-level national annual travel forecasts. The Volpe National Transportation Systems Center (Volpe) vehicle miles traveled (VMT) macro-models fill that need and are developed with long-term, as opposed to near-term, forecasting as the central premise.
Objectives
This paper reports the peer review of the Volpe VMT forecasting models. The first section presents the objectives of the paper, followed by validation and testing methods, and the results of validation exercises. We then present the conclusions of the review and close the paper by presenting how FHWA plans for VMT models to provide VMT projections as an input to other models, such as the Highway Economic Requirements System (HERS), that in turn provide estimates on the conditions and performance of the nation’s transportation system.
Since the Volpe models have a wide application for medium- and long-term policy, FHWA commissioned independent research and a peer review assessment of them. This paper summarizes an independent peer review of the modeling tools developed by Volpe for FHWA. The research was undertaken from October 2019 to July 2020. An independent expert panel made up of academics, government officials, and industry representatives with recognized expertise in economics, transportation modeling, econometrics, and public policy supported the research team. The research team also had unfettered access to the Volpe model developers. The team presented findings and recommendations to the developers, the agency, and the expert panel. This paper’s specific objectives are to provide a review of the models for all three vehicle classes and to demonstrate how each of the models was validated and tested. We feel this research is timely and highly relevant given the long-term and medium-term implications for infrastructure investment decision making.
Current Policy Applications
Travel is a derived demand, a means to achieve other goals such as access to work, leisure, family obligations, or freight movement.
Travel requires road space and users’ (passengers, shippers, and receivers) time and energy. It leads to emissions of criteria pollutants and greenhouse gases, accidents causing property damage, injury, and loss of life. Various federal agencies are responsible for supporting this demand, as well as addressing the consequences of travel including congestion, energy consumption, safety, and emissions. Some of the more prominent medium- and long-term uses of long-range travel forecasts at the federal level include:
U.S. Department of Transportation FHWA requires national VMT forecasts for the biennial Status of the Nation’s Highways, Bridges, and Transit Conditions & Performance (C&P Report). National VMT forecasts are used to support the Travel Analysis Framework (TAF). National Highway Transportation Safety Administration (NHTSA) Corporate Average Fuel Economy (CAFE) model uses VMT forecasts to estimate future fuel consumption. FHWA uses national VMT forecasts to assess future ITS applications including Connected and Autonomous Vehicle (CAV) potential. VMT forecasts are needed to assess a mileage-based user fee (MBUF) that could be based on the facility, vehicle class, and time-of-day. Weatherford, for example, discusses the use of VMT forecasts for motivating MBUF (
2
).
The Environmental Protection Agency (EPA) is responsible under the Clean Air Act for the regulation of motor vehicle emissions. Motor vehicle emissions are a function of the number and type of vehicles, the fuel type and technology, and the number and characteristics of travel. VMT is a critical input to estimate expected emissions.
The U.S. Department of Energy (DOE)/Energy Information Administration (EIA) maintains a series of transportation energy models that include forecasts on vehicle travel.
The Volpe National Transportation Systems Center Vehicle Miles Traveled (VMT) Forecast Models
The Volpe Center has been working with the FHWA Policy Office in developing and updating national VMT projection models for the last six years. The VMT forecast period is 30 years forward with interim 10-year period forecasts. The VMT estimation is driven by socio-demographic variables, including fuel consumption and explanatory variables differing by vehicle class, that are forecastable over the 30-year period ( 3 ).
Policy analysts often use scenarios to describe future conditions after a given time. These future conditions could be economic, demographic, or technological, and include prices, preferences, or other factors that could influence policy variables of interest. The policy variable(s) are often linked with key economic and other variables through statistical or econometric models as independent variables, while scenarios then chalk out the path of some or all of those included variables. The use of scenarios allows potential policies to be tested against favorable and unfavorable events. The Volpe VMT models are a category of macro-models which fill that need and are an aggregate annual, long-term forecasting suite of models that cover three vehicle classes defined by FHWA: light-duty passenger vehicles (LDV), including automobiles and light-duty trucks (FHWA Vehicle Classes 2 and 3); single-unit trucks (SUT) (FHWA Vehicle Classes 5, 6, and 7); and combination trucks (CT) (FHWA Vehicle Classes 8 through 13). These models jointly facilitate the evaluation of future travel by highway mode under alternative feasible conditions.
Testing and Validation Approach
Model Description and Review
The models provide forecasts of VMT for the United States, disaggregated into three vehicle type categories. These models were developed in 2018 and relied essentially on a data series of VMT collected by FHWA over the period 1969–2016 ( 3 ). The current model is the most recent update formulating the basis for long-term forecasting, not year-to-year forecasts, a notion that builds on the idea of cointegrated series. For instance, the preliminary models were a separate blend of short-term and long-term models, in direct contrast to the current approach, which blends both in the same model framework.
Variable Specification
The Volpe models of highway VMT are specified as demand-side models. The central premise of the LDV model is that household demographics, economic characteristics, and the cost of driving are among the central drivers influencing passenger demand. Because of the influence of household demographics, specifically household size, on LDV VMT, this model incorporates population and estimates VMT on a per capita basis.
The SUT and CT models recognize that truck freight is driven by economic activity. The CT model also recognizes the role of deregulation of for-hire trucking services in influencing truck demand. The core framework used is a long-term equilibrium forecast approach. The final model specifications in all three cases use an autoregressive distributed lag model (ARDL) structure, which was selected over several other specifications, including a vector error correction model (ECM) and vector autoregressive (VAR) model. Given the long-term forecast thrust, models that have equilibrium properties over the long term, and can also address fluctuations in key inputs, are preferred. The ARDL is such a long-run equilibrium model. This model is the preferred model structure to adopt when variables are cointegrated of order zero (I[0]) or integrated of order 1 (I[1]) or both, but not integrated of order 2 (I[2]) ( 4 ). (The I[0] series is an non-integrated or a stationary series while I[1] series is a series where the first difference is stationary.) The ARDL model allows different lag lengths; it allows a decomposition of long-run equilibrium trends into both short-run and long-run effects.
The Volpe specification and elimination process started with a list of 300 possible variables, including their proxy representations, spanning at least seven broad categories of explanatory variables for each of the three vehicle categories. For the full list, the interested reader is directed to the Volpe report ( 3 ). Three sets of candidate explanatory variable categories were finally considered:
Demographic variables: population is used for the LDV class (VMT is normalized on a per capita basis)
Economic activity variables reflecting the derived demand nature of transport (vary for each vehicle class)
Cost of driving (reflecting transportation costs)
The process of filtering down to the final specification report was based on:
Plausibility of the arithmetic signs and magnitudes of the estimated coefficients
Precision and statistical significance of estimated coefficients
Evidence of serial correlation in model residuals
Goodness of fit of the overall model
Mean absolute percent error (MAPE) and other indicators of accuracy for in- and out-of-sample performance ( 3 )
The filtered variables found to be cointegrated with VMT are included in both level and differenced form to distinguish between their short- and long-run effects on VMT, while those not found to be cointegrated with VMT are included only in differenced form. Cointegration testing was achieved via pretests on the data using standard augmented Dickey–Fuller tests. Despite this, ARDL methods do not require pretesting, making it the preferred approach with I(1) and/or I(0) variables ( 2 ).
Model Specification, Estimation, and Sample Sizes
In the aggregate models developed by Volpe, the sample sizes are relatively small, with 47 annual observations for LDV equations and 43 observations for SUT and CT equations. In such cases, the ARDL specification may also be more robust (4, 5). Given the ARDL model structure, the filtered list of variables provides the required parsimony in specification, since such models perform best when only a few key variables are considered. The parsimony is also important to reduce input error associated with the included variables as well as to limit the number of variables for which reliable forecasts are required ( 4 , 5 ). The distributed lag aspect of ARDL models imposes a partial adjustment mechanism via lags to the long-run adjustment process, in the presence of shocks. This model takes a sufficient number of lags to capture the data generating process in a general-to-specific modeling framework. A dynamic ECM can be derived from ARDL through a simple linear transformation. Likewise, the ECM integrates the short-run dynamics with the long-run equilibrium without losing long-run information and avoids problems such as a spurious relationship resulting from non-stationary time series data. Starting from a simple model (Equation 1), where x and z represent predictors of yt at time t, the ARDL ECM can be shown to be Equation 2 as a reparameterization. The error term ε is assumed to be autocorrelated. The first part of the equation with β and γ represents short-run dynamics of the model. The second part with ρi represents long-run relationships. These relationships include lagged values of VMT, and other explanatory variables described below in further detail. This dual representation is the hallmark of ARDL models, since both the short- and long-run contributions of a variable on VMT can be decomposed. The null hypothesis in such equations is generally that ρi=0, meaning a long-run relationship between the variables does not exist. Therefore, rejecting the null points to evidence of long-run cointegrating relationships between the variables.
The final LDV variables specification used by Volpe includes lagged specifications of personal disposable income in first-differenced form. It also includes two variables with nonlinearities, personal disposable income per capita squared, and the squared first difference of personal disposable income per capita. (Volpe found that a quadratic relationship for the income variable was their preferred specification. To explore this further, the JFA team tried re-running the LDV ARDL model for variations of the income variable. JFA tried cubic and linear specifications, which led to multicollinearity in the former and statistical insignificance of some variables in the latter. Thus, JFA confirmed that the quadratic relationship was preferred for econometric reasons. In relation to the economics intuition behind the quadratic term, the income-squared term is potentially capturing opportunity costs [of travel associated with income which could reflect the value of time, and also car ownership factors]. Income has two effects on household travel demand and vehicle use: first, rising income increases the demand for household members to participate in activities away from home, which increases travel; and second, rising income increases the opportunity cost of spending time traveling, which reduces travel. Beyond some level of income, the second effect begins to counteract the first. In addition, transit variables were considered part of the specification but failed to be significant in the aggregate specifications. Further details are in the Volpe report [ 3 ]. The fuel cost variable also represents two aspects since it is the cost of fuel divided by the fuel economy. The first reflects the transport cost and the second reflects the rebound effect from an improvement in fuel economy.)
The first difference of consumer sentiment, published by University of Michigan, is also included ( 6 ). Fuel cost per mile is included as a long-run variable (Table 1).
Variables Included in the VMT Model Suite: LDV, SUT, VT
Note: LDV = light duty vehicle; VMT = vehicles miles travelled; SUT = single-unit truck; CT = combination truck; LD = lagged first difference; FHWA = Federal Highway Administration; FRED =Federal Reserve Economic Data; CPM = Cost Per Mile; PC = Per Capita.
IHS Markit is an international market intelligence firm that provides economic and industry models. More information can be found here: https://ihsmarkit.com/about/index.html. EIA is part of the US Department of Energy and the foremost energy modeling organization. More information on the EIA energy forecasts can be found here: https://www.eia.gov/outlooks/aeo/nems/documentation/
For the CT category, the final variables specification includes long-run variables for goods imports and exports and diesel fuel cost per mile. Short-run variables include a lagged dependent variable, an indicator variable to account for the deregulation of for-hire trucking services by the Federal Motor Carrier Act of 1980, a 2007 to 2008 structural break indicator (as a result of data reporting revisions in that year), and interactions between the regulation indicator with the fuel costs variable and the international trade flows variable.
The SUT long-run variables were other non-durable goods consumption and the diesel fuel cost per mile. Short-run variables (first-differenced) include lagged VMT, other non-durable goods consumption, and the structural break indicator (Table 1). The final model equations used are as follows:
LDV VMT per capita t=f (LDV VMT per capita lagged t-1, fuel cost per mile, real disposable personal income, real disposable personal income 2 , consumer sentiment)
SUT VMT t=f (consumption of other non-durable goods, SUT VMT t-1, diesel cost per mile, structural break dummy)
CT VMT t = f (goods exports and imports, diesel cost per mile, deregulation dummy, CT VMT t-1, structural break)
Model Scenario Forecasts and Treatment of Uncertainty
The Volpe analysis relies on several forecasted variables, some of which are classified with three scenarios for each forecast (baseline, pessimistic, and optimistic). These are 30-year average annual forecasted variables, including growth of gross domestic product (GDP), fuel price, population, and consumer spending (on goods). Table 1 details their data sources. Volpe developed an in-house forecast of cost of driving using IHS and EIA data. Volpe also developed several non-scenario-based in-house forecasts for other variables. For the scenario-based forecasts, the growth rates seem reasonable under ordinary circumstances. Before the current COVID-19 pandemic, the United States never experienced such drastic negative employment growth since the interstate highway system has been in place. While monthly employment growth is negative at the time of writing this report, it is likely that within a couple of years the United States will return to a more “normal” trajectory for employment, GDP, and consumer spending growth. On fuel prices, IHS forecasts them to increase at 1% annually for the baseline, 1.5% for the optimistic scenario, and 0.5% for the pessimistic case. The optimistic scenario may be most likely to describe the past pattern of gas prices, with a roughly 56% increase over 30 years, implying an average price per gallon in 2020 of $1.91. As the United States climbs out of the COVID-19 pandemic, the implied average gas price change over the past 30 years from the EIA optimistic scenario seems reasonable.
Validation and Testing
Model Used and Implied Restrictions
As a first step in the validation and testing process, the error-corrected version of the three ARDL models and the equivalent levels’ equations were estimated. This step helps corroborate the parameter restrictions across models (Equations 1, 2). For the VMT models the error-corrected equation and the equivalent levels equation are shown again in Equations 3 and 4, respectively. The error-corrected equation (Equation 3) includes both long-run and short-run components reflected by the coefficients. The equivalencies imply that coefficients of the error-corrected version must satisfy specific restrictions shown in Equations 5 and 6. This required that both versions of the model be estimated and tested. The estimates for the LDV VMT per capita, CT VMT, and SUT VMT are shown in Tables 1 to 3, respectively. The L. operator indicates a one-period lag, D. indicates a first- difference, LD. indicates a lagged first difference while L2. implies the second lag. For the Volpe models, yt = logarithm of VMT by class in period t and xt = logarithm of independent variables by model class in period t.
Data Used and Corroboration of Long-Run Models
The data used for analysis and verification relied on data series of LDV, SUT, and CT VMT obtained from Volpe, as well as all the independent variables, some of which were collected and verified by the paper authors. The final dependent and independent variables are shown in Table 1. The estimated parameters for each of the three models are shown in Tables 2–4 in error-corrected version and levels form. Explanations for the coefficients follow Tables 2–4. The optimal number of lags selected in each case for estimation is determined by the Bayesian information criterion (BIC). All three models included adjustment variables that were negative and significant. In addition, all the adjustment coefficients are less than one, indicative of an inherently dynamically stable process in the long run. The longest adjustment period of approximately five years to return to long-run stable equilibrium is indicated for LDV VMT. SUT and CT have shorter adjustment periods. The longer length for LDV is reflective of underlying variables, like income, that are variable and dynamic because of wide variations across the nation. Perhaps the most salient results point to evidence of dynamic stability as evidenced by the coefficients of lagged VMT in each case. For instance, the coefficient of lagged VMT suggests a period of approximately five years for a system shock to return to equilibrium (4.3 years for SUT and 2.6 years for CT). These point to inertia or habit persistence effects in travel demand.
The ARDL LDV VMT Mode in Logs: Levels and Error-Corrected Models
Note: LDV = light duty vehicle; VMT = vehicles miles travelled; LDV = light duty vehicle; CT = combination truck; MAPE = mean absolute percent error; CUSUM = cumulative sum; FRED =Federal Reserve Economic Data; CPM = Cost Per Mile; PC = Per Capita; JCSMICH = University of Michigan Consumer Sentiment Index; RDPI = Real Disposable Personal Income; SQ = Squared Term: LN = Natural Log.
All coefficients are significant at 5% level of significance except the constant.
Cumby Huizinga Test (Autocorrelation) = 0.515 p-value-0.4779 (Ho of no serial correlation) not rejected (Chi square 1 degree of freedom)
CUSUM: Test statistic = 0.6877 (not rejected at 1%)
In-Sample Performance 1974 to 2016 MAPE error: 0.69%
Out-of-sample performance: 2006 to 2016: 3.15% MAPE Error; Out-of-sample performance: 2011 to 2016: 0.13% MAPE error;
Pesaran Shin Smith Bounds Test = F = 10.311 t = −4.708; F and t values greater than critical values at 10%, 5%, and 1%.
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
- - -+- -- -- -- -- -- -- -- -- -+- -- -- -- -- -- -- -- -- -+- -- -- -- -- -- -- -- -- -+- -- -- -- -- -- -- -- --
F | 2.831 4.017 | 3.437 4.770 | 4.851 6.507 | 0.000 0.000
t | −2.532 −3.420 | −2.873 −3.804 | −3.561 −4.567 | 0.000 0.007
The ARDL SUT VMT Mode in Logs: Levels and Error-Corrected Models
Note: LDV = light duty vehicle; VMT = vehicles miles travelled; ARDL = autoregressive distributed lag; LDV = light duty vehicle; SUT = single-unit truck; CT = combination truck; MAPE = mean absolute percent error: CNOR = Consumption of other non-durable goods; CUSUM = cumulative sum.
All coefficients are significant at 5% level of significance except the constant.
Cumby Huizinga Test (Autocorrelation) = 1.75. p-value-0.186 (Ho of no serial correlation) not rejected (Chi square 1 degree of freedom)
CUSUM: Test statistic = 0.1482 (not rejected at 1% significance level); Critical value = 1.143
In-Sample Performance MAPE error: 2.63%
Out-of-sample performance: 2011 to 2016: MAPE error 4.52%
Pesaran Shin Smith Bounds Test = F = 23.3 t = −3.51; F and t values greater than critical values at 10%, 5%, and 1%.
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
- --+- -- -- -- -- -- -- -- -- -+- -- -- -- -- -- -- -- -- -+- -- -- -- -- -- -- -- -- -+- -- -- -- -- -- -- -- --
F | 3.278 4.366 | 4.028 5.258 | 5.792 7.333 | 0.000 0.000
t | −2.548 −3.212 | −2.888 −3.586 | −3.576 −4.331 | 0.012 0.058
Post Estimation Tests and Tests for Parameter Stability
Tables 2 to 4 provide summary statistics for two post estimation tests which should be satisfied to ensure a robust specification. In the context of ARDL models, the valid specification requires that there is no residual autocorrelation. This is evidenced in the Cumby Huizinga autocorrelation statistic, ( 7 ) which is distributed as a chi square test statistic (and one degree of freedom). Since the p-values are large, the null hypothesis cannot be rejected, so there is no evidence of residual autocorrelation. Because of the long-term 30-year forecasts, we tested the parameter stability as an added validation test. Tables 2–4 also show the cumulative sum of error (CUSUM) test statistics which suggest that all models satisfy stability tests at 95% confidence levels. This suggests that specification overall is well behaved. However, the CT and LDV models do not perform as well with respect to cumulative sums of squared error tests. This latter test suggests room for improvement in specifications for both these models.
Validation Checks: Fuel/Diesel Cost Per Mile-All Models
The fuel price elasticities are partially reflective of the rebound effect associated with VMT in the long run, suggesting that fuel cost changes occurring via improvements in fuel economy may stimulate additional driving and VMT. Fuel price elasticity is a measure of the sensitivity of the dependent variable to a change in the price of fuel. The rebound effect refers to the impact on the dependent variable from higher fuel efficiency reducing the cost of travel. The negative and significant coefficient associated with the LDV forecast equation (long-run growth elasticity = −0.146) suggests that a decrease in fuel costs can lead to an increase in VMT growth in the long run because of the combined effect of fuel prices and changes in fuel economy. The SUT and CT rebound effects can be observed in the coefficients of −0.252 and −0.127, respectively. The LDV estimates are consistent with positive rebound effects in the short (.16) and long-run (.22) observed in the literature ( 8 ). The CT and to some extent the SUT coefficients also find support in the literature as reported by Leard et al. ( 9 ) in their working paper aimed at exploring rebound effects associated with trucks.
Income Response: LDV Model
Volpe indicated they were trying to capture changes in the opportunity costs of driving with the inclusion of both the linear and nonlinear income squared term in the LDV model. This can be associated with the empirical observation that household vehicle use rises with household income until about $100,000, then levels off and even declines slightly ( 10 – 12 ). This aspect of validation included trying linear and cubic specifications for the income term, but both were inferior in model performance relative to quadratic terms. Income variables are reconciled to contribute approximately 16% of the overall LDV VMT dynamics over the period from 1970 to 2016 (the model period we were tasked with validating).
Deregulation and Structural Break Effects: Truck Models
The two truck models, the SUT and CT models, include additional variables to address sources of structural breaks in the data, a critical element of time series data. Volpe’s development of the models considered two sources of possible structural breaks: policy induced changes (e.g., deregulation) and changes in the data-generating process. The ability to identify such breaks was based on having strong priors about when each of those occurred, as there needs to be a hypothesis about the existence and source of breaks to inform the search for them. But the issue is the post-2008 period, in which there are too few observations, only eight to nine years, to consider ARDL models.
Changes in the data-generating process comprise the first structural break affecting the CT and SUT models. This break is captured by including a dummy variable, which is highly statistically significant in both models (CT Model: 0.201; SUT Model: 0.329), for the years 2007 to 2008.
The potential structural break affecting LDV is the 2008 change in the upward trend in LDV VMT. The trucks reclassification refers to the period before 2009, when FHWA’s classification was primarily based on the Vehicle Inventory and Use Survey (VIUS).
With respect to the combination truck model, the regulation indicator is large (relative to the standard error), negative, and strongly significant, supporting the effects of motor carrier deregulation on competition in the CT markets. An interaction term with fuel costs per mile (LN_DIESEL_CT_CPM-REG) is included in the model to obtain an unbiased estimate of the coefficient on the regulation indicator variable and to capture any of its influence on VMT equation slope post regulation (instead of just the intercept). This interaction coefficient (Table 4) is not significant. Fuel prices were very high during the regulatory era, especially in the 1970s, and it is expected that regulation would be associated with lower truck VMT and higher fuel costs per mile than in a more competitive market that accompanied deregulation. Finally, the interaction of the regulation with net exports is positive and significant, suggesting effects of increased truck trade volume in response to the growth of the trading sector in the United States.
The ARDL CT VMT Mode in Logs: Levels and Error-Corrected Models
Note: VMT = vehicles miles travelled; ARDL = autoregressive distributed lag; CT = combination truck; MAPE = mean absolute percent error; CUSUM = cumulative sum; IMP = Import; EXP = Export; REG = Regulation Indicator; CPM = Cost Per Mile; PC = Per Capita; LN = Natural Log.
Note: All coefficients are significant at 5% level of significance except the constant.
Cumby Huizinga Test (Autocorrelation) = 0.294. p-value-0.5875 (Ho of no serial correlation) not rejected (Chi square 1 degree of freedom)
CUSUM test = Test statistic = 0.208(not rejected at 1% significance level); Critical value = 1.143
In-Sample Performance MAPE error: 1.62%
Out-of-sample performance: 2011 to 2016: MAPE error 4.195%
Pesaran Shin Smith Bounds Test = F = 39.11 t = −6.59; F and t values greater than critical values at 10%, 5%, and 1%.
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
- F | 3.265 4.375 | 4.017 5.276 | 5.792 7.381 | 0.000 0.000
t | -2.539 -3.204 | -2.882 -3.581 | -3.576 -4.335 | 0.000 0.000
Backcast and Forecast Performance
The backcasting validation technique shows how well the estimated equations for each model reflect the past. For the LDV model, 1969 is the only year for which a backcast could be developed. Comparing the backcast value to the model estimate of predicted ln(VMT per capita) shows a small difference of −0.01, indicating the estimated LDV equation reflects past changes in VMT quite well. The CT and SUT models are backcasted for the period from 1970 to 1973. The differences between the backcast results for the change in log CT VMT and the model estimate of predicted log CT VMT range from 0.05 to 0.09. Though these differences are larger relative to the LDV model, they still show that the CT equation is effective in reflecting the past. The differences between the change in VMT and the model estimate of predicted SUT ln(VMT), ranging from 0.03 to 0.08, are also large relative to that of the LDV model, but still indicate that this model is able to reflect past VMT well.
Figures 1 to 3 provide an assessment as to how the model performs for the backcast period, model period, and forecast period (2017–2018) in all three cases. Along the y-axes are the logarithms of the predicted and actual VMT variables in each case. The x-axis shows the time period. The figures also show the overall mean absolute prediction error (MAPE) as evidence of model in-and out-of-sample performance. Several sub-periods are tested separately as well. The evidence points to better levels of in- and out-of-sample performance for the LDV models (overall MAPE error=.104); and lower for the SUT and CT models (MAPE SUT=.238) (MAPE CT=0.147). In all cases, the in- and out-of-sample MAPE errors are roughly similar, providing confidence in the validation efforts. To further support this assessment, in- and out-of-sample performance via MAPE errors is reported for rolling periods for all three models (Tables 2–4).

Ln (LDV per capita) VMT Predicted Versus Actual.

Ln (CT) VMT Predicted Versus Actual.

Ln (SUT) VMT Predicted Versus Actual.
Conclusions
In general, the peer review found that the Volpe models work reasonably well for the period considered: 1966 to 2018 (three periods are lost in modeling because of the introduction of lags). As there may always be opportunity for improvement in any forecasting model, the review lists several recommendations for FHWA, Volpe, or both, to consider as they move forward with revising their VMT forecasting models. These include:
The National VMT forecasting models applied economic and transportation theory to develop rigorous long-run forecasting models using ARDL models for LDV, SUT, and CT VMT.
All models performed well in forecasting and backcasting tests for the periods through 2018.
The LDV model accounts for 85% to 90% of travel and should be a candidate for alternative model specifications. Nonlinearity with respect to income in the LDV market argues for techniques that would expand sample size in the future in the LDV equation. The model as estimated performs very well. Since it explains a large portion of VMT, the goodness of fit, signs and sizes of the coefficients, and predictive performance of this equation is very important. The team evaluated the performance of this equation as estimated and backcasting and forecast ability and concluded the MAPE errors drawn from different rolling periods are low and roughly similar in size. There is significant literature that demonstrates the link between VMT and GDP ( 13 , 14 ). LDV VMT will, therefore, play a significant role in GDP, and monitoring forecasts will be important to ascertain the extent to which nonlinearities in income response will continue to hold.
The SUT and CT models accounted for a smaller share for the duration examined (1966–2018), yet changes made in one model suite will call for similar adjustments in others, if one infers aggregate VMT growth. The limited observations with respect to CT and SUT models will therefore also call for techniques that expand sample sizes post 2009.
Behavioral changes are driven by factors like e-commerce potentially influencing the SUT and LDV market shares in aggregate VMT and these trends will need to be monitored.
Users of the forecast estimates should recognize that the forecasting tools as designed here are models that will produce long-term (20–30 year) forecasts of VMT, since the planning horizon that FHWA is faced with is long-term investment planning and related analysis of Highway Trust Fund revenues. Therefore, there should be caution in using these for short-term planning contexts and applications.
In all cases, the Pesaran Shin Smith Bounds Test ( 15 ) confirms the presence of long-run cointegrating relationships combined with statistically significant long-run coefficients as seen in the ARDL error correction equations (Tables 2–4). This factor, along with additional stability tests, supports our conclusions that the models perform well. The final report for this study provides recommendations and future research along two different lines of inquiry: a) changes in specification and estimation; and b) changes in scenarios considered particularly directed at considering influence of COVID-19 on several variables. The final report will be available from FHWA when released.
FHWA Plans of Future VMT Modeling
The FHWA VMT forecasting models provide vital forecasts of VMT for the entire United States. The process that the Volpe Center used to develop vehicle travel forecasting models for the FHWA was described in previous paragraphs. The purpose of these models is to allow FHWA to forecast future changes in the use of passenger and freight vehicles (as measured by the number of VMT) that are likely to occur in response to predicted changes in future economic conditions and demographic trends. Forecasts of VMT developed using these models inform and support the development of future transportation plans and policies by the Federal government and other transportation policy makers.
FHWA plans to use the vehicle travel forecasting models to help support further study in coordinating agencies. FHWA’s VMT forecast is used to support the Highway Economic Requirements System (HERS) modeling for the C&P report. The VMT forecast is the official estimate of future traffic volumes and consequently influences the HERS model’s estimates of investment cost and benefits temporal projection.
The national VMT forecast is used to calibrate the state-level forecasts that are embedded in the “Future AADT” (average annual daily traffic) fields of Highway Performance Monitoring System section data. An adjustment factor is then applied to ensure that the sum of section-level forecasts matches the national VMT forecast. Currently, this adjustment is done at the national level, but it is feasible to begin using a more nuanced state-level approach.
The three VMT models should be updated to yield improved specifications and estimations by considering panel estimation methods for the aggregate specifications, followed by a systematic method of reconciling estimates obtained of VMT at the national level. This approach would improve the specification and estimation of LDV VMT in particular, allowing for continued exploration of nonlinearities in income-related purchases and driving behavior, while allowing for enhanced degrees of freedom. A similar approach is also recommended for single and CTs to allow a greater representation of data following FHWA reclassification of data and post deregulation for CTs, specifically. Other approaches for VMT models, such as a system of equations method, should be feasible as more VMT data is collected. This would permit a more granular understanding of e-commerce and how behavioral trends affect VMT, particularly modal shifts in LDV VMT. Taken together, more VMT data and associated models that capture greater levels of complexity would improve the long-term forecasting capabilities of all VMT models and aggregate VMT models. Since VMT has the potential to also serve as a leading indicator of GDP, it is then equally important to focus on improving short-term VMT forecastability.
FHWA will seek collaboration with DOE/EIA and EPA, two other agencies that use VMT forecasts in their work.
Footnotes
Acknowledgements
The research was completed under the direction of Tianjia Tang, Chief of the Travel Monitoring and Surveys Division; and Clayton Clark was the project manager in the FHWA Office of Policy and International Affairs. The model developers, Dr. Donald Pickrell, Dr. David Pace and Jacob Wishart from the Volpe National Transportation Systems Center provided documents, presentations, and continuous feedback on the research. The efforts of the panel members are acknowledged. The panel members included: Dr. Lisa Ecola (Rand Corporation), Dr. Kajal Lahiri (University of Buffalo), Dr. Jon Fricker (Purdue University), Dr. Cletus Coughlin (Federal Reserve Bank), Mr. John Maples (U.S. Department of Energy, EIA), Dr. Mark Burris (Texas A&M University), and Dr. Pierre Vilain (Steer Group). The authors acknowledge the critique provided by Drs A. Gelman and P. Nice in “The Commissar for Traffic Presents the Latest Five- Year Plan, 2014” and note that the models have evolved over time.
Author Contributions
The authors confirm contribution to the paper as follows: study conception and design: Jeffrey Cohen, Sharada Vadali, Michael Lawrence, Shikha Dave, Clayton Clark; data collection: Jeffrey Cohen, Sharada Vadali, Michael Lawrence, Shikha Dave; analysis and interpretation of results: Jeffrey Cohen, Sharada Vadali, Michael Lawrence, Shikha Dave; draft manuscript preparation: Jeffrey Cohen, Sharada Vadali, Michael Lawrence, Shikha Dave, Clayton Clark. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Jeffrey Cohen is not speaking, acting, or making representations on behalf of the University of Connecticut, nor as an employee of the State of Connecticut. Sharada Vadali is not speaking as an employee of Texas A&M University.
