Abstract
The aim of this study is to assess whether combining econometric models with different explanatory variables can contribute to better tourism demand forecasts. Inbound tourism demand to the UK from seven leading markets is forecast, respectively, based on quarterly data using both individual and combination models. Causal econometric models that serve as constituents in combination take two specifications which are different in identified influencing factors. The empirical results show that generally including different explanatory variables in combination can produce better predictions according to both predictive accuracy measures and statistical tests. It suggests that the combination forecasting approach is superior to the individual one, and diversified information embedded in different explanatory variables should be integrated to improve tourism demand forecasting performance.
Introduction
The combination forecasting approach has been extensively applied in many areas such as meteorology, economics, and insurance as a robust and powerful method to improve forecasting ability (Qian et al., 2022; Stock and Watson, 1998, 2003, 2004). However, in the context of tourism demand forecasting, combining individual forecasts is not the mainstream method. Some studies apply the combination, and constituent models considered include time series, econometric, and artificial intelligence (AI) models with causal models being confined to the ones that identify the same set of economic variables as influencing factors. When forecasting tourism demand, no combination has been applied to causal models with different explanatory variables. It means that a type of information has been neglected, which origins from distinct independent variables included in different single models. Bates and Granger (1969) pointed out that to make as good a forecast as possible, combining single forecasts based on different variables was a wise procedure. This paper evaluates whether combining econometric models with different explanatory variable can improve tourism demand forecasting performance.
In this paper, econometric models serving as constituents in combination take two model specifications, which are different in identified influencing factors: either only include mainstream economic determinants or introduce the climate factor as a demand determinant. Climate is an important influencing factor in choice of destination, time of departure, and length of stay and can hence affect tourism demand. In the context of climate change, to which tourism is both a significant contributor and substantially affected by (Scott et al., 2019), it is important to understand the value of climate variables when projecting the geographical and seasonal shifts in tourism demand. Besides, the ongoing COVID-19 crisis holds important messages regarding the interwoven nature between tourism and the environment, which highlights the need to consider the role of climate variables in forecasting tourism demand. In the current literature, however, mainstream causal models treat climate variables as fixed with time, as seasonal dummies, or as fixed effects in panel data studies, which ignores variation as well as long-term change in climate conditions. Tourism demand forecasts generated from such models neglect the relationship between climate condition and tourism demand. This study fills this gap through incorporating climate factors when combining tourism demand forecasts.
The rest of this paper is structured as follows. Section 2 reviews the literature on combination forecasting and its application in the tourism demand context. Section 3 focuses on the research method by introducing the individual and combination forecasting models, the variables and data as well as the forecasting procedure. Section 4 presents the empirical results with discussion and section 5 concludes the study.
Literature review
Why to combine?
There are two distinct forecasting approaches: the individual forecasting approach, which produces direct forecasts from single models; and the combination forecasting approach, which generates composite forecasts by combining constituent forecasts yielded by single models. When the individual approach is followed, forecasts are generated relying on only one model, and the forecasting performance of rival models are always compared to identify the best-performing one with the inferior forecasts being discarded. The disadvantages of such a procedure include: firstly, the discarded predictions may contain useful independent information, and secondly, the identification of the “best single model” is like a moving target. Many existing studies show that the forecasting performance of different individual models depends on the accuracy measure used, the forecasting horizon under consideration and the origin-destination pair under study (Gunter and Önder, 2015; Hassani et al., 2017). There are no clear-cut evidences showing which single model is superior to others under all situations, hence there exists no principles regarding the selection of the best single forecasting method. Rather than trying to choose the best single model, the combination forecasting approach pools a range of constituent forecasts together. The rationale is that single forecasts from diverse models based on competing theories, functional forms and specifications contain independent information, the combination of which can achieve diversification gain. As a result, the performance of combination forecasting is supposed to be more stable than the individual one.
The idea of combining multiple forecasts of the same event dates to the 1960s. Bates and Granger (1969) published the seminal work in 1969, which showed that better predictions can be obtained by combining two forecasts yielded by different models. Since then, the general forecasting literature has seen considerable studies on combination forecasts with contributions from many disciplines such as forecasting, statistics, and management (Clemen, 1989). The constituent forecasts have been extended from two to multiple ones with various combining methods being presented and tested and different forecasting horizons and accuracy measures being considered. The empirical results support the conclusion that combining alternative forecasts together can reduce uncertainty and increase accuracy (Diebold and Pauly, 1990; Makridakis et al., 2018, 2020; Zhang and Yu, 2018).
Weighting schemes
One important step of combination is to identify the optimal weights that are assigned to each constituent projection. Main combination approaches differ in the way they use historical information to compute the weights. The simplest weighting scheme is the simple average (SA) method, which assigns equal weights to all individuals. The SA method has been found to be a robust, stable and easy-to-use way, often outperforming more sophisticated weighting schemes and hence is always used as a benchmark in combination forecasting studies (Makridakis and Winkler, 1983; Makridakis et al., 2020; Henry and Clements, 2004; Wu et al., 2020).
The variance-covariance (VACO) method was presented by Bates and Granger (1969) and extended by Fritz et al. (1984) to multiple constituents. To minimize the combined forecasts variance, the VACO scheme assigns larger weights to individual forecasts with smaller forecasting errors, which links the weighting scheme to the historical performance of constituent forecasts. The VACO method is a common choice in forecasting studies (Baumeister and Kilian, 2015; Fritz et al., 1984; Stock and Watson, 2004; Wong et al., 2007; Wu et al., 2020).
A similar weighting method is the discounted mean square forecast error (DMSFE) method, which is proposed by Bates and Granger (1969) and generalized by Newbold and Granger (1974). Weights in DMSFE are inversely related to the individual forecasting accuracy, which is measured by the forecasting error, and the recent forecasts are weighed more heavily by applying a discounting factor. The discounting factor lies between 0 and 1, and in practice, 0.95, 0.9, 0.85, and 0.8 are all common choices (Diebold and Pauly, 1987; Shen et al., 2008, 2011; Stock and Watson, 2004).
All the above-mentioned methods share one common feature which is that the weights add up to unity. Granger and Ramanathan (1984) presented the regression method which does not require the weights to add up to unity. The proposed method regresses the actual values on each constituent forecasts and a constant term with the estimated parameters to be the corresponding weights. Some applications of the regression method have demonstrated its satisfactory performance (Guerard, 1987; Holmen, 1987; MacDonald and Marsh, 1994), while others showed evidence of its unstable predicting ability (Lobo, 1991; Shen et al., 2011). The limitation of the regression method is obvious: when the number of the constituent forecasts are large compared to the sample size, it is inappropriate as the regression for working out the weights is invalid. In the regression-based framework, Diebold and Pauly (1990) applied Bayesian shrinkage techniques to incorporate prior information into the weighting scheme and concluded that shrinkage improved the accuracy of the regression-based combination forecasts.
Another extension of the regression method is the time-varying parameter (TVP) method, which uses Kalman filter algorithm to estimate the weights that are allowed to vary with time if the data suggests so. Applications of the time-varying combination method include LeSage and Magura (1992), Sessions and Chatterjee (1989), Shen et al. (2011) and Stock and Watson (2004). Sun et al. (2021) further developed the TVP weighting method through introducing the nonparametric estimation and proposed the TV jackknife model averaging (TVJMA), which can handle structural changes and nonstationary trends in tourism data.
Applications of combination forecasting in the tourism demand literature
An increasing number of studies on combination forecasts have appeared in the tourism demand literature, among which differences can be found in weighting schemes, individual model inputs and accuracy measures with one common finding being that combination forecasts are generally superior to individual ones (Kourentzes et al., 2021; Li et al., 2019; Liu et al., 2021; Song and Li, 2021; Wu et al., 2020). The SA, VACO, and DMSFE methods are popular weighting schemes (Shen et al., 2008, 2011; Wong et al., 2007). Other methods including the stochastic frontier analysis (SFA), the cumulative sum control chart (CUSUM) method, the management-oriented approach, AI techniques, and nonparametric techniques have also been explored to determine the optimal weights (Andrawis et al., 2011; Chan et al., 2010; Claveria et al., 2016; Coshall and Charlesworth, 2011; Qiu et al., 2021; Sun et al., 2021).
When it comes to individual model inputs, the most popular ones are time-series and econometric models. Regarding time series techniques, autoregressive integrated moving average (ARIMA), and exponential smoothing (ETS) models are widely chosen; and for causal models, popular candidates are autoregressive distributed lag (ADL), error correction (EC), and TVP models (Cang et al., 2011a, 2011b, 2014; Chan et al., 2010; Li et al., 2019; Shen et al., 2008, 2011). However, independent information embedded in different explanatory variables in econometric models has never been combined. Causal models that generate component forecasts in existing combination studies are the same in identifying the same influencing factors: the origins’ real income, the relative price between destination and origin, the substitute price in competing markets as well as seasonal and one-off events dummies (Chan et al., 2010; Li et al., 2019; Shen et al., 2008, 2011; Wong et al., 2007). Artificial intelligence techniques and grey models are also considered in more recent studies (Hu et al., 2021; Liu et al., 2021; Qiu et al., 2021).
When evaluating forecasting performance, widely employed accuracy measures include mean absolute percentage error (MAPE), mean absolute error (MAE), root mean squared error (RMSE) and root mean squared percentage error (RMSPE). Besides descriptive accuracy measures, statistical tests such as the Diebold and Mariano (D-M) test; the Harvey, Leybourne, Newbold (HLN) test; and the Mann–Whiteney test are also applied to assess whether the combination forecasting approach is significantly better than the single method (Bangwayo-Skeete and Skeete, 2015; Coshall, 2009).
Research method
Variables and data
UK’s inbound tourism demand from its seven leading markets: France, Germany, Irish Republic, Italy, the Netherlands, Spain, and the US are studied with tourism demand being proxied by tourist arrivals. For traditional econometric models, income, own price and substitute price are chosen as demand determinants, and seasonal and event dummies are included to capture the seasonal and event effects. To compute substitute prices, Germany and France are chosen as competing destinations to the UK, as France, the UK, and Germany are the top three most visited destinations in northwestern Europe in 2017 (UNWTO, 2018). For climate econometric models, besides economic influencing factors, climate condition in the UK, which is measured by Tourism Climatic Index (TCI) of the UK, is introduced to represent the climate attribute of the destination, which is considered as another determinant of tourism demand.
Tourism Climatic Index is a human-oriented, synthetic evaluation of climate attractiveness to tourists, which takes the most relevant climate elements to tourism experience into account. It comprises five sub-indices. The composition of
Variables and data sources.
The general form of traditional and climate econometric models are specified in equation (2) and equation (3), respectively
Individual forecasting models and three combination groups
Individual models selected as constituents in combination serve not only as forecasting tools but also as sources of diversification. If we see forecasts as information, combining forecasts is an aggregation of information. According to Bates and Granger (1969), the combination of models that contain independent information is most likely to improve forecasting performance. To ensure that constituent models contain as much independent information as possible, a variety of individual models including causal econometric and non-causal time series techniques, which are different in modelling techniques, assumptions, and explanatory variables are selected. The seasonal naive no-change forecasts serve as benchmarks.
Summary of individual forecasting models.
Notes: Only the estimation equations for each model are provided here.
According to the PP (Phillips and Perron, 1988) unit root test results, all model variables are integrated of order zero or order one, based on which the bounds test cointegration approach is selected in this study as it is robust no matter whether the model variables are integrated of the same order (not integrated of order
The component models of three combination groups.
Combination forecasting methods
This paper evaluates the most popular statistical weighting schemes in the current tourism demand literature including SA, VACO, and DMSFE. The regression-based methods are excluded from this study as they are inappropriate because of the large number of constituent forecasts in the combination panel relative to the small training sample size. In addition, a new weighting scheme, which is referred to as
Three popular accuracy measures including MAE, MAPE and RMSE are applied to evaluate the forecasting performance, and the D-M test (Diebold and Mariano, 1995) is conducted to check whether the difference in the accuracy of competing models is statistically significant. Applications of the D-M test in the tourism demand literature include Gil-Alana (2010) and Álvarez-D í az and Rossell ó -Nadal (2010).
Forecasting procedure
The sample covers the 1994Q1–2017Q4 period and different time spans are considered for different origins given data availability. The whole sample is divided into three periods as illustrated by Figure 1. Observations from 1994Q1 to 2012Q4 are used for model estimation, and the individual out-of-sample forecasts are generated from 2013Q1 to 2017Q4, with forecasts from 2013Q1 to 2015Q4 used to determine the combining weights, and the ones from 2016Q1 to 2017Q4 retained for comparison. The out-of-sample combination forecasts are generated from 2016Q1 to 2017Q4. The sample from 2013Q1 to 2015Q4 is the training sample, and that from 2016Q1 to 2017Q4 is the comparison sample. Illustration of data sample.
This study follows the recursive forecasting procedure which is popular in the tourism demand forecasting literature (Song et al., 2019). One- to four-step-ahead out-of-sample forecasts are generated from every 15 individual forecasting model for combination and comparison with the seasonal naive no-change forecasts serving as benchmarks. Three groups of individual forecasts are combined, respectively, with different weighting schemes. The weights determined by different combination methods (except SA) are time-varying by applying the recursive weighting procedure, which is illustrated in Figure 2. For example, for composite forecasts in 2016Q1, the historical performance of single forecasts from 2013Q1 to 2015Q4 are considered to construct the weights. For combination forecasts in 2016Q2, the historical performance of individual forecasts from 2013Q1 to 2016Q1 are taken into account to decide the weights. Such a procedure is repeated with one more single forecast added to the training sample each time, updating the weights each period according to the historical individual forecasting performance. The codes for computing combination forecasts as well as conducting forecasting comparison and statistical tests are written in MATLAB. The Recursive Weighting Procedure for One-Step-Ahead Combination Forecasts. Notes: The weights for two-to four-step-ahead combination forecasts are generated iteratively in the same way.
Results and discussion
Performance comparison based on descriptive measures
For seven origins and four forecasting horizons, the performance of every single and combination model is evaluated and compared based on MAE, MAPE, and RMSE. For the comparison between the individual and the combination forecasting approach, the percentages of the superior combination forecasts compared to the best single ones (referred to as
Superior Percentages of three combination groups for each origin (MAE)
Note: Two decimal places are retained for all percentages for neatness of presentation and the superior percentages of the best-performing combination group for each combination method and each origin are highlighted.
Superior percentages of three combination groups for each origin (RMSE)
Note: Two decimal places are retained for all percentages for neatness of presentation and the superior percentages of the best-performing combination group for each combination method and each origin are highlighted.
Superior percentages of three combination groups for each origin (MAPE).
Note: Two decimal places are retained for all percentages for neatness of presentation and the superior percentages of the best-performing combination group for each combination method and each origin are highlighted.
It also shows that for one market, the difference in the forecasting ability of different weighting schemes given the same combination group is small (refer to each column of Table 4 to Table 6). For instance, according to Table 4 (judged by MAE), the greatest difference in the superior percentages achieved by the best and the worst weighting methods to combine Group A is seen from the American case between SA and DMSFE (0.85), which is 3.7%. It means that there are 1212 more superior combination forecasts if the weights are obtained based on SA instead of DMSFE (0.85).
Superior percentages of three combination groups for each forecasting horizon (MAE).
Note: Two decimal places are retained for all percentages for neatness of presentation and the superior percentages of the best-performing combination group for each combination method and each forecasting horizon are highlighted.
Superior percentages of three combination groups for each forecasting horizon (MAPE).
Note: Two decimal places are retained for all percentages for neatness of presentation and the superior percentages of the best-performing combination group for each combination method and each forecasting horizon are highlighted.
Superior percentages of three combination groups for each forecasting horizon (RMSE).
Note: Two decimal places are retained for all percentages for neatness of presentation and the superior percentages of the best-performing combination group for each combination method and each forecasting horizon are highlighted.
Superior percentages of three Combination groups for each weighting scheme.
Note: Two decimal places are retained for all percentages for neatness of presentation and the superior percentages of the best-performing combination group for each combination method are highlighted.
Performance comparison based on the D-M test
For a given weighting scheme, 32,752 combination models and 15 single models are available to generate forecasts for seven origins at four forecasting horizons when combining Group A. (Regarding Group B and Group C, there are 502 combination forecasting models and nine single forecasting models, respectively, with one weighting scheme.) To conduct the D-M test for each forecasting horizon, every combination model is compared with the best single model for each combination group with each weighting scheme. Considering seven origins and the application of the recursive forecasting procedure, each forecasting model generates 140 (
The D-M test results.
Note: Each cell of the table comprises two entries. The first entry is the percentage of times the Null of equal forecast accuracy between the combination model and the best single model is rejected at a 5% level of significance with a negative statistic, that is, forecasts from the combination model are significantly more accurate than the best single model. The second entry is the percentage of times the Null of equal forecast accuracy between the combination model and the best single model is rejected at a 5% level of significance with a positive statistic, that is, forecasts from the combination model are significantly less accurate than the best single model.
According to Table 11, the combination forecasting approach is statistically better than the single one. For each case, the percentage of the times that the D-M test is rejected with a negative statistic, which means that the combination model forecasts statistically more accurate than the best single one, is much higher than the percentage of the times that the D-M test is rejected with a positive statistic, which signifies that the best single model performs significantly better than the combination model. The results imply that when generating one-to four-step-ahead forecasts, forecasting performance can be improved significantly if combination is applied. For example, for the case of combining Group A using SA for one-step-ahead forecasts, the percentage of times of a negative significant statistic for the tests is 22.66%, and that of times of a positive significant statistic is 6.02%. It means that 7422 (
Table 11 also shows that combining econometric models with different explanatory variables, that is, combining models in Group A, is better than integrating econometric models with the same set of independent variables. For four forecasting horizons, combining all models in Group A always yields the highest percentage of negative significant statistic and the lowest percentage of positive significant statistic no matter which combination method is applied. For instance, for the D-M tests between the VACO combination models and the best single model for four-step-ahead forecasts, the percentage of times of a negative significant statistic is 22.47% when combining Group A, 21.12% if combining Group B and 21.51% when Group C is combined. It means that if VACO weighting is applied with Group A, 7359 (
Table 11 further indicates that for four forecasting horizons, the difference in the forecasting ability of different weighting schemes given the same combination group is small. For instance, the greatest difference in the percentage of times of a negative significant statistic achieved by the best and the worst weighting methods to combine Group A is seen from producing one-step-ahead forecasts between DMSFE (0.95) and VACO, which is 1.6%. It means that there are 524 (
Conclusion
This paper evaluates forecasting combination of econometric models with different influencing factors through an empirical study on UK’s inbound tourism demand with results showing that such a combination is statistically better than just integrating econometric models with the same explanatory variables. It also demonstrates that for a given combination panel, the forecasting ability of different combination methods is similar with the presented inverse-MAE scheme showing good performance.
The findings of this paper have implications for future research and for stakeholders in the tourism industry. Most importantly, this paper paves the way for further empirical investigations on combination forecasting including component models with distinct explanatory variables. In addition, it suggests more research attention to the combination forecasting approach. This study shows the general forecasting superiority of combination models compared to individual ones, which is in agreement with the conclusions of many existing studies (Li et al., 2019; Liu et al., 2021; Qiu et al., 2021; Wu et al., 2020). In the current literature, too much attention is paid to improving the forecasting ability of individual models and to identifying the best single model. If improving forecasting accuracy is the aim, combination forecasting deserves more study and should be included in forecasting comparisons. Besides, government and destination managers can improve the efficiency of their planning exercises by taking into account additional information on climate trends.
Several future research directions have been identified. Firstly, combining econometric models with different explanatory variables deserves more exploration in the future. Climate variables such as the origin’s climate condition, the difference in the climate condition between the destination and the origin, and the relative climate condition of the main destination to the alternative competitors can be considered as tourism demand influencing factors. And the value of other variables such as Air Quality Index (AQI) and search engine data in improving combination forecasting accuracy can be explored.
Secondly, a user-friendly software which can produce combination forecasts easily should be made available considering the powerful forecasting ability of the combination forecasting approach. The biggest obstacle of popularizing combination forecasting is the cost of applying it. It is extremely time-consuming under the current condition to generate combination forecasts as it requires different programs for different tasks. With the help of the software, combination forecasts should be included in forecasting comparisons and can be used as benchmarks for forecasting evaluation.
Moreover, the findings of this paper suggest the application of combination forecasting to assess the impact of the COVID-19 crisis on tourism demand. In the current literature, one important way to evaluate the impact of crises on tourism demand is to compare actual values of demand with reference demand forecasts which are generated on the assumption that the crisis had not happened (Page et al., 2012). Combination can be applied to improve the accuracy of the reference forecasts and hence to improve the performance of such evaluations.
This paper, like other studies, is not without limitations. Only one- to four-step-ahead forecasts are generated and other forecasting horizons are omitted. Besides, nonparametric techniques such as AI-based individual and combination forecasting methods are not included. In addition, important determinants may be missing from the causal models. For instance, no variables for travel costs are included due to unavailability of suitable data.
Supplemental Material
Supplemental Material - Does the combination of models With different explanatory variables improve tourism demand forecasting performance?
Supplemental Material for Does the combination of models With different explanatory variables improve tourism demand forecasting performance? by Xi Wu and Adam Blake in Tourism Economics
Supplemental Material
Supplemental Material - Does the combination of models With different explanatory variables improve tourism demand forecasting performance?
Supplemental Material for Does the combination of models With different explanatory variables improve tourism demand forecasting performance? by Xi Wu and Adam Blake in Tourism Economics
Supplemental Material
Supplemental Material - Does the combination of models With different explanatory variables improve tourism demand forecasting performance?
Supplemental Material for Does the combination of models With different explanatory variables improve tourism demand forecasting performance? by Xi Wu and Adam Blake in Tourism Economics
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed the receipt of the following financial support for the research, authorship, and/or publication of this article: Wu thanks the support of the research fund from Zhongyuan University of Technology (No. K2020YY019), the support of the research fund from Social Science Foundation of Ministry of Education of China (No. 19YJA630067), the support of the research fund form Natural Science Foundation of Henan Province, China (No. 212300410423), the support of research funds from The Research Foundation of Henan Higher Education (No. 21A630042) and the support of the research fund from National Social Science Foundation of China (No. 20CGJ041).
Supplemental material
The data used in this research is available on request.
Author biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
