Abstract
Although tourism literature is replete with tourism demand studies, numerous empirical and theoretical issues remain unresolved. The majority of the extant studies center around replication of well-established theory of tourism demand with applications of contemporary empirical techniques for different sets of tourist-originating and receiving countries. However, the application of empirical techniques, especially panel data methods that are utilized to model tourism demand, seems to be arbitrarily chosen without due consideration of theoretical and empirical ramifications. The purpose of this study is to present a critical review of the tourism demand literature. We assessed the theoretical and methodological inaccuracies when modeling tourism demand in general and when applying panel data models in particular. We provided a guide to application of panel data techniques when modeling tourism demand. The article ends with discussions of an agenda for future research to fill the gaps in the extant literature and advance tourism demand research.
Keywords
Introduction
Tourism demand modeling has been in the center of tourism economics research since the advent of tourism studies (see early tourism demand studies, e.g. Keintz, 1968; Laber, 1969; Uysal and Crompton, 1984, 1985; Witt and Martin, 1987). The determinants of tourism demand have been examined for most major international destinations while a plethora of review papers have summarized recent developments and progress made in modeling and forecasting tourism demand (see, e.g. Crouch, 1994a, 1994b; Lim, 1997, 1999; Peng et al., 2014a, 2014b; Song and Li, 2008; Song et al., 2012, 2019; Witt and Witt, 1995; Wu et al., 2017).
Early tourism demand studies focus on developing a theory of tourism demand by identifying, conceptualizing, and measuring key factors affecting tourism demand (see, e.g. Martin and Witt, 1987, 1988; Morley, 1992; Uysal and Crompton, 1985; Witt and Martin, 1987). Literature is replete with studies that center around either estimating tourism demand models for different sets of tourist-originating and destination countries, or calibrations of tourism demand models using different periods, or mere application of more contemporary analytical tools (see, e.g. Assaf et al., 2019; De Vita and Kyaw, 2013; Fuleky et al., 2014; Garín-Muñoz and Montero-Martín, 2007; Pham et al., 2017; Rafiei Darani and Asghari, 2018; Smeral, 2012; Song and Witt, 2006).
The ever-changing dynamics of global economy and tourism demand patterns and development of novel empirical approaches continuously draw the attention of tourism economists who to investigate the determinants of tourism demand for newly emerging destinations around the world. However, major leap in demand-forecasting research has yet to happen. The majority of the extant studies does not go beyond mere application of well-established theory of tourism demand and related calibration of models. Although replication studies are sine qua non of continued progress of science, continues replication of tourism demand models since early 1960s seems to have saturated the literature with each study contributing less and less to the current knowledge base. Yet, there is still much room for improvement for numerous fundamental issues still need to be tackled and resolved. First, studies utilizing alternative empirical techniques do not objectively justify the application of a specific method. Second, the sample of the studies examined appears to be chosen based on mere convenience and/or simply because prior studies did not estimate tourism demand models for the sample of countries under investigations. Third, the selection of variables to estimate the tourism demand models does not always align with the theory of tourism demand. Also, although several studies analyzed the effects of climate change (Amelung and Nicholls, 2014; Dogru et al., 2019b), terrorism (Araña and León, 2008), immigration (Seetaram and Dwyer, 2009), and trade policies and tariffs (Cheng et al., 2013; Dogru et al., 2019a), many studies examined the effects of these factors on tourism demand in isolation.
Current empirical studies are generally concerned with the application of contemporary empirical techniques that test the efficacy of alternative empirical methods in modeling tourism demand (see, e.g. Chu, 2011; Fuleky et al., 2014; Pham et al., 2017). While applying alternative empirical techniques has become an integral part of forecasting tourism demand literature (see, e.g. Akal, 2004; Chu, 2009; Kim and Moosa, 2005; Song and Witt, 2006; Song et al., 2003), studies modeling tourism demand primarily employ a single empirical method without testing the efficacy of the estimated model with alternative empirical methods. In general, panel data analysis has been the most predominant empirical technique utilized in modeling tourism demand (see, e.g. Dogru and Sirakaya-Turk, 2016; Li et al., 2016; Rafiei Darani and Asghari, 2018; Seetanah et al., 2010). However, studies utilizing alternative panel data techniques do not objectively justify the application of a specific method. The choice of empirical methods in modeling tourism demand is mostly arbitrary and is not theoretically driven and subjective; researchers’ choice is solely driven by the desire to apply a specific empirical technique irrespective of the research gap within the literature. It is a stylistic fact that arbitrarily chosen empirical methods yield inefficient and biased coefficient estimates, thus estimates are usually not BLUE (best, linear, and unbiased).
Another deficiency of the current state of relevant research is that the samples of the studies under investigation are chosen based on convenience or simply because studies do not estimate tourism demand models for these particular samples (e.g. countries, states, cities, etc.). This approach alone makes little contribution to the literature or the practice especially when estimations are made utilizing a panel data approach that estimates a single tourism demand model for the sample of countries included in the panel. Calibration of a single tourism demand model is inefficient and biased especially when slope heterogeneity maybe present in the panel data (Pedroni, 1999). Thus, practical implications of such studies will be limited because countries in the panel will likely have unique characteristics, and hence, a “one size fits all” approach in providing practical implications for destinations is less than perfect, according to a recent study by Dogru et al. (2017).
Furthermore, extant studies in tourism demand literature do not consider testing for cross-sectional dependence within the panel data under investigation. Rather, it is assumed that the members of the panel do not have a cross-sectional dependence. However, the assumption of cross-sectional independence might lead to biased and inconsistent findings, and hence, a test of cross-sectional dependence is necessary prior to choosing an appropriate panel data model (Baia and Kaob, 2006). Moreover, stationarity tests are seldomly applied in the context of tourism demand studies utilizing panel data models. The probability distribution of economics data tends to change over time. Estimations based on the assumption of a stationary process would violate the basic Gauss–Markov assumptions. Therefore, applicable panel data models, such as cointegration-based empirical techniques, may be necessary for modeling tourism demand.
In view of these gaps in the extant literature, the purpose of this study is therefore to present a critical review of the tourism demand literature. This article contributes to the tourism economics literature by (1) assessing the theoretical and methodological inaccuracies when modeling tourism demand; (2) providing a guide to application of panel data techniques when modeling tourism demand; and (3) presenting an agenda for future research to fill the gaps in the extant literature and advance tourism demand research.
Panel data models in tourism demand research
Panel data are defined as “the pooling of observations on a cross-section of units (e.g., countries, cities, destinations, firms, and so on.) over several time periods such as days, weeks, or months” (Baltagi, 2005: 1). By combining pooled-cross-sectional data with time-series data of the same entities under investigation, panel data can capture information that could not be otherwise captured while having added benefits of the reduction of multicollinearity and increase in degrees of freedom (Wooldridge, 2010). In recent years, researchers used a variety of empirical models to estimate tourism demand for many destinations (Song et al., 2019 provide a comprehensive review of the tourism demand literature).
Although studies from 1980s primarily applied the ordinary least squares regression technique to assess the determinants of tourism demand (see, e.g. Uysal and Crompton, 1984), a substantial progress has been made since then in econometrics literature that advance empirical techniques using panel data regression. These panel data models generally include panel fixed and random effects, generalized method of moments (GMM) (Arellano and Bond, 1991; Arellano and Bover, 1995), fully modified ordinary least squares (FMOLS) (Pedroni, 2001), and autoregressive distributed lag (ARDL) techniques (Pesaran et al., 1999).
Despite major progress made in econometrics literature since 1960s, not until late 1990s, tourism researchers have begun using dynamic panel data models (Song and Li, 2008). Early applications of panel data in tourism demand studies were also based on static panel data models and mostly pooled cross-sectional data (see, e.g. Garin-Munoz and Amaral, 2000; Görmüş and Göçer, 2010; Seetaram and Dwyer, 2009). Employing static panel data models are not optimal when modeling tourism demand because estimation models based on static panel data fail to estimate the effects of dynamic mechanisms inherent in panel data. Macroeconomic data usually entail relationships that depend on time and past values of another macroeconomic indicator. Static panel data models cannot allow the calibration of the dynamic nature of the data; thus, the estimation of models tends to produce estimates in violation of major Gauss–Markov assumptions hence not BLUE (best, linear, and unbiased). Dynamic panel data models on the other hand overcome these major shortcomings by incorporating the lagged dependent variable into the model by enabling the estimation of long-run tourism demand elasticities; dynamic panel data models help gain better insights in developing policies and strategies for tourism development.
Furthermore, tourism demand is fundamentally dynamic as a tourist’s former visit experience can affect future tourism demand through repeat visitation and/or word-of-mouth. From a theoretical perspective, past tourism demand, also known as autoregressive term or lagged dependent variable in time-series models, is a major determinant of tourism demand. It is a stylistic fact in tourism demand literature; the lagged dependent variable captures much information when tourism demand models are calibrated (Song and Li, 2008). More specifically, the lagged dependent variable captures word-of-mouth effect, habit persistence or repeat visitation, and also destinations’ supply side factors (e.g. hotel development) (Dogru et al., 2017; Song et al., 2019). From an econometrics perspective, a dynamic panel data model is practical because it accounts for the effect of past values of the dependent variables. In the context of tourism demand, accounting for the effect of past values of the dependent variables in the tourism demand model is a fundamental part of the theoretical model as it explains tourists’ intention to return to the destination and/or to recommend the destinations to their relatives and friends. Panel data models that exclude the lagged dependent variable would not only produce econometrically inefficient and biased estimates but also such specification would be inconsistent with the tourism demand theory.
Considering the limitation of the static panel data models, dynamic panel data models have received an increasing interest in tourism economics literature, and the applications of dynamic panel data models have become the predominant empirical approach in modeling tourism demand in recent years (see, e.g. Brida and Risso, 2009; Dogru and Sirakaya-Turk, 2016; Falk, 2010; Li et al., 2016; Pham et al., 2017). However, the extant studies primarily employed the GMM estimator of Arellano–Bond technique to estimate tourism demand models. While this panel data model can be useful in modeling tourism demand due to its dynamic nature, it has some limitations and GMM may not be the most ideal panel data model in some study contexts. In particular, the application of the GMM estimator of Arellano–Bond and Arellano–Bover techniques in panels with larger time series (T) than units (N; i.e. T > N) can suffer from overidentification of the parameter estimates (Roodman, 2009). Also, the GMM estimators assume that the study variables move together in the long run, which is not always the case, when estimating the long-run relationships. However, a cointegration test is required to examine whether the study variables move together in the long run prior to estimating long-run coefficients.
Aside from aforementioned characteristics, most tourism demand studies do not conduct necessary preliminary tests to determine the nature of the panel data and essentially provide guidance to researchers in regard to applicable panel data models. Panel data models might produce spurious coefficient estimates when the presence of cross-sectional dependence, slope heterogeneity, and unit root is not taken into consideration when modeling tourism demand (Bai and Kao, 2006; Baltagi, 2005; Pesaran, 2006; Pesaran and Yamagata, 2008). First, unobservable common factors might be embedded in error term and that the error term might be correlated with its past values, the explanatory variables, and the past values of the explanatory variables. Estimating tourism demand models when the panel data exhibit cross-sectional dependence may yield biased results if appropriate panel data models are not employed. Second, slope coefficients might need to be estimated separately for the member of the panel. Thus, producing a single slope coefficient for the entire panel in the presence of slope heterogeneity will be asymptotically biased. Third, in cases when the probability distribution of the panel data does not follow a stochastic process and hence estimating tourism demand models without testing for the stationarity of the study variables will yield biased estimates, especially when modeling long-run estimates. Therefore, conducting these preliminary tests is of paramount importance to generating asymptotically unbiased, efficient, and consistent estimates.
Preliminary tests in panel data models
Panel data, when methodologically appropriate models are applied, have various advantages over cross-sectional and/or time series data models. Panel data allow researchers to control for individual heterogeneity in the panel and provides higher degrees of freedom and more information, and panel data models are more efficient in modeling the econometrical relationship between the study variables compared to time-series and/or cross-sectional models because a panel data comprises cross-sectional units (e.g. households, firms, countries, etc.) over time (e.g. month, quarter, year, etc.) (Baltagi, 2005). The basic framework for a panel data model can be demonstrated as follows (Baltagi, 2005; Greene, 2003)
where i denotes cross-sectional dimension and t stands for time-series dimension. The heterogeneity (or individual effect) is described by
In panel data models, the first step is to examine whether a cross-sectional dependence exists in the error terms. Cross-sectional dependence implies that an observed or unobserved shock occurring in one unit affects other units in the panel. Put differently, an interdependence between the unobserved components (i.e. error term) and the regressors suggests that panel members are cross-sectionally dependent. Bai and Kao (2006) argue that cross-sectional independence is a very difficult assumption to justify and that ignoring the possible cross-sectional dependence leads to biased and inconsistent findings. Also, the Monte Carlo experiments conducted by Pesaran (2006) present evidence showing that disregarding the possibility of cross-sectional dependence creates a substantial bias in the estimated models. While alternative methods have been developed to test for the presence of cross-sectional dependence, CD test, CD Lagrange multiplier (CDlm) test of Pesaran (2004), and the adjusted Lagrange multiplier (LMadj) test of Pesaran et al. (2008) are the cost commonly applied and efficient methods. CD and CDlm tests of Pesaran (2004) are capable of producing efficient estimates when the cross-sectional dimension of the panel (N) is large; LMadj test of Pesaran et al. (2008) generates efficient outputs for panels with both large cross-sectional dimension and long time series.
In addition to testing for the presence of cross-sectional dependence, researchers must test for the presence of slope heterogeneity when modeling tourism demand via panel data models. Slope heterogeneity indicates that slope coefficients may not be homogenous across cross-sectional units in a given panel data. The assumption of homogenous slope coefficients may mask unit-specific features of the panel members (Menyah et al., 2014). The most widely accepted methods to test for slope homogeneity have been developed by Pesaran and Yamagata (2008). The slope homogeneity tests of Pesaran and Yamagata (2008) produce two test statistics, which are labeled as
After examining whether cross-sectional dependence and slope heterogeneity exist in the panel data, the next step is to employ unit root tests to examine whether the study variables are stationary to avoid potential spurious regression problem. Although testing for the presence of unit root has become a customary procedure in panel data, there is not a universal agreement on the length of the time series in a panel data. It is largely suggested that a possible nonstationary process must be examined in panel data containing several times series periods (Baltagi, 2005; Roodman, 2009). While unit root test is essential to conduct, the presence of unit root tests does not indicate tourism demand models cannot be estimated. Rather, it suggests that cointegration analysis needs to be conducted prior to estimating a tourism demand model using appropriate empirical methods. If a cointegration relationship exist between the study variables, or simply if the variables of interests move in the same direction in the long run, cointegration-based panel data models can be applied to estimate tourism demand models. The econometrics literature offers a variety of unit root tests, including but not limited to the ADF–Fisher test developed by Maddala and Wu (1999), the LLC test developed by the tests of Levin et al. (2002), and the IPS test developed by Im et al. (2003) and extended by Pesaran (2007).
Overall, it is clear that researchers may obtain inefficient and biased estimates if the cross-sectional dependence, slope heterogeneity, and nonstationarity were to be ignored. In the following section, we discuss four possible scenarios considering the presence or absence of cross-sectional dependence, slope heterogeneity, and/or nonstationarity and present alternative panel data models that can be performed to estimate tourism demand models under each scenario.
Panel data models under alternative scenarios
The absence of cross-sectional dependence and slope heterogeneity
If the cross-sectional dependence and slope heterogeneity are not present in the panel data under investigation, the stationarity process of the panel data must be examined prior to choosing from several alternative static and dynamic panel data models to estimate tourism demand models. A variety of unit root tests has been developed to tests for the stationarity of the study variables. The first-group unit root tests were developed by Harris and Tzavalis (1999), Breitung (2001), Hadri and Kurozumi (2012), and Levin et al. (2002), and the second-group unit root tests are proposed by Maddala and Wu (1999), Choi (2001), and Im et al. (2003). The estimation procedures of the first-group tests assume that the parameter of the one-period lagged dependent variable does not change across cross-sectional units, whereas the second-group tests posit that this parameter can change across cross-sectional units.
The pooled regression model, the fixed effects model, and the random effects panel data models can be employed if the all of the study variables are stationary. Greene (2003) suggests that the pooled regression model can be employed if zi includes only a constant term. While the fixed effects model should be used if zi contains unobserved variables and correlated with xit, the random effects model should be employed if the unit effects in zi differ across cross-sectional units. Hausman (1978) test can be used to statistically determine whether fixed or random effects panel data model is appropriate in a given context. However, as we noted earlier, the static panel data models should not be utilized when estimating tourism demand models because static panel data models are likely to produce econometrically inefficient and biased estimates and also such specification would be inconsistent with the tourism demand theory (Dogru et al., 2017; Song et al., 2019).
While testing for the presence of unit root test is essential prior to modeling tourism demand, conducting unit root tests will not always be necessary in the context of panel data. In particular, the econometrics literature offers the difference GMM estimators (Arellano and Bond, 1991) or system GMM estimators (Arellano and Bover, 1995) that can be utilized without examining the unit root properties of the study variables to model tourism demand. However, the difference and/or system GMM methods require that the panel data to be comprised of a large cross-sectional units (N) and a small time series (T; i.e. N > T) and that the lagged value of the dependent variable to be included in the model as an additional regressor. If these conditions are present, the difference and/or system GMM methods can be employed without testing for the presence of unit root in the panel data.
On one hand, the difference GMM method utilizes a first difference transformation, which depends on subtracting the previous values of the study variables from their current values. The system GMM method, on the other hand, utilizes a forward orthogonal deviations transformation, which relies on subtracting the average of all available future observations from the current values of the study variables. Although both the difference and system GMM methods are commonly used in the tourism demand literature (see, e.g. Brida and Risso, 2009; Garín-Muñoz and Montero-Martín, 2007), studies of Alonso-Borrego and Arellano (1999) and Blundell and Bond (1998) showed that the results from the Monte Carlo simulations yield that the system GMM approach produces more efficient estimates than the difference GMM method. However, we must note that both the difference and system GMM methods use instrumental variables to remedy possible endogeneity problem, which may occur due to omitted variables and/or a strong correlation between regressors and the error term and may exist in the panel data. In the context of tourism demand models, a possible endogeneity problem, for example, may arise due to the unobservable effects of countries’ political, social, and cultural relationships and/or differences. Therefore, the validity of the instrumental variables must be tested prior to interpreting the estimated empirical model. Sargan (1958) test is customarily used to test for the validity of the instrumental variables used in the context of the difference and system GMM methods.
Although the difference and system GMM methods can be used without examining the unit root process of the study variables, the presence of unit root is likely to lead to weak instrumental variable issues in the model (Baltagi, 2005; Roodman, 2009). Gujarati (2003: 797) states a variable is stationary if its mean and variance are constant over time and the value of the covariance between the two time periods depends only on the distance or gap or lag between the two time periods and not the actual time at which the covariance is computed.
If a cointegration relationship exists between study variables, the panel FMOLS estimator and the panel dynamic ordinary least squares (DOLS) estimator developed by Pedroni (2001) can be employed to estimate tourism demand models. For these estimators, Pedroni (2001) extends the time-series FMOLS estimator, which was originally developed by Phillips and Hansen (1990), and the DOLS estimator, which is based on the study of Stock and Watson (1993). Both panel FMOLS and DOLS models are dynamic panel data models, and they are capable of producing efficient estimates panels with small samples. The panel FMOLS model makes a semi-parametric correction in the panel data to eliminate potential endogeneity and serial correlation problems and to produce asymptotically unbiased and efficient coefficient estimates. The panel DOLS model generates a long-run dynamic equation that contains the leads and lags of the differences of explanatory variables to correct for the possible endogeneity and serial correlation problems. Despite their efficiency in estimating dynamic panel data models, the DOLS and FMOLS methods have rarely been utilized to estimate tourism demand models. Studies of Seetanah et al. (2010), Dogru and Sirakaya-Turk (2016), and Dogru et al. (2017), for example, employed the panel FMOLS model to estimate tourism demand models.
The panel DOLS and FMOLS models can only be employed when the study variables follow stationary processes at the same level. Several study variables can follow a stationary process at level I(0), while other study variables can be stationary at their first differences I(1). In these circumstances, the panel ARDL method, which was developed by Pesaran et al. (1999), should be employed to estimate tourism demand models. The panel ARDL model is also a dynamic panel data model that allows the estimations based on the mean group (MG) estimator and the pooled mean group (PMG) estimator. While the MG estimator can be employed irrespective of whether there exists slope heterogeneity, the PMG estimator cannot be performed when a slope heterogeneity is present in the panel data. In addition to the cointegration tests of Kao (1999) and Pedroni (2004), one can alternatively determine whether a cointegration relationship exists between the study variables via the coefficient of the error correction term. A negative and statistically significant coefficient of the error correction term suggests that a cointegration relationship exists between the study variables, and hence, the results from the estimated model can be interpreted, as the estimated model would be asymptotically unbiased, efficient, and consistent.
The absence of cross-sectional dependence and the presence of slope heterogeneity
If the cross-sectional dependence is not present but there is a presence of slope heterogeneity in the panel data under investigation, researchers should employ panel data models that can produce efficient and asymptotically unbiased estimates in panels with the presence of slope heterogeneity. However, prior to determining the panel data models, the presence of unit root must be examined using the second-group unit root tests to account for the slope heterogeneity (see, e.g. Choi, 2001; Im et al., 2003; Maddala and Wu, 1999). While the panel random effects model can be employed when slope heterogeneity is present and the study variables are found to be stationary at their levels, we do not recommend the use of static panel data models for both empirical and theoretical reasons discussed throughout this article. Also, due to the presence of slope heterogeneity, the difference and/or system GMM methods cannot be employed even though the study variables are stationary at their levels I(0).
If study variables are stationary at their first differences I(1), the cointegration test of Pedroni (2004) must be employed prior to estimating the tourism demand model. If the study variables are found to be cointegrated in the long run, the panel FMOLS, panel DOLS, and/or panel ARDL methods can be utilized to estimate the tourism demand model. In the event that panel FMOLS or panel DOLS methods are utilized, the coefficients estimates must be produced for each cross-sectional unit (e.g. country, city, destination, etc.) because of the presence of slope heterogeneity in the panel. Similarly, the MG estimator must be employed if the panel ARDL technique is used to estimate the panel data model because the PMG estimator produces a pooled slope coefficient for the entire panel. Therefore, the PMG estimator cannot be performed when a slope heterogeneity is present in the panel data.
The presence of cross-sectional dependence and the absence of slope heterogeneity
In some situations, the panel data can be cross-sectionally dependent, while the slope coefficient is homogenous. In this case, empirical techniques that take the cross-sectional dependence into account must be employed to produce consistent, efficient, and unbiased estimates. While there are alternative empirical methods that take the cross-sectional dependence into account, the stationarity process of the study variables must first be examined prior to estimating a tourism demand model. The most commonly applied unit root tests in the presence of cross-sectional dependence are the tests of Breuer et al. (2001), Pesaran (2007), and Hadri and Kurozumi (2012). Then, if the study variables are stationary at their first differences I(1), the cointegration relationship between the study variables must be determined. In this context, the Westerlund (2007) panel cointegration test based on the error correction model, the Westerlund (2008) panel cointegration test based on the Durbin-Hausman Principle of Choi (1994), and the Westerlund and Edgerton (2007) panel bootstrap cointegration test can be employed to test for the cointegration relationship between the study variables. Finally, if the study variables are found to have statistically significant cointegration relationship, the long-run parameters of the tourism demand model can be estimated using three alternative panel data models. Specifically, the common correlated effects (CCE), which is developed by Pesaran (2006), that utilize a cross-sectional averages of variables to produce the long-run estimates can be employed to estimate tourism demand models. The two-stage augmented mean group (AMG) estimator, which is developed by Eberhardt and Bond (2009), can also be employed when the cross-sectional units are dependent while the slope coefficient is homogenous. The two-stage AMG model with time dummies is estimated through the pooled first difference estimator at the first stage. These estimations are included in the models established for each cross-sectional unit at the second stage. Furthermore, Chudik and Pesaran (2015) developed the dynamic CCE, which includes the lagged value of the dependent variable as a one of the regressors; CCE can be employed to produce efficient and asymptotically unbiased coefficient estimates in tourism demand models. While these models are capable of producing coefficient estimates for each cross-sectional unit in the panel, a single model for the whole panel is sufficient because slope heterogeneity does not exist in the panel data.
The presence of both cross-sectional dependence and slope heterogeneity
When both cross-sectional and slope heterogeneity are observed in the panel data, the procedures to determine the applicable panel data models to estimate tourism demand models are the same as the previous section. To reiterate, the CCE of Pesaran (2006), the two-stage AMG estimator of Eberhardt and Bond (2009), and the dynamic CCE of Chudik and Pesaran (2015) can be utilized to estimate tourism demand models. However, given the presence of the slope heterogeneity in the panel data, the estimations must be produced for each cross-sectional unit in the panel. One must also note that the long-term parameters of the tourism demand models cannot be estimated if the study variables do not have cointegration relationships.
Conclusion
In this study, we provided a step-by-step guide to determine which panel data models can be utilized in the context of tourism demand by following proper methodological producers. We further discussed the research implications of utilizing panel data in the context of tourism demand modeling. Although the application of static panel data models was limited to a few studies, one must note that the use of static panel data models should be avoided with entirety.
The nature of tourism demand is dynamic that tourists’ behavior is expected to have an effect on tourism demand, as previous tourists’ behavior explains tourists’ intention to revisit and/or create a word-of-mouth effect. Therefore, tourism demand should be modeled utilizing dynamic panel data models, where the lagged dependent variable (autoregressive term) is included in the model as an additional explanatory variable.
The application of contemporary dynamic panel data models is essential to advance the tourism demand literature. However, dynamic panel data models should not be arbitrarily chosen when modeling tourism demand because arbitrarily chosen empirical methods may yield inefficient and biased coefficient estimates. The presence of slope heterogeneity, cross-sectional dependence, and unit root must be examined prior to modeling tourism demand via panel data models. If these conditions are present in the panel data under investigation, panel data models should be utilized. Otherwise, the estimated parameters are likely to be inconsistent and biased. Also, in the context of tourism demand modeling, the practical implications of the findings based on econometrically inappropriate panel data model will be less than perfect. This is particularly concerning when a slope heterogeneity is present in the panel data. Cross-sectional dependence might also lead to biased and inconsistent estimates in tourism demand models. Furthermore, the probability distribution of tourism demand data is likely to change over time, and hence, the panel data may contain unit roots. Tourism demand models that ignore the presence of unit root in panel data may be unreliable. When the variables in a panel data model contain a unit root, cointegration-based empirical techniques are necessary for modeling tourism demand. Therefore, we strongly recommend testing for the presence of slope heterogeneity, cross-sectional dependence, and unit root prior to choosing an empirical model to estimate tourism demand models. However, the examination of the unit root test may not be necessary when the difference and/or system GMM panel data models are utilized to model tourism demand.
In our review of the extant literature, we found that the majority of studies modeling tourism demand do not compare alternative panel data models but rather they employ a single panel data model to estimate tourism demand models. However, the efficacy of the panel data model cannot be assessed if alternative panel data models are not utilized in a given context. Therefore, future studies should compare the efficacy of alternative panel data models. This course of action is necessary to make progress in tourism demand literature. Also, we observed that many studies that analyzed the effects of climate change, terrorism, immigration, trade policies, tariffs, and visa requirements do not incorporate the main explanatory variables that determine tourism demand. That is, the effects of these external factors (e.g. climate change, tariffs, etc.) on tourism demand were examined in isolation without integrating the main explanatory variables (e.g. income, prices, exchange rate, etc.) of tourism demand. Such studies are not aligned with the theory of tourism demand and can yield unreliable estimates. Future studies should investigate the role of climate change, tariffs, visa requirements, and other global and macroeconomic factors on tourism demand by incorporating the main factors, such as income, price, and exchange rate to rectify the methodological inaccuracies.
Furthermore, many subjects within the context of tourism demand received little or no attention (e.g. supply–demand interaction, outbound tourism demand, domestic tourism demand, etc.) from the academic community. Although international tourism demand received an extensive attention, the primary focus has been on estimating determinants of inbound tourism demand. A few studies have estimated tourism demand models for outbound tourism and the determinants of tourism demand for main tourist-generating countries like the United States, Germany, and France. While it is important to explain the factors affecting tourism demand, it is also essential to answer why countries receive significantly fewer number of tourists from certain countries. Modeling domestic tourism demand has also received scant attention despite its major economic impact in communities and countries around the globe. Similarly, state, province, city, and niche-based tourism demand models are not widely examined. Investigating these subject matters remains a niche for future research.
Also, tourism demand models are usually estimated without incorporating the supply-side dynamics of the tourism industry in a given context. Quantifying and obtaining reliable data on tourism supply is not always straightforward. However, tourism supply components can contribute to the explanation of tourism demand, and vice versa. Such analysis will require employing simultaneous equation systems-based empirical techniques and warrant further investigations. Although many panel data techniques have been utilized in the context of modeling tourism demand, novel empirical techniques are continuously emerging that attempts to increase efficiency of the models and provide robust estimations. For example, the relationships between economic variables are assumed to be best estimated using linear modeling. However, linear empirical methods may not always be ideal and hence employing nonlinear empirical approaches may provide better estimates. A powerful approach, difference-in-difference technique (or sometimes called controlled before-and-after study in some social sciences) that mimics a quasi-experimental design has also been rarely applied in the context of tourism demand. This method can be useful to identify causal inferences in tourism demand modeling. Also, the estimation of the tourism demand models within the Bayesian framework has not received the attention it deserves. Future studies can apply Bayesian dynamic panel models when modeling tourism demand. Furthermore, development of disruptive technologies and the rise of sharing economy, especially the rise of alternative accommodation platforms (e.g. Airbnb), deserve the interest of tourism economists and hence should be incorporated in tourism demand models.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
