Tourism Demand Interval Forecasting Amid COVID-19: A Hybrid Model With a Modified Multi-Objective Optimization Algorithm

Abstract

A hybrid tourism demand interval forecasting system is proposed consisting of two parts: the construction of forecasting interval based on lower and upper bound estimates, and the forecasting interval adjustment based on an optimized reduction coefficient. Coronavirus factors are added as input variables to improve forecasting performance. A new multi-objective optimization algorithm is proposed to construct a feature selection method, optimize the forecasting model, and estimate the optimal reduction coefficient. The results of the experiments show that the proposed system has a powerful interval forecasting ability, which provides crucial guidance for balancing the recovery of the tourism industry and the control of the epidemic spread during the COVID-19 pandemic, and contributes to contingency planning for tourism practitioners and managers.

Keywords

COVID-19 daily tourism demand forecasting interval forecasting modified multi-objective artificial gorilla troops optimization algorithm tourism recovery

Highlights

Tourism interval forecasting provides guidance for tourism recovery during the COVID-19 pandemic.

Feature selection was employed to adaptively select input variables.

A modified multi-objective algorithm is proposed to optimize parameters.

An optimized reduction coefficient is proposed to adjust interval forecasting width.

Introduction

The tourism industry accounts for a significant share of the global economy (Hu, Qiu et al., 2021). The World Tourism Organization expects the continuous growth of tourism over time. From 2009 to 2019, real growth in international tourism receipts exceeded the growth of the world’s gross domestic product. In 2019, the total international tourism receipts reached US$1481 billion, accounting for 7% of global exports and 28% of the world’s service exports, underscoring the sector’s strength and resilience. To maintain the sustainability and stable development of tourism, tourism demand forecasting has attracted increased attention from experts and practitioners (Wu, Ji et al., 2021).

Compared with forecasting at time scales of months and years, daily tourism demand forecasting can provide more specific and targeted guidance for related organizations (Li, Ge et al., 2020 ). For tourism industry practitioners and managers, accurate daily tourism demand forecasting can set reasonable prices, enhance the utilization rate of tourism-related services, and promote the efficient allocation of resources (Bi, Li, Xu & Li, 2021). For local governments and relevant departments of destinations, accurate tourism demand forecasting helps in risk management, strategic planning, and reasonable dispatching (Bi, Li et al. 2021; Athanasopoulos et al. 2018). Thus, accurate and effective tourism demand forecasting has great practical application value (Hu et al., 2021; Álvarez-Díaz & Rosselló-Nadal, 2010). However, tourism demand time series has distinct nonstationary, stochastic, and nonlinear characteristics that are attributed to the impact of various external factors, such as irregular events and seasonal variations (Duan et al., 2022; Zhang et al., 2022). For example, COVID-19 caused structural breaks in global tourism markets, creating significant challenges for tourism demand forecasting performance.

Experts and scholars have taken steps to deal with the disadvantages of using only single tourism demand time series data in tourism demand forecasting. Specifically, with the development of the Internet and other information sources, tourists tend to collect information about a destination in advance, such as hotels, weather, and transportation. This behavior has triggered the rise and development of search engine data (Liu, Liu et al., 2021). Owing to their real-time and high-frequency characteristics, search engine data are used in tourism demand forecasting to supplement traditional data sources and obtain a better grasp of the future trends in tourism (Li & Law, 2020). However, owing to the huge volume of search engine data, it is important to identify useful key information. For example, COVID-19 has had a tremendous impact on the global tourism industry (Gunter et al., 2022; Shin et al., 2021) but was not considered in previous studies.

Most search query data are determined by intuition and prior domain knowledge. Too few variables may lead to the loss of crucial information, while excess amounts of data may contribute to multicollinearity or overfitting issues (Li et al., 2021). Feature selection can automatically determine the appropriate input variables relative to tourism demand from search engines. It is a key step in filtering invalid information and enhancing algorithm accuracy (Chandrashekar & Sahin, 2014). Given the large amount of search engine data, a data subset can be obtained based on feature selection for improved forecasting results (Domingos, 2012). The main advantages of feature selection are its ability to deal with overfitting and improve forecasting performance (Guyon & De, 2003). Thus, it is a useful tool for selecting proper feature variables from massive amounts of search engine data to improve tourism demand forecasting ability.

Another attempt to improve tourism demand forecasting accuracy is the development of more effective methods, such as artificial intelligence (AI) methods. AI models have been widely used in time series forecasting (Nie et al., 2020; Wang, Wang, Lu et al. 2021; Jiang et al., 2021; Niu & Wang, 2019; Wang et al., 2021). AI models, including artificial neural networks (ANNs; Zhang Wang & Niu 2021) and other technologies (Wei et al., 2021), possess excellent nonlinear fitting ability to better capture the nonlinearity and randomness of the tourism demand time series. The echo state network (ESN) is a traditional ANN that is widely used in forecasting fields (Qin et al., 2019). ESN is an advanced version of recurrent neural network (RNN). Compared with RNN, ESN has higher computational efficiency and accuracy (Wang et al., 2020). The unique characteristic of ESN is introducing reservoirs (a series of recurrent neurons) to replace the hidden layers in traditional ANNs. In the process of modeling, the recurrent neurons are randomly initialized and remain unchanged. Different from other ANNs that need to train all layers in each iterative, the ESN does not need to train the input and reservoir layers, which accelerates its learning and extrapolation process (Gao et al., 2021). Furthermore, the ESN has performed satisfactorily in addressing various forecasting problems, such as stock price return volatility prediction (Trierweiler Ribeiro et al., 2021), traffic forecasting (Zhang et al., 2020), and wind power forecasting (Wang et al., 2019). Because of its powerful fitting ability, ESN has been taken as the predictor for tourism demand forecasting in this paper. The ESN has three main parameters: reservoir scale, reservoir connection rate, and reservoir spectral radius. The design of these parameters significantly affects the forecasting performance of the ESN. To promote forecasting accuracy and stability, this paper proposes a modified multi-objective gorilla troops optimizer (MMOGTO) algorithm to optimize the hyperparameters of the ESN. The modified multi-objective optimization strategy considers both accuracy and stability, which is instrumental to achieving high-quality tourism demand forecasting results.

Although many advanced technologies and algorithms have been developed to enhance tourism demand forecasting performance, volatile market conditions and other external factors (e.g., COVID-19) pose serious challenges to accurate and robust tourism demand forecasting. The combination of these factors leads to the uncertain, nonlinear, and unstable characteristics of tourism demand time series. In this context, point forecasting is unable to provide valid and sufficient information for decision-makers. Thus, interval forecasting is considered by academics and practitioners to forecast future outcomes to further depict the uncertain information associated with traditional point forecasting (Wu et al., 2021). Compared with point forecasting, interval forecasting simultaneously exhibits the central tendency of the prediction and the variation range for a given confidence level. Moreover, given different confidence levels, different forecasting intervals are built, which is conducive to the formulation of various policies and strategies (Li, Wu et al., 2019 ). However, there are problems in interval forecasting that require attention, such as the tradeoff between the interval coverage rate and width, and low interval forecasting accuracy. To solve these problems, an interval reduction coefficient tuned by the modified multi-objective optimization was introduced, which is used to balance the coverage rate and width, and further improve interval forecasting accuracy and stability.

In summary, the main issues addressed by this paper are as follows:

(1) Extraction of key information from search engine data. Existing studies select key variables from search engine data by virtue of prior domain knowledge, which may lead to data loss or noisy data. To reduce the negative impact of subjective selection, a feature selection method is used to automatically choose important influencing factors. Feature selection has received little attention in past tourism demand forecasting research despite its ability to improve forecasting performance.

(2) Failure to consider COVID-19. The global tourism industry suffered the brunt of COVID-19, creating huge uncertainties in tourism demand forecasting. Ways to incorporate the COVID-19 factor into the tourism demand forecasting framework is a topic worthy of further discussion.

(3) Optimization of the forecasting model. The optimization of hyperparameters in forecasting models is often neglected in research, despite its importance in improving forecasting performance (Wang et al., 2022).

(4) Incoordination between coverage rate and width in interval forecasting. The interval coverage rate and width are two contradictory indicators that should be balanced in interval forecasting. However, the forecasting performance of existing interval forecasting technologies is limited. Thus, it is essential to develop more effective methods to balance the interval coverage rate and width.

This paper proposes the novel hybrid interval forecasting system, namely MMOGTO-ESN with feature selection to perform the lower and upper bound estimation (LUBE) of daily tourism demand interval forecasting. Considering that single input information limits the performance of daily tourism demand interval forecasting, many potential exogenous explanatory variables—such as search engine data and COVID-19 information—are considered as candidate feature variables in our study. The addition of COVID-19 information provides a better interpretation of the volatility and uncertainty of tourism demand time series and improves the effectiveness of interval forecasting. Subsequently, feature selection based on MMOGTO-ESN is performed to adaptively select the appropriate input variables from all candidate feature variables (e.g., historical tourism volumes, search engine data, and COVID-19 information). These variables are then input into the trained ESN whose hyperparameters are optimized by MMOGTO to obtain the initial lower and upper bounds of forecasting interval. It should be noted that feature selection and ESN forecasting are conducted simultaneously to reduce running costs and enhance computational efficiency. To further improve the interval forecasting performance, a reduction coefficient tuned by MMOGTO is proposed to coordinate the coverage rate and width. By combining the initial lower and upper bounds of forecasting interval and the optimized reduction coefficient, the final interval forecasting results can be obtained. Two datasets on two well-known tourist attractions in China and the United States were employed to verify the interval forecasting performance of the proposed forecasting system. The simulation results demonstrate the effectiveness of the proposed system, which can provide valuable references for managers’ decision-making.

The remainder of this paper is organized as follows. Section 2 reviews the related work on tourism demand forecasting. Section 3 presents the methods and framework of the proposed forecasting system. The data selection and experimental analysis are described in Section 4, while Section 5 concludes the paper.

Related Work

Traditional Tourism Demand Forecasting Models

Researchers have explored many tourism demand forecasting methods, which can be classified into three types: time series methods, econometric methods, and AI methods. Time series methods are used to forecast tourism demand by analyzing the patterns in historical tourism demand data. Common time series methods include exponential smoothing, autoregressive (AR), moving average (MA), and autoregressive integrated moving average (ARIMA; Akin 2015; Lim & McAleer 2002). Econometric methods attempt to build rational bridges between exogenous variables and the dependent variable to explain and forecast the dependent variable. Econometric models can be classified as static models (e.g., gravity models [Morley et al., 2014], autoregressive distributed lag models), and dynamic models (e.g., vector autoregressive [VAR] models [Dergiades et al., 2018], time-varying parameter models [Song et al., 2011], and error correction models).

Although these methods have been proven to improve tourism demand forecasting performance, there are some disadvantages. For example, time series methods cannot deal with nonlinear time series patterns, while econometric methods always require some preliminary hypotheses that are often difficult to satisfy in practice (Jiang, Yang, et al., 2020). Recently, AI methods, such as ANNs and support vector regression (SVR) have been employed to predict tourism demand. Compared with time series and econometric models, AI models possess excellent nonlinear fitting ability and require no preliminary assumptions; hence, they are promising tools for improving tourism demand forecasting performance.

Tourism Demand Forecasting Based on Search Engine Data

Google is the world’s most popular search engine with a market share of over 90%. Thus, it is a reliable platform that collects and reflects tourists’ interests, and has been used by many researchers (Huang & Hao, 2021). Baidu is another useful search engine widely used in China. Data from Google Trends and Baidu Index can be used to effectively predict tourism demand (Bi et al., 2020). With the development of various forecasting methods, an increasing number of studies have employed search engine data along with time series methods, econometric methods, and AI methods to conduct tourism demand forecasting.

Pan and colleagues (2012) first applied Google search data and autoregressive moving average (ARMA) family models to forecast the demand for hotel rooms. Artola and colleagues (2015) employed Google Trends as an additional explanatory variable to predict the monthly tourist volumes of three regions in Germany. Econometric methods with search engine data have also attracted significant attention. For example, Yang and colleagues (2015) built two time series models based on Google Trends and Baidu Index, and found that both provide valid support for tourist arrivals forecasting. Moreover, the Baidu Index is superior to Google Trends in improving the performance of Chinese tourist arrival forecasting. Li and colleagues (2017) proposed a novel tourism demand forecasting framework by combining search engine query data with an econometric method (generalized dynamic factor model). Recently, AI models and search query data have been combined for tourism demand forecasting. Sun and colleagues (2019) considered Google, Baidu indexes, and a kernel-based extreme learning machine to conduct tourist volume forecasting. Hu and Song (2020) used an ANN model to integrate causal variables and search engine data for tourism demand forecasting, and achieved high forecasting accuracy. Some classical tourism demand forecasting methods based on search engine data are listed in Table 1.

Table 1.

Summary of Researches of Tourism Forecasting With Search Engine Data.

Reference	Internet Data	Forecasting Model	Predicted Variables
Pan et al. (2012)	Google Trends	ARMA family models, autoregressive distributed lag model, the time-varying parameter model, and VAR	Demand forecasting for hotel rooms
Artola et al. (2015)	Google Trends	ARIMA	Tourism volumes to Spain
Wen et al. (2021)	Baidu Index	Mixed data sampling, seasonal ARIMA	Monthly tourist inflows to Hong Kong from mainland China
Li et al. (2020)	Baidu search engine and online review data	Seasonal Naïve, Exponential Smoothing State Space, ARIMA, ARIMA with explanatory variables, support vector machine, and random forest	Tourist arrivals to Mount Siguniang, China
Dergiades et al. (2018)	Google Trends	VAR	Tourist arrivals in Cyprus
Camacho & Pacce (2018)	Google’s search volume	Dynamic factor model	Forecasting travelers in Spain
Liu et al. (2018)	Baidu Index	VAR	Tourism arrivals to Guizhou, China
Havranek & Zeynalov (2021)	Google Trends	Mixed data sampling	Monthly tourist arrivals and overnight stops in Prague
Xie et al. (2021)	Baidu and economic indexes	Least squares support vector regression model with gravitational search algorithm	Cruise tourism demand in Chinese
Höpken et al. (2021)	Google Trends	ARIMA, ANN	Tourism volumes to Sweden from major sending countries
Li et al. (2019)	Baidu Index	Back propagation neural network (BPNN), fruit fly optimization algorithm	Tourist arrivals to Mount Huangshan, China
Huang & Hao (2021)	Baidu Index and Google Trends	Ensemble SVR based deep belief network	Tourist volumes in Hong Kong

Tourism Demand Interval Forecasting

Previous researchers have focused on point forecasting, which only outputs a single value at each predicted time point. Point forecasting is popular because of its straightforward use for decision-makers. However, it poorly depicts structural instability and assumes that the samples are sufficiently large for model estimation. When structural instability occurs, or only a small number of samples are available, the forecasting results may be inaccurate (Song & Lin, 2010). Thus, interval forecasting is necessary to provide more comprehensive information for tourism planning and formulation.

Kim and colleagues (2011) built forecasting intervals based on several forecasting methods for tourism volume in Hong Kong and Australia. Athanasopoulos and colleagues (2011) tested three automated forecasting algorithms based on monthly, quarterly, and yearly data in terms of interval coverage. Song and Lin (2010) estimated interval demand elasticities to forecast the intervals of inbound and outbound tourism in Asia, which decreased the risk of forecasting failure caused by external shocks. Huang and Lin (2011) used the gray envelope prediction method to forecast international tourist arrivals in Taiwan. By constructing upper, lower, and central envelope lines, the interval forecasting results can serve as a valid reference for decision-makers. Song and colleagues (2019) used multiple time series methods to conduct density forecasts based on scoring rules. Li and colleagues (2019) applied a combination strategy to tourism interval forecasting and found that combination can significantly enhance interval forecasting performance. Xie and colleagues (2020) proposed a decomposition-ensemble approach to conduct tourism point and interval prediction. Overall, the number of existing studies on tourism demand interval forecasting is relatively small and should receive more attentions.

Tourism Demand Forecasting Amid COVID-19

The tourism industry is highly sensitive and vulnerable to a viral outbreak such as COVID-19 (Page et al., 2012). Since the outbreak of COVID-19, a number of studies have been devoted to exploring the impact of the pandemic on tourism demand. However, few studies were aimed at scrutinizing the tourism demand forecasting amid COVID-19. Polyzos and colleagues (2021) employed long short-term memory (LSTM) to predict the impact of COVID-19 on tourist arrivals from China to the United States and Australia. According to the simulation results, it will take 6 months to a year for tourist arrivals to return to normal levels for Australia and the United States, respectively. Yang and colleagues (2022) used the lasso model to predict daily tourism demand and evaluated the role of online search queries on the improvement of forecasting performance. Jaipuria and colleagues (2021) tended to forecast foreign tourists’ arrival in India and the foreign exchange earnings based on ANN. The simulation results generated valuable theoretical and managerial implications for policymakers. Fotiadis and colleagues (2021) adopted LSTM and the generalized additive method to forecast 12-month international tourism arrivals under different scenarios. They asserted that the pandemic could cause huge economic losses and regress tourism growth by 15 years.

An organized tourism forecasting competition was performed to explore advanced forecasting technologies and to provide more information for managers and marketing organizers about the impact of the pandemic on the tourism industry. There were two stages, including ex post forecasting of tourism demand before COVID-19 and ex ante forecasting of tourism demand during and after COVID-19. The main research results are as follows. Qiu and colleagues (2021) assembled ex-post tourism demand forecasting using stacking models and set three scenarios to conduct ex-ante forecasting. Results suggested a recovery of tourism arrivals of 10% to 70% compared to 2019 under different scenarios. Liu and colleagues (2021) developed a scenario-based judgmental forecast technology based on a novel index and found that the extent of recovery depended on the destination’s dependence on long-haul markets. Kourentzes and colleagues (2021) combined multiple forecasting methods to accomplish the tourism forecasting task in the first stage and conducted judgmental adjustment of model-based forecasting in the second stage. Experiments indicated that the average recovery relative to 2019 tourist arrivals is 58%, 34%, and 80% under medium, severe, and optimistic scenario, respectively. To date, however, no studies have considered tourism demand interval forecasting amid COVID-19, which inspires our study.

Methodology

ESN

The ESN is a popular tool for time series forecasting (Hu et al., 2021), which introduces reservoirs (a series of recurrent neurons) to replace the hidden layers in traditional RNN. The basic structure of an ESN has three parts, K input nodes in the input layer, M internal nodes in the reservoir layer, and L output nodes in the output layer.

The detailed steps include the following stages: First, the input and output sequences ( $I V_{t}$ and $O V_{t}$ ), the number of internal nodes in the reservoir layer M, and the weight matrices ( $w_{i n}$ , $w$ , and $w_{b a c k}$ ) are initialized. Then, the state of reservoir $R V_{t}$ and output vector is updated based on Equations 2 and 3. Finally, we can calculate the optimal weight matrix between reservoir and output layer $w_{o u t}$ based on least square method.

The expressions of the input, reservoir state, and output vectors are as follows:

{\begin{cases} I V_{t} = {[I V_{1} (t), I V_{2} (t), \cdot \cdot \cdot, I V_{K} (t)]}^{T} \\ R V_{t} = {[R V_{1} (t), R V_{2} (t), \cdot \cdot \cdot, R V_{M} (t)]}^{T} \\ O V_{t} = {[O V_{1} (t), O V_{2} (t), \cdot \cdot \cdot, O V_{L} (t)]}^{T} \end{cases}

(1)

where t is the time step. For the (t + 1)-th time step, the reservoir state $R V_{t + 1}$ and output $O V_{t + 1}$ vector can be updated as follows:

R V_{t + 1} = f (w_{i n} \cdot I V_{t + 1} + w \cdot R V_{t} + w_{b a c k} \cdot O V_{t})

(2)

O V_{t + 1} = w \cdot [R V_{t + 1}; I V_{t + 1}]

(3)

where $w_{i n} \in R^{M \times K}$ , $w \in R^{M \times M}$ , $w_{b a c k} \in R^{M \times L}$ , and $w_{o u t} \in R^{L \times (K + M)}$ are weight matrices connecting input and reservoir layer, reservoir layer itself, output and reservoir layer, and reservoir and output layer, respectively. $w_{i n}$ , $w$ , and $w_{b a c k}$ are randomly initialized with no change in the training process, and $w_{o u t}$ is determined by linear regression. This simplifies the training process and enhances the computational efficiency. $f (\cdot)$ represents the activation function of the reservoir, which is designed as a hyperbolic tangent function in this study. The core training objective of the ESN is to calculate the $w_{o u t}$ . For a training dataset with T time steps (t = 1, 2, . . ., T), the matrix form of $O V_{t + 1}$ can be formulated as

O V = Z \cdot w_{o u t}^{T}

(4)

where $O V = {[O V_{1}, O V_{2}, \cdot \cdot \cdot, O V_{T}]}^{T} \in R^{T \times L}$ , $Z = {[z_{1}, z_{2}, \cdot \cdot \cdot, z_{T}]}^{Τ} \in R^{T \times (K + M)}$ , and $z_{t} = [R V_{t + 1}; I V_{t + 1}], t = 1, 2, \dots, T$ .

Subsequently, $w_{o u t}$ can be calculated based on the least square method, that is, $w_{o u t} = {(Z^{- 1} \cdot O V)}^{T}$ , where $Z^{- 1}$ represents the Moore–Penrose pseudo inverse of Z. Relative to the gradient descent approach of RNN, the least square approach can avoid gradient vanish and gradient explosion. In other words, local optimum issues can be avoided by ESN.

MMOGTO

Gorilla Troops Optimization (GTO)

The GTO technique, recently proposed by Abdollahzadeh and colleagues (2021), was inspired by the collective behavior of gorillas. Gorillas are highly social animals that often live and migrate in groups. Within the group, the silverback gorilla holds the leadership position, while the other males, females, and offspring are subordinates. The positions of the gorillas are considered as the candidate solutions of the search space. The three types of candidate solutions are the position of the existing individual $X$ , the candidate position that is superior to the existing solution $S X$ , and the optimal solution of the silverback $X^{'}$ .

We first initialize N random individuals $X = {X_{1}, X_{2}, \cdot \cdot \cdot, X_{N}}$ , and preset the number of iterations and the parameters $Ψ$ and $γ$ . Then, the fitness of the individual is calculated. The GTO has two phases: exploration and exploitation. In the exploration phase, the position of the individual is updated based on Equation 5, and the involved parameters, including C and $L_{p}$ are updated as shown in the following. By calculating and comparing the fitness values of the new solution and the current solution, we can determine the optimal solution in the exploration phase. In the exploitation phase, two mechanisms are used, namely following the silverback and competing for adult females. Once the value of C is greater than or equal to a present parameter $θ$ , the position of the individual is updated based on Equation 7, otherwise, the position of the individual is updated based on Equation 8. Finally, the fitness values of the new solution and the current solution are compared to obtain the best solution in this phase. The exploration and exploitation process are both performed in each iteration, and when the maximum iteration is reached, the optimal solution can be obtained as the location of the silverback.

Exploration phase

In this phase, the migration behavior of gorillas is simulated. First, a parameter $Ψ \in [0, 1]$ is designed. When $R < Ψ$ (R represents a random number), the individual tends move toward an unknown position; when $R \geq 0.5$ , the individual tends to move toward other individuals; and when $R < 0.5$ , the individual migrates to a known position. The specific process is mathematically expressed in Equation 5.

S X (t + 1) = {\begin{cases} (u b - l b) \times ξ_{1} + l b R < Ψ \\ (ξ_{2} - C) \times X_{r} (t) + L_{p} \times H R \geq 0.5 \\ X (t) - L_{p} \times (L_{p} \times (X (t) - S p_{r} (t)) + ξ_{3} \times (X (t) - S p_{r} (t))) R < 0.5 \end{cases}

(5)

where $X (t)$ is the position of the individual in the t-th iteration; $S X (t + 1)$ represents the position of the candidate solution in the t+1-th iteration; and $ξ_{1}, ξ_{2}, ξ_{3}, a n d R \in [0, 1]$ are random parameters. ub and lb are the upper and lower bounds of the variables, respectively, $X_{r} (t)$ and $S X_{r} (t)$ are the positions of randomly selected individuals. $C = (\cos (2 \times ξ_{4}) + 1) \times (1 - I_{t} / M a x I_{t}), ξ_{4} \in [0, 1]$ , where $\cos (\cdot)$ is the cosine function, and $I_{t}$ and $M a x I_{t}$ are the current and maximum iterations, respectively. From this equation, it was observed that in the early iteration, there are values for the sudden changes in a large interval, and the change interval decreased with the increase in the number of iterations. $L_{p}$ can be calculated by Equation 6.

L_{p} = C \times l

(6)

where $l \in [- 1, 1]$ is random. This equation represents the leadership behavior of the silverback, who may fail to lead and organize groups in the early stage but gains experience and leadership stability with the increase in the number of iterations. In Equation 5, $H$ can be defined as $H = z \times X (t)$ , where $z = [- C, C]$ .

At the end of the exploration phase, the costs of the $S X_{r} (t)$ solutions are calculated and compared with those of the $X_{r} (t)$ solutions. When the costs of the two solutions satisfy $S X_{r} (t) < X_{r} (t)$ , $S X_{r} (t)$ replaces $X_{r} (t)$ as the optimal solution obtained in this phase.

Exploitation phase

Two mechanisms are used in the exploitation phase: following the silverback and competing for adult females. The mechanism is selected by comparing the C value with a preset parameter $θ$ . When $C \geq θ$ , the following the silverback strategy is performed by Equation 7.

S X (t + 1) = L_{p} \times Q \times (X (t) - X^{'}) + X (t)

(7)

where $X^{'}$ is the optimal solution of the silverback. $Q = {({| (1 / N) \sum_{i = 1}^{N} S X_{i} (t) |}^{η})}^{1 / η}$ , where N is the number of all individuals, $η = 2^{L_{p}}$ .

When $C < θ$ , individual gorillas compete for adult females, which is expressed by Equation 8.

S X (t + 1) = X^{'} - (X^{'} \times A - X (t) \times A) \times B

(8)

The values of parameters A and B in Equation 8 can be calculated by Equation 9.

{\begin{cases} A = 2 \times ξ_{5} - 1 \\ B = γ \times E \end{cases}

(9)

where A denotes the impact force, $ξ_{5} \in [0, 1]$ is random, B represents the degree of violence in conflicts, and $γ$ is a preset parameter. $E = {\begin{cases} N_{1}, R \geq 0.5 \\ N_{2}, R < 0.5 \end{cases}$ measures the influence of violence on the dimensions of the solution, where N₁ and N₂ are random values of the normal distribution. When $R \geq 0.5$ , E is a random value in the normal distribution and the problem"s dimensions. When $R < 0.5$ , E is a random value in the normal distribution.

At the end of the exploitation phase, the costs of the $S X_{r} (t)$ solutions are calculated and compared with those of the $X_{r} (t)$ solutions. When the costs of $S X_{r} (t)$ satisfy $S X_{r} (t) < X_{r} (t)$ , $S X_{r} (t)$ replaces $X_{r} (t)$ and the optimal solution is finally obtained for the entire population.

Modified Strategy: Population Initialization Based on a Sobol Sequence

It must be noted that the initial positions of gorillas largely affect the convergence rate and accuracy of the GTO. A uniform solution distribution can improve the optimization performance. To this end, a Sobol sequence is introduced to the initial positions of the gorillas (Sobol 1967). The Sobol sequence is a random sequence that uses a radical inversion with a base of two in each dimension of the population. The radical inversion in each dimension has different matrices to produce nonredundant and uniform points. The Sobol sequence makes the distribution of initial solutions more uniform and covers a wider search space to enhance the performance of the GTO.

Multi-Objective Optimization Scheme

The principle of the multi-objective optimization strategy is as follows. The Pareto optimal solution is introduced into the multi-objective optimization scenario to replace the exact solution involved in the single-objective optimization issue. The Pareto optimal solution $ρ^{*}$ satisfies the equation $\exists χ \in Ω s . t . F (χ) ≻ F (ρ^{*})$ , where Ω represents the entire search space. All non-dominated Pareto optimal solutions are provisionally placed in the archive. Each newly obtained solution is compared with the current solutions in the archive. If the new candidate solution is better or at least equal to the current solutions, then it will enter the archive. There is an upper limit of the archive, that is, the maximum number of non-dominated Pareto optimal solutions. When the upper limit is reached, the most crowded segment will be eliminated from the archive based on the deleting probability, that it, $P_{i} = N_{i} / c$ , where $c > 1$ is a constant and N_i is the number of solutions around the i-th solution. The Pareto optimal solution is ultimately determined by a non-dominated solution sorting scheme. The mechanism of this scheme selects the least crowded segment from the archive, and the selection probability is determined based on the roulette wheel approach, that is, $P_{i} = c / N_{i}, c > 1$ .

The Framework of the Interval Prediction System

Two datasets on the daily tourist volumes at two well-known tourist attractions in China and the United States were collected. Each dataset had 470 data points $T o_{1, 2}^{u r} (1 t h - 470 t h)$ , which were divided into training, validation, and test sets. The framework of the proposed forecasting system is illustrated in Figure 1. From Figure 1, we can summarize that the unique features and advantages of the proposed hybrid interval prediction model lie in the following two aspects: one is that MMOGTO-ESN is used to conduct feature selection in terms of multiple search engine data and COVID-19 information, which can effectively screen the input information and improve the quality of interval forecasting; the other is that a reduction coefficient optimized by MMOGTO is proposed to coordinate the coverage rate and width so as to further enhance the effectiveness of the forecasting interval. The main steps of the proposed hybrid interval prediction system are as follows:

Figure 1.

The Framework of the Proposed Forecasting System.

Step 1: The main objective of this stage is to use the proposed MMOGTO-ESN model to select appropriate input features and construct initial prediction intervals by using training and validation set data. The initial input matrix of MMOGTO is given in Equation 10, which is then converted into the input matrix in Equation 11. The output of MMOGTO is the selected input variables and the hyperparameters for ESN. Based on this, the ESN is then trained based on the training set with the input and output sets of Equations14 and 15. The trained ESN is then employed to calculate the interval forecasting results in the validation set $[L B_{t}, U B_{t}] (t = 371, 372, \dots, 420)$ Finally, based on $[L B_{t}, U B_{t}] (t = 371, 372, \dots, 420)$ and the real tourism volumes data $Z_{1}^{t} (t = 371, 372, \dots, 420)$ in the validation set, the fitness function of the MMOGTO can be calculated, and, as the iteration goes on, the optimal input matrix of the feature variables and the optimal hyperparameters of the ESN can be obtained to construct the initial prediction interval $[L B_{t}, U B_{t}] (t = 371, 372, \dots, 470)$ .

Step 2: The main objective of this stage is to adjust the forecasting interval based on the reduction coefficient optimized by the MMOGTO. The fitness function of the MMOGTO is designed as shown in Equation 21, which is calculated based on the interval forecasting results and the real tourism volumes in the validation set. After obtaining the optimal reduction coefficient, we can further calculate the final forecasting interval and the performance evaluation indicators.

Stage I: LUBE interval forecasting based on MMOGTO-ESN with feature selection

In previous studies, feature selection and forecasting were often conducted separately. In this study, a hybrid prediction model is proposed to reduce the computation time. The training set $T o_{1, 2}^{u r} (1 t h - 370 t h)$ was used to train the MMOGTO-ESN system, the validation set $T o_{1, 2}^{u r} (371 t h - 420 t h)$ was employed by the MMOGTO algorithm to conduct feature selection in terms of the input features of the ESN and to optimize the hyperparameters of the ESN. The test set $T o_{1, 2}^{u r} (421 t h - 470 t h)$ was used to predict the daily tourism demand.

The training set and validation set were first input into the MMOGTO. In the iteration process, the current optimal solution of MMOGTO is taken as the basis of feature selection and the optimized hyperparameters of the ESN. The variable dimension of the optimal solution in the iteration process is designed as the sum of the number of input features (p) and the number of hyperparameters of the ESN. The first p input feature values are regarded as the continuous variables, $C_{i}, (i = 1, \dots, p)$ , corresponding to each feature, with a search range of [0,1]. The hyperparameter values correspond to the three hyperparameters of the ESN (reservoir scale, reservoir connection rate, and reservoir spectral radius) with search ranges of [50,150], [0.01,1], and [0.01,1], respectively. It must be noted that the reservoir scale values are integers; hence, the integer values of the current optimal solution corresponding to the reservoir scale are selected in each iteration.

The initial input matrix of all feature variables in the t-th period is given as

[\begin{matrix} Z_{1}^{t - 14} & Z_{2}^{t - 14} & \cdot \cdot \cdot & Z_{n}^{t - 14} \\ Z_{1}^{t - 13} & Z_{2}^{t - 13} & \cdot \cdot \cdot & Z_{n}^{t - 13} \\ \cdot \cdot \cdot & \cdot \cdot \cdot & \cdot \cdot \cdot & \cdot \cdot \cdot \\ Z_{1}^{t - 1} & Z_{2}^{t - 1} & \cdot \cdot \cdot & Z_{n}^{t - 1} \end{matrix}]

(10)

where $Z_{1}^{t}$ represents tourism volumes and n is the dimension of all involved features, including tourism volumes and the total number of Internet search keywords. The candidate search keywords are listed in Table S1 of the supplementary file (see the online supplementary material). The initial input matrix $(14 \times n)$ is converted into an input matrix $(1 \times p)$ , as shown in Equation 11.

[Z_{1}^{t - 14}, \cdot \cdot \cdot, Z_{1}^{t - 1}, Z_{2}^{t - 14}, \cdot \cdot \cdot, Z_{2}^{t - 1}, \cdot \cdot \cdot, Z_{n}^{t - 14}, \cdot \cdot \cdot, Z_{n}^{t - 1}]

(11)

It must be noted that a rolling prediction scheme was adopted, and the detailed data structure and input characteristic are presented in Figure S1 of the supplementary file (see the online supplementary material). In each loop, the first 14 periods in each feature are taken as input, that is, all input features in the first 14 days are taken as input to predict the tourism volume on the next day, and this is determined based on the results of trial and error. For example, when we predict the tourism volumes in the 15-th period, the input is set to $[Z_{1}^{1}, \cdot \cdot \cdot, Z_{1}^{14}, Z_{2}^{1}, \cdot \cdot \cdot, Z_{2}^{14}, \cdot \cdot \cdot, Z_{n}^{1}, \cdot \cdot \cdot, Z_{n}^{14}]$ . Based on $C_{i}, (i = 1, \dots, p)$ , the binary discrete variables of each feature $d_{i,} (i = 1, \dots, p)$ can be obtained to determine whether the feature is selected. The computation method of $d_{i,} (i = 1, \dots, p)$ is as follows:

d_{i} = {\begin{matrix} 0 & 0 \leq C_{i} \leq 0.5 \\ 1 & 0.5 < C_{i} \leq 1 \end{matrix} (i = 1, 2, \dots, p)

(12)

If $d_{i} = 1$ , then the i-th feature is selected; if $d_{i} = 0$ , then the i-th feature is not selected. Based on Equation 12, the new input matrix of the feature variables in the t-th period can be obtained, as shown in Equation 13.

[{Z^{'}}_{1}^{t}, {Z^{'}}_{2}^{t}, \cdot \cdot \cdot, {Z^{'}}_{h}^{t}]

(13)

where h represents the number of the selected feature variables. The last three values of the current optimal solution are assigned to the three hyperparameters of the ESN. The ESN is trained based on the training set $T o_{1, 2}^{u r} (1 t h - 370 t h)$ . In this process, the input set of the ESN is given as

[{Z^{'}}_{1}^{t}, {Z^{'}}_{2}^{t}, \cdot \cdot \cdot, {Z^{'}}_{h}^{t}] (t = 15, 16, \dots, 370)

(14)

The corresponding output set of the ESN is:

[Z_{1}^{t} \times (1 - α), Z_{1}^{t} \times (1 + α)] (t = 15, 16, \dots, 370)

(15)

where $α$ is the significance level of the forecasting interval.

The trained ESN is then used to predict the validation data, whose input set is:

[{Z^{'}}_{1}^{t}, {Z^{'}}_{2}^{t}, \cdot \cdot \cdot, {Z^{'}}_{h}^{t}] (t = 371, 372, \dots, 420)

(16)

The interval forecasting results of validation set are $[L B_{t}, U B_{t}] (t = 371, 372, \dots, 420)$ .

Based on $[L B_{t}, U B_{t}] (t = 371, 372, \dots, 420)$ and the real tourism volumes data $Z_{1}^{t} (t = 371, 372, \dots, 420)$ , we can calculate two evaluation indicators as the fitness function of the MMOGTO. The fitness function is defined as

m i n {\begin{cases} O b j_{1} = - (F I C P - P I N C) \\ O b j_{2} = F I N A W \times [1 + γ (F I C P) \times \exp (η \times O b j_{1})] \end{cases}

(17)

where FICP , PINC , and FINAW are the prediction interval coverage probability (FICP), the prediction interval nominal confidence (PINC), and the prediction interval normalized average width (FINAW), respectively. $O b j_{1}$ is the negative of the average coverage error (ACE) and $O b j_{2}$ is the coverage width-based criterion (CWC). The detailed definitions of these indicators are presented in Table S2 (see the online supplementary material).Equation 17 can be rewritten as

Table 2.

Exhibition for the Proposed Forecasting System and Comparative Models.

Model	Explanation
M1	Model based on BP and reduction coefficient optimized by MMOGTO (with COVID-19 information)
M2	Model based on BiLSTM and reduction coefficient optimized by MMOGTO (with COVID-19 information)
M3	Model based on ELM and reduction coefficient optimized by MMOGTO (with COVID-19 information)
M4	Model based on ENN and reduction coefficient optimized by MMOGTO (with COVID-19 information)
M5	Model based on ESN and reduction coefficient optimized by MMOGTO (with COVID-19 information)
M6	Model based on GRU and reduction coefficient optimized by MMOGTO (with COVID-19 information)
M7	Model based on LSTM and reduction coefficient optimized by MMOGTO (with COVID-19 information)
M8	Model based on ARIMA and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M9	Model based on ETS and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M10	Model based on BP and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M11	Model based on BiLSTM and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M12	Model based on ELM and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M13	Model based on ENN and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M14	Model based on ESN and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M15	Model based on GRU and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M16	Model based on LSTM and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M17	Model based on MMOGTO-ESN and reduction coefficient optimized by MMOGTO (with COVID-19 information)
M18	Model based on MMOGTO-ESN with feature selection and reduction coefficient optimized by MMOGTO (without COVID-19 information)
M19	Model based on MOAOA-ESN with feature selection and reduction coefficient optimized by MOAOA (with COVID-19 information)
M20	Model based on MODA-ESN with feature selection and reduction coefficient optimized by MODA (with COVID-19 information)
M21	Model based on MOGOA-ESN with feature selection and reduction coefficient optimized by MOGOA (with COVID-19 information)
M22	Model based on MOSSA-ESN with feature selection and reduction coefficient optimized by MOSSA (with COVID-19 information)
Proposed Model	Model based on MMOGTO-ESN with feature selection and reduction coefficient optimized by MMOGTO (with COVID-19 information)

\begin{array}{l} {O p t_S o l} = \arg \min_{{O p t_S o l}} {O b j^{1}, O b j^{2}} \\ = \arg \min_{{O p t_S o l}} {\begin{cases} O b j_{1} = - (F I C P - P I N C) \\ O b j_{2} = F I N A W \times [\begin{array}{l} 1 + γ (F I C P) \times \\ \exp (η \times O b j_{1}) \end{array}] \end{cases} \end{array}

(18)

With the continuous iterative optimization of the MMOGTO, the optimal solutions are obtained, that is, the optimal input matrix of the feature variables and the optimal hyperparameters of the ESN. Then, the ESN is retrained based on the training set $T o_{1, 2}^{u r} (1 t h - 370 t h)$ , and the prediction is conducted in terms of the validation set $T o_{1, 2}^{u r} (371 t h - 420 t h)$ and the test set $T o_{1, 2}^{u r} (421 t h - 470 t h)$ . Finally, the ultimate forecasting interval is obtained $[L B_{t}, U B_{t}] (t = 371, 372, \dots, 470)$ .

Stage II: Forecasting interval adjustment based on MMOGTO

The interval width and coverage rate are two crucial factors in evaluating the interval forecasting performance. A narrow interval width implies a higher interval forecasting accuracy. A high coverage rate indicates improved interval forecasting stability. However, the interval width and coverage rate are contradictory that should be balanced in practical applications. To this end, a reduction coefficient optimized by the MMOGTO is proposed to balance the forecasting accuracy and stability.

The lower and upper bounds of the forecasting interval in the validation and test sets can be calculated by

L {B^{'}}_{t} = θ^{1} * L B_{t} (t = 371, 372, \dots, 470)

(19)

U {B^{'}}_{t} = θ^{2} * U B_{t} (t = 371, 372, \dots, 470)

(20)

where $θ^{1, 2}$ is optimized by the MMOGTO. The fitness function of the MMOGTO is as follows

m i n {\begin{cases} O b j_{1} = - (F I C P - P I N C) \\ O b j_{2} = F I N A W \times [\begin{array}{l} 1 + γ (F I C P) \\ \times \exp (η \times O b j_{1}) \end{array}] \end{cases}

(21)

The $O b j_{1}$ and $O b j_{2}$ are the same as the objective functions in Equation 17.

It must be noted that the calculation of the fitness function is based on the interval forecasting results $[L {B^{'}}_{t}, U {B^{'}}_{t}] (t = 371, 372, \dots, 420)$ and the real tourism volumes $Z_{1}^{t} (t = 371, 372, \dots, 420)$ in the validation set. Equation 21 can be rewritten as

\begin{array}{l} {θ^{1}, θ^{2}} = \arg \min_{{θ^{1}, θ^{2}}} {O b j^{1}, O b j^{2}} \\ = \arg \min_{{θ^{1}, θ^{2}}} {\begin{cases} O b j_{1} = - (F I C P - P I N C) \\ O b j_{2} = F I N A W \times [\begin{array}{l} 1 + γ (F I C P) \\ \times \exp (η \times O b j_{1}) \end{array}] \end{cases} \\ s . t . L {B^{'}}_{t} = θ^{1} * L B_{t} (t = 371, 372, \dots, 420), \\ U {B^{'}}_{t} = θ^{2} * U B_{t} (t = 371, 372, \dots, 420), \\ 0 \leq θ_{d}^{1}, θ_{d}^{2} \leq 5 . \end{array}

(22)

Based on the optimized reduction coefficient, we can obtain the final interval forecasting results $[L {B^{'}}_{t}, U {B^{'}}_{t}] (t = 421, 422, \dots, 470)$ . A comparison with real tourism volumes $Z_{1}^{t} (t = 421, 422, \dots, 470)$ verified the forecasting performance of the proposed forecasting system.

Empirical Study

Data Selection and Experiment Setup

Daily tourism volume data and search engine data on Jiuzhaigou, China, and Hawaii, the United States, were used to test the effectiveness of the proposed system. The daily tourism volume data covered the period from April 1, 2020 to July 14, 2021, with 470 observations. All observations were divided into a training set, validation set, and test set containing 370, 50, and 50 observations, respectively. The long holidays in China were from May 1 to 5 and October 1 to 7 in 2020 and 2021. During the long holidays, tourist arrivals in Jiuzhaigou, China reached its upper limit. Considering that the outliers caused by a surge in tourists during the long holidays would disrupt the trend and continuity of the time series, a five-period moving average was used to replace the original tourism volume data of Jiuzhaigou, China (Li & Cao, 2018). According to the actual condition of China, tourist arrivals on long holidays is always up to the limit; thus, practitioners and managers in tourist destinations should be fully prepared to deal with the overflow of tourists during the long holidays. In contrast, the obvious changes in tourism volume data during holidays were not considered for Hawaii because of cultural and customary differences. Thus, the original tourism volume data of Hawaii were retained. For the missing data of Jiuzhaigou and Hawaii, we adopted a five-period moving average to fill in the gap.

Data from two search engines, that is, Google Trends and Baidu indexes, were also used as input features to improve tourism demand forecasting ability. Considering the applicability of search engines in different markets, Google Trends data was used for tourism demand forecasting in Hawaii, while Baidu Index data was used for that in Jiuzhaigou. Following the keyword classification strategy of Li and colleagues (2020) and keyword inclusion in Google Trends and the Baidu indexes, we divided the candidate search keywords into seven major categories: day type, weather, transportation, lodging, dining, tours, and coronavirus information. The search keywords are listed in Table S1 (see the online supplementary file). The daily Google Trends were downloaded from the website https://trends.google.com/, while the Baidu indexes were collected from the website http://index.baidu.com/ by using python crawler tools. It must be noted that day type is a qualitative factor. Weekends were defined as 1 and weekdays as 0. All experiments were performed in MATLAB 2021a on Windows 10 with Intel(R) Core (TM) i5-8250U CPU @ 1.60GHz 1.80 GHz.

Performance Evaluation Criteria

Four widely used interval forecasting evaluation criteria: FICP, FINAW (Jiang, Liu, et al., 2020), ACE (Nie et al., 2021), and CWC were adopted to evaluate the interval prediction ability. The mathematical expressions for these criteria are listed in Table S2 (see the online supplementary material).

Exhibition for Benchmark Models

To evaluate the forecasting performance of the proposed approach, 22 comparative methods were constructed and compared with the proposed system in subsequent experiments. The models are described in Table 2. M1 to M9 are classical individual benchmark models such as ANNs (BPNN, extreme learning machine [ELM], Elman neural network [ENN], and ESN), deep learning models (i.e., bi-directional long short-term memory [BiLSTM], gated recurrent unit [GRU], and LSTM), ARIMA, and exponential smoothing (ETS). M10 to M16 represent individual ANNs and deep learning models (BPNN, ELM, ENN, ESN, BiLSTM, GRU, and LSTM) without COVID-19 information. It must be noted that ARIMA and ETS are univariate prediction models; thus, no COVID-19 information is involved in these two models. By comparing the results with the single benchmark models in Experiment I, we can illustrate the effectiveness of the ESN. M17 is a hybrid MMOGTO-ESN that does not consider a feature selection strategy. M18 is a hybrid model that does not consider COVID-19 information. By comparing M17 and M18 with the proposed forecasting system, we can emphasize the importance of adding feature selection and coronavirus information. M19 to M22 are hybrid forecasting models based on diverse multi-objective optimization algorithms, including the multi-objective Archimedes optimization algorithm (MOAOA; Zhang, Wang, Niu, et al., 2021), multi-objective dragonfly algorithm (MODA; Mirjalili, 2016), multi-objective grasshopper optimization algorithm (MOGOA; Mirjalili et al., 2018), and multi-objective salp swarm algorithm (MOSSA; Mirjalili et al., 2017). The comparison between M19 to M22 and the proposed forecasting system verified the optimization performance of different optimization technologies.

Experiment I: Comparison Between Individual Forecasting Models

To select the most effective predictor for daily tourism demand forecasting, nine commonly used forecasting technologies: ELM, ESN, BP, ENN, BiLSTM, GRU, LSTM, ARIMA, and ETS were employed and compared. To verify the effectiveness of adding COVID-19 information, both ANNs and deep learning models with and without COVID-19 information were used to estimate the forecasting interval. The parameter settings of the predictors used in this experiment are listed in Table S3 (see the online supplementary material) which are designed for default values or based on previous studies (Wang & Gao, 2022; Zhang, Wang, Niu, et al., 2021). To minimize the influence of parameter variations on the prediction performance, the common parameters of different predictors were set to be the same. Four classical evaluation indexes (FICP, FINAW, ACE, and CWC) were adopted to verify the interval prediction capacity of the forecasting models. A lower FINAW indicates a higher interval forecasting accuracy, while a larger FICP indicates improved interval forecasting stability. ACE estimates whether the constructed forecasting interval can cover the preset PINC. When $A C E > 0$ ₁, $F I C P > P I N C$ is satisfied, which means that the constructed forecasting interval is effective. CWC is a comprehensive indicator; lower CWC values yield high-quality interval forecasting results. Multiple PINC levels were designed, that is, the significance levels were set as $α = 0.05$ , $α = 0.10$ , and $α = 0.15$ .

Table 3.

Interval Forecasting Results of Benchmark Models in Two Datasets.

Dataset	Model	Alpha = 0.05				Alpha = 0.10				Alpha = 0.15
Dataset	Model	ACE	CWC	FICP	FINAW	ACE	CWC	FICP	FINAW	ACE	CWC	FICP	FINAW
Dataset A	M1	3.00%	3.55	98.00%	3.55	2.00%	3.07	92.00%	3.07	3.00%	1.91	88.00%	1.91
	M2	5.00%	1.28	100.00%	1.28	10.00%	0.87	100.00%	0.87	13.00%	0.62	98.00%	0.62
	M3	3.00%	0.99	98.00%	0.99	0.00%	0.93	90.00%	0.93	7.00%	0.89	92.00%	0.89
	M4	3.00%	3.64	98.00%	3.64	4.00%	2.11	94.00%	2.11	11.00%	1.53	96.00%	1.53
	M5	3.00%	0.85	98.00%	0.85	6.00%	0.66	96.00%	0.66	13.00%	0.59	98.00%	0.59
	M6	5.00%	0.97	100.00%	0.97	8.00%	0.75	98.00%	0.75	11.00%	0.70	96.00%	0.70
	M7	5.00%	0.95	100.00%	0.95	10.00%	0.90	100.00%	0.90	15.00%	0.89	100.00%	0.89
	M8	3.00%	1.01	98.00%	1.01	8.00%	1.01	98.00%	1.01	13.00%	1.01	98.00%	1.01
	M9	5.00%	0.69	100.00%	0.69	10.00%	0.69	100.00%	0.69	15.00%	0.69	100.00%	0.69
	M10	3.00%	3.59	98.00%	3.59	4.00%	5.45	94.00%	5.45	7.00%	2.67	92.00%	2.67
	M11	3.00%	1.32	98.00%	1.32	10.00%	0.96	100.00%	0.96	15.00%	0.82	100.00%	0.82
	M12	3.00%	1.28	98.00%	1.28	8.00%	1.22	98.00%	1.22	5.00%	1.55	90.00%	1.55
	M13	3.00%	3.69	98.00%	3.69	8.00%	2.55	98.00%	2.55	15.00%	3.60	100.00%	3.60
	M14	3.00%	0.88	98.00%	0.88	8.00%	0.80	98.00%	0.80	13.00%	0.88	98.00%	0.88
	M15	5.00%	1.10	100.00%	1.10	10.00%	0.96	100.00%	0.96	15.00%	0.92	100.00%	0.92
	M16	5.00%	0.98	100.00%	0.98	10.00%	1.15	100.00%	1.15	13.00%	0.99	98.00%	0.99
Dataset B	M1	3.00%	6.30	98.00%	6.30	4.00%	6.18	94.00%	6.18	9.00%	5.32	94.00%	5.32
	M2	5.00%	1.90	100.00%	1.90	10.00%	1.65	100.00%	1.65	15.00%	1.47	100.00%	1.47
	M3	3.00%	1.91	98.00%	1.91	6.00%	1.72	96.00%	1.72	3.00%	1.64	88.00%	1.64
	M4	1.00%	5.74	96.00%	5.74	8.00%	5.51	98.00%	5.51	11.00%	4.37	96.00%	4.37
	M5	1.00%	1.31	96.00%	1.31	2.00%	1.26	92.00%	1.26	7.00%	1.24	92.00%	1.24
	M6	3.00%	2.25	98.00%	2.25	8.00%	1.52	98.00%	1.52	11.00%	1.48	96.00%	1.48
	M7	3.00%	2.13	98.00%	2.13	10.00%	1.74	100.00%	1.74	13.00%	1.53	98.00%	1.53
	M8	5.00%	3.28	100.00%	3.28	10.00%	3.28	100.00%	3.28	15.00%	3.28	100.00%	3.28
	M9	5.00%	1.60	100.00%	1.60	10.00%	1.66	100.00%	1.66	15.00%	1.64	100.00%	1.64
	M10	3.00%	9.28	98.00%	9.28	6.00%	8.82	96.00%	8.82	15.00%	5.39	100.00%	5.39
	M11	5.00%	2.28	100.00%	2.28	8.00%	2.66	98.00%	2.66	11.00%	1.51	96.00%	1.51
	M12	5.00%	2.13	100.00%	2.13	4.00%	1.99	94.00%	1.99	9.00%	1.75	94.00%	1.75
	M13	5.00%	7.76	100.00%	7.76	10.00%	6.15	100.00%	6.15	15.00%	6.83	100.00%	6.83
	M14	3.00%	1.34	98.00%	1.34	6.00%	1.34	96.00%	1.34	13.00%	1.52	98.00%	1.52
	M15	5.00%	2.27	100.00%	2.27	8.00%	1.94	98.00%	1.94	13.00%	2.51	98.00%	2.51
	M16	5.00%	2.39	100.00%	2.39	10.00%	1.85	100.00%	1.85	13.00%	1.58	98.00%	1.58

The interval forecasting results for the two datasets are given in Table 3. The explanations for each forecasting model are presented in Table 2. From Table 3, we can find that the ACE values of all the predictors are positive. For example, in Dataset A, when $α = 0.05$ , $A C E_{α = 0.05}^{M 1} = 3.00 %$ , $A C E_{α = 0.05}^{M 2} = 5.00 %$ , $A C E_{α = 0.05}^{M 3} = 3.00 %$ , $A C E_{α = 0.05}^{M 4} = 3.00 %$ , $A C E_{α = 0.05}^{M 5} = 3.00 %$ , $A C E_{α = 0.05}^{M 6} = 5.00 %$ , $A C E_{α = 0.05}^{M 7} = 5.00 %$ , $A C E_{α = 0.05}^{M 8} = 3.00 %$ , $A C E_{α = 0.05}^{M 9} = 5.00 %$ , $A C E_{α = 0.05}^{M 10} = 3.00 %$ , $A C E_{α = 0.05}^{M 11} = 3.00 %$ , $A C E_{α = 0.05}^{M 12} = 3.00 %$ , $A C E_{α = 0.05}^{M 13} = 3.00 %$ , $A C E_{α = 0.05}^{M 14} = 3.00 %$ , $A C E_{α = 0.05}^{M 15} = 5.00 %$ , and $A C E_{α = 0.05}^{M 16} = 5.00 %$ ,respectively, all of which are greater than 0. In Dataset B, the situation is the same. This indicates that all the predictors can grasp the future changes in the tourism demand series and generate effective forecasting intervals, proving the effectiveness of adding the optimized reduction coefficients. Because $F I C P > P I N C$ , the penalty added to FINAW by CWC is 0 ( $γ (F I C P) = {\begin{matrix} 0, & F I C P \geq P I N C \\ 1, & F I C P < P I N C \end{matrix}$ ); thus, CWC = FINAW in two datasets.

First, adding COVID-19 information is conducive to improving forecasting performance. By comparing individual ANNs and deep learning models with and without COVID-19 information, we can find that the forecasting ability of the M1 to M7 is significantly superior to the M10 to M16, indicating the effectiveness of involving COVID-19 information. For example, when $α = 0.05$ in Dataset A, the FINAW and CWC value of M5 (ESN with COVID-19 information) is $F I N A W_{α = 0.05}^{D a t a s e t A} = C W C_{α = 0.05}^{D a t a s e t A} = 0.85$ , while the corresponding value of M14 (ESN without COVID-19 information) is $F I N A W_{α = 0.05}^{D a t a s e t A} = C W C_{α = 0.05}^{D a t a s e t A} = 0.88$ . furthermore, according to the comparison results in the two datasets, the interval forecasting ability of benchmark deep learning methods (i.e., BiLSTM, GRU, and LSTM) surpasses the overwhelming majority of benchmark ANNs (i.e., BP, ELM, and ENN). For example, the FINAW and CWC values of M2 (BiLSTM), M6 (GRU), and M7 (LSTM) at $α = 0.10$ are $F I N A W_{α = 0.10}^{D a t a s e t A} = C W C_{α = 0.10}^{D a t a s e t A} = 0.87$ , $F I N A W_{α = 0.10}^{D a t a s e t A} = C W C_{α = 0.10}^{D a t a s e t A} = 0.75$ , and $F I N A W_{α = 0.10}^{D a t a s e t A} = C W C_{α = 0.10}^{D a t a s e t A}$ =0.90, respectively, which are lower than the FINAW and CWC values of M1 (BP), M3 (ELM), and M4 (ENN) with the values of $F I N A W_{α = 0.10}^{D a t a s e t A} = C W C_{α = 0.10}^{D a t a s e t A} = 3.07$ , $F I N A W_{α = 0.10}^{D a t a s e t A} = C W C_{α = 0.10}^{D a t a s e t A} = 0.93$ , and $F I N A W_{α = 0.10}^{D a t a s e t A} = C W C_{α = 0.10}^{D a t a s e t A}$ =2.11, respectively. Moreover, the FICP values of deep learning methods are also larger than that of ANNs. The FICP values of deep learning methods are $F I C P_{α = 0.05}^{D a t a s e t A} = 100.00 %$ at $α = 0.05$ , which are superior to the ANNs with the corresponding values of $F I C P_{α = 0.05}^{D a t a s e t A} = 98.00 %$ .

Compared with other predictors, the FINAW and CWC values of M5 (ESN) were always the best except for $α = 0.05$ in Dataset A. In Dataset A, the FINAW and CWC values of M5 at $α = 0.05$ are $F I N A W_{α = 0.05}^{D a t a s e t A} = C W C_{α = 0.05}^{D a t a s e t A} = 0.85$ , which are slightly inferior to the corresponding values of the M9. Yet, when $α = 0.10$ and $α = 0.15$ , M5 yields the best FINAW and CWC values with the values of $F I N A W_{α = 0.10}^{D a t a s e t A} = C W C_{α = 0.10}^{D a t a s e t A} = 0.66$ and $F I N A W_{α = 0.15}^{D a t a s e t A} = C W C_{α = 0.15}^{D a t a s e t A}$ =0.59, respectively. Similarly, for Dataset B, the proposed model also yields the best FINAW and CWC values with $F I N A W_{α = 0.05}^{D a t a s e t B} = C W C_{α = 0.05}^{D a t a s e t B} = 1.31$ at $α = 0.05$ , $F I N A W_{α = 0.10}^{D a t a s e t B} = C W C_{α = 0.10}^{D a t a s e t B} = 1.26$ at $α = 0.10$ , and $F I N A W_{α = 0.15}^{D a t a s e t B} = C W C_{α = 0.15}^{D a t a s e t B} = 1.24$ at $α = 0.15$ , respectively. These show that M5 can provide accurate interval forecasts. It should be noted that the FICP values of M5 are not the highest among all comparative models; however, this is acceptable because $F I C P > P I N C$ , that is, the constructed interval coverage rate (FICP) can cover the actual coverage required (PINC).

Conclusion

Based on the FICP, FINAW, ACE, and CWC indicators, we compared the interval forecasting performance of nine commonly used models. The simulation results verify the high-quality forecasting ability of the ESN and the effectiveness of the reduction coefficients optimized by the MMOGTO.

Experiment II: Comparison Between the Proposed System and Other Hybrid Forecasting Models

To further investigate the forecasting performance of the proposed forecasting system, six hybrid forecasting models (M17 to M22) were built; their descriptions are provided in Table 2. M17 is a hybrid MMOGTO-ESN model without a feature selection scheme. M18 is a hybrid model that does not consider COVID-19 information. M19 to M22 are four hybrid methods based on various multi-objective optimization technologies (MOAOA, MODA, MOGOA, and MOSSA). The number of iterations ( n _i), archive size ( S _a), and population size ( S _p) of the four optimization algorithms were set to be the same: n _i = 500, S _a = 500, and S _p = 50. These parameter values are set by referring to previous studies (Jiang et al., 2021). By comparing the six models with the proposed forecasting system, we address three questions:

(1) Why is feature selection necessary?

(2) Will adding COVID-19 information impact forecasting results?

(3) Is the MMOGTO necessary to optimize hyperpara-meters and reduction coefficients?

The comparison results are shown in Table 4. From this table, we can see that the ACE values of all forecasting technologies satisfied $A C E_{α = 0.05, 0.10, 0.15}^{M 10 - M 15} > 0$ for $α = 0.05$ , $α = 0.10$ , and $α = 0.15$ . This indicates that all forecasting technologies can provide a satisfactory forecasting interval coverage rate that is wider than that of PINC. In this context, the CWC values are equal to the FINAW values. When $α = 0.05$ , the FINAW and CWC values of the proposed model are $F I N A W_{α = 0.05}^{D a t a s e t A} = C W C_{α = 0.05}^{D a t a s e t A} = 0.49$ on Dataset A and $F I N A W_{α = 0.10}^{D a t a s e t B} = C W C_{α = 0.10}^{D a t a s e t B} = 1.07$ on Dataset B, respectively, an improvement of 0.28 and 0.35 compared with those of M17 with $F I N A W_{α = 0.05}^{D a t a s e t A} = C W C_{α = 0.05}^{D a t a s e t A} = 0.77$ and $F I N A W_{α = 0.05}^{D a t a s e t B} = C W C_{α = 0.05}^{D a t a s e t B} = 1.42$ , respectively. Yet, the FICP values of M17 are always better or equal to that of the proposed model. For example, the in Dataset A, the FICP values of M17 are $F I C P_{α = 0.05}^{D a t a s e t A} = 98.00 %$ , $F I C P_{α = 0.10}^{D a t a s e t A} = 96.00 %$ , and $F I C P_{α = 0.15}^{D a t a s e t A} = 96.00 %$ , respectively, while the corresponding values of the proposed model are $F I C P_{α = 0.05}^{D a t a s e t A} = 96.00 %$ , $F I C P_{α = 0.10}^{D a t a s e t A} = 92.00 %$ , and $F I C P_{α = 0.15}^{D a t a s e t A} = 94.00 %$ , respectively, lower than the values of the M17. The high FICP value comes at the expense of a high FINAW, and we have shown that $F I C P > P I N C$ is satisfied for all prediction models; thus, it is not necessary to broaden the width of the forecasting interval only in pursuit of high coverage. The crucial difference between the proposed forecasting system and M17 is the utilization of the feature selection. By comparing the simulation results, we can conclude that the feature selection enhances the interval forecasting accuracy because it can adaptively select the optimal input variables to include more useful information. For M18, the $F I N A W_{α = 0.10}^{D a t a s e t A} = C W C_{α = 0.10}^{D a t a s e t A} = 0.49$ on Dataset A and $F I N A W_{α = 0.10}^{D a t a s e t B} = C W C_{α = 0.10}^{D a t a s e t B} = 1.30$ on Dataset B are obviously greater than those of the proposed model with $F I N A W_{α = 0.10}^{D a t a s e t A} = C W C_{α = 0.10}^{D a t a s e t A} = 0.35$ and $F I N A W_{α = 0.10}^{D a t a s e t B} = C W C_{α = 0.10}^{D a t a s e t B} = 0.89$ , respectively. When it comes to other significant levels (i.e., $α = 0.05$ , and $α = 0.15$ ), the FINAW and CWC values of the proposed model are still lower than that of M18. Thus, COVID-19 information on tourist destinations is an essential factor that tourists consider, and its addition can effectively improve forecasting accuracy. To further provide examples of how the proposed system outperforms M18, Figure S2 (see online supplementary material) shows the interval forecasting results of the proposed system and M18 on Jiuzhaigou dataset at $α = 0.15$ . Specifically, the forecasting result of M18 is given in Figure S2 (A), while the forecasting result of the proposed model is given in Figure S2 (B). In Figure S2, the triangle represents real daily tourism volume data, and the pentacle represents the upper and lower bounds of the forecast interval. From this figure, it is obvious that the proposed model presents more satisfactory interval forecasting performance over the comparative model, since the proposed model can better depict the trend of the original tourism volume data. Although there is some lag in the prediction results of some points, the proposed model can more accurately reflect the changes in more data points compared with the comparative model.

Table 4.

Interval Forecasting Results of the Proposed Forecasting System and Other Comparative Hybrid Models in Two Datasets.

Dataset	Model	Alpha = 0.05				Alpha = 0.10				Alpha = 0.15
Dataset	Model	ACE	CWC	FICP	FINAW	ACE	CWC	FICP	FINAW	ACE	CWC	FICP	FINAW
Dataset A	M17	3.00%	0.77	98.00%	0.77	6.00%	0.63	96.00%	0.63	11.00%	0.46	96.00%	0.46
	M18	3.00%	0.76	98.00%	0.76	0.00%	0.49	90.00%	0.49	9.00%	0.47	94.00%	0.47
	M19	3.00%	0.52	98.00%	0.52	6.00%	0.51	96.00%	0.51	9.00%	0.46	94.00%	0.46
	M20	3.00%	0.57	98.00%	0.57	2.00%	0.53	92.00%	0.53	9.00%	0.50	94.00%	0.50
	M21	3.00%	0.61	98.00%	0.61	0.00%	0.57	90.00%	0.57	3.00%	0.46	88.00%	0.46
	M22	1.00%	0.54	96.00%	0.54	4.00%	0.45	94.00%	0.45	3.00%	0.39	88.00%	0.39
	Proposed Model	1.00%	0.49	96.00%	0.49	2.00%	0.35	92.00%	0.35	9.00%	0.33	94.00%	0.33
Dataset B	M17	3.00%	1.42	98.00%	1.42	8.00%	1.26	98.00%	1.26	13.00%	1.23	98.00%	1.23
	M18	3.00%	1.39	98.00%	1.39	8.00%	1.30	98.00%	1.30	13.00%	1.24	98.00%	1.24
	M19	3.00%	1.21	98.00%	1.21	8.00%	1.11	98.00%	1.11	13.00%	0.99	98.00%	0.99
	M20	1.00%	1.21	96.00%	1.21	6.00%	0.92	96.00%	0.92	7.00%	0.90	92.00%	0.90
	M21	3.00%	1.13	98.00%	1.13	4.00%	1.03	94.00%	1.03	13.00%	0.82	98.00%	0.82
	M22	3.00%	1.26	98.00%	1.26	8.00%	0.98	98.00%	0.98	13.00%	0.86	98.00%	0.86
	Proposed Model	3.00%	1.07	98.00%	1.07	8.00%	0.89	98.00%	0.89	11.00%	0.75	96.00%	0.75

We also conducted a comparison between M19 to M22 and the proposed forecasting system. The forecasting results of the proposed forecasting system showed significant improvement over those of the hybrid models M19 to M22 with different multi-objective optimization algorithms. The proposed forecasting system yielded the lowest FINAW and CWC values among all the comparative models. For instance, in Dataset B, the proposed model yields the best FINAW and CWC value with $F I N A W_{α = 0.15}^{D a t a s e t B} = C W C_{α = 0.15}^{D a t a s e t B} = 0.75$ at $α = 0.15$ , which is lower than that of M19–M22 with the corresponding values of $F I N A W_{α = 0.15}^{D a t a s e t B} = C W C_{α = 0.15}^{D a t a s e t B} = 0.99$ , $F I N A W_{α = 0.15}^{D a t a s e t B} = C W C_{α = 0.15}^{D a t a s e t B} = 0.90$ , $F I N A W_{α = 0.15}^{D a t a s e t B} = C W C_{α = 0.15}^{D a t a s e t B} = 0.82$ , and $F I N A W_{α = 0.15}^{D a t a s e t B} = C W C_{α = 0.15}^{D a t a s e t B} = 0.86$ , respectively. Thus, we can conclude that the MMOGTO has good optimization ability and is a helpful tool for improving tourism demand forecasting performance. In Dataset B, when $α = 0.05$ and $α = 0.10$ , the FICP values of the proposed forecasting system are the best ( $F I C P_{α = 0.05}^{D a t a s e t B} = 98.00 %$ and $F I C P_{α = 0.10}^{D a t a s e t B} = 98.00 %$ ); however, in some situations, they are not always the largest. This is feasible because the constructed interval coverage rate (FICP) can successfully cover the actual coverage required (PINC).

Conclusion

By comparing six hybrid models with the proposed forecasting system, we proved that the feature selection, the addition of COVID-19 information, and the proposed MMOGTO algorithm can significantly improve the interval forecasting performance.

Conclusion and Discussion

Conclusion

Accurate and stable tourism demand prediction is crucial for multiple parties in the tourism industry. However, because of the impact of exogenous factors, such as COVID-19, tourism demand data is nonlinear and uncertain. To effectively improve tourism demand forecasting accuracy and stability, a hybrid interval forecasting system based on MMOGTO, ESN, and feature selection technology was proposed for daily tourism demand forecasting. Two empirical studies were conducted based on datasets obtained from China and the United States. The main conclusion can be summarized as follows.

(1) Adding COVID-19 information and feature selection improved interval forecasting performance, since forecasting models with COVID-19 information and feature selection achieved higher CWC and FINAW values. (2) A reduction coefficient successfully managed the contradiction between the interval width and coverage. After the adjustment, the ACE values of all the forecasting models were greater than 0, indicating that these can effectively capture future changes in the tourism demand series and build reasonable forecasting intervals. (3) Compared with other optimization algorithms, including MOAOA (Zhang, Wang, Niu, et al., 2021), MODA (Mirjalili, 2016), MOGOA (Mirjalili et al., 2018), and MOSSA (Mirjalili et al., 2017), the proposed forecasting system with MMOGTO provided a higher interval forecasting accuracy, verifying the effectiveness of the proposed MMOGTO. (4) The results showed that the proposed forecasting system provided accurate and stable forecasting intervals. At all significance levels, the ACE values of the proposed model were greater than 0, and the CWC and FINAW values were significantly lower than all comparative models. Thus, the proposed hybrid interval forecasting system contributed to the management efforts of practitioners in the tourism industry.

Theoretical Implications

This study provided a new perspective for tourism demand interval forecasting and extended the related literatures. The proposed model provided a new method for tourism interval prediction against the background of the COVID-19 pandemic. COVID-19 information along with other related search engine data were taken as candidate input variables and adaptively selected by feature selection to assist in depicting tourism volume trends , which provided more useful reference information for decision-makers to conduct tourism demand interval forecasting and improve forecasting reliability. The optimized interval reduction coefficient was proposed to predict the upper and lower bounds of the forecasting intervals, which was not used in previous studies. Empirical results demonstrated that the constructed forecasting interval by all predictors can cover the preset PINC; thus, the constructed forecasting interval was effective, verifying the superiority of the optimized interval reduction coefficient. This prompted practitioners and managers to pay more attention to the coordination between the coverage rate and the width in tourism demand interval forecasting, so as to construct a more effective forecasting interval and improve the accuracy and stability of tourism demand interval forecasting.

Practical Implications

During the COVID-19 pandemic, the volatile market environment has aggravated the difficulty of tourism demand point forecasting, since it always provides an expected average outcome. Interval forecasting can provide an expected range of future outcomes, therefore contributing to contingency planning (Wu et al., 2021). Reliable interval forecasting results during the COVID-19 pandemic provide crucial guidance for balancing the recovery of the tourism industry and the control of the epidemic spread (Li et al., 2022). Specifically, when the predicted tourism volume is low, stakeholders can optimize pricing strategies dynamically or enact attractive travel packages to draw more tourists. When the predicted tourism volume is high, practitioners can make plans in advance and take relevant measures, such as arranging for temporary employees, and ensuring a supply of tourism products and anti-epidemic related articles. Managers can formulate crisis plans to prevent the spread of the epidemic due to the detention or gathering of tourists. Moreover, traffic authorities can issue announcements to make reasonable travel arrangements to avoid congestion.

The current challenges facing the tourism industry urge managers to provide more comfortable travel experiences, more personalized services, more reasonable prices, and better sanitation initiatives (Abbas et al., 2021). For example, practitioners can provide airport shuttle services for tourists, improve the service quality of hotels and restaurants, and provide discount coupons for tourists (Liu et al., 2022). To control the risk of COVID-19 spreading, restaurant operators should provide staggered dining during expected peaks in travel demand, and beaches can be divided into separate areas to reduce crowding. In addition, providing free protective masks, gloves, and disinfectant can further increase tourist satisfaction during the pandemic.

Limitations and Future Research

Despite the superior performance of the proposed forecasting system, there are still some limitations. Considering data integrity and stability, the tourism demand time series data beyond the pandemic has not been used, which can be further investigated in future work. Multiple-step ahead forecasting and longer forecasting horizons should be considered. Apart from search engine data, other types of data sources, including web-based text and social media data, should also be considered. More policy factors regarding COVID-19 in tourist destinations need to be considered. Furthermore, the applicability of the proposed model to the highly uncommon circumstances and fluctuating COVID-19 situation needs to be further explored in future work.

Supplemental Material

sj-docx-1-jht-10.1177_10963480221142873 – Supplemental material for Tourism Demand Interval Forecasting Amid COVID-19: A Hybrid Model With a Modified Multi-Objective Optimization Algorithm

Supplemental material, sj-docx-1-jht-10.1177_10963480221142873 for Tourism Demand Interval Forecasting Amid COVID-19: A Hybrid Model With a Modified Multi-Objective Optimization Algorithm by Jianzhou Wang, Lifang Zhang, Zhenkun Liu and Xiaojia Huang in Journal of Hospitality & Tourism Research

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Major Program of National Social Science Foundation of China under Grant [number 17ZDA093].

ORCID iD

Lifang Zhang

Supplemental Material

Supplemental material for this article is available online.

Author Biographies

Jianzhou Wang, Ph.D (E-mail address: wangjz@dufe.edu.cn), is a professor at Dongbei University of Finance and Economics in Dalian, China.

Lifang Zhang, B.A. (E-mail address: lifangzhang1106@126.com) is getting her Doctor degree with Dongbei University of Finance and Economics in Dalian, China.

Zhenkun Liu, M.A. (E-mail address: zhenkunliudufe@163.com) is getting his Doctor degree with Dongbei University of Finance and Economics in Dalian, China.

Xiaojia Huang, Ph.D (E-mail address: 13321286058@163.com) is a postdoctor at Lanzhou University in Lanzhou, China.

References

Abbas

Mubeen

Iorember

P. T.

Raza

Mamirkulova

(2021). Exploring the impact of COVID-19 on tourism: Transformational potential and implications for a sustainable recovery of the travel and leisure industry. Current Research in Behavioral Sciences, 2, 100033. https://doi.org/10.1016/j.crbeha.2021.100033

Abdollahzadeh

Soleimanian Gharehchopogh

Mirjalili

(2021). Artificial gorilla troops optimizer: A new nature-inspired metaheuristic algorithm for global optimization problems. International Journal of Intelligent Systems, 36(10), 5887–5958. https://doi.org/10.1002/int.22535

Akin

(2015). A novel approach to model selection in tourism demand modeling. Tourism Management, 48, 64–72. https://doi.org/10.1016/j.tourman.2014.11.004

Álvarez-Díaz

Rosselló-Nadal

(2010). Forecasting British tourist arrivals in the Balearic Islands using meteorological variables. Tourism Economics, 16(1), 153–168. https://doi.org/10.5367/000000010790872079

Artola

Pinto

Pedraza

P. De

. (2015). Can internet searches forecast tourism inflows? International Journal of Manpower, 36(1), 103 –116. https://doi.org/10.1108/IJM-12-2014-0259

Athanasopoulos

Hyndman

R. J.

Song

D. C.

(2011). The tourism forecasting competition. International Journal of Forecasting, 27(3), 822–844. https://doi.org/10.1016/j.ijforecast.2010.04.009

Athanasopoulos

Song

Sun

J. A.

(2018). Bagging in tourism demand modeling and forecasting. Journal of Travel Research, 57(1), 52–68. https://doi.org/10.1177/0047287516682871

J. W.

Fan

Z. P.

(2021). Tourism demand forecasting with time series imaging: A deep learning model. Annals of Tourism Research, 90, 103255. https://doi.org/10.1016/j.annals.2021.103255

J. W.

(2021). Forecasting daily tourism demand for tourist attractions with Big Data: An ensemble deep learning method. Journal of Travel Research, 90, 103255. https://doi.org/10.1177/00472875211040569

10.

J. W.

Liu

(2020). Daily tourism volume forecasting for tourist attractions. Annals of Tourism Research, 83, 102923. https://doi.org/10.1016/j.annals.2020.102923

11.

Camacho

Pacce

M. J.

(2018). Forecasting travellers in Spain with Google’s search volume indices. Tourism Economics, 24(4), 434–448. https://doi.org/10.1177/1354816617737227

12.

Chandrashekar

Sahin

(2014). A survey on feature selection methods. Computers and Electrical Engineering, 40(1), 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024

13.

Dergiades

Mavragani

Pan

(2018). Google Trends and tourists’ arrivals: Emerging biases and proposed corrections. Tourism Management, 66, 108–120. https://doi.org/10.1016/j.tourman.2017.10.014

14.

Domingos

(2012). A few useful things to know about machine learning. In Communications of the ACM, 55(10), 78–87. https://doi.org/10.1145/2347736.2347755

15.

Duan

Xie

Morrison

A. M.

(2022). Tourism crises and impacts on destinations: A systematic review of the tourism and hospitality literature. Journal of Hospitality and Tourism Research, 46(4). https://doi.org/10.1177/1096348021994194

16.

Fotiadis

Polyzos

Huan

T. C. T. C

. (2021). The good, the bad and the ugly on COVID-19 tourism recovery. Annals of Tourism Research, 87, 103117. https://doi.org/10.1016/j.annals.2020.103117

17.

Gao

Duru

Yuen

K. F.

(2021). Time series forecasting based on echo state network and empirical wavelet transformation. Applied Soft Computing, 102, 107111. https://doi.org/10.1016/j.asoc.2021.107111

18.

Gunter

Smeral

Zekan

(2022). Forecasting tourism in the EU after the COVID-19 Crisis. Journal of Hospitality and Tourism Research, 109634802211251. https://doi.org/10.1177/10963480221125130

19.

Guyon

A. M.

(2003). An introduction to variable and feature selection André Elisseeff. Journal of Machine Learning Research, 3, 1157–1182.

20.

Havranek

Zeynalov

(2021). Forecasting tourist arrivals: Google Trends meets mixed-frequency data. Tourism Economics, 27(1), 129–148. https://doi.org/10.1177/1354816619879584

21.

Höpken

Eberle

Fuchs

Lexhagen

(2021). Improving tourist arrival prediction: A big data and artificial neural network approach. Journal of Travel Research, 60(5), 998–1017. https://doi.org/10.1177/0047287520921244

22.

Qiu

R. T. R.

D. C.

Song

(2021). Hierarchical pattern recognition for tourism demand forecasting. Tourism Management, 164, 729–751. https://doi.org/10.1016/j.tourman.2020.104263

23.

Song

(2020). Data source combination for tourism demand forecasting. Tourism Economics, 26(7), 1248 –1265. https://doi.org/10.1177/1354816619872592

24.

Wang

Tao

(2021). Wind speed forecasting based on variational mode decomposition and improved echo state network. Renewable Energy, 164, 729–751. https://doi.org/10.1016/j.renene.2020.09.109

25.

Y. C.

Jiang

(2021). Tourism demand forecasting using nonadditive forecast combinations. Journal of Hospitality and Tourism Research, 109634802110478. https://doi.org/10.1177/10963480211047857

26.

Huang

Hao

(2021). A novel two-step procedure for tourism demand forecasting. Current Issues in Tourism, 24(9), 1199–1210. https://doi.org/10.1080/13683500.2020.1770705

27.

Huang

Y. L.

Lin

C. T.

(2011). Developing an interval forecasting method to predict undulated demand. Quality and Quantity, 45(3), 513–524. https://doi.org/10.1007/s11135-010-9317-9

28.

Jaipuria

Parida

Ray

(2021). The impact of COVID-19 on tourism sector in India. Tourism Recreation Research, 46(2), 245–260. https://doi.org/10.1080/02508281.2020.1846971

29.

Jiang

Liu

Niu

Zhang

(2020). A combined forecasting system based on statistical method, artificial neural networks, and deep learning methods for short-term wind speed forecasting. Energy, 217, 119361. https://doi.org/10.1016/j.energy.2020.119361

30.

Jiang

Liu

Wang

Zhang

(2021). Decomposition-selection-ensemble forecasting system for energy futures price forecasting based on multi-objective version of chaos game optimization algorithm. Resources Policy, 73, 102234. https://doi.org/10.1016/j.resourpol.2021.102234

31.

Jiang

Yang

(2020). Inbound tourism demand forecasting framework based on fuzzy time series and advanced optimization algorithm. Applied Soft Computing Journal, 92, 106320. https://doi.org/10.1016/j.asoc.2020.106320

32.

Kim

J. H.

Wong

Athanasopoulos

Liu

(2011). Beyond point forecasting: Evaluation of alternative prediction intervals for tourist arrivals. International Journal of Forecasting, 27(3), 887–901. https://doi.org/10.1016/j.ijforecast.2010.02.014

33.

Kourentzes

Saayman

Jean-Pierre

Provenzano

Sahli

Seetaram

Volo

(2021). Visitor arrivals forecasts amid COVID-19: A perspective from the Africa team. Annals of Tourism Research, 88, 103197. https://doi.org/10.1016/j.annals.2021.103197

34.

Cao

(2018). Prediction for tourism flow based on LSTM neural network. Procedia Computer Science, 129, 277–283. https://doi.org/10.1016/j.procs.2018.03.076

35.

Liu

Zheng

(2020). Forecasting tourist arrivals using denoising and potential factors. Annals of Tourism Research, 83, 102943. https://doi.org/10.1016/j.annals.2020.102943

36.

(2020). Forecasting tourism demand with multisource big data. Annals of Tourism Research, 83, 102912. https://doi.org/10.1016/j.annals.2020.102912

37.

Law

(2020). Forecasting tourism demand with decomposed search cycles. Journal of Travel Research, 59(1), 52–68. https://doi.org/10.1177/0047287518824158

38.

Pan

Law

(2021). Machine learning in Internet search query selection for tourism forecasting. Journal of Travel Research, 60(6), 1213–1231. https://doi.org/10.1177/0047287520934871

39.

Liang

Wang

(2019). Intelligence in tourism management: A hybrid FOA-BP method on daily tourism demand forecasting with web search data. Mathematics, 7(6), 531. https://doi.org/10.3390/MATH7060531

40.

Pan

Law

Huang

(2017). Forecasting tourism demand with composite search index. Tourism Management, 60(6), 1213–1231. https://doi.org/10.1016/j.tourman.2016.07.005

41.

D. C.

Zhou

Liu

(2019). The combination of interval forecasts in tourism. Annals of Tourism Research, 75, 363–378. https://doi.org/10.1016/j.annals.2019.01.010

42.

Zheng

(2022). Tourism demand forecasting with spatiotemporal features. Annals of Tourism Research, 94, 103384. https://doi.org/10.1016/j.annals.2022.103384

43.

Lim

McAleer

(2002). Time series forecasts of international travel demand for Australia. Tourism Management, 23(4), 389–396. https://doi.org/10.1016/S0261-5177(01)00098-X

44.

Liu

Jiang

Wang

Niu

Zhang

(2022). Hospitality order cancellation prediction from a profit-driven perspective prediction. International Journal of Contemporary Hospitality Management. https://doi.org/10.1108/IJCHM-06-2022-0737

45.

Liu

Wang

(2021). A study on the influencing factors of tourism demand from mainland China to Hong Kong. Journal of Hospitality and Tourism Research. https://doi.org/10.1177/1096348020944435

46.

Liu

Y. Y.

Tseng

F. M.

Tseng

Y. H.

(2018). Big Data analytics for forecasting tourism destination arrivals with the applied vector autoregression model. Technological Forecasting and Social Change, 130, 123–134. https://doi.org/10.1016/j.techfore.2018.01.018

47.

Liu

Vici

Ramos

Giannoni

Blake

(2021). Visitor arrivals forecasts amid COVID-19: A perspective from the Europe team. Annals of Tourism Research, 88, 103182. https://doi.org/10.1016/j.annals.2021.103182

48.

Mirjalili

(2016). Dragonfly algorithm: A new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Computing and Applications, 27(4), 1053–1073. https://doi.org/10.1007/s00521-015-1920-1

49.

Mirjalili

Gandomi

A. H.

Mirjalili

S. Z.

Saremi

Faris

Mirjalili

S. M.

(2017). Salp Swarm algorithm: A bio-inspired optimizer for engineering design problems. Advances in Engineering Software, 114, 163–191. https://doi.org/10.1016/j.advengsoft.2017.07.002

50.

Mirjalili

S. Z.

Mirjalili

Saremi

Faris

Aljarah

(2018). Grasshopper optimization algorithm for multi-objective optimization problems. Applied Intelligence, 48(4), 805–820. https://doi.org/10.1007/s10489-017-1019-8

51.

Morley

Rosselló

Santana-Gallego

(2014). Gravity models for tourism demand: Theory and use. Annals of Tourism Research, 48, 1–10. https://doi.org/10.1016/j.annals.2014.05.008

52.

Nie

Jiang

Zhang

(2020). A novel hybrid model based on combined preprocessing method and advanced optimization algorithm for power load forecasting. Applied Soft Computing, 97, Article 106809. https://doi.org/https://doi.org/10.1016/j.asoc.2020.106809

53.

Nie

Liang

Wang

(2021). Ultra-short-term wind-speed bi-forecasting system via artificial intelligence and a double-forecasting scheme. Applied Energy, 301, 117452. https://doi.org/10.1016/j.apenergy.2021.117452

54.

Niu

Wang

(2019). A combined model based on data preprocessing strategy and multi-objective optimization algorithm for short-term wind speed forecasting. Applied Energy, 241, 519–539. https://doi.org/10.1016/j.apenergy.2019.03.097

55.

Page

Song

D. C.

(2012). Assessing the impacts of the global economic crisis and swine flu on inbound tourism demand in the United Kingdom. Journal of Travel Research, 51(2), 142–153. https://doi.org/10.1177/0047287511400754

56.

Pan

D. C.

Song

(2012). Forecasting hotel room demand using search engine data. Journal of Hospitality and Tourism Technology, 3(3), 196–210. https://doi.org/10.1108/17579881211264486

57.

Polyzos

Samitas

Spyridou

A. E.

(2021). Tourism demand and the COVID-19 pandemic: An LSTM approach. Tourism Recreation Research, 46(2), 175–187. https://doi.org/10.1080/02508281.2020.1777053

58.

Qin

(2019). Effective passenger flow forecasting using STL and ESN based on two improvement strategies. Neurocomputing, 356, 244–256. https://doi.org/10.1016/j.neucom.2019.04.061

59.

Qiu

R. T. R.

D. C.

Dropsy

Petit

Pratt

Ohe

(2021). Visitor arrivals forecasts amid COVID-19: A perspective from the Asia and Pacific team. Annals of Tourism Research, 88, 103155. https://doi.org/10.1016/j.annals.2021.103155

60.

Shin

Kang

Sharma

Nicolau

J. L.

(2021). The impact of COVID-19 vaccine passport on air travelers’ booking decision and companies’ financial value. Journal of Hospitality and Tourism Research, 109634802110584. https://doi.org/10.1177/10963480211058475

61.

Sobol’

I. M.

(1967). On the distribution of points in a cube and the approximate evaluation of integrals. USSR Computational Mathematics and Mathematical Physics, 7(4), 86–112. https://doi.org/10.1016/0041-5553(67)90144-9

62.

Song

Witt

S. F.

Athanasopoulos

(2011). Forecasting tourist arrivals using time-varying parameter structural time series models. International Journal of Forecasting, 27(3), 855–869. https://doi.org/10.1016/j.ijforecast.2010.06.001

63.

Song

Lin

(2010). Impacts of the financial and economic crisis on tourism in Asia. Journal of Travel Research, 49(1), 16–30. https://doi.org/10.1177/0047287509353190

64.

Song

Wen

Liu

(2019). Density tourism demand forecasting revisited. Annals of Tourism Research, 75, 379–392. https://doi.org/10.1016/j.annals.2018.12.019

65.

Sun

Wei

Tsui

K. L.

Wang

(2019). Forecasting tourist arrivals with machine learning and internet search index. Tourism Management, 70, 1–10. https://doi.org/10.1016/j.tourman.2018.07.010

66.

Trierweiler Ribeiro

Alves Portela Santos

Cocco Mariani

dos Santos Coelho

(2021). Novel hybrid model based on echo state neural network applied to the prediction of stock price return volatility. Expert Systems with Applications, 184, 115490. https://doi.org/10.1016/j.eswa.2021.115490

67.

Wang

Gao

(2022). An integrated forecasting system based on knee-based multi-objective optimization for solar radiation interval forecasting. Expert Systems with Applications, 198, 116934. https://doi.org/10.1016/j.eswa.2022.116934

68.

Wang

Lei

Liu

Peng

Liu

(2019). Echo state network based ensemble approach for wind power forecasting. Energy Conversion and Management, 201, Article 112188. https://doi.org/10.1016/j.enconman.2019.112188

69.

Wang

(2022). Prediction of air pollution interval based on data preprocessing and multi-objective dragonfly optimization algorithm. Frontiers in Ecology and Evolution, 10(April). https://doi.org/10.3389/fevo.2022.855606

70.

Wang

Niu

Yang

(2020). A novel framework of reservoir computing for deterministic and probabilistic wind power forecasting. IEEE Transactions on Sustainable Energy, 11(1), 337–349. https://doi.org/10.1109/TSTE.2019.2890875

71.

Wang

Yang

(2021). Design of a combined system based on two-stage data preprocessing and multi-objective optimization for wind speed prediction. Energy, 231, Article 121125. https://doi.org/10.1016/j.energy.2021.121125

72.

Wang

Zhao

(2021). A novel combined model for wind speed prediction – Combination of linear model, shallow neural networks, and deep learning approaches. Energy, 234, 121275. https://doi.org/10.1016/j.energy.2021.121275

73.

Wei

Wang

Niu

(2021). Wind speed forecasting system based on gated recurrent units and convolutional spiking neural networks. Applied Energy, 292, 116842. https://doi.org/10.1016/j.apenergy.2021.116842

74.

Wen

Liu

Song

Liu

(2021). Forecasting tourism demand with an improved mixed data sampling model. Journal of Travel Research, 60(2), 336–353. https://doi.org/10.1177/0047287520906220

75.

D. C.

Cao

Wen

Song

(2021). Scenario forecasting for global tourism. Journal of Hospitality and Tourism Research, 45(1), 28–51. https://doi.org/10.1177/1096348020919990

76.

D. C. W.

Tso

K. F. G.

(2021). Forecasting tourist daily arrivals with a hybrid Sarima–Lstm approach. Journal of Hospitality and Tourism Research, 45(1), 52–67. https://doi.org/10.1177/1096348020934046

77.

Xie

Qian

Wang

(2020). A decomposition-ensemble approach for tourism forecasting. Annals of Tourism Research, 81, 102891. https://doi.org/10.1016/j.annals.2020.102891

78.

Xie

Qian

Wang

(2021). Forecasting Chinese cruise tourism demand with big data: An optimized machine learning approach. Tourism Management, 82, 104208. https://doi.org/10.1016/j.tourman.2020.104208

79.

Yang

Fan

Jiang

Liu

(2022). Search query and tourism forecasting during the pandemic: When and where can digital footprints be helpful as predictors? Annals of Tourism Research, 93, 103365. https://doi.org/10.1016/j.annals.2022.103365

80.

Yang

Pan

Evans

J. A.

(2015). Forecasting Chinese tourist volume with search engine data. Tourism Management, 46, 386–397. https://doi.org/10.1016/j.tourman.2014.07.019

81.

Zhang

Sun

Tang

Wang

(2022). Decomposition methods for tourism demand forecasting: A comparative study. Journal of Travel Research, 61(7), 1682–1699. https://doi.org/10.1177/00472875211036194

82.

Zhang

Qian

Chen

Lei

(2020). A short-term traffic forecasting model based on echo state network optimized by improved fruit fly optimization algorithm. Neurocomputing, 416, 117–124. https://doi.org/10.1016/j.neucom.2019.02.062

83.

Zhang

Wang

Niu

(2021). Wind speed prediction system based on data pre-processing strategy and multi-objective dragonfly optimization algorithm. Sustainable Energy Technologies and Assessments, 47, 101346. https://doi.org/10.1016/j.seta.2021.101346

84.

Zhang

Wang

Niu

Liu

(2021). Ensemble wind speed forecasting with multi-objective Archimedes optimization algorithm and sub-model selection. Applied Energy, 301, 117449. https://doi.org/10.1016/j.apenergy.2021.117449

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.59 MB