Data Mining in Tourism Data Analysis: Inbound Visitors to Japan

Abstract

The increasing power of technology puts new, advanced statistical tools at the disposal of researchers. This is one of the first research articles to use a data mining tool—namely, decision trees—to analyze the behavior of inbound tourists for the purpose of effective future destination marketing in Japan. The research results of approximately 4,000 observations show that the main motivation for visitors’ future return is not driven by experiences had during their most current visit but rather by experiences anticipated in the future, such as visiting hot springs or immersing themselves in beautiful natural settings. The data mining method largely excludes the possibility of the intrusion of researcher subjectivity and is conducive to useful discoveries of certain visitor patterns in large data sets, providing governments and destination marketing organizations with additional tools to better formulate effective destination marketing strategies.

Keywords

big data analysis decision trees quantitative destination marketing data mining Japan

Growing international tourism and travel has increased competition among tourism destinations throughout the world. Destinations are a combination of tourist products and services that create an integral experience for tourists and are consumed under the brand name of the destination (Leiper 1995; Buhalis 2000). Historically, destinations have been considered to be specific geographical locations, such as cities or countries (Hall and Hall 2000). On the other hand, there is a new trend in defining a destination as a concept that can be subjectively interpreted by tourists based on their purpose, culture, past experiences, etc. (Buhalis 2000). It is increasingly recognized that the core of any destination’s successful performance is determined by satisfied tourists who intend to return in the future and who will recommend the destination to their friends and families (Chi and Qu 2008; Assaker, Vinzi, and O’Connor 2011; Valle et al. 2006). Ryan (1991) argued that if the tourism industry is to continue satisfying tourists, it has to adopt societal marketing strategies, carefully monitor tourist satisfaction, and use the information collected to create success. Thus, understanding which attributes of a given destination create tourist satisfaction as well as the types of tourists who are willing to come back in the future enables destination marketing organizations (DMOs) to plan future destination marketing strategies that could be essential for an emerging tourist destination such as Japan.

For the past several decades, Japan has established its image as an industrial country; however, with increasing competition from other countries using its manufacturing-led growth model, it has also displayed an increasing interest in inbound tourism and has pursued several marketing campaigns to increase global awareness of Japan as a tourism destination. According to the Japan National Tourism Organization (JNTO), Japan witnessed a historical record of 19,737,409 inbound visitors in 2015, and in 2016, the Japanese government announced an ambitious plan to increase the number of annual inbound visitors to Japan to 40 million by 2020 and even to 60 million by 2030 (JNTO 2016).

As tourism becomes increasingly important to Japan as a destination, academic research is gradually starting to address questions of motivation, intention to return, positive word of mouth, and satisfaction related to Japan as a tourism destination. However, only a limited amount of research has been carried out regarding international tourism in Japan, and it is mainly qualitative in nature and includes very little empirical research. By contrast, extensive contributions that employ various qualitative and quantitative methodologies applied to research on customer satisfaction and behavioral intentions for specific destinations have enriched the current body of literature (Buhalis 1999; Pearce 2014; Som and Badarneh 2011; Okamura and Fukushige 2010; Uzama 2009; Park and Gretzel 2007; Liu, Siguaw, and Enz 2008; Pizam, Neumann, and Reichel 1978). The existing literature is nevertheless not comprehensive, and technological advances have led to methodological and statistical advancements that provide opportunities to take a broader view on tourism behavior for destination marketing by using data mining techniques (Goh, Law, and Mok 2008).

Data mining techniques are extremely underutilized in tourism research, and empirical research on Japan as a tourism destination using data mining techniques is highly limited. Hence, the purpose of this research is twofold: first, to provide an overview of data mining and its potential in tourism research using decision trees, and second, to fill the empirical research gap in the tourism literature related to Japan as a tourist destination.

This article, which presents findings related to tourist satisfaction and intention to return to Japan drawn from research using advanced statistical methods of data mining, has three specific objectives: (1) to identify the most important experiences of inbound tourists to Japan as a destination; (2) to identify the preferences, likes, and dislikes of the tourists; and (3) to identify how those experiences and preferences affect satisfaction and future intention to return to Japan as a destination choice.

Theoretical Background

Data Mining

Advances in technology and computer power have made it possible to collect immense amounts of data across many different fields. There is an increasing need for tools that will assist in extracting useful information from the growing amounts of data and turn it into knowledge (Fayyad, Piateetsky-Shapiro, and Smyth 1996). Many industries are enhancing their competitiveness by adopting data mining technology for various purposes, such as gauging customer preferences in e-commerce and retail, ascertaining medical history in health care, assessing risk factors in insurance, and gathering financial data in banking, to name just a few. The nature of the tourism and hospitality sector has made it one of the largest users of informational technology (Sheldon 1994; Buhalis 1998). Information about tourists is being accumulated at an increasing pace, and it is becoming progressively more difficult for destinations to stay competitive and to increase their market share. Destination management organizations will find a growing need to use data mining if they wish to stay competitive (Pyo, Uysal, and Chang 2002).

With an acceptably accurate learning model, one can not only understand but also predict expected values in the tourism industry. For example, a tourism agency may choose to use its visitor database to predict future arrivals and patterns of consumption. Given an updated visitor profile, agencies will be more prepared, based on actual visitor behavior, to 1) meet the needs of visitors with better marketing material, 2) establish necessary collaborations with agencies such as transportation and lodging, and 3) share information and work with other destinations within a country to improve the country’s desirability for future touristic visits (Apté and Weiss 1997). While some academic articles (Buhalis 1997; Pyo, Uysal, and Chang 2002; Buhalis and Law 2008) have emphasized the need for data mining in tourism rather than empirical research with classical statistics, empirical research undertaken with data mining tools remains highly limited (Kim, Timothy, and Hwang 2011). The question could be raised, What is the advantage of data mining–based research compared to already established techniques using classical statistics? While both techniques have their advantages and disadvantages, the authors thought it would be prudent to give a brief comparison of the two techniques and provide a rationale for why data mining would be advantageous in this case.

Data mining versus classical statistics

Managing very large data sets requires skills different from those used in classical statistical analysis. Data mining manages such problems by efficient summaries of large amounts of data, identifying patterns and relationships of previous data, and constructing predictors for the future. Classical statisticians have well-established tools for such things. Many statistical models have been utilized for explaining relationships and patterns within given data, and it is therefore tempting to think of data mining as an extended branch of statistics. However, data mining has its own merits, being capable of working with larger-scale data sets than the data sets used in classical statistics. Comparatively, there are differences in the approaches to modeling, where data mining pays less attention to the large-sample asymptotic properties of its inferences and more to the “learning,” including the complexity of the modeling and computation required by large data sets. Of course, classical statistics and data mining are similar in that they draw inferences from data. However, unlike classical statistics, data mining is more tolerant toward discreet-valued variables and seeks to minimize a loss function expressed in terms of predictor error, where minimization is achieved by cross-validation (Hosking, Pednault, and Sudan 1997).

One of the oldest definitions of data mining is “the non-trivial extraction of implicit, previously unknown, and potentially useful information from data” (Frawley, Piatetsky-Shapiro, and Matheus 1992, p. 58). Data mining uses machine learning algorithms to find patterns of relationships between data elements in large, noisy, and messy data sets, thereby facilitating actions that enhance in some form (diagnosis, profit, detection, etc.) knowledge discovery in that data (Nisbet, Elder, and Miner 2009, p. 17). As data mining evolved, a new definition was proposed, as follows: “Knowledge discovery in databases is the non-trivial process of identifying valid, novel, potential, useful, and ultimately understandable patterns in data” (Fayyad et al. 1996, p. 30).

One big difference between classical statistics and data mining is that classical statistics has large subjective components, known as predictive models, the main goal of which is to estimate parameters and/or confirm or reject hypotheses. On the other hand, from a data mining perspective, the correct model is unknown. In fact, the goal of the analysis is to discover the correct model. In classical statistics, models must be specified, whereas in data mining, a series of competing models will be specified and selected based on data examination. This preferential ordering addresses the issue of overfitting. There are many other points of difference between classical statistics and data mining techniques; however, this is not the purpose of this study. In summary, one can say that statistical learning (data mining) is much more manageable when there are no restrictions placed on the model for a given data set—that is, where analyses are data driven and the complexities of the given machine learning algorithms are dependent on the underlying distribution we desire to learn (Hosking, Pednault, and Sudan 1997).

An extensive number of data mining techniques have evolved over the years, including but not limited to decision trees, neural networks, regression analysis, text mining, association rules, and clustering.

Data preparation and reduction are essential steps in data mining. Unlike the data sets used in classical statistics, it is impossible to “eyeball” data mining data sets where variables could be counted in the hundreds and observations in the millions, because, just as in classical statistics, the quality of the prediction and accuracy of a model depend on the quality of the data. Furthermore, variables should be reduced and manipulated into analytical data sets. Finally, once a data set is cleaned and finalized, similar to classical statistics, an appropriate statistical tool is chosen for data analysis, such as neural networks, time series, decision trees, etc.

Destination Marketing

Tourism is one of the world’s major industries that contributes significantly to the global economy and has become one of the major sources of wealth for a number of developing and developed countries. Tourism takes place at destinations; consequently, a destination is taken as the fundamental unit of analysis (WTO 2002). Destinations are also a focal point of destination marketing, an essential tool of tourism destinations in an increasingly globalized and international tourism market (UNWTO 2011). Destinations are a conglomeration of tourist services and experiences (Buhalis 2000). Understanding tourists’ perceptions is essential to a successful tourism destination, because they influence a tourist’s choice of destination (Ahmed 1991), their satisfaction, and their decision whether to return (Weiermair 2000). The increasing competition among tourist destinations over the last several decades has prompted concern among destination marketing managers and industry practitioners about the perceptions of a destination by tourists (Wang and Pizam 2011). The marketability of destinations as well as the offered services, entertainment, lodging, transportation, and shopping leave an impression on visitors in terms of their sense of satisfaction and their decision whether to come back in the future. Thus, the following questions are raised: How can a DMO best communicate with stakeholders and the market? How can a DMO engage with visitors to stimulate repeat visits? Finally, how can a DMO filter the vast amount of information to obtain a set of manageable rules to predict visitor behavior and ensure visitor satisfaction and loyalty? (Pike 2012).

Destination marketers conduct extensive research to identify prospective visitors who have not yet visited (suppressed demand) and potential tourists (active demand) (Athiyaman 1997). DMOs need to know how their destination is perceived by potential visitors to better target their market, develop more appropriate tourism products, and increase destination attractiveness (Phillips and Back 2011). For example, cultural differences, the extent of planning time before a vacation, and the number of people in the group influence tourist expenditure (Laesser and Dolnicar 2012). A review of past literature shows an increasing number of articles that deal with aspects of destination marketing, customer satisfaction, and behavioral intentions in tourism overall and for a specific destination. For instance, Kozak and Rimmington (2000) looked at tourist satisfaction in Mallorca, Spain. Baloglu and McCleary (1999) looked at U.S. international pleasure travelers to four Mediterranean destinations. Yoon and Uysal (2005) studied the motivation and satisfaction of tourists in Northern Cyprus. Campelo, Aitken, and Gnoth (2011) looked at visual rhetoric in the destination marketing of New Zealand. Finally, Dwyer et al. (2014) studied destination marketing and return on investment in Australia.

Inquiring into tourist perception of a destination is generally aimed at looking into customer satisfaction and intention to return. The literature related to measuring destination marketing can be successfully arranged into two groups (Hallowell 1996). The service management literature postulates that customer satisfaction leads to customer loyalty and subsequently to profitability (Hallowell 1996; Reinartz and Kumar 2002). The marketing literature claims that if customers are happy with a product, they will purchase it again and tell their friends and relatives about it (Maxham 2001; Ranaweera and Prabhu 2003; Brown et al. 2005). Similarly, this concept could be applied to the body of tourism literature, which finds a significant correlation between satisfaction and future intention to return (Gallarza and Saura 2006; Hernández Lobato, Solis-Radilla, and Moliner-Tena 2006). A number of articles have examined differences between first-time and repeat visitors (Woodside and Lysonski 1987; Lupton 1997; Okamura and Fukushige 2010; Fuchs and Reichel 2011) and have established that repeat visitors are more likely to choose the same destination. First-timers will reduce their stereotypes and obtain a better and deeper understanding of a destination (Pool 1965). Repeaters will move beyond simple stereotyping and build a more subtle and complex understanding (Fakeye and Crompton 1991; Mishler 1965). This of course happens when sufficient time has been spent at a destination, and the tourist has had sufficient saturation through establishing different contacts and relationships (Mishler 1965).

It is generally accepted that tourist satisfaction is essential for destinations to have repeat visitors and that the intention to return to a destination depends on the level of satisfaction visitors had with its products and services.

Since this study is looking into destination experiences and attributes, we will use the definition of tourist satisfaction proposed by Pizam, Neumann, and Reichel (1978): tourist satisfaction is the result of interaction between a tourist’s experience at the destination and the expectations he or she had about that destination (p. 315).

While it is true that satisfaction and intention to return are highly important for tourism, destinations are an amalgam of tourism products and services that create experiences for the consumer. There is a plethora of important dimensions that could potentially contribute to consumer satisfaction and subsequent return. There is also an increasing trend in the recent stream of research that reveals the different dimensions that influence tourists’ destination perceptions, satisfaction, and loyalty (Table 1).

Table 1.

Analysis of Relevant Past Research.

Author(s)	Purpose(s)	Objects of Observation	Findings
Yoon and Uysal 2005	To understand tourist motivations based on the push and pull motivators, satisfaction, and destination loyalty	Northern Cyprus	Tourist destination loyalty is affected by satisfaction and experiences
Chen and Tsai 2007	To construct an integrated model of the tourist consumption process and to examine relationships between destination image, evaluative factors, and behavioral intentions	Kenting region in southern Taiwan	Destination image has the most important effect on behavioral intentions directly and indirectly
Chi and Qu 2008	To offer an integrated approach to understanding destination choice and examine empirical evidence on the causal relationships among destination image, tourist satisfaction, and loyalty	Arkansas, Eureka Springs, USA	Statistically significant relationship among destination image, attribute satisfaction, overall satisfaction, and destination loyalty; full mediation role of overall satisfaction between destination image and loyalty
Alegre and Garau 2011	To identify key drivers of sun-and-sand products on tourist satisfaction using penalty–reward analysis	Palma Airport, Spain	Penalty–reward method supported an asymmetrical relationship between satisfaction with attributes and overall satisfaction
Lee, Lee, and Lee 2014	To investigate the dynamic nature of destination image and role of satisfaction in modifying it	Seoul, Korea	There is a significant difference in the destination image between pre- and posttrip
Ramseook-Munhurrun, Seebaluck, and Naidoo 2015	To develop a conceptual model for destination image	Mauritius, East Africa	Destination image and perceived value are direct determinants of satisfaction
Özdemir and Şimşek 2015	To examine complex destination images by evaluating interconnecting relationships between destination image, perceived price, quality, value, and overall satisfaction	Izmir, Turkey	Tourist-induced destination image changes only based on the experiences he or she has
Lee, Kyle, and Scott 2012	To explore factors that drive festival visitors’ loyalty to host destinations	Pasadena, Texas, USA	Place attachment plays a mediating role in the relationship between festival satisfaction and destination loyalty
Bajs 2015	To define model of tourist-perceived value that was affected primarily by destination appearance and emotional experience	Dubrovnik, Croatia	Perceived value has a significant effect on future intentions and satisfaction

For example, Yoon and Uysal (2005) examined “push and pull” motivation factors for satisfaction and destination loyalty. “Push” motivation factors include relaxation, family togetherness, safety, and fun. “Pull” motivators include weather, shopping, cleanliness, night life, and local cuisine. “Push” motivators have a significant impact on destination loyalty, and satisfaction with a destination leads to destination loyalty. Lee, Kyle, and Scott (2012) approached destination loyalty from an events perspective. They found that satisfaction with a special event (e.g., a festival) led to destination preference and place attachment (place identity and dependence). Lee, Lee, and Lee (2014) studied change in tourist perceptions of destination image before and after a trip, and the destination image was found to have been significantly impacted by satisfaction. Destination image dimensions such as amenities and hygiene, attractions, and accessibility were viewed differently by tourists after visiting a destination. Bajs (2015) evaluated the effects of quality of touristic services, destination appearance, emotional experience, and monetary and nonmonetary costs on perceived value and subsequently on satisfaction and behavioral intentions. Joppe, Martin, and Wallen (2001) studied Toronto visitors’ perceptions of product and service attributes, such as hospitality, accommodations safety, cuisine, and family orientation, using an importance satisfaction model. They found that food, accommodations, and sightseeing ranked as very important and excellent on their importance–satisfaction grid, while family orientation was ranked as unimportant and unsatisfactory. Alegre and Garau (2011) found that cuisine, budget, cleanliness, climate, scenery, and access were of explicit importance to tourist satisfaction and destination competitiveness. Chi and Qu (2008) applied an empirical integrative approach to understanding destination loyalty and used destination image, overall satisfaction, and tourist attributes. Ramseook-Munhurrun, Seebaluck, and Naidoo (2014) used tourist perception of destination image, perceived value, tourist satisfaction, and loyalty for destination marketing for the small island destination Mauritius. In their study, the authors looked at dimensions such as travel environment, attractions, events, infrastructure, sports activities, and perceived value as antecedents of satisfaction and loyalty. While only satisfaction had an impact on loyalty, perceived value and destination image had an impact on tourist satisfaction. Finally, Özdemir and Şimşek (2015) analyzed perceptions of quality, price, and value on satisfaction and destination image. They found that perceived price and quality have a significant impact on destination image.

Another recent trend has been the increase in destination marketing research that takes into consideration specific destination attributes and their effects on tourist satisfaction and behavioral intentions. Research using advanced statistical tools, however, appears to be limited. The authors of the present study took an extensive look at the underlying dimensions of “destination image” and “attribute satisfaction.” Destination image includes dimensions such as travel environment, natural attractions, entertainment and events, infrastructure, relaxation, outdoor activities, price, and value, while attribute satisfaction includes shopping, lodging accessibility, attraction, dining, and environment. Destination image and attribute satisfaction had a significant impact on overall satisfaction and destination loyalty

Advances in technology have increased researchers’ ability to collect, store, and run calculations on very large data sets (Pyo, Uysal, and Chang 2002). Big data analysis is slowly making progress as a valid research tool in broader social science fields. For example, in the medical field, Qu et al. (2002) used decision trees in their evaluation of a proteomic approach to the simultaneous detection and analysis of multiple proteins for the differentiation of prostate cancer patients from noncancer patients. De Reyck, Degraeve, and Vandenborre (2008) used decision trees for evaluation as an alternative approach to valuing real options based on a certainty-equivalent version of the net present value formula. Goh, Law, and Mok (2008) incorporated rough sets theory into tourism demand analysis and created a tourism climatic index using data mining techniques. They found that climate and leisure time have a stronger impact on tourist arrivals than economic factors. Wicker and Breuer (2013) used decision trees to evaluate organizational problems for the recruitment/retention of members at non-profit sports clubs. Duncan (1980) used decision trees in his evaluation of organizational structure and design. Min, Min, and Emam (2002) used a data mining approach in developing hotel customers’ profiles. Chang et al. (2016) applied data mining techniques (decision trees) for tourist loyalty intentions in the hotel sector. Specifically, the authors were looking at hotels/physical environment and social interaction to gauge customer loyalty. Kim, Timothy, and Hwang (2011) used decision trees analysis to evaluate Japanese tourists’ shopping preferences and intention to revisit Korea.

Finally, this article is making one of the first attempts to use DTs to analyze tourist satisfaction and intention to return by utilizing large data sets on inbound visitors using destination-specific attributes of Japan as a destination.

Japan as a Destination

Japan is still largely undiscovered by mass tourism. Mainly known for its industrial power, Japan as a tourism destination is still overshadowed by its industrial and business image. Even though a limited amount of academic research on Japan as a destination does indeed point to its tremendous potential as a tourism destination, this potential remains generally untapped.

Nevertheless, research on inbound tourism to Japan remains highly limited. According to Uzama (2009), the Japanese marketing campaign “Yokoso! Japan” was mainly unsuccessful in advertising Japan as a desirable tourism destination, and in spite of government interest in promoting Japan, it did not go beyond simple promotion. Okamura and Fukushige’s (2010) research on international tourists to Japan looked into the differences between first-time visitors and repeat visitors to the Kansai area of Japan and found that first-time visitors were interested in sightseeing, while repeat tourists were more involved and interested in participating in events. Such relatively limited existing research emphasizes the need for additional empirical research about Japan as a destination for purposes of its destination marketing.

Pyo, Uysal, and Chang (2002) emphasized a need for data mining analysis and its application to the distribution of knowledge about tourists and destinations as well as market information. The authors stressed that promotional activities could be more effective after the characteristics of the destination have been understood and defined. Destinations count their visitors in the thousands and millions, and DMOs and other government institutions have an extensive amount of data that reflects actual tourists’ behavior, but data mining is generally limited to private organizations and consulting firms in the hospitality and tourism industry. Buhalis (2000) emphasized that tourism research is extensively dynamic and that continuous research is necessary to follow developments. However, despite the possible benefits that data mining research can provide to destination marketing, empirical research using data mining techniques in the tourism industry has been sorely lacking.

Methodology

Data Mining—Decision Trees

Decision trees (DTs) are a form of multiple variable analysis. A decision tree “is a structure that can be used to divide up a large collection of records into successively smaller sets of records by applying a sequence of simple decision rules” (Berry and Linoff 2000, p. 6). Another definition of a DT provided by Nisbet, Elder, and Miner (2009) states that a “DT is a hierarchical group of relationships organized into tree-like structures, starting with one variable (like a trunk of an oak tree) called a root node” (p. 241). The root node is split into multiple branches using a split criterion. Each split is defined in terms of an impurity measure reflecting how uniform resulting cases are. Each split node is referred to as a parent node, and the following splits are called child nodes. Splits continue until the final or terminal node with the minimum number of cases is reached. For example, Figure 1 is a small illustration of decision trees used to indicate patterns of travel behavior based on age, gender, and marital status.

Figure 1.

Decision tree sample.

DTs are a very appealing method of analysis for the present study because of their relative power, ease of use, robustness, ability to handle ordinal data (Likert scale), and ease of interpretability. It is a collection of one-cause, one-effect relationships presented in the form of a tree. DTs try to find strong relationships between input and target variables; when a set of values is identified that have a strong relationship to a target, all those values are grouped into the bin that forms the branches of a DT.

Impurity-based criterion

In many cases, a DT split is done according to the value of a single variable. The most common criterion for a split would be an impurity-based split, as used in this study. The impurity-based criterion is briefly represented as follows.

Given random variable x with k discrete values and distribution according to P = (p₁, p₂, . . . p_k), an impurity measure is a function of φ:[0,1] k→R that satisfies the following conditions:

φ(P) ≥ 0

φ(P) is maximum if ∃₁ such that component p_i=1.

φ(P) is maximum ∀₁, 1≤ i ≤ k, p_i = 1/k.

φ(P) is symmetric with respect to components of P.

φ(P) is smooth (differentiable everywhere) in its range.

Given the training set S, the probability vector of the target attribute y is defined as

P_{y} (S) = (\frac{| σ_{y = c_{1}} S |}{| S |}), \dots ., (\frac{| σ_{y = c_{| d o m (y) |}} S |}{| S |})

The goodness of split due to the attribute a_i is defined as a reduction in the impurity of the target attribute after partitioning S according to the values v_i,_j ∈ dom (a_i), as follows:

Δ Φ (a_{i}, S) = ϕ (P_{y} (S)) - \sum_{j = 1}^{| d o m (a_{i}) |} \frac{| σ_{a i = v i, j} S |}{| S |} \cdot ϕ (P_{y} (σ_{a i = v i, j} S))

(Maimon and Rokach 2010, p. 153).

Information gain

Out of three tests (gini, chi-square, and entropy), the entropy information gain criterion was chosen for the purposes of this study. Information gain is an impurity-based criterion that uses the entropy measure (originating from information theory) as the impurity measure (Quinlan 1987). Entropy information gain is represented as

\begin{array}{l} I n f o r m a t i o n G a i n (a_{i}, S) = \\ E n t r o p y (y, S) - \sum_{v i, j \in d o m (a_{i})} \frac{| σ_{a i = v i, j} S |}{| S |} \cdot \\ E n t r o p y (y, σ_{a i = v i, j} S) \end{array}

where

E n t r o p y (y, S) = \sum_{c_{j} \in d o m (y)} - \frac{| σ_{y = c_{j}} S |}{| S |} \cdot l o g_{2} \frac{| σ_{y = c_{j}} S |}{| S |}

(Maimon and Rokach 2010, p. 153).

Sample and data collection

Data were acquired by the Japan Travel Bureau (JTB) Foundation on behalf of the Japan Tourism Agency in 2010. The JTB Foundation is the largest travel agency in Japan and one of the largest travel agencies in the world that specializes in tourism. The JTB Foundation is a nonprofit research organization affiliated with the JTB. (The JTB was established in 1912 and became a for-profit company in 1963.) Data collection was conducted at international airports and seaports in Japan as a part of a tourist expenditure survey series undertaken for the Japan Tourism Agency. Inbound tourists to Japan were approached at random by representatives of the JTB Foundation with an iPad in their hands. Participation in the survey was voluntary, and no monetary incentives were provided for participation. Questions were dictated by the interviewer to the interviewees and answers recorded on the spot, after which the iPad sent the data immediately to the database.

While data mining could potentially offer substantial benefits to research and development, utilizing a large data set potentially raises legal questions and potential liabilities. In 2006, AOL (650,000 users) and Netflix (100 million ratings) released “anonymized” user data. Potential anonymization failed for both organizations, however, creating legal concerns (Walton 2014). The authors emphasize that AOL used about 30% of their users’ information (total users is approximately 2.1 million [Pagliery 2015]), and Netflix released an extensively high number of reviews. The ratio of sample to population of users for those organizations were considerably high, thus creating privacy concerns. The data collected by the JTB Foundation were anonymous, and the sample represented less than 1% of total foreign inbound tourists to Japan (8.65 million total in 2010 [JTB Tourism Research and Consulting 2016 (Japan Tourism Research and Consulting in the list)]). Hence, any potential privacy threats have been eliminated.

Data were collected using a Likert and binary scale, and out of a total sample size of 6,000, roughly 4,000 usable observations were obtained. Because of the large sample size, the use of classical statistical tools was not appropriate; therefore, the decision tree data mining technique was used for data analysis. Specifically, because of the binary and ordinal scales used in the survey, decision trees with two-step modeling (with two dependent variables) were used to summarize and interpret the behavioral and purchasing patterns of inbound tourists in Japan.

In this article, we use data mining as an exploratory tool and extract hidden knowledge through a set of rules that connects a collection of inputs. In a sense, DTs represent a series of questions, where an answer to a question determines the follow-up question, thereby creating a pattern. The decision tree is probably one of the most popular and powerful techniques used in data mining (Berry and Linoff 2000). DTs do not have strict assumptions concerning the functional form of the model, but they do have computational efficiency, are robust against outliers, are resistant to the curse of dimensionality, and require less data preparation than other data mining tools.

Measurements

This study employed a casual research design. The survey questionnaire consisted of the following major sections: tourist attributes of satisfaction, overall satisfaction, intention to return, and demographic questions for tourists requesting information such as country of residence, party size, gender, age, and number of children.

Attributes of satisfaction

Destination response encompassed information about the current trip to Japan, the purpose of the visit, expenditures, transportation, accommodation arrangements, shopping, sources of information, activities at destination, satisfaction with Japan as a destination, and intention to return to Japan in the future. The survey consisted of more than 150 questions measured on a five-point Likert-type scale and 0/1 binary responses.

Overall satisfaction

A single overall measure of satisfaction was used in this study for its ease of use and empirical support. Satisfaction was measured on a seven-point Likert scale, with 1 being highly dissatisfied and 7 highly satisfied.

Behavioral intentions

A single measure for intention to return was used in this study for its ease of use and empirical support. Intention to return was measured on a seven-point Likert scale, where 1 indicated definitely not returning and 7 definitely returning.

Results

The top most important variables are listed in Table 2 and Table 3. Out of 150 variables, 18 are represented in the tables. The variables’ importance was selected by decision trees and is measured on a continuous scale (decimals) from 1 to 0, with importance decreasing as it approaches 0. The top 15 variables for satisfaction and intention to return are listed in the tables. Variables are listed in order of importance for satisfaction and future intention to return to Japan as dependent variables:

Table 2.

Variables in Order of Importance for Satisfaction.

Variable Name	Importance
Experienced Japanese Food	1.000
Shopping	0.927
Availability of information about transportation	0.295
Lonely Planet as source of information about Japan prior to visit	0.285
Country of residence	0.209
Nationality	0.202
Airport of entry	0.124
Main destination visited in Japan	0.117
Main purpose of the visit to Japan	0.110
Secondary destination visited in Japan	0.091
Expenditure at the accommodation (hotel, etc.)	0.090
Prior visit to Japan	0.090
Level of expectation for business trip	0.083
Cosmetics and pharmacy expenditure	0.074
Credit card as a method of payment in Japan	0.073

Table 3.

Variables in Order of Importance for Intention to Return.

Variable Name	Importance
Experienced Japanese Food	1.000
Shopping	0.930
Transportation	0.301
Lonely Planet as a major source of information about Japan prior to visit	0.293
Which airport did you land at in Japan?	0.134
How many time have you visited Japan, including this visit?	0.124
Main area (destination) in Japan visited	0.094
Internet as a main helpful source in obtaining information while in Japan	0.090
Desire to experience nature/scenery sightseeing next visit	0.081
Flight cost	0.077
Country of residency	0.075
Want to walk around downtown in the future	0.069
Catering cost	0.064
Cosmetics and pharmacy expenditure	0.062
Nationality	0.060

Decision Tree Rules

Important variables provide a snapshot of what is important to tourists when they travel to Japan. Conversely, decision trees provide a deeper understanding by grouping and creating patterns of tourist preferences that provide higher levels of satisfaction and intention to return. Because of the large number of independent variables, it was not possible to insert a complete decision tree into this article. However, excerpts from a decision tree are shown as examples here (Figures 2 and 3).

Figure 2.

Excerpt of decision tree for Satisfaction.

Figure 3.

Excerpt of decision tree for Intention to return.

Demographics of Data

The majority of tourists came from Asian countries (62%), such as Korea (19.51%), Taiwan (18.10 %), and mainland China (14.16%). The second largest group of visitors was from the United States (10.65%). From mainland China, the two largest groups were from Beijing and Shanghai. Gender was rather evenly distributed between men (56%) and women (43%). The average age was 23 years, with a standard deviation of 13 years. The airports that the majority of the tourists arrived at were Narita-Tokyo (53.88%), Kansai-Osaka (17.63%), and New Chitose-Sapporo in Hokkaido (6.212%). The data revealed that 42% of respondents were visiting Japan for the first time, 15% were visiting for the second time, and 10% for the third time. The general distribution of the groups of travelers was alone (17%), with family (21%), with one or more work colleagues (19%), and with one or more friends (19%). Additionally, 57.9% of the respondents traveled for tourism and leisure, while 25% traveled to participate in business training, conferences, or trade fairs.

Decision Trees

Odds ratios

Odds ratios are used to compare the relative odds of the occurrence of the outcome of interest (e.g., a disease or a disorder) given exposure to the variable of interest (e.g., a health characteristic, an aspect of medical history). An odds ratio is represented by the formula

O d d s R a t i o (θ) = \frac{π_{11} π_{22}}{π_{21} π_{12}}

where

OR = 1 Exposure does not affect odds of outcome

OR > 1 Exposure associated with higher odds of outcome

OR < 1 Exposure associated with lower odds of outcome (Bland and Altman 2000).

Satisfaction

For the purposes of better classification with decision trees, variable satisfaction was recoded into a binary variable, where 1 includes highly satisfied and satisfied and 0 includes everything else. This produced binary values that were rather equally distributed between 1 (50.1%) and 0 (48.9%). For the purposes of this study, the top four most important decision tree combinations (rules) were selected. The overall model’s misclassification rate is 0.14. The misclassification rate calculates the proportion of an observation being allocated to the incorrect group. It is calculated as follows: number of incorrect classifications / total number of classifications. This indicates an accuracy for the model of 86%.

Results—satisfaction

The odds ratio of tourists being satisfied is higher by

2.32 if the tourists are mainly from non-Asian countries, had an experience with Japanese food, paid no higher than $1,500 for airfare, purchased Japanese fruits, and shopped at a supermarket;

2.21 if the tourists are mainly from non-Asian countries, paid no higher than $1,500 for airfare, experienced Japanese food, stayed less than eight days, and stayed at a Western-style hotel.

1.64 if they are from a neighboring Asian country (Korea, China, Taiwan, Hong Kong, or Thailand); stayed at a Japanese-style inn; experienced Japanese food; came for tourism/leisure, incentive travel, study, or international conference; and came through one of the two main airports (Narita/Haneda).

1.51 if the tourists are from a neighboring Asian country (Korea, China, Taiwan, Hong Kong, or Thailand), experienced Japanese food, came for tourism or exhibition/conference/company meeting, and had visited Japan more than once before.

Intention to return

For the purposes of better classification with decision trees, the variable satisfaction was recoded into a binary variable, where 1 includes highly likely and likely to return and 0 includes everything else. That produced binary values were rather equally distributed between 1 (49.1%) and 0 (50.9%). The binary response was equally distributed. The overall model’s misclassification rate is 0.13, indicating an accuracy for the model of 87%.

Results—intention to return

The odds ratio of tourists having an intention to return is higher by

3.9 if the tourists experienced Japanese food, want to experience Japanese sightseeing (e.g., nature, scenery) in the future, paid no higher than $1,670 for airfare, visited Japan for the first time, and came through airports such as Narita, New Chitose (Sapporo), or Fukuoka.

3.9 if the tourists experienced a festival/event, sightseeing (nature/scenery), and Japanese food; paid no higher than $1,670 for airfare, and had visited Japan several times.

1.94 if tourists experienced Japanese food; want to experience sightseeing and/or Japanese hot springs; and came with family, spouse, or friends.

1.49 if tourists want to experience sightseeing in the future, experienced Japanese food, and paid no higher than $1,670 for airfare.

Discussion

Various studies (Pizam, Neumann, and Reichel 1978; Buhalis 2000; Weiermair 2000; Kozak and Rimmington 2000; Yoon and Uysal 2005; Chen and Tsai 2007; Liu, Siguaw, and Enz 2008; Lee, Kyle, and Scott 2012; Özdemir and Şimşek 2015) have acknowledged the significance of destination image attributes and their impact on tourist satisfaction and behavioral intentions. Even studies using large data sets (Yoon and Uysal 2005; Chen and Tsai 2007; Chi and Qu 2008; Alegre and Garau 2011; Lee, Kyle, and Scott 2012; Lee, Lee, and Lee 2014; Ramseook-Munhurrun, Seebaluck, and Naidoo 2015; Özdemir and Şimşek 2015; Bajs 2015) have confirmed and emphasized the importance of Japanese food, shopping, and transportation or information about transportation to tourists’ satisfaction and intention to return to Japan as a destination. Contrary to the popular belief that the Internet is the main source of information (Buhalis and Law 2008; Litvin, Goldsmith, and Pan 2008), Lonely Planet travel books have been a major source of information prior to visits to Japan. However, online information is very important for tourists while in Japan for the future intention to return. Place of arrival (airports), prior visits to Japan, country of residency, and flight cost were among the top variables selected by the DT, perhaps because of convenience and the centralization of attractions and/or businesses around airport areas. For example, Tokyo’s main international airport is Narita (Tokyo is the capital and a major business and attraction center), Osaka’s airport is Kansai (Osaka is a major trade center), and New Chitose is Sapporo’s airport (Sapporo is the northern island capital). Another interesting point is that credit cards as a method of payment rather than cash was found to be important to tourists. It was also notable that satisfaction reflects similar variables with several important differences. Preference for a type of the hotel, quality of accommodations, and two main destinations visited in Japan became important in the model (Joppe, Martin, and Wallen 2001).

Thus, the variables show an interesting variance between satisfaction and intention to return. Most of the important variables in both models (i.e., satisfaction and intention to return) are related to convenience and food, such as Japanese food, shopping, transportation, flight cost, etc. However, the models differ in two ways: for intention to return, the source of information and desire to experience new things are important; however, for satisfaction, accommodations and destinations within Japan are of importance.

The results of the decision trees indicate that there are two distinct groups—namely, Asian and non-Asian tourists—who have different preferences related to a high level of satisfaction. The main theme for non-Asian tourists is experiencing Japanese food, shopping at supermarkets, staying at Western-style hotels, staying less than eight days, and reasonable airfare costs. These findings support the results of previous research (Joppe, Martin, and Wallen 2001; Alegre and Garau 2011; Bajs 2015). For Asian tourists, higher satisfaction can be achieved by those who experience Japanese food; stay at a Japanese-style inn; come mainly for an event such as a conference, for incentive travel, or to study; and by those who have visited Japan more than once.

On the other hand, for future intent to return to Japan, nationality plays no role in the decision but rather whether the visitors are more family-oriented/non-business and whether they are first-time visitors have effects on their intention. The main motivation for a visitor’s future return is not driven by experiences they had during their visit but rather by experiences they want to have when they return, such as Japanese hot springs or immersing themselves in the beauty of nature. Furthermore, experiencing Japanese food appears to remain a main attraction across all segments as a common denominator to attract all different groupings.

The decision tree analyses revealed the existence of intriguing segments irrespective of nationality, gender, total expenditure during the current visit, or even the purpose of the visits, such as a core repeater grouping of those who “experienced Japanese food, want to experience Japanese nature/scenery sightseeing next time, paid no higher than $1,670 for airfare, visited Japan for the first time, and came through airports such as Narita, New Chitose (Sapporo), or Fukuoka”—a grouping whose likelihood of returning to Japan is almost four times (3.9) higher than that of average inbound visitors.

Nationality plays an important role in the satisfaction of tourists and again separates them into two distinct groups of Asian and non-Asian. An interesting point about satisfaction is that tourists from non-Asian countries have higher odds of being satisfied than tourists from Asian countries. This could potentially be explained by the desire of non-Asians to visit a destination with a culture very different from that of their homeland. On the other hand, Tran and Ralston (2006) proposed a model pertaining to tourists’ unconscious needs and their preferences in tourism. For example, they found a connection between achievement motivation and preference for adventure in American tourists. Considering that the largest non-Asian group of tourists in this study is from the United States, the authors speculate that Japan falls into both categories as a very different culture and an adventurous destination (a unique country with a very different culture, different food, and a different language that is a long distance away from home). Also, for Asian countries, airfare is no longer an important variable, which could probably be explained by the closer proximity. For example, the likelihood of satisfaction is a little more than double (2.32) for “tourists from non-Asian countries who had experience with Japanese food, paid no higher than $1,500 for airfare, purchased Japanese fruits, and shopped at a supermarket.” Conversely, the odds increase by a little more than half (1.64) “if they are from a neighboring Asian country (Korea, China, Taiwan, Hong Kong, or Thailand); stayed at a Japanese-style inn; experienced Japanese food; came for tourism/leisure, incentive travel, study, or an international conference; and came through one of the two main airports (Narita/Haneda).”

Managerial Implications

Data mining is a data-driven technique that can analyze data without introducing any major subjectivity of data analysts who may have preferred approaches or agendas associated with their past research streams. Therefore, by its structure, it does not lead to the verification of a specific existing theory unless the researchers are dealing with multiple data sets and see commonalities across them.

Data mining, however, presents unique managerial implications in that the resulting analysis can more effectively identify certain combinations of profiles and characteristics of visitors without relying on the study design or the subjective judgment of the researchers. In other words, data mining results can present more compelling responses to questions of marketing return on investment on expenditures by government marketing and DMOs by identifying the specific grouping of potential visitors who are more likely to come back as repeaters than any other combinations of groupings—based purely on the analysis of the objective big data in question. Odds ratios are rather easy to interpret for non-academic practitioners, and, with an understanding of groupings with higher odds ratios, DMOs may put a strategy in place to aim at groups whose odds ratios are higher than certain discretionary benchmarks. For example, a DMO’s strategy to market to groups with an odds ratio higher than 1.5 means the DMO is aiming at groups who have at least a 50% higher likelihood (of being satisfied with the destination or of coming back to the destination), with an expectation that the same marketing expenditures would work more efficiently than targeting the average at large. Data mining results might not corroborate prior-knowledge, prior-beliefs, or myths, as the structure does not pay attention to those but rather purely seeks winning combinations of groupings by mathematical computations in an objective manner. As governments and DMOs can access big data on tourists, data mining techniques open a new chapter for their quantitative data analysis to aid managerial decisions for better allocation of their limited resources into segments more likely to be enticed back to their destination than the market average.

Study Limitations and Future Research

This study has several structural limitations, and we believe that acknowledging these limitations may lead to more viable future research in the field of quantitative destination marketing. First, our research is based on the data of visitors who came to Japan, representing a small fraction of all travelers—including those who decided not to visit Japan and to whom we cannot extrapolate our findings. Second, our research data were collected at one period in the study year of 2010, after which Japan saw a huge drop in visitors because of the Great East Japan Earthquake and simultaneous radiation leaks from nuclear power plants in Fukushima in 2011. The total number of inbound visitors to Japan reached the national goal of 10 million in 2013, overcoming the negative impacts on inbound visitors in the two years prior. We do not have any evidence to support whether our findings based on 2010 data have temporal stability with later data. This indicates that updated research on the data may generate an interesting answer to the issue of temporal stability of the behavior of inbound visitors.

Finally, we did not have any involvement with the data collection processes nor the design of the survey; thus, we have dealt with secondary data collected by professionals. The lack of direct data collection experience with the data set may prevent us from having certain insights that may be useful in evaluating the data set, or it may have just saved us from any sampling errors associated with data input. Future research may be performed with a survey exclusively on satisfaction and repeat intents, without piggybacking on the visitors’ expenditure survey, the length of which may be a partial reason for relatively high numbers of incompletion of the survey.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Author Biographies

Valeriya Shapoval is an assistant professor at Rosen College, University of Central Florida. She received her PhD and Data Mining Certificate in 2016 from University of Central Florida. Her research addresses Big Data analysis, destination marketing, human resourses, and organizational behavior.

Morgan C. Wang received his PhD from Iowa State University in 1991. He is the fusnding director of Data Mining Program and professor of Statistics at the University of Central Florida. He has published one book (Integrating Results through Meta-Analytic Review Using SAS Software, SAS Institute, 1999), and over 80 papers in referee journals and conference proceedings on topics including interval analysis, meta-analysis, computer security, business analytics, health care analytics and data mining.

Tadayuki Hara, is an associate professor at UCF, and served as a visiting professor at Yamaguchi University. His research areas are hotel management, economic impact of tourism/culture/aviation, tourism taxation and planning, funding side of tourism infrastructures, and destination marketing. He has published paper in Tourism Economics, Cornell Hospitality Quarterly, Pan Pacific Association of Input-Output Studies, Journal of Heritage Tourism, Journal of Regional Analysis & Policy, Journal of Convention & Event Tourism, Journal of Tourism Economics, Policy and Hospitality Management and published a technical textbook “Quantitative Tourism Industry Analysis: Introduction to Input-Output, Social Accounting Matrix Modeling and Tourism Satellite Accounts.”

Hideo Shioya is a manager of Tourism Economic Research Department at Japan Travel Bureau Foundation. Having trained in econometrics at Graduate School at Tsukuba University, he worked as a researcher on demand-side of tourism market, tourism statistics, economic impacts, inbound visitor and has compiled several statistical surveys conducted by the Japanese government. His recent research interests address funding sources for tourism management and impact of emergence of self-driving cars over tourism.

References

Ahmed

Z. U.

1991. “The Influence of the Components of a State’s Tourist Image on Product Positioning Strategy.” Tourism management 12 (4): 331–40.

Alegre

Garau

2011. “The Factor Structure of Tourist Satisfaction at Sun and Sand Destinations.” Journal of Travel Research 50 (1): 78–86.

Apté

Weiss

1997. “Data Mining with Decision Trees and Decision Rules.” Future Generation Computer Systems 13 (2): 197–210.

Assaker

Vinzi

V. E.

O’Connor

2011. “Examining the Effect of Novelty Seeking, Satisfaction, and Destination Image on Tourists’ Return Pattern: A Two Factor, Non-linear Latent Growth Model.” Tourism Management 32 (4): 890–901.

Athiyaman

1997. “Knowledge Development in Tourism: Tourism Demand Research.” Tourism Management 18 (4): 221–8.

Bajs

I. P.

2015. “Tourist Perceived Value, Relationship to Satisfaction, and Behavioral Intentions the Example of the Croatian Tourist Destination Dubrovnik.” Journal of Travel Research 54 (1): 122–34.

Baloglu

McCleary

K. W.

1999. “US International Pleasure Travelers’ Images of Four Mediterranean Destinations: A Comparison of Visitors and Nonvisitors.” Journal of Travel Research 38 (2): 144–52.

Berry

M. J.

Linoff

2000. Mastering Data Mining, 1st ed. Canada: Wiley Computer.

Bland

J. M.

Altman

D. G.

2000. “The Odds Ratio.” British Medical Journal 320 (7247): 1468.

10.

Brown

T. J.

Barry

T. E.

Dacin

P. A.

Gunst

R. F.

2005. “Spreading the Word: Investigating Antecedents of Consumers’ Positive Word-of-Mouth Intentions and Behaviors in a Retailing Context.” Journal of the Academy of Marketing Science 33 (2): 123–38.

11.

Buhalis

1997. “Information Technology as a Strategic Tool for Economic, Social, Cultural and Environmental Benefits Enhancement of Tourism at Destination Regions.” Progress in Tourism and Hospitality Research 3 (1): 71–93.

12.

Buhalis

1998. “Strategic Use of Information Technologies in the Tourism Industry.” Tourism Management 19 (5): 409–21.

13.

Buhalis

1999. “Tourism on the Greek Islands: Issues of Peripherality, Competitiveness and Development.” International Journal of Tourism Research 1 (5): 341–58.

14.

Buhalis

2000. “Marketing the Competitive Destination of the Future.” Tourism Management 21 (1): 97–116.

15.

Buhalis

Law

2008. “Progress in Information Technology and Tourism Management: 20 Years on and 10 Years after the Internet—The State of eTourism Research.” Tourism Management 29 (4): 609–23.

16.

Campelo

Aitken

Gnoth

2011. “Visual Rhetoric and Ethics in Marketing of Destinations.” Journal of Travel Research 50 (1): 3–14.

17.

Chang

Chen

Kuo

Hsu

Cheng

2016. “Applying Data Mining Methods to Tourist Loyalty Intentions in the International Tourist Hotel Sector.” Anatolia: An International Journal of Tourism & Hospitality Research 27 (2): 271–74.

18.

Chen

C. F.

Tsai

2007. “How Destination Image and Evaluative Factors Affect Behavioral Intentions?” Tourism Management 28 (4): 1115–22.

19.

Chi

C. G. Q.

2008. “Examining the Structural Relationships of Destination Image, Tourist Satisfaction and Destination Loyalty: An Integrated Approach.” Tourism Management 29 (4): 624–36.

20.

De Reyck

Degraeve

Vandenborre

2008. “Project Options Valuation with Net Present Value and Decision Tree Analysis.” European Journal of Operational Research 184 (1): 341–55.

21.

Duncan

1980. “What Is the Right Organization Structure? Decision Tree Analysis Provides the Answer.” Organizational Dynamics 7 (3): 59–80.

22.

Dwyer

Pham

Forsyth

Spurr

2014. “Destination Marketing of Australia Return on Investment.” Journal of Travel Research 53 (3): 281–95.

23.

Fakeye

P. C.

Crompton

J. L.

1991. “Image Differences between Prospective, First-Time, and Repeat Visitors to the Lower Rio Grande Valley.” Journal of Travel Research 30 (2): 10–16.

24.

Fayyad

U. M.

Piatetsky-Shapiro

Smyth

Uthurusamy

1996. “Advances in Knowledge Discovery and Data Mining.” AI Magazine 17 (3): 37–54.

25.

Frawley

W. J.

Piatetsky-Shapiro

Matheus

C. J.

1992. “Knowledge Discovery in Databases: An Overview.” AI Magazine 13 (3): 57.

26.

Fuchs

Reichel

2011. “An Exploratory Inquiry into Destination Risk Perceptions and Risk Reduction Strategies of First Time vs. Repeat Visitors to a Highly Volatile Destination.” Tourism Management 32 (2): 266–76.

27.

Gallarza

M. G.

Saura

I. G.

2006. “Value Dimensions, Perceived Value, Satisfaction and Loyalty: An Investigation of University Students’ Travel Behavior.” Tourism Management 27 (3): 437–52.

28.

Goh

Law

Mok

H. M.

2008. “Analyzing and Forecasting Tourism Demand: A Rough Sets Approach.” Journal of Travel Research 46 (3): 327–38.

29.

Hall

C. M.

Hall

2000. Tourism Planning: Policies, Processes and Relationships. Upper Saddle River, NJ: Pearson Education.

30.

Hallowell

1996. “The Relationships of Customer Satisfaction, Customer Loyalty, and Profitability: An Empirical Study.” International Journal of Service Industry Management 7 (4): 27–42.

31.

Hernández Lobato

Solis-Radilla

M. M.

Moliner-Tena

M. A.

Sánchez-García

2006. “Tourism Destination Image, Satisfaction and Loyalty: A Study in Ixtapa-Zihuatanejo, Mexico.” Tourism Geographies 8 (4): 343–58.

32.

Hosking

J. R.

Pednault

E. P.

Sudan

1997. “A Statistical Perspective on Data Mining.” Future Generation Computer Systems 13 (2): 117–34.

33.

JNTO (Japan National Tourism Organization). 2012. “Foreign Visitors and Japanese Departures.” Report prepared by Japan National Tourism Organization. (Accessed on November 2014).

34.

Japan Tourism Research and Consulting Co. 2016. “Historical Statistics–Visitors to Japan from Overseas.” Report prepared by National Tourism Organization, January 15, 2016 http://www.tourism.jp/en/statistics/inbound/.

35.

Joppe

Martin

D. W.

Waalen

2001. “Toronto’s Image as a Destination: A Comparative Importance-Satisfaction Analysis by Origin of Visitor.” Journal of Travel Research 39 (3): 252–60.

36.

Kim

S. S.

Timothy

D. J.

Hwang

2011. “Understanding Japanese Tourists’ Shopping Preferences Using the Decision Tree Analysis Method.” Tourism Management 32 (3): 544–54.

37.

Kozak

Rimmington

2000. “Tourist Satisfaction with Mallorca, Spain, as an Off-Season Holiday Destination.” Journal of Travel Research 38 (3): 260–69.

38.

Laesser

Dolnicar

2012. “Impulse Purchasing in Tourism—Learnings from a Study in a Matured Market.” Anatolia 23 (2): 268–86.

39.

Lee

C. K.

Lee

2014. “Dynamic Nature of Destination Image and Influence of Tourist Overall Satisfaction on Image Modification.” Journal of Travel Research 53 (2): 239–51.

40.

Lee

J. J.

Kyle

Scott

2012. “The Mediating Effect of Place Attachment on the Relationship between Festival Satisfaction and Loyalty to the Festival Hosting Destination.” Journal of Travel Research 51:754–67.

41.

Leiper

1995. Tourism Management, Vol. 455. French Forest: Pearson Education Australia.

42.

Litvin

S. W.

Goldsmith

R. E.

Pan

2008. “Electronic Word-of-Mouth in Hospitality and Tourism Management.” Tourism Management 29 (3): 458–68.

43.

Liu

Siguaw

J. A.

Enz

C. A.

2008. “Using Tourist Travel Habits and Preferences to Assess Strategic Destination Positioning the Case of Costa Rica.” Cornell Hospitality Quarterly 49 (3): 258–81.

44.

Lupton

R. A.

1997. “Customer Portfolio Development: Modeling Destination Adopters, Inactives, and Rejecters.” Journal of Travel Research 36 (1): 35–43.

45.

Maimon

Rokach

2010. Decomposition Methodology for Knowledge Discovery and Data Mining. Boston, MA: Springer US.

46.

Maxham

J. G.

2001. “Service Recovery’s Influence on Consumer Satisfaction, Positive Word-of-Mouth, and Purchase Intentions.” Journal of Business Research 54 (1): 11–24.

47.

Min

Emam

2002. “A Data Mining Approach to Developing the Profiles of Hotel Customers.” International Journal of Contemporary Hospitality Management 14 (6): 274–85.

48.

Mishler

A. L.

1965. “Personal Contact in International Exchange.” In International Behavior, a Social-Psychological Analysis, edited by Kelman.

Herbert C.

New York: Holt, Rinehart & Winston.

49.

Nisbet

Elder

IV Miner

2009. Handbook of Statistical Analysis and Data Mining Applications. Burlington, MA: Elsevier.

50.

Okamura

Fukushige

2010. “Differences in Travel Objectives between First-Time and Repeat Tourists: An Empirical Analysis for the Kansai Area in Japan.” International Journal of Tourism Research 12 (6): 647–64.

51.

Özdemir

Şimşek

Ö. F.

2015. “The Antecedents of Complex Destination Image.” Procedia-Social and Behavioral Sciences 175:503–10.

52.

Pagliery

2015. “OMG: 2.1 Million People Still Use AOL Dial-up.” CNN Money, May 8. http://money.cnn.com/2015/05/08/technology/aol-dial-up/ (accessed December 15, 2016).

53.

Park

Y. A.

Gretzel

2007. “Success Factors for Destination Marketing Web Sites: A Qualitative Meta-analysis.” Journal of Travel Research 46 (1): 46–63.

54.

Pearce

D. G.

2014. “Toward an Integrative Conceptual Framework of Destinations.” Journal of Travel Research 53 (2): 141–53.

55.

Phillips

W. J.

Back

K. J.

2011. “Conspicuous Consumption Applied to Tourism Destination.” Journal of Travel & Tourism Marketing 28 (6): 583–97.

56.

Pike

2012. Destination Marketing, 1st ed. Burlington, MA: Elsevier.

57.

Pizam

Neumann

Reichel

1978. “Dimensions of Tourist Satisfaction with a Destination Area.” Annals of Tourism Research 5 (3): 314–22.

58.

Pool

I. D.

1965. “Effects of Cross-National Contact on Nation and International Images.” In International Behavior: A Social Psychological Analysis, edited by Kelman.

Herbert C.

New York: Holt, Rinehart & Winston.

59.

Pyo

Uysal

Chang

2002. “Knowledge Discovery in Database for Tourist Destinations.” Journal of Travel Research 40 (4): 374–84.

60.

Adam

B. L.

Yasui

Ward

M. D.

Cazares

L. H.

Schellhammer

P. F.

Wright

G. L.

2002. “Boosted Decision Tree Analysis of Surface-Enhanced Laser Desorption/Ionization Mass Spectral Serum Profiles Discriminates Prostate Cancer from Noncancer Patients.” Clinical Chemistry 48 (10): 1835–43.

61.

Quinlan

J. R.

1987. “Simplifying Decision Trees.” International Journal of Man-Machine Studies 27 (3): 221–34.

62.

Ramseook-Munhurrun

Seebaluck

V. N.

Naidoo

2015. “Examining the Structural Relationships of Destination Image, Perceived Value, Tourist Satisfaction and Loyalty: Case of Mauritius.” Procedia-Social and Behavioral Sciences 175:252–59.

63.

Ranaweera

Prabhu

2003. “On the Relative Importance of Customer Satisfaction and Trust as Determinants of Customer Retention and Positive Word of Mouth.” Journal of Targeting, Measurement and Analysis for Marketing 12 (1): 82–90.

64.

Reinartz

Kumar

2002. “The Mismanagement of Customer Loyalty.” Harvard Business Review 80 (7): 86–94. https://login.ezproxy.net.ucf.edu/login?auth=shibb&url=http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=6899224&site=eds-live&scope=site (accessed June 5, 2015).

65.

Ryan

1991. “Tourism and Marketing—A Symbiotic Relationship?” Tourism Management 12 (2): 101–11.

66.

Sheldon

1994. “Information Technology and Computer Systems.” In Tourism Marketing and Management Handbook, 2nd ed., edited by Witt

S. F.

Mountinho

, 126–30. Hemel Hempstead, UK: Prentice Hall.

67.

Som

A. P. M.

Badarneh

M. B.

2011. “Tourist Satisfaction and Repeat Visitation; Toward a New Comprehensive Model.” International Journal of Human and Social Sciences 6 (1): 38–45.

68.

Tran

Ralston

2006. “Tourist Preferences Influence of Unconscious Needs.” Annals of Tourism Research 33 (2): 424–41.

69.

UNWTO. 2011. Policy and Practice for Global Tourism. Madrid: UNWTO.

70.

Uzama

2009. “Marketing Japan’s Travel and Tourism Industry to International Tourists.” International Journal of Contemporary Hospitality Management 21 (3): 356–65.

71.

Valle

P. O. D.

Silva

J. A.

Mendes

Guerreiro

2006. “Tourist Satisfaction and Destination Loyalty Intention: A Structural and Categorical Analysis.” International Journal of Business Science and Applied Management 1 (1): 25–44.

72.

Walton

David.

2014, March 28. “Big Data Raises Big Legal Issues.” Inside Counsel. http://www.insidecounsel.com/2014/03/28/big-data-raises-big-legal-issues (accessed June10, 2015).

73.

Wang

Pizam

2011. Destination Marketing and Management: Theories and Applications, 1st ed. Oxfordshire, UK: CABI.

74.

Weiermair

2000. “Tourists’ Perceptions towards and Satisfaction with Service Quality in the Cross-Cultural Service Encounter: Implications for Hospitality and Tourism Management.” Managing Service Quality 10 (6): 397–409.

75.

Wicker

Breuer

2013. “Exploring the Critical Determinants of Organizational Problems Using Data Mining Techniques: Evidence from Non-profit Sports Clubs in Germany.” Managing Leisure 18 (2): 118–34.

76.

Woodside

A. G.

Lysonski

1989. “A General Model of Traveler Destination Choice.” Journal of Travel Research 27 (4): 8–14.

77.

WTO. 2002. “Thinktank. World Tourism Organisation.” http://www.world-tourism.org/education/menu.html (accessed May 15, 2015).

78.

Yoon

Uysal

2005. “An Examination of the Effects of Motivation and Satisfaction on Destination Loyalty: A Structural Model.” Tourism Management 26 (1): 45–56.