Abstract
This study examines the relationship between distance measures and a Portuguese data set consisting of 34,622 online hotel reviews extracted from Booking.com and TripAdvisor written in Portuguese, Spanish, and English. Based on the country of origin of each review author, a geographic and a psychic distance measure is calculated for Portugal. Data and text mining analysis provides additional insights into online hotel ratings. The authors confirm that online travelers’ evaluations are multifaceted constructs displaying varying patterns of rating behavior among the traveler base. By investigating the contemporary relevance of geographic and psychic distance, a key finding of this study is that travelers with less distance both in terms of psychic and geographic distance give a lower rating score than travelers with greater distance. The inclusion of psychic and geographic distance is advocated as a salient aspect for future researchers and for those practitioners who wish to enhance hotel product and service features.
Introduction
The increasing focus on customer experience by practitioners has led to the creation of a rich seam of research pertaining to online hotel ratings. Travelers’ purchasing decisions are increasingly being influenced by online reviews (Cantallops and Salvi 2014; Kwok, Xie, and Richards 2017; Ring, Tkaczynski, and Dolnicar 2016; Tan, Lv, and Gursoy 2018). Indeed, from such investigations various questions arise, and are answered. We know from several contributions that culture (Gao et al. 2018), language (Antonio et al. 2018a; Goethals 2016; Wu et al. 2017), and travel experience (Lu and Stepchenkova 2015; Morosan and Bowen 2017) are among factors that have an influence on online hotel ratings. The continued growth of data-generating platforms has inspired new approaches to understanding the traveler experience. Such subjective rating information is now expressed and published in more than 70 diverse platforms, including popular online booking websites such as Booking.com and TripAdvisor (Phillips et al. 2015).
Reviews from Booking.com and TripAdvisor possess two main types of ratings: quantitative (the overall review score) and qualitative (the textual component being the commentary). Although there are numerous studies on the subject of online reviews, most of them focus on the quantitative ratings of reviews to represent user opinion (Duan et al. 2016), but recent works are advocating the use of the textual component of reviews (Antonio et al. 2018a; Bjørkelund, Burnett, and Nørvag 2012; Duan et al. 2016; Han et al. 2016; Xiang et al. 2015; Xu and Li 2016), the rationale being the textual component has the potential to allow for better recognition of “guests’ true feelings” (Han et al. 2016, 17).
The initial interest of online hotel ratings has been maintained by researchers advocating the merits of understanding evaluations posted on web and social media sites (Floyd et al. 2014; Kostyra et al. 2016; Schuckert, Liu, and Law 2015). Li et al. (2018) provide a thorough review of big data and online hotel ratings research. To date, researchers have mainly focused on using sentiment analysis, which can automatically detect the valence of a piece of reviewer text, which can be positive, neutral or negative (Geetha, Singha, and Sinha 2017). Prior tourism-related research has assessed the valence of reviews (Duverger 2013; Sparks and Browning 2011), their volume (Xie, Zhang, and Zhang 2014), and variance (Melian-Gonzalez, Bulchand-Gidumal, and Gonzalez Lopez-Valcarcel 2013). Additional insights can be derived from studying semantic relationships and meaning in online hotel ratings (Alaei, Becken, and Stantic 2017; Phillips et al. 2016; Xiang et al. 2015; Xu and Li 2016). The degree of positivity or negativity toward the main textual subject of online hotel ratings (semantic analysis) is currently a hotbed of research and development for academics and practitioners (Ge, Vazquez, and Gretzel 2018). The continual lower prices to travel to overseas locations together with a more favorable US dollar exchange rate have in part accelerated the international dimension of online hotel ratings. Online travelers’ preferences have been investigated from many facets, but distance offers a fresh perspective. Indeed, in an online environment the concept of distance needs to go beyond geographic distance (Deodhar, Subramani, and Zaheer 2017). Similar to Deodhar, Subramani, and Zaheer (2017), this study incorporates a set of psychic measures, which are one of the most popular forms of distance (Safari, Thilenius, and Hadjikhani 2013). Psychic distance is likened to “the sum of factors” or the “differences” that go beyond the objective criteria of geographic and cultural distance per se (see Yang, Liu, and Li 2019) and incorporate other factors such as business development, industrial development, and education differences too (Dow and Karunaratna 2006).
Using online hotel ratings, this study explores the relationship between distance measures. The authors generate a data set consisting of reviews and associated ratings on Booking.com and TripAdvisor for Portuguese hotels in three different languages (Portuguese, Spanish, and English). The country of origin of each review author is collected in order to derive a geographic and psychic distance measure between the author’s country of origin and Portugal. A more technical aspect of this study is the sentiment analysis of reviews whereby each review is assigned a sentiment score based on a dictionary approach that involves calculating a ratio of the terms with positive and negative sentiments to derive an overall score. This technical approach of text-mining analysis provides additional insights into online hotel ratings.
Dissimilarities among travelers will influence their preferences with respect to hotel attributes (Banarjee and Chua 2016), which is rather pertinent for heterogeneous groups of travelers. Such a difference leads to a social identity theory (Tajfel 1982), whereby network-based communities may act not as individuals but develop a common, social identity. Individuals perceive themselves and others as belonging to various social groups, which from the perspective of the hotel may result in different evaluative online hotel rating statements from guests based on their distance measures. For example, although culture is an important factor in decision making, other factors such as online platforms may affect the decision-making process of travelers. Given the relatively nascent state of research, there is limited empirical work directly related to how the country of origin affects rating behavior. Furthering this path has led to some recent studies (Kim 2018; Gao et al. 2018) that look at the effects of country of origin on online review ratings as well as those of culture. Less well examined too is the role of language in online hotel ratings (Schuckert, Liu, and Law 2015; Liu et al. 2017). Moreover, research is lacking on how distance influences travelers’ ratings, together with the dangers of aggregating reviews written in multiple languages.
So why does this matter? Well, having an accurate understanding of salient online hotel ratings’ relationships is essential for both strategic management and marketing theory and practice. The economic and societal impact of tourism across global markets is a priority for governments, the private sector, and societal-oriented organizations. Hotels play a pivotal role in a country’s tourism product. Travelers of varying distances may possess different expectations in areas unknown to those responsible for marketing strategies at the individual and destination level. So, an understanding of such relationships may help advance a more effective connectivity among the online hotel ratings database, which is a key strategic resource.
The remainder of the article is organized as follows. In the next section, we consider online hotel ratings and the relevant distance literature and present the research questions that describe the positioning of the study. The data and methods are outlined. Finally, we present the data analysis and results of our empirical analysis and round off with the conclusion and theoretical and managerial implications.
Online Hotel Ratings
Hoteliers believe that their performance will be hampered if they are unable to reliably monitor online hotel ratings, and results from recent academic studies support this (Duverger 2013; Phillips et al. 2016; Xie, Zhang, and Zhang 2014). A consequence of the popularity of online hotel ratings is that reviews now constitute a new element of the marketing communication mix and have implications for both theory and practice. The decomposition of online reviews into their main elements of valence, variance, and volume has been one way to obtain a better understanding into the relevance of each aspect of firm performance (see, e.g., Floyd et al. 2014; Kostyra et al. 2016 for an overview).
Online hotel ratings not only capture online reviews, recommendations, and opinions exchanged by consumers (Cantallops and Salvi 2014) but also form the bases on which consumers may revise their purchase decisions and ultimately change their buying behavior (Cantallops and Salvi 2014; Sparks and Browning 2011). Online hotel ratings create a resource whereby reviewers, review readers, and managers can use either quantitative or qualitative techniques to consider outcomes in terms of consumer decision-making and business performance (Kwok, Xie, and Richards 2017). In fact, because of the diversity of opinion, independence, decentralization, and aggregation, according to Surowiecki (2005), users who post online reviews can be considered a “crowd.” Being a diverse collection of independent individuals who are better at making certain decisions and predictions than its individual members or even, better than experts, explains why, these days, consumers attach more value to online hotel ratings than hotels’ official classifications or stars (Öğüt and Onur Taş 2012). A number of companies such as Olery.com (www.olery.com), ReviewPro (www.reviewpro.com), and Revinate (www.revinate.com) have sprung up to develop reputational management systems that show how improving guest satisfaction can translate and enhance revenues (Hensens 2015). Another firm with a strong presence is Brand Karma, which is an in-house marketing agency of the Next Story Group (www.nextstory.com). This firm has the ability to filter western social media channels from Chinese social media channels.
To illustrate the benefits that can accrue, consider, for example, Anderson (2012), who found that a 1% increase in a hotel’s index score results in higher profitability in terms of 1.42% increase in Revenue per Available Room. But now, we accept that there is causality between review management, reputation, and revenue development, but not as linear as presented by Anderson (2012). Recent studies have demonstrated that the effect of review management on revenues depends on the type of hotel, the destination, the customer structure, and the occupancy rate, for instance (Kim, Lim, and Brymer 2015; Phillips et al. 2015, 2016; Xie, Zhang, and Zhang 2014; Yang, Park, and Hu 2018).
Online hotel ratings is now a powerful resource, as the exchange of information by which the communicator (reviewer) transmits content (message) to several communicatees (receivers), which can modify perceptions and behavior (Hernández-Ortega 2018). In terms of further opportunities, marketing managers ought to learn how to actively manage reviews, including negative reviews (Baka 2016; Cantallops and Salvi 2014). Yet, previously raised questions on what needs managing and measuring (Godes and Mayzlin 2004) have become more difficult to answer because of the increasing availability of data both to consumers and organizations.
Moreover, an area that has received scant scholarly attention is that of the influence of travelers’ origin on online hotel ratings. In general, prior academic studies aggregate reviews from individual travelers of differing origins to compute an average rating. Such travelers may have consistently varying experiences. This practice raises particular issues for those hotels where guests come from different nationalities (Wilson, Murphy, and Fierro 2012). Prior research of online hotel ratings’ aggregated data does reveal general trends, but as Mckercher (2008) notes, aggregation camouflages significant changes that occur at submarket levels. To illustrate this point, consider Pizam and Sussmann (1995), who espoused that travelers’ perceptions in terms of satisfaction levels do vary according to the country of origin. We know that when travelers select a tourism destination, they are influenced in part by both measurable and cognitive distances (Ankomah, Crompon, and Baker 1996; Massara and Severino 2013; Uchiyama and Kohsaka 2016; Zhang, Seo, and Lee 2013).
Understanding how distance and language influences online hotel ratings is important for several reasons. Dissimilarities among travelers will influence their preferences with respect to hotel attributes (Banarjee and Chua 2016), which is rather pertinent for heterogeneous groups of travelers. Having an accurate understanding of salient online hotel ratings’ relationships is essential for both strategic management and marketing theory and practice. The economic and societal impact of tourism across global markets is a priority for governments, private sector, and societal-oriented organizations. Travelers of different origins will possess significantly different expectations. So understanding changes in online customer reviews beyond those written in English may help advance a more effective connectivity among the online hotel ratings database.
Distance
The construct of distance can be disaggregated into multiple measures across the social sciences as well as economic, financial, political, administrative, cultural, as well as geographic realms (Berry, Guillén, and Zhou 2010). According to Johanson and Vahlne (1977), distance can be measured as an objective variable (e.g., geographic) and measured as a matter of decision makers’ perceptions (e.g., psychic distance). Both physical and perception distances are related but imperfectly correlated, and physical distance influences judgment and decision making (Fujita et al. 2006).
Prior research within the business and management literature has considered cultural (Hofstede 1980; House et al. 2004), psychic (Beckerman 1956; Dow and Karunaratna 2006), and geographic (Choi and Contractor 2016; Mckercher 2008) distances as central to comprehending organizational performance. This study considers both objective and perceptive perspectives.
Geographic
Understanding the influence of geographical distance is important for several reasons. The stimuli of geographic distance has been incorporated in prior empirical studies (Choi and Contractor 2016). Previous research defines geographic distance as the distance between two cities in kilometers (Brewer 2007). Blum and Goldfarb (2006) note how geographic distance influences the trade of digital goods sold over the Internet. Studies have had varying levels of success by assessing the distance between capital cities (Brock, Johnson, and Zhou 2011), major cities (Hutzschenreuter, Kleindienst, and Lange 2014), and geographic centers of countries (Ojala and Tyrväinen 2008). The geographic dispersal of travelers present opportunities for marketers to customize visitor packages. In the tourism literature, studies observe that travel demand decreases as distance from the origin market increases (Cai and Li 2009; Mckercher 2008; Mckercher and Lew 2003). Increasing the distance adds time, and costs money, thus making the destination less attractive to the traveler (Prideaux 2000). The distance-decay model provides some theoretical foundations (Mckercher and Lew 2003), which states that the demand increases up to a certain distance and then decreases exponentially. Nicolau and Más (2006) proposed that the effects of distance and prices are moderated by travelers’ motivation. The digitized environment operating across different time zones can further reduce the efficacy of the communication effort. Geographical proximate destinations provide lower economic and social costs, together with a degree of environmental familiarity.
In short, both information networks and transportation costs may influence the impact of distance (Ghemawat 2001). Notwithstanding the improvements in transportation systems and digital technologies, travelers who are geographically distant may undergo differing experiences in their outbound trips. In fact, Ojala (2015) remark that modern air transport and communication have reduced the perceived distance and eased commercial interactions. Child, Ng, and Wong (2002) allude to these as “distance-compressing factors.” This study investigates how traveler distance between home and destination influences online hotel ratings by considering:
Research question 1: How do varying levels of geographic distance influence travelers’ online hotel ratings?
Psychic
In this section, we set out the positioning of our study in the broader destination image literature. We begin by briefly acknowledging the destination image literature and highlight why we have framed our approach using psychic distance.
In examining how tourists view or have mental representations of a place, researchers generally consider destination image (Ryan and Cave 2005). Taking this perspective, destination image is commonly depicted as a concept formed by a traveler’s interpretation of cognitive and perceptive evaluations and effective appraisals toward a destination (Crompton 1979; Hallmann, Zehrer, and Müller 2015). The topic has been among the most popular in tourism literature for more than four decades (Pike and Page 2014) and is considered to be a multidimensional construct. In the extant literature, destination image tourists’ mental representation has been defined, operationalized, and measured in a plethora of ways. Space precludes a detailed overview, but Kock, Josiassen, and Assaf (2016) provide a succinct overview of the destination image literature. Critically, two ways of depicting destination image include the sum of beliefs, ideas, and impressions people have of an object, place, or destination (Zhang et al. 2014). Another view relates to cognitive (beliefs or assessments), affective (positive or negative emotion), and conative aspects (behavioral intention) (Choi, Hickerson, and Kerstetter 2018; Kim 2018).
On the credit side, although the considerable body of prior destination image research provides useful insights, it leaves room for additional theorizing and empirical research. We outline our rationale next.
First, as previously stated, the topic of destination image has been one of the most popular topics of the tourism literature (Pike and Page 2014). However, the precise nature and scope of destination image remain vague (Hallmann, Zehrer, and Müller 2015; Lai and Li 2016). As Albert Einstein famously quoted, “We cannot solve problems by using the same kind of thinking we used when we created them.” Moreover, new approaches are required if organizations wish to prosper and survive new environments (Baden-Fuller and Stopford 1994; Markides 1998). We wish to look outside this traditional destination image approach, and indeed delve into another area.
Second, in a turbulent, chaotic, and nonlinear tourism environment, strategies need to incorporate cultural and value differences (Phillips and Moutinho 2014). More specifically, Phillips and Moutinho (2014) lament about the methodologic introspection of prior approaches in tourism and stress that new research methodologies are critically important in enhancing theory and practice—the implication being that new approaches may generate fresh knowledge and insights too.
Third, in addition, Ferrer-Rosell, Martin-Fuentes, and Marine-Roig (2019) revealed that the marketing promotion activities of higher-class hotels highlight their facilities, whereas lower-class hotels refer more to the destination. In their study, four- and five-star hotels made up more than 80% of our sample. This latter point reinforces that our unit of analysis is not the destination per se but the hotel itself.
Finally, according to Mossberg and Kleppe (2005), destination image is an area for marketing practice that can incorporate the sale of export products in the international arena. Psychic distance being the distance between the home country of the firm and export countries can be used as a multidimensional concept and measured from the customer perspective at an individual level (Assarut and Srisuphaolarn 2018). So the adoption of psychic measures can incorporate Mossberg and Kleppe’s (2005) views of destination image, and its legitimacy can be supported by our first three points. We replace the traditional unit of analysis of the firm with the traveler, and give attention to the international business management and marketing literatures and employ both psychic and distance measures. Kim (2018) considered postvisit image rather than revisit so that tourists are able to rate their experiences. The current study, therefore, adopts the postvisit approach and considers hotels and uses online hotel reviews to gauge perceptions.
Early psychic distance research commenced with Beckerman (1956), who coined the phrase by remarking on the special problem posed by its existence. The term was sporadically referred to in international trade flow research (Geraci and Prewo 1977; Linnemann 1966). During the 1970s, management-oriented literature gained prominence thanks to the research at the University of Uppsala. In terms of measurement, the sum of the factors approach include differences in language, culture, political systems, level of education, and level of industrial development (Johanson and Wiedersheim-Paul 1975). International business researchers have since refined and added to the aforementioned list. The existing literature offers a wide range of studies, but developing and confirming a set of psychic scales that captures the characteristics that matter has posed a dilemma (Dow and Karunaratna 2006).
Psychic distance is not solely about nationality and cultural factors but considers individuals and relationships of customers in an international online setting (Safari, Thilenius, and Hadjikhani 2013). The unit of analysis varies too, with some studies considering differences between countries and others between companies (Durand, Turkina, and Robson 2016). In the context of this study, psychic distance is the gap or differences that a traveler might perceive between his or her origin (country) and the destination. In spite of a decade of online hotel ratings research, the field of tourism has not yet yielded a comprehensive analysis of the fact that travelers with different origins may provide ratings that are different on a number of distance dimensions beyond solely cultural studies (Assaf, Josiassen, and Agbola 2015; Bi and Lehto 2018; Martin, Jin, and Trang 2017; Qian, Law, and Wei 2018). Psychic distance goes beyond the objective criteria of geographic and cultural distance per se, as it incorporates business, industrial development, and education differences too. Dow and Karunaratna (2006) proposed and tested a range of potential psychic distance stimuli encompassing culture, language, religion, education, and political systems. This school of thought concentrates on more than one stimulus, for example, culture, and demonstrates that the latter is only one indicator.
The concept of psychic distance is one of the most explored areas in the internationalization literature (Safari, Thilenius, and Hadjikhani 2013). Yet, conflicting findings on the issue of psychic distance indicates the need for further research (see Durand, Turkina, and Robson 2016). This issue deserves attention in tourism too, as it prevents researchers and practitioners from making effective recommendations in deploying marketing strategies. The scarcity of available resources now makes it imperative that the salient drivers be identified (Durand, Turkina, and Robson 2016). Considering the importance of forming and maintaining effective customer relationships as drivers of competitiveness, innovation, customer satisfaction, and performance (Ulaga and Eggert 2006) in international settings (Zhang, Cavusgil, and Roath 2003), it is necessary to identify contingent factors that influence the effect of psychic distance on international travel (Durand, Turkina, and Robson 2016).
Another significant observation from the prior literature is the absence of studies applying psychic distance in online settings (Safari, Thilenius, and Hadjikhani 2013). In this study, we reflect and investigate this multifaceted concept by considering:
Research question 2: How do varying levels of psychic distance influence travelers’ online hotel ratings?
Data and Methods
Data Set
The study uses a unique data set created by merging four different data sets: one with geographic distances between countries created by Mayer and Zignago (2011); a data set with psychic distance between countries developed by Dow and Karunaratna (2006); a third one with ISO country codes (International Organization for Standardization 2017) with ISO 3166 two-digit country codes and their designations in English, Portuguese, and Spanish; and a data set of hotel online customer reviews. The latter was created using a custom-built web content extractor that retrieved a total of 39,425 hotel reviews published during July 1, 2015, to the November 30, 2016. The custom-built web extractor made use of a Firefox Internet browser to automatically navigate through Booking.com and TripAdvisor reviews’ web pages and process the content of those web pages, in a process known as “web scraping” (Batrinca and Treleaven 2015; Braun, Kuljanin, and DeShon 2018). European law recognizes that users can make copies of publicly available databases and use that data in research (Bosch 2017; Monkman, Kaiser, and Hyder 2018), but companies are making scraping increasingly difficult (Jennings and Yates 2009). Because of this difficulty, we decided to extract data only from Booking.com and TripAdvisor as these are the two of the most popular platforms, and available only in English, Spanish, and Portuguese. Also, these three languages represent the main official languages of 70% of Portugal’s hotel guests (Instituto Nacional de Estatística 2016). This diversity in languages makes Portugal an ideal location to examine the influence of language. Difference in language is a stimuli that has received endorsement from numerous studies, from Beckerman (1956), Conway and Swift (2000), and Dow and Karunaratna (2006) to more recent works such as Avloniti and Filipppaios (2014), Cuypers, Ertug, and Hennart (2015), and Antonio et al. (2018a).
One of the authors (responsible for the data collections) is actively involved in the Portuguese hotel sector and has access to many hotel contacts and sources of data. This enables the collection of both qualitative and quantitative data, which supports this study. Portugal is the setting for the destination with two-, three-, four-, and five-star city and resort hotels providing the context of the study. The inclusion of city and resort hotels enable greater insights by category of hotel. Andriotis (2011) clusters destinations into three categories—urban, coastline, and rural—and so, in terms of hotel profiles, our study uses City (urban) hotels in Lisbon and Resort (coastline) hotels in the Algarve. Four city hotels and four resort hotels were initially selected, and each hotel manager was asked to identify the top five hotels of their competitive set. This resulted in a total of 56 hotels being selected for online reviews retrieval, from two to five stars, as detailed in Table 1.
Hotel Summary.
Data set elaboration
To elaborate and analyze this data set with respect to the two research questions, we employed the software package R because of its openness and statistical and visualization capabilities. As previously mentioned and illustrated in Figure 1, this study’s data set is a merger of four different data sets: hotel reviews, geographic distances, psychic distances, and ISO country codes. The construction of the final data set was based on the hotel reviews. From the 39,425 obtained reviews, 16 were removed because they were duplicates or in a language different from the ones chosen for this study. As the country of the traveler writing the review was not identifiable, another 3,877 reviews were removed. Most TripAdvisor reviews provide the user’s identification and his or her location, but that is not the case in Booking.com reviews, where either location is not a mandatory field in the user profile or the user can ask to remain anonymous. Lastly, 448 reviews were removed because they were from countries that had fewer than 20 reviews or were from countries where there was no information on the geographic or psychic distances data sets, which was the case of 462 reviews from Serbia, Gibraltar, Georgia, and Angola.

Data set elaboration diagram.
An array of data science tools was employed to build this data set, including data visualization, natural language processing, feature engineering, statistics, and machine learning. Such tools enable the creation of new features, which were necessary because:
Booking.com and TripAdvisor use different rating scales in their quantitative components. Booking.com uses a continuous scale from 1 to 10 and TripAdvisor a discrete scale from 1 to 5. Besides this, there is a difference in the scales used. Yet, the Booking.com scale actually has a minimum rating of 2.5, as highlighted by Mellinas, María-Dolores, and García (2015). Thus, it is necessary to normalize the quantitative ratings from both sources in order to study them.
There is a need for clearer analysis and interpretation of the impact of geographic and psychic distances in ratings. Thus, geographic and psychic distances, originally continuous variables, need to be converted into categorical features.
A summary of the features included in the final data set is presented in Table 2.
Final Data Set Features Summary.
Some of the features in Table 2 were engineered, namely,
GeoDistanceFactor: Geographic distance was transformed and resulted in a three-valued categorical feature: PT (Portugal), Near, and Far. We considered values from 0 to 114.9 km as “PT,” values from 115 to 4,999.9 km as “Near” (this includes most European countries), and from 5,000 km upwards as “Far.” The process of transforming continuous features to categorial features is called discretization. This process is usually done for allowing a feature to be employed by machine learning algorithms that do not work with continuous features, to speed processing, or to increase interpretability (Dougherty, Kohavi, and Sahami 1995; Kotsiantis and Kanellopoulos 2006). Discretization methods are usually divided into two groups: unsupervised and supervised. Unsupervised methods, such as equal-interval binning or equal-frequency binning, do not make use of class membership information in the discretization process. Conversely, supervised methods make use of class membership information to establish the discretization limits. Since supervised discretization methods only produce slightly better performance results than unsupervised methods (Dougherty, Kohavi, and Sahami 1995) and our objective was not to build a predictive model, we decided to employ an unsupervised approach that would guarantee what Kotsiantis and Kanellopoulos (2006) designate as the compromise between information quality (homogenous intervals) and statistical quality (sufficient sample size to ensure generalization).
PsychicDistanceFactor: Psychic distance was also transformed to a categorical feature. As for geographic distance, psychic distance was divided into three named values using a similar distance criterion: PT (Portugal), Near, and Far. We considered a null (zero) distance as “PT.” From 0.1 to 1.49, which cover most Latin countries and other countries that Portuguese feel as “familiar,” like Brazil, as “Near” (in terms of religion, language, and even in historic background). All other countries with a psychic distance above 1.5 were considered “Far.”
RevRating: Because of the aforementioned differences in the quantitative rating scales used by Booking.com and TripAdvisor, ratings were normalized so that the quantitative overall rating of reviews could be analyzed together. We opted to normalize ratings according to the TripAdvisor scale, that is, from 1 to 5. Since Booking.com only allows a minimum rating of 2.5, we employed binning, a common technique used to convert numeric variables to discrete (Abbott 2014; Dougherty, Kohavi, and Sahami 1995). We divided the amplitude of the Booking.com scale (7.5 = 10 − 2.5) by 5 to obtain each bin amplitude, which resulted in the following bins classification intervals: [2.5, 4.0[, [4.0, 5.5[, [5.5, 7.0[, [7.0, 8.5[ and [8.5, 10], respectively, represented by 1 to 5.
RevSentences: This feature is a by-product of the sentiment analysis of the review textual component. By recording the number of sentences, we can explore the existence of a possible relationship with the opinion or quantitative rating of the review.
RevSentimentStrength: It is a numeric feature that reflects the polarity of the opinion (also known as sentiment analysis) based on the textual component of review. In the case of Booking.com, since it has two textual components, one for positive aspects and one for negative aspects, we concatenated both texts. Sentiment analysis, or opinion mining, is the computational study of people’s opinions toward entities, individuals, events, topics, and their attributes. Sentiment analysis allows for the quantification of opinions according to their polarity (positive, negative, or neutral) (Liu and Zhang 2012). By assigning each review with a polarity value based on the textual component, it is possible to compare how users rate hotels in the textual component of reviews against what they rate in the quantitative component, therefore, obtaining two ratings for the same review. Prior to the execution of sentiment analysis, text preprocessing was performed. As recognized by Han et al. (2016), text preprocessing is an arduous and time-consuming task, because it requires going back and forth while creating a document-term matrix (a document-term matrix is a common form for representing a collection of documents, where documents are assigned to rows, words are assigned to columns, and each cell populated with the frequency of the word in the document). This is even more difficult when it must be applied to three different languages. Text preprocessing consisted of the following steps: ○ Transform all text to lowercase. ○ Normalize related entities—transform words of similar meaning that appear in different formats in different languages to a consistent form. For example, “wi-fi” and “wi fi” were converted to “wifi.” ○ Per language—perform stemming of common hospitality words like “rooms,” “restaurants,” and others that could be meaningful for data interpretation. ○ Per language—normalize different spellings of the same words or expressions that could be written differently or could be misspelled. For example, in English, transform “didn’t” and “didnt” to “did not.” ○ Per language—standardize domain-specific terms. For example, in English, “staff” is a common word used to describe hotel staff, but in Portuguese, numerous words like “equipa” (team), “pessoal” (personnel), “funcionários” (employees), or “colaboradores” (collaborators) are used. Other examples related to guest origin also had to be taken into consideration. Brazilian Portuguese has some differences from the European Portuguese language, and because Brazil is an important market in Portugal, terms from Brazilian Portuguese like “café da manhã,” “ônibus,” or “metrô” had to be transformed to national equivalents, respectively, “pequeno-almoço,” “autocarro,” and “metro” (in English, “breakfast,” “bus,” and “metro”). ○ Removal of punctuation, numbers, and stop words (e.g., “a,” “as,” “at,” “by,” etc.). ○ After text preprocessing, we then performed sentiment analysis to calculate the review sentiment strength. We adopted a dictionary-based approach, also known as a lexicon-based approach. Dictionaries are a collection of opinion words with a polarity classification (Ravi and Ravi 2015). Selection of dictionaries is an important methodological consideration (Han et al. 2016), with one essential aspect being its adequacy to the domain of the text, in this case, hospitality. Since we did not find hospitality dictionaries in any of the languages of this study, the criteria to choose dictionaries were based on relatively easy transformation, completeness (dictionaries had to have an extensive range of words), and openness (should be of general domain and broad). Based on these criteria, SentiLex-PT 02 sentiment lexicon (Silva, Carvalho, and Sarmento 2012) was chosen for Portuguese. For Spanish, the choice was the ElhPolar dictionary (Saralegi and San Vincente 2013). For English, the choice rested on the well-known Opinion Lexicon from Hu and Liu (2004). Sentiment strength was calculated by sentence, counting positive and negative words and then applying the same formula as used in Bjørkelund, Burnett, and Nørvag (2012),
which results in a value between 0 and 1, where 0 is perfectly negative and 1 is perfectly positive. Each review’s overall sentiment strength was calculated as the average of the review’s sentences’ sentiment strength.
RevTotalWords: As for RevSentences, this feature is a by-product of the sentiment analysis of the textual component of the review. We kept a record of the number of words in the textual component to explore any links with the other features.
RevUserCountyISOCode: From the location of the user of the review, we extracted the name of the country and assigned its ISO 3166 two-digit country code.
The frequency and distribution of the resulting 34,622 observations in the final data set can be seen, per each categorical feature, and per review source, in Table 3.
Review Frequency and Distribution by Source.
Data Analysis and Results
Using a table plot, built with “tabplot,” an R package for visualization of large multivariate data sets (Tennekes and de Jonge 2017), we started by analyzing the distribution and looked for patterns in the data set. This powerful visualization, as illustrated in Figure 2, shows each feature in a separated column, and in each row, each bin aggregates a predefined number of observations of the data set, in this case 100. Numeric features are represented in the form of bar charts and categorical, in the form of stacked bar charts.

Visualization of the full final data set.
This powerful visualization reveals at a glance patterns in data that indicate potential areas of interest. More than 50% of reviews have a RevRating of 5 and an average RevSentimentStrength above 0.7, which means the data are not normally distributed and are highly skewed. Figure 2 also shows that the sentiment strength (RevSentimentStrength) of the textual component of reviews is in line with the behavior of the review ratings (RevRating), because as one decreases the other decreases as well, but it also shows a similar pattern with geographic distance (CEPII_dist) and psychic distance (PD_PD_DK). This could indicate that less distant users, both in geographic and psychic distance, give lower ratings than more distant users. This visualization also illustrates that lower ratings (RevRating and RevSentimentStrength) occur more often in hotels of lower classification (2 and 3 stars in HotelStars) and when there is a lower number of reviews in English (Language).
Another interesting visualization that illustrates the skewness and the spread (degree of dispersion) of both review ratings (RevRating and RevSentimentStrength) by geographic and psychic distance factors, per hotel type and hotel star rating, is the set of boxplots presented in Figure 3. These boxplots show that although there are some similarities in the distribution of the quantitative ratings (RevRating) between geographic and psychic distances, this does not apply to the qualitative ratings (RevSentimentStrength). Qualitative rating, that is, the sentiment strength of the textual component of reviews, does not follow the same patterns in terms of geographic and psychic distances, as the quantitative review ratings. This figure also shows that the distribution of both ratings differs by hotel type and star ratings. These similarities and differences are detailed in Tables 4 and 5, where frequency of reviews, as well as the mean and standard deviation for each combination by hotel type and star ratings, respectively, per geographic and psychic distance are shown.

Distribution of ratings by psychic and geographic distances, per hotel type and hotel stars.
Rating Statistics by Geographic Distance, Hotel Type and Hotel Stars.
Rating Statistics by Psychic Distance, Hotel Type, and Hotel Stars.
Analysis conducted with CTree, a conditional decision tree (Hothorn, Hornik, and Zeileis 2006) implemented with the R package “partykit” (Hothorn and Zeileis 2015) with the top three nodes predicting the value of RevRatings as depicted in Figure 4, shows that geographic distance is an important predictor of the quantitative review rating among four- and five-star hotels. This seems to be confirmation that some form of relationship exists between the geographic distance and review ratings. Figure 4 only shows three levels because of space constrains. CTree is a nonparametric class of regression trees that embeds tree-structured regression models to the well-defined theory of conditional inference techniques. As CTree deals with overfitting and variable selection problems by inducing a recursive fitting procedure and application of appropriate statistical tests, on both variable selection and stopping, it is a good tool to explore the predictive importance of features in a determined outcome.

Conditional inference tree by top predictors of RevRating.
We also applied a set of filter-based techniques to evaluate how each feature of the data set was relevant in terms of the prediction of the RevRating. The objective of this test is to understand if geographic and psychic distances have predictive power over the quantitative rating of the review, which could indicate the importance of these features (see Table 6). The tests we applied, with the help of Microsoft Azure Machine Learning, were Pearson correlation, Mutual information, Kendall correlation, chi-squared, and Spearman correlation.
Filter-Based Feature Selection Results.
Since our data set is not normally distributed, to compare means of review ratings by geographic and psychic distances, per hotel type and per hotels star ratings, we chose to employ the Kruskal-Wallis (Kruskal and Wallis 1952), which is considered to be the nonparametric equivalent of the one-way ANOVA. With the Kruskal-Wallis results presenting values below the defined threshold (that we defined as 0.05), this indicates that rating mean values differ in each category of analyzed features. In these instances, it is necessary to conduct a post hoc analysis by the categories of each feature in the study. For this post hoc analysis, we employed the R package “pgirmess” (Giraudoux 2016).
The result of Kruskal-Wallis test is used to evaluate if the means of RevRating and RevSentimentStrength differ by each of the features in the scope of the study (geographic and psychic distances, per hotel types and hotel star ratings) and is presented in Table 7. Since p values for all categorical features presented values below 0.05, means do differ by categories for each feature. In other words, with respect to the two research questions, RevRating and RevSentimentStrength distributions differ by hotel type, hotel star ratings, geographic distance, and psychic distance, which means that users from different geographic and psychic distances rate hotels differently according to the hotel type and hotel stars.
Kruskal-Wallis Test Results.
As the Kruskall-Wallis test revealed that there were differences in the mean values of the categories in this study, a post hoc analysis was performed to determine which categories possess different means. This analysis is achieved by pairwise comparison for each combination of categories. Results of this test are presented in Tables 8 and 9. In this test, when the observed differences are higher than the critical value considered as significant (we opted for 0.05), we identify a difference between the categories.
Geographic Distance Kruskal-Wallis Pairwise Comparison.
Psychic Distance Kruskal-Wallis Pairwise Comparison.
Unlike the quantitative review rating, the textual component allows users to fully expose their opinions. In other words, although a user gives a hotel a top rating (5 in TripAdvisor quantitative scale), in the text the user can express views differently (e.g., “Excellent hotel. Staff very helpful. Very good breakfast. The only downside is that it is slightly away from the center”), which is a sentence that is not fully positive. Therefore, as expected, the results presented in Table 8 show that there is a difference between the results of the quantitative rating (RevRating) and the sentiment of the textual component (RevSentimentStrength) (Antonio et al. 2018b). Nevertheless, in the 48 combinations of categories, results only differ in 7 of them. This illustrates the correlation between review ratings and sentiment polarity of the textual component of reviews (Antonio et al. 2018a). This correlation is also illustrated in Table 9, with only 9 of the 48 combinations presenting different results.
Table 8 also shows that of the 16 combinations of hotel type and hotel stars for each geographic category, 12 present different distributions for “PT–Near” distance category, 9 for “PT–Far,” and 5 for “Near–Far.” The results suggest that users from Portugal tend to have a very different opinion to users from a “Near” distance, but not so different to users from “Far.” Opinions of users from “Near” do not differ much from users from “Far.” However, for psychic distance, as presented in Table 9, results differ even more between combinations. From the 16 combinations of hotel type and hotel stars for each psychic category, eight present different distributions for “PT–Near,” 14 for “PT–Far,” and 11 for “Near–Far.” The observations illustrate that the further away a user is in terms of psychic distance, the less similarity there is in ratings, independently of the hotel type and star rating.
Conclusion
This study reinforces the importance of both psychic and geographic distance as an influence of hotel online reviews, and provides theoretical and managerial guidance. We seek to draw on a multidisciplinary social science approach by incorporating strands of prior literature from international management, international marketing and tourism. Tourism is an integral aspect of contemporary society and is an area of interest across the social sciences (Holden 2004). This research considers some important gaps in the literature by strengthening understanding of the online hotel ratings during times of significant demand shifts (Wong, Fong, and Law 2016). In this article, we specifically ask, in terms of hotels, if distance is a factor, then to what extent does language of the review, hotel location, and hotel star rating matter?
Based on the influence of preferences with respect to hotel attributes, the results reveal dissimilarities among travelers based on geographic and psychic distances. This is in agreement with prior research (Banarjee and Chua 2016). For hoteliers with a significant number of foreign guests, this is worthy of further investigation and is rather pertinent for heterogeneous groups of travelers. Social identity theory (Tajfel 1982), refers to social identity within communities. Hotels need to identify social groups within their customer database. With the prevalence of digital transformation, even the smallest hotel has to act. The marketing processes need to keep abreast of external shifts, customer expectations, and from the employee perspective. To remain competitive, hotels cannot allow the likes and dislikes of guests and communities to remain unknown. This study provides a platform that illuminates why aggregated evaluative online hotel ratings statements from guests need to be disaggregated for effective marketing decision making. With the advance and growing importance of personalization to the hotel sector (Buhalis and Amaranggana 2015), identifying patterns in terms of geographic and psychic distance will provide fresh knowledge.
The application of big data represents a new era for data exploration and utilization, which can be a driver for innovative processes in customer-facing practices. Tourism is not an exception and the results from this study can provide opportunities to enrich marketing processes by delving into the influence of traveler distance and language. For this to be effective, hoteliers will need to demonstrate higher levels of analytical, interpretive, and strategic knowledge (Phillips and Moutinho 2014). As research into the opportunities available through the use of big data in hotels remain nascent, this study provides theoretical and empirical evidence of fresh insights that can be obtained.
Theoretical and Managerial Implications
The present study extends extant research in three important ways. First, this article provides new insights into how geographic and psychic distance influence online hotel ratings. Even though the concept of distance occupies a central role in business and management literature, tourism research to date has not delved into the influence of the origin of travelers in their online rating behavior. In general, prior academic studies aggregate reviews from individual travelers of differing origins to compute sentiment scores of simple average ratings, which are also written in different languages. Aggregated hotel ratings may not provide the full story, as guest opinions may be buried and lost.
The choice of concept of distance needs to go beyond geographic distance (Deodhar, Subramani, and Zaheer 2017), and this study deploys a set of psychic measures, which are one of the most popular forms of distance (Safari, Thilenius, and Hadjikhani 2013). Online travelers’ preferences have been investigated from many facets, but a paucity of prior studies focus on traditional distance metrics. Gao et al. (2018) analyzed the relationship between online ratings (quantitative review ratings) and power distance (a metric distance different from this study). Our study distinguishes from Gao et al. (2018), by using two distance measures (geographic and psychic) and incorporating sentiment polarity of the qualitative component of reviews.
Second, the results of the Kruskal-Wallis test illustrate the difference in the means of RevRating and RevSentimentStrength for each of the features in the scope of the study (geographic and psychic distances, per hotel types and hotel star rating). By using the original data of 34,622 online customer reviews written in English, Portuguese, and Spanish, we also confirm that online customer reviews are multifaceted constructs. By investigating the contemporary relevance of distance, whether psychic or geographic, our results reveal that both types of distance matter differently to hotels in terms of language, location, and star rating. Travelers’ rating patterns by language vary across hotel profiles too. We observe that low-distance users both in terms of geographic and psychic distance give lower scores than high-distance users. Figure 2 illustrates that lower ratings (RevRating and RevSentimentStrength) occur more often in hotels of lower classification (two and three stars in HotelStars) and when there is a lower number of reviews in English (Language).
Third, in light of these gaps and concerns spotted in the literature, this study provides theoretical and managerial guidance for future research. Travelers of different origins may possess significantly different expectations. So, understanding changes in online customer reviews beyond those written in English may help advance a more effective connectivity among the travelers. With the already high penetration rate of English on the Internet, future growth will come from non-English languages. English is the most common language on the Internet (25.4%), but Chinese is already 19.3% (Internet World Stats 2019). So, further research could develop and validate the influence of distances and alternative languages in differing tourism contexts. The study provides a platform for further exploration of geographic and psychic distance. The findings of this study provide a starting point to design a more focused investigation. We analyzed hotels in a Portuguese setting and suggest future work analyzing hotels in other countries. Each country may impact both geographic and psychic distance differently. This would strengthen the generalization of our results.
Moreover, as tourism has matured, increasing numbers of academics and practitioners’ attention is drawn to developing creative ways for firms to enhance distinctiveness in their offer. To enhance the service offer, managers need to closely monitor customer voice (Phillips et al. 2016). This study contributes to contemporary research on online hotel ratings by incorporating distance and language. In fact, after a near decade of eWOW research, there has not been a comprehensive analysis of how differences of origin influence online hotel ratings. Distance influences not only travelers’ assessment and choice of destination but also the activities selected during their stay. This illustrates the potential for distance to be used as a segmentation variable (Nyaupane and Graefe 2008). The relationship between language and the Internet is not unimportant but rather neglected in hospitality and tourism research (Schuckert, Liu, and Law 2015; Liu et al. 2017). So what causes this? Indeed, distance compression may be making countries less distinct over time, and easier for travelers from greater distances. The promotional material received by such travelers may be making the hotel and its location more attractive. We content that this holistic distance results in travelers being more discerning as they possess improved availability of information together with increased knowledge of hotel experience. The uncertainty in greater psychic distance appears to make travelers less critical of the hotel experience. In this instance, the hotel may appear more attractive and the associated network relationships determine the impact (Ojala 2015).
In terms of practical implications, hotels need to listen to travelers, as they form an invaluable resource and are part of the brand strategy. However, it is critical to have effective processes in place to make the necessary operational and service improvements. This will enhance the level of the traveler experience. Tracking what travelers are saying requires the firm to develop a sound management of online customer reviews. By tapping into rich bespoke data sets, firms can ascertain their strengths and weaknesses and make better-quality decisions. Hotels should understand the impact of geographic and psychic distance when evaluating the customer journey. The advent of technology makes it possible to design bespoke customer journey strategies for differing customer segments beyond traditional demographics such as purpose of visit and age.
The results of the study suggest that local travelers tend to be more critical than travelers from a distance, which suggests that incentives could be made available to local travelers. In these instances, understanding the motivation of the trip and behavioral approaches of key segments will create a platform for better managing the salient information flows between the traveler and the hotel. Our results highlight that when travelers perceive a gap and it is beyond an acceptable level, it will lead to dissatisfaction, which should be avoided.
Collectively, this demonstrates how “the sum of factors” or the “differences” interact in the formation of asymmetric distance perceptions. The individual experience of travelers are influenced by psychic distance, which will impact hotel marketing strategies and promotional activity, product development, and pricing strategies. Future research may delve into the moderating effects of distance and hotel online reviews, such as the size of the host country in terms of GDP, attractiveness of host country, historical events, and hotel entry modes.
Limitations
As with any other study, this research possesses limitations. Our research employs a sampling frame of reviews and associated ratings on Booking.com and TripAdvisor for Portuguese hotels in three languages (Portuguese, Spanish, and English). So, our findings may not be generalizable to other hotel markets and languages. However, the development of any new seam of research needs to be repositioned in terms of an overfocus on the theoretical aspects, such as the rigor–relevance debate. Prior research in social sciences has argued for a slight shift in abstract philosophical debate around research epistemologies. The consequence is very significant illustrating a lack of what (Ven 2007) has been outlined as engaged scholarship, which has both rigor and relevance. With more than three quarters of traveler purchasers visiting TripAdvisor prior to making a booking, its influence and significance makes the platform useful for academic research.
Another limitation, as identified by Antonio et al. (2018a), relates to the small number of users that do not write reviews in the official language of their country. For example, such users will tend to write reviews in English rather than in their mother tongue. Therefore, the analysis of ratings or sentiment polarity based on the language of the review could not have reflected the cultural background of such reviewers.
Another difficulty relates to the difficulty of performing text analysis across multiple languages, whereby it was decided to use only reviews in English, Spanish, and Portuguese. Although reviews in these languages represent the official languages of 70% of Portugal’s tourists, they are not representative of all tourists. Therefore, future research could explore the analysis of reviews in other languages.
We also recognize that as every language has a different degree of expressive power (Ravi and Ravi 2015), it is possible that some differences in the sentiment strength exist because of the differences in the dictionaries employed per language. Future research should explore the analysis of sentiment with dictionary-free approaches or with domain-specific dictionaries.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
