Abstract
Chinese inbound tourism constitutes Australia’s fastest growing and largest international tourist market. Currently, most of this travel is conducted via group package tours (GPTs). While there is anecdotal evidence of dissatisfaction with some aspects of Chinese tourists’ service experience such as commission-based shopping, little empirical evidence is available about the salient dimensions of service quality or their respective performance. The aim of this study is to identify the core service components offered by Chinese GPTs and to examine any service shortfalls. Findings from a survey of 520 tourists revealed three dimensions of Chinese package tour service: attractions, tour leader and food and accommodation. The study identified significant gaps between expectation and performance across all dimensions. Theoretical and service quality implications for researchers, tour operators and policymakers are presented and discussed.
Introduction
The Approved Destination Status scheme opened the substantial and growing Chinese tourism market to Australia in 1999 (Australian Trade Commission, 2015). The growth rate of Chinese visitor arrivals to Australia has remained above 10% since 2011 (15.6% for 2012, 14.5% for 2013 and 19% for 2014), and a similar pattern has been recorded in the growth rate of total spending by this market (13% for 2012, 16% for 2013 and 19% for 2014; Tourism Australia, 2013, 2014, 2015). The Chinese market is now Australia’s largest inbound tourism market, which is still growing rapidly in terms of volume and expenditure, and hence highly valued by Australian tourism stakeholders.
In many Asian countries, including China, group package tours (GPTs) are the main mode of outbound travel and are likely to retain popularity for international trips (Wang and Sheldon, 1996). GPTs are standardized and repeatable bundle offers sold at single prices, which usually include transportation, accommodation, food, attractions and services (Middleton and Clarke, 2012). The appeal of GPTs includes financial benefits (i.e. value for money and reasonable prices), convenience, and reduced uncertainties and perceived risks (Kim et al., 2009; Wong and McKercher, 2012). Asian tourists, including Chinese tourists, may also travel on GPTs to help overcome language barriers, to assist a relatively inexperienced international market, and because GPTs fit with the group orientation of collectivism (Li et al., 2011).
However, the experiences of Chinese package tourists may be influenced and even constrained by a number of factors. These include their limited time in Australia (South Australian Tourism Commission, 2013); operational and managerial difficulty in standardizing and monitoring service due to the just-in-time service promises (Moutinho et al., 2015) and the practice of commission tours where tour operators offer very-low-priced tours and try to reap profits through commissions from shopping and/or entertainment activities (Zhang et al., 2009). These factors, especially commission tours, have resulted in tourist dissatisfaction and associated complaints, which include overpricing of tourist goods, changing and downgrading itineraries at the destination, arranging shopping venues favoured by the tour operator, misguiding and deceiving tourists into the purchase of expensive goods, the misrepresentation of information in the source market by tour operators and even forced shopping (Dwyer et al., 2004, 2007; Keating, 2009; Prideaux et al., 2006; Wang et al., 2015; Zhang and Murphy, 2009). The mass media reported some of these incidents, for example, a video of a Chinese tour guide admonishing a bus load of tourists for not shopping enough has been widely circulated on Chinese social media, which in turn sparked outrage, with many users sharing their negative experiences of package tours (BBC News, 2015).
Surprisingly little research has been undertaken on service quality of Chinese GPTs either in general or specifically in the Australian context. In the broader business and tourism contexts, demand for quality service has become clear for business operators in general (Schwartz, 2007; Ting, 2004; Velázquez et al., 2011) and specifically for tourism stakeholders (Ayeh and Chen, 2013; Mak et al., 2011). To achieve long-term profit and business sustainability, Chinese tour operators need to have a quality service–based orientation. An important strategy for achieving this is to diagnose service shortfalls. Underpinned by a robust body of marketing theory and research on service quality (Boulding et al., 1993; Parasuraman et al., 1991; Parasuraman et al., 1985), the present study aims to examine service shortfalls in Chinese GPTs in Australia.
Literature review
The role of expectation in service quality
The conceptualization of service quality has been one of the most debated and controversial topics in the marketing literature. In conceptualizing and measuring service quality, the present study considers expectation, which is defined as what customers think should be provided (Zeithaml et al., 1996), as an essential and critical benchmark. Expectation has been used as a conceptual component in the measurement of service quality for several decades (e.g. Grönroos, 1982; Parasuraman et al., 1985; Sasser et al., 1978). More recently, a number of marketing studies have confirmed and supported the integrity and merit of including expectation when studying service quality (Bolton and Drew, 1991; Parasuraman et al., 1990). This approach has also been adopted by a number of tourism researchers (e.g. Armstrong et al., 1997; Zhu et al., 2007).
Even so, some researchers suggest using a simple performance-based measure of service quality (Baker and Crompton, 2000; Cronin and Taylor, 1994). However, a number of studies have found that expectation and performance measures have superior diagnostic power compared to performance-only measures (Jim and Julie, 2000; Teas and Decarlo, 2004; Voss et al., 1998; Zeithaml, 2000; Zeithaml et al., 1993). In an applied context, business managers may lose key diagnostic information if they rely solely on performance-only measures. Expectation and performance measures provide more detailed – and probably more accurate – managerially valuable diagnostic information, which can be used as an internal benchmark in determining, monitoring and enhancing service performance (Nadiri and Hussain, 2005; Parasuraman et al., 1994). Moreover, expectations can provide insights into customers’ standards in evaluation, thus helping companies to leverage expenditure on monitoring services and avoid overspending on improving service quality (Rust et al., 1995; Zeithaml et al., 2013). This is particularly important for highly competitive industries with limited resources, such as the Chinese tour operator industry (Wei, 2003).
Previous research on the development and refinement of definitions of expectation in service quality has produced two categories of definitions: desired expectation and adequate expectation. The proposal of SERVQUAL, a 22-item instrument to measure service quality, introduced the concept of desired expectation, which refers to what customers think the service provider should provide (Parasuraman et al., 1985; Parasuraman et al., 1988; Zeithaml et al., 1993). As a further addition, and in response to the need to refine expectation standards, the concept of adequate expectation, which refers to what consumers deem as acceptable, has been introduced (Parasuraman et al., 1991; Veronica and Tore, 1993; Zeithaml et al., 1993).
The present study employs the concept of desired expectation because literature on service quality predominately supports desired expectation as part of the conceptualization and measurement of service quality (Boulding et al., 1993; Parasuraman et al., 1988; Prakash, 1984; Zeithaml et al., 1993). Adequate expectation is of less relevance to this study because it has been examined most often in zone-of-tolerance literature (Jim and Julie, 2000; Nadiri and Hussain, 2005; Yap and Sweeney, 2007). In addition, using desired expectation may produce more reliable results. Previous studies have found that learning takes place in the market on a continual basis and customers may adjust their adequate expectations on the basis of product/service usage, cumulative consumption experiences and marketing communication during the consumption process (Johnson et al., 1995; Yi and La, 2004). In contrast, desired expectations remain relatively stable over time because the service standards that consumers think should be offered are often not significantly influenced by external factors such as marketing communication (Boulding et al., 1993; Jim and Julie, 2000).
The current study uses desired expectation based on Zeithaml et al.’s (1993) notion of desired service. Specifically, GPT service expectation is defined as the normative service levels tourists think tour operators should provide regarding various dimensions of the GPT, for example, tour leader, local guides, accommodation and attractions.
Dimensionality of Chinese GPT service
Dimensions of service quality are service attributes and determinants customers use to evaluate services (e.g. Parasuraman et al., 1988). They may be used by companies to influence customers’ profit-related behaviours such as word of mouth or repurchase intention (Kandampully, 1998; Zeithaml et al., 1996). The diverse characteristic of services, such as intangibility, inseparability and heterogeneity, is to understand their structural dimensionality (Grönroos, 1982; Parasuraman et al., 1985). These challenges are particularly evident in research on little-studied markets such as Chinese package tours (Wang et al., 2007).
Service measures such as SERVQUAL, which attempts to uncover service dimensions of general applicability across industries (Cronin and Taylor, 1994; Parasuraman et al., 1988), are unable to fully capture the service dimensions of Chinese GPTs. The simple and direct transference of categorizations in general service encounters is not appropriate for complex services such as GPTs, which are usually characterized by multiple service encounters, long processes, intense interaction between tour leaders and tourists, the dominant role of the tour guide, managers’ lack of observation and direct control over tour guides’ performance (Wang et al., 2007; Wang et al., 2000). The widely used SERVQUAL instrument, for example, which is a 22-item, five-dimension scale (Parasuraman et al., 1985, 1988), applies mostly to short-term service encounters such as the office-based services provided by travel agencies, hotels and airlines (Fick and Ritchie, 1991; Lam and Zhang, 1999; LeBlanc, 1992) and may not capture the most salient dimensions of GPTs. As such, attempts to force SERVQUAL onto the multiple service encounters of Chinese GPT services have necessarily involved significant changes to the original instrument (Ayeh and Chen, 2013; Chang, 2009).
Measuring Chinese GPT service quality
To understand the dimensions of Chinese GPTs and to develop an appropriate service quality measure for the present study, a review of measures used in previous studies was undertaken. This was done through searching for relevant studies on GPT services in the broad Chinese context (Mainland China, Hong Kong and Taiwan) in major tourism databases and Google Scholar. Studies conducted in Hong Kong and Taiwan were included because these regions share similar cultural roots with mainland China, and GPTs are common in these regions (Wang et al., 2000). Key words including ‘Chin*’, ‘service quality’ and ‘(group) package tour’ were used in different combinations. Subsequently, an endnote library was created and the lead author read abstracts of the papers collected and selected the ones that focused on and measured service quality of Chinese GPTs. Table 1 provides a summary of the operationalization of GPT service quality in previous studies, including the study context, methods, dimensions and subdimensions of GPT services.
Operationalization of GPT service quality in previous studies.
GPT: group package tour.
Note: Items from Liu and Wu (2006) and Sheng (1999) were originally in Chinese and have been translated by the author.
These studies of Chinese GPT service quality were critically reviewed to provide a pool of items to serve as the basis for measurement of the core components GPTs. To start with, this study limits the domain of GPT services to on-tour services under direct control of tour operators, given the fact that tourists’ evaluation of GPT services is mainly based on on-tour service (Bowie and Chang, 2005) and the critical role and responsibility tour operators bear in delivering GPT services (Chang, 2009; Gong et al., 2015). Items that are outside the scope of the on-tour services, for example, accuracy of information provided by the front desk of tour operators (Sheng, 1999), were removed.
A critical review of the dimensions and subdimensions of GPT service quality measurement highlighted several weaknesses including lack of validity and reliability (Liu and Wu, 2006; Sheng, 1999), double-barrelled or even multibarrelled items (Ayeh and Chen, 2013; Lee et al., 2011; Liu and Wu, 2006), a mix of bidirectional and unidirectional items (Sheng, 1999) and omission of some critical components of GPT services such as attractions and food (Wang et al., 2000). Moreover, some of the dimensions presented in Table 1 are only conceptual dimensions (Chang, 2009). Some studies represent explorations of service features of GPTs (Lin et al., 2009; Wang et al., 2000), while only a few studies have developed measurements of GPT service using a systematic and rigorous approach (Wang et al., 2007; Wang et al., 2013). The criticisms of items used to measure Chinese GPT service quality in previous studies were considered when developing an instrument for the present study.
Given the significance of the Chinese market to Australia, the urgent need to understand and monitor GPT services, and the lack of research in this area, the present study aims to diagnose service shortfalls within direct control of tour operators of Chinese GPTs in Australia. This is achieved through (1) determining the dimensionality of Chinese GPT services and (2) identifying gaps between expectation and performance of Chinese GPT services.
Methodology
Instrument development
This study employed a quantitative approach, using an online questionnaire to collect data. To develop the questionnaire, pooled items from the literature review that did not fit in the context of the present study (e.g. services not within the direct control of tour operators) were excluded. Items that lacked reliability and validity, for example, double-barrelled items and purely conceptual items, were also eliminated. In addition, relevant findings from a recent Chinese satisfaction survey conducted by Tourism Research Australia (Tourism Research Australia, 2014) were incorporated.
With the pool of items, three face-to-face group discussions (comprising 8, 9 and 10 people, respectively) were conducted between December 2013 and January 2014. Participants were Chinese residents aged between 21 and 60 years with at least one GPT experience in the past 12 months at the time of discussion. Each group was shown a list of items to measure GPT service quality and was asked to respond ‘yes’ or ‘no’ regarding whether the item should be included. The responses of the three groups were considered together and those items with more than 50% yes answers were retained. Participants’ opinions did not vary significantly and the decision on retaining or discarding items was clear. As a result of this process, a list of 24 items to measure GPT service quality was generated.
Questionnaire design
With the items for GPT service determined, a self-completed questionnaire with three sections was developed. The first section relates to trip attributes such as timing of the trip and travel party. The second section consisted of 24 statements about GPT service dimensions, of which five were answered only where applicable, that is, only by respondents who had shopped or joined an optional tour. These respondents were also asked if they were forced to shop (yes or no) and forced to join an optional tour (yes or no). Participants were asked to rate expectation as well as performance of the 24 service attributes provided. Each service quality attribute was rated on a five-point Likert-type scale, ranging from strongly disagree (1) to strongly agree (5) in the expectation section and the performance section. The third section of the questionnaire included sociodemographic information. The questionnaire was first written in English and then translated into simplified Chinese. Bilingual experts checked the wording to ensure accuracy of the translation. The draft questionnaire was pretested on Chinese nationals who had GPT experience (N = 84). Based on the feedback, the questionnaire was further improved. Online Appendix 1 presents the English version of the questionnaire.
Data collection
Data were collected via an online panel provider from a sample of 520 mainland Chinese residents who had completed a GPT to Australia. This study employed a retrospective approach, which belongs to a family of procedures that tap into respondents’ recalled beliefs, feelings and behaviours (East and Uncles, 2008). Psychology researchers provided evidence on people’s ability to recall events, feelings, time periods, expectations and preferences sufficient to justify the current approach (Koriat et al., 2000; Levine and Safer, 2002; Safer et al., 2002). In addition, while tourists may be more accessible during a tour, any measurement attempt during the tour misses what happens afterwards. Tourism researchers therefore also advocate such a retrospective approach, especially when desired expectation is included as a measure (Dickson and Hall, 2006; Wang and Davidson, 2009; Yüksel and Yüksel, 2001). To capture the whole GPT service experience, service quality needs to be measured when the tour is completed, allowing respondents to recall their entire GPT experience. Therefore, it can be reasonably argued that the use of a retrospective survey for the present study does not pose critical issues in terms of accuracy.
Considering the fact that GPTs to Australia only started in 1999 and the proportion of Chinese nationals who have been to Australia on GPTs is small, the target population of this study was defined as any Chinese national who had previously visited Australia on a GPT. While this helped to produce a sample large enough within the research time frame, a small number of tourists (29) who had travelled to Australia a relatively long time ago (more than 10 years ago) were also recruited. This was not viewed as a problem as the characteristics of tourism services such as high involvement, uniqueness and memorability enhance the likelihood of accurate recollection of these experiences (Andrews and Shimp, 1990; Park and Hastak, 1994; Snelgrove and Havitz, 2010).
Data analysis
Data analysis started with data checking and cleaning. The sample size was found to be adequate for principal component analysis (PCA) and confirmatory factor analysis (CFA), with 19 of the 24 numeric indicators on GPT service quality answered by everyone in the sample (N = 520), while 381 respondents answered the questions on shopping and 380 respondents answered the questions on optional tours. A total of 309 respondents participated in both shopping and optional tours.
Conducting analyses on the whole sample using all 24 items would require either pairwise or listwise deletion and would result in considerable data (more than 10%) being excluded from the analysis. To make full use of the data collected, the 19 indicators with responses from the whole sample were included in the final analyses. Even though not included in the final analysis, shopping and optional tours were retained and considered as possible components of GPT services, considering their significance as evidenced by high factor loadings in trial analyses. This strategy focuses on achieving the research aim and simplifies the analysis while at the same time making full use of the data and avoiding dealing with missing values, which could be troublesome, given the missing values are not random. The 19 variables represent the core indicators that are applicable to all respondents, while the 5 indicators on shopping and optional tours are possible elements of GPT services applicable only to those who had such experiences. To reveal dimensionality of Chinese GPT service, PCA and CFA were used, and to examine service gaps, descriptive analysis and t-tests were used.
Results
Demographic background and trip attributes
Demographic characteristics show that the sample was dominated by males (62%), was relatively young with 53% aged between 18 years and 29 years and was highly educated with nearly all (93%) holding bachelor’s degrees or above. One-third of the sample (33%) earned an average annual income of Renminbi (RMB) 30,000–80,000. Respondents resided across China, with almost one-third living in Guangdong (13%), Shanghai (9%) or Beijing (8%). Most of the respondents (69%) had travelled to Australia in the last 5 years (2010 to 2015) at the time of the survey and 89.6% in the last 10 years. The average length of trip was 11 days, and a majority of the respondents visited Sydney (86%) and Melbourne (54%). Most respondents joined the tour with at least one other person, 29% with spouses/partners, 25% with family or friends and 19% as part of family groups. Within the whole sample of 520 respondents, 381 (73%) shopped and 380 (73%) had optional tours on their itinerary. A small portion of respondents (59, 15%) reported forced shopping, and 54 (14%) reported forced participation of optional tours.
Dimensionality of Chinese GPT service
Table 2 presents the results of the PCA on GPT service dimensions. A three-factor solution emerged, explaining 62% of the variance. All factor loadings were above 0.60, and all commonalities were above the cut-off point of 0.50 (Hair et al., 2006). Factor 1 had five significant loadings (Cronbach’s α = 0.818), factor 2 had five significant loadings (Cronbach’s α = 0.807) and factor 3 had three significant loadings (Cronbach’s α = 0.785). Based on the content of indicators, factor 1 was named attractions; factor 2 was named tour leader and factor 3 was named food and accommodation.
VARIMAX rotated component analysis of GPT service quality.
GPT: group package tour.
Note: Factor loadings less than 0.50 have not been included and variables have been sorted by loadings on each factor. Six items were deleted because their communalities were below 0.50.
With the factor structure determined, a CFA was conducted to further explore the factor structure. The overall model χ2 was 230.318, with 62 degrees of freedom and a p value of 0.000. With goodness-of-fit of 0.935, most of the model fit indices satisfied the suggested cut-off points (Normed fit index [NFI] = 0.918, Comparative fit index [CFI] = 0.938, Incremental fit index [IFI] = 0.939, Tucker-Lewis Index [TLI] = 0.923, Adjusted goodness of fit index [AGFI] is 0.905, Root mean square error of approximation [RMSEA] = 0.072, room mean squared residual [RMR] = 0.066). Cronbach’s α was 0.882, exceeding the recommended threshold of 0.70 (Fornell and Larcker, 1981) and suggesting acceptable reliability. All standardized loading estimates were above the suggested standard of 0.40 (Anderson and Gerbing, 1988). Taken together, the results suggest good model fit, thus the model was accepted.
Examining service shortfalls
To examine service shortfalls, the gap scores of expectation and perceived performance of all indicators were calculated and a paired sample t-test was conducted to compare the differences. Mean scores for expectation, performance and the gap scores were also calculated. Table 3 presents these results.
Comparison of customers’ expectations and performance of GPT services.
GPT: group package tour.
Note: The mean and SD (standard deviation) are calculated as the average scores of all variables from the factor. Gap mean score is defined as the perception minus expectation.
*Significant difference at 0.05.
Referring to Table 3, the mean scores for expectation scores were consistently high, all between 4.11 and 4.39, while performance scores showed greater variances, ranging between 3.56 and 3.96. The t values showed significant differences between expectation and performance on all indicators, and all gap scores (performance minus expectation) were negative ranging between −0.227 and −0.679, suggesting that perceived performance of all Chinese GPT service dimensions was significantly below expectation.
Discussion
The two main objectives of this study were to determine the dimensionality of Chinese GPT service and to identify gaps between expectation and performance of Chinese GPT services. The results are discussed in turn.
Dimensionality of Chinese GPT service
The PCA and CFA revealed three factors: attractions, tour leader and food and accommodation. Attractions is the first dimension of GPT service. Visiting attractions is the primary tourist experience (Stevens, 1992), and scholars have confirmed this finding in the broad Chinese GPT context (Chang, 2009; Chen et al., 2013; Wang et al., 2013). The present study found that Chinese tourists to Australia want to visit enough attractions and spend enough time at each. In addition, the fact that Chinese tourists value history and culture may have made visitor commentary, which mainly deals with the manner and content of the interpretation of attractions, a much valued service (Chang, 2009; Li et al., 2011; Wang et al., 2000).
The second dimension is tour leader. The significant role of the tour leader, which has been emphasized in previous research on tourism generally and Chinese package tours specifically (Chang, 2009; Chen et al., 2013; Li et al., 2011; Wang et al., 2007; Wang et al., 2000; Wang et al., 2013; Weiler and Black, 2015), has been confirmed in the context of Chinese GPTs in Australia. One variation to this general finding is that Chinese tourists place more value on the tour leader’s problem-solving skills, as opposed to the tour leader’s role as motivator and entertainer reported in studies of Western tourists (Cohen, 1985; Heung, 2008). Chinese GPT tourists are usually strangers to Australia, and tour leaders act as cultural brokers and liaise with service suppliers (Li et al., 2011). Generally, in GPTs, tour leaders play a generic role in information giving and connecting when accompanying the tour, while local guides employed by local tour operators (who are not the focus of this study) take more responsibilities to entertain the group.
The third dimension is food and accommodation. Previous studies similarly reported the importance of food and accommodation to Chinese tourists (Chang et al., 2010; Chen et al., 2013; Li et al., 2011). By signing a travel contract, tourists agree to accept and thus expect the standards of accommodation and food as appear on the contract. ‘High quality of Chinese food’ is also an important indicator, suggesting Chinese tourists’ expect good quality Chinese food even though they are travelling in a Western country.
Shopping and optional tours
In addition to the three factors as found in PCA and CFA, the present study found shopping and optional tours to be critical components of Chinese GPT services. Shopping is a common and preferred tourist activity popular among Chinese tourists (Chen et al., 2013; Wang et al., 2007; Wang et al., 2000). While Western tourists usually rate shopping experiences based on staff service, product value and reliability, physical features of shops, payment methods and other shop attributes (Tosun et al., 2007), Chinese tourists’ shopping experiences are most likely to be influenced by frequency of shopping, duration in shops, pricing of goods and the manner of shopping (i.e. forced shopping, which occurs largely as a result of commission shopping in Chinese outbound GPTs).
To reduce GPT selling prices and to provide flexible choices for GPT tourists, tour operators have built optional tours into most Chinese GPTs. Under the commission scheme, the tour leader promotes and sometimes adds extra optional tours, often at high prices to earn income. Tourists without sufficient knowledge about the destination are less likely to recognize, question or challenge the exploitation (Harris, 2012). Extreme cases include forced participation on optional tours, which was reported by almost 15% of respondents. The number of tourists who participate in optional tours influences the tour leader’s commission. Thus, the tour leader may provide better service to tourists who participate in optional tours, while intentionally or unintentionally neglecting those who are not participating (Wang et al., 2000).
Expectation, performance and service gaps
Chinese tourists sampled in this study had consistently high expectation. However, performance on these service indicators varied, with perceived performance on all 24 service indicators significantly below expectation. The gaps between expectation and performance are discussed for each service dimension and component in the following paragraphs.
Shopping had the biggest gap between expectation and performance (gap score = −0.598). This finding is consistent with the findings from a number of previous studies in the broad Chinese context (Zhang and Chow, 2004; Zhang and Murphy, 2009) as well as previous studies on Chinese package tourists in Australia (Dwyer et al., 2004; King et al., 2006; Wang and Davidson, 2009, 2010). Among the Chinese tourists sampled in this study, 15% reported being forced to shop (N = 59). In commission tours, tour operators capitalize on Chinese tourists’ emphasis on gift-buying (Kwek and Lee, 2013) and tourists’ reliance on the tour leader and local guide for ‘expert advice’ and arrangements (Huang et al., 2010) to facilitate shopping. As a result, tour leaders/local guides sometimes ‘manipulate’ Chinese tourists’ purchasing behaviour. Such manipulation can involve coercion, cheating, lying, aggression and misrepresentation (Chen et al., 2011; Rezabakhsh et al., 2006).
Attractions had the second biggest gap score between expectation and performance. Under attractions, ‘enough time spent at attractions’ and ‘appropriately arranged itinerary’ had the largest gap scores. This suggests that the Chinese tourist respondents in this study did not visit enough high-quality attractions, nor did they stay long enough at these attractions to meet their expectations. One possible explanation for these low performance scores may be that tour operators tend to take tourists’ shopping or on optional tours for commission, with the former in particular reducing the number of attractions visited and the time spent at these attractions.
Significant gaps between expectations and performance were also identified in other service dimensions/components including tour leader, food and accommodation and optional tours. For example, significant gaps existed in ‘helpful tour leader’ and ‘tour leader had ability in solving problems’. Similar issues have been reported for tour leaders in Hong Kong (Zhang and Chow, 2004). This suggests that tour leaders were not proactive in helping Chinese tourists in Australia. One possible reason for the service gaps in tour guides may relate to the fact that many tour operators hire tour leaders/guides on a temporary basis and under the commission–business mode, tour operators may employ less-skilled tour guides and underpay them (Huang and Gross, 2010; King et al., 2006). Under the food and accommodation factor, the indicator high quality of Chinese food had the biggest gap between expectation and performance. Similar issues on food have been reported in previous studies on Chinese GPTs in Australia (Li et al., 2011; Tourism Research Australia, 2014; Wang and Davidson, 2009). One possible explanation may be that tour operators try to minimize costs by selecting restaurants that are of a lower standard and quality.
Implications, limitations and conclusion
There are several theoretical and practical implications of the research. The findings of this study contribute to better understanding, conceptualizing and measuring of GPT service quality in the context of Chinese GPTs in Australia. The dimensionality of package tour services as revealed in the present study pinpoints the salient elements of Chinese GPTs in Australia. These are attractions, tour leader and food and accommodation. Australian tourism stakeholder may need to monitor service performances of these elements to help develop a sustainable package tour industry. This study also demonstrates the merits of including expectation in examining service shortfalls, as argued in previous studies (Parasuraman et al., 1990; Voss et al., 1998), and highlights the need to specify the definition of expectation at conceptual and operational levels, a consideration largely ignored in previous studies. Researchers need to differentiate between desired expectation, adequate expectation and expectation as a prediction of the future as well as provide respondents with a clear definition of these to avoid confusion (Chen et al., 2016).
This research also represents a rigorous, comprehensive and focused evaluation of a complex service package that involves multiple service encounters and providers, as opposed to a single service provider as some studies have done (Han and Ryu, 2009; Min et al., 2002). A distinctive feature of the tourism and hospitality industry is that services are provided by a variety of suppliers (Mok et al., 2013). This study shows that in managing and monitoring complex Chinese GPT services, it is critical to focus on the key elements of service provision and this needs to be adjusted according to the different service provider contexts and organizational operation. Future studies are needed that focus on different tourism stakeholders and contexts. Improving and monitoring service quality can be a time-consuming and complex process, considering the importance of context, the number of dimensions and the range of different service providers involved. However, such efforts are worthwhile in the many instances where high service quality leads to customer loyalty.
Finally, there are some practical implications for Chinese tour operators and policymakers. One approach to addressing the service gaps is heavier regulation by Chinese tourism authorities. Chinese GPTs would benefit from a stronger government role aimed at eliminating unethical commission-based tours. The effects of the China Tourism Law (China National Tourism Administration, 2017) are beyond the scope of this study. However, of the 32 respondents who travelled to Australia after 2013 when the law was passed, 7 reported forced shopping and 6 reported forced participation in optional tours. Thus, it may be argued that illegal and unethical practices due to commission-based tour operations have not been eliminated. Chinese and Australian tourism authorities could take action to enforce the law and conduct evaluations of its effectiveness. Regulations and licensing in relation to inbound and outbound tour operators and tour leaders could also be introduced and enforced more strictly, such that operators and leaders are required to comply or otherwise risk losing their license for breaching the law. Future studies could explore better accreditation of tour guides and tour operators and the development of trade arrangement between China and Australia in terms of quality.
Assuming that the laws of competition will reward tour operators offering high-quality services by way of positive word of mouth and repurchases, and ultimately increased profit, another avenue to help improve service quality may be to rely on market forces. Even though to some extent the fierce competition and lack of effective regulation contribute to current problems of services in the Chinese tourism industry, as the market matures and consolidates, Chinese tour operators will need to provide quality services to maintain competitiveness. For tour operators who wish to pursue this path, the findings of the present study highlight the importance of a profitable business model. This may be achieved through appropriate pricing levels, adjusting human resource strategies, monitoring on-tour services and regulating tour leaders’ interactions with tourists. Several further practical implications flow from these observations.
Firstly, Chinese tour operators need to invest in marketing and in delivering quality services instead of relying on low-priced schemes to attract and retain customers. This will increase the cost of marketing and service delivery, which will in turn raise GPT prices in order to introduce profit margins needed for operation. Such investment would not be wasteful because improvements in services in business and specifically in the tourism industry are known to retain customers and encourage positive referral (Zeithaml et al., 1996).
Secondly, tour operators could consider implementing human resource strategies to recruit, train and monitor tour leaders’ and other employees’ performance. Tour operators may need to recruit and train tour leaders with competent interpersonal and problem-solving skills. This can start from including personality tests in the selection process and providing periodic training. Specific performance standards which align with the dimensions of GPT service as revealed in the present study can also be introduced and professional training can be provided to prepare tour leaders for meeting the service standards.
Thirdly, a few on-tour services need attention from tour operators. Tour operators may need to consider adjusting current itineraries and extending duration at attractions. Attractions should be the priority in tour arrangements, and any commission-based activities, including shopping and optional tours, should not compromise tourists’ time at attractions. In advertised itineraries and promotions, all shopping activities and optional tours should be clearly marked to reduce unnecessary confusion and potential conflict.
Finally, it is suggested that tour operators regulate tour leaders’ behaviour relating to shopping and optional tours more rigidly. Tour leaders need to explicitly brief tourists on shopping and optional tours with regard to duration, frequency and cost (Wang et al., 2000). All shopping and optional tour activities should follow the travel contract strictly. Forced purchasing needs to be avoided and penalized when it occurs. However, the literature suggests that tour leaders are often not paid adequately and are thus under pressure to achieve economic remuneration through persuading and even forcing shopping and optional tours (Mak et al., 2011). In this sense, it is the economic structure of the GPT system that contributes to the identified service shortfalls. Thus, responsibility for service outcomes largely rests with tour operators who, through better design and pricing and fairer remuneration, can more effectively control the behaviour of tour leaders. Clearly, tour operators need to find an equilibrium between GPT prices and the revenue from commissions. If the tour operators still intend to underprice their GPTs and use shopping commissions to achieve profit margins, they are likely to lose customers.
The interpretation of these results may be limited by several considerations. Firstly, the findings are based on a case study of Chinese GPTs in Australia, thus necessitating obvious caution in interpreting and applying the findings to GPTs in other countries or Chinese GPTs to other destinations. Moreover, this study was based on a nonrandom sample of 520 respondents from a large Chinese GPT market in Australia, thereby prompting familiar calls for replication. Secondly, the measurement only discloses the basic and most important features of Chinese GPTs to Australia. The iterative procedure of developing the questionnaire retained only those items that were common and relevant to general Chinese GPTs under direct control of tour operators. A complete catalogue of possible service attributes of GPTs under all possible circumstances would probably list hundreds of attributes. Thirdly, the current study used a quantitative questionnaire approach to measuring and analysing the constructs of expectation and service quality performance. The primary focus was on the gap between expectation and perceived performance; why and how these expectations and perceptions are formed remains unknown.
Supplemental material
Supplementary_Appendix - Examining service shortfalls: A case study of Chinese group package tours to Australia
Supplementary_Appendix for Examining service shortfalls: A case study of Chinese group package tours to Australia by Hanyu Chen, Betty Weiler, and Martin Young in Journal of Vacation Marketing
Footnotes
Acknowledgement
The authors would like to thank Francis Markham for his suggestions on data analysis of this study.
Declaration of conflicting interests
The author(s) declared no potential conflict of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship and/or publication of this article.
Supplemental material
Supplemental material is available for this article online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
