Abstract
In tourism studies, new means of data collection are opening up opportunities for disclosing hidden mobility patterns. This paper aims to analyze and model the tourist flow networks for different lengths of trip on urban scale, using user generated content (UGC) data collated from an open tourism web service. The textual UGC data, with high spatial and temporal resolution, is utilized to construct three tourist flow networks in response to length of trips. Social network analysis and a revised spatial interaction model are deployed for exploring the temporal heterogeneity in the tourist movements. This empirical study from Nanjing City has further confirmed the power law of distance decay in intraurban tourist mobility. Furthermore, the research reveals temporal variations with length of trip. The paper highlights the role of time in the tourism study through incorporating a temporal dimension into the analyses and taking advantage of the availability of new data.
Keywords
Introduction
China, as the second largest economy in the world, is undergoing an economic transition from traditional manufacturing to modern services. In particular, rapid urbanization has stimulated the development of tourism. Domestic tourist numbers reached 3,262 million in 2013, following an average annual growth rate (AAGR) of 11.27% between 1990 and 2013. In 2013, China’s domestic tourism revenue reached RMB 2,627.61 billion yuan, with an AAGR of 12.5% between 1995 and 2013 (CEIC China Database 2014). The challenges that this growth brings to China are subject to increasing research.
Human mobility is a key concept subject to multidisciplinary study (e.g., tourism, geography, transport, and sociology), where research has focused on social mobility, geographic mobility, occupational mobility, smart mobility, and even everyday mobility. Although there might be varied foci and interests between these areas, analyzing and modeling flows of people on different (spatial and temporal) scales is a major concern in the mobility literature. For example, Stillwell et al. (2016) explored the scaling effects of distance decay in the case of internal migration in an international context. Tourism, as a form of mobility, particularly as a subset of a vast and heterogeneous complex of global mobilities, is increasingly integrated into wider processes of economic and political development and even constitutive of everyday life (Hannam, Butler, and Paris 2014). Understanding how tourists move through time and space has important implications for infrastructure and transportation development, product development, destination planning and the planning of new attractions, as well as management of the social, environmental, and cultural impacts of tourism (Lew and McKercher 2006).
Understanding tourism mobility involves the scientific and empirical interpretations of spatial and temporal patterns of flows between destinations or attractions within a study area. The prerequisite for reasonably analyzing and modeling tourist mobility is the collection of representative flow data of a high quality. Apart from secondary data, traditional methods of data collection are dominated by questionnaire surveys and interviews (e.g., Shoval and Raveh 2004; Asero, Gozzo, and Tomaselli 2016; Wong, Fong, and Law 2016), which are time consuming, costly, and labor intensive. Hannam, Butler, and Paris (2014) suggest that new technologies (e.g., sensor, social media, and the Internet), together with new means of high-resolution data collection, provide opportunities to develop sophisticated tools with which to explore mobility (Shoval and Ahas 2016), encompassing digital footprints (Önder, Koerbitz, and Hubmann-Haidvogel 2016), word of mouth (Ring, Tkaczynski, and Dolnicar 2016), user-generated content (UGC) (Lu and Stepchenkova 2015) and big data (Fuchs, Höpken, and Lexhagen 2014).
Most of these studies focus on the spatial dimension of mobility; however, Leiper (2004, 129) suggests that “different tourists tend to perceive the same destination in different ways and a very influential variable is the time available for spending in each place,” because time has been increasingly recognized as a scarce commodity and limited resource. The valuation of time as commodity or resource, which determines the distribution and order of attraction visits, is very much affected by time availability during a trip. Fuchs, Höpken, and Lexhagen (2014) stated that “the evidence and understanding of the role played by time are incomplete.” Using survey data, Shoval and Raveh (2004) revealed that both length of stay and number of previous visits to the city have a strong effect on the tourism consumption pattern, which is an initial exploration of the temporal dimension of tourism mobility. This article, therefore, further explores the temporal heterogeneity in tourism mobility through the application of new means of data collection, to consider the following: how does the length of trip affect tourist mobility, which data sets can be utilized to analyze the influence, and what are the implications for tourism mobility theory?
The second section of the article presents a literature review of tracking and analyzing tourist mobility, before introducing the study area of Nanjing City (China), data collection and processing, and analytical methods in the third section, followed by the presentations from the analyses and models in the fourth section. The fifth section proposes a conceptual model summarizing and comparing tourist flow patterns in response to different trip lengths. The article concludes with a discussion of the potential contribution to the study of tourism mobility.
Literature Review: Tracking and Analyzing Tourist Mobility
Tourism mobility involves the details of a person (who), location (where), time (when), and context (what or environment) (e.g., Kim and Fesenmaier 2015). The 4-W’s approach enables analysis of where and when the tourist is and what they are doing. Information can be recorded on a variety of spatial, temporal, and human scales, where spatial scale considers resolution and extent; the temporal scale includes interval and duration or length; and the human scale consists of sample and population. This approach underpins both the tracking and analysis of tourist mobility.
Tracking
There has been increasing growth of the use of sensors for tracking purpose, for example, GPS, cell-tower identification, Wi-Fi positioning, Bluetooth, RFID, and camera, to list but a few. In terms of citizen science (Goodchild 2007), the individual tourist acts as a kind of “sensor,” allowing the tracking of movement through their physical movements together with the personal data they volunteer and share online. By reviewing the progress of tracking technologies, from GPS to smartphone, Shoval and Ahas (2016) predict the eventual development of smartphones that will connect to various external sensors, which will become the main facility for rapidly collecting the spatial and temporal data of tourist mobility. For example, online social networks (e.g., Twitter, Flickr, and Foursquare), already provide geo-referenced messages and photos with high spatial and temporal resolution. These new data sets have motivated the spatial and temporal analyses of tourist movement (Shoval and Ahas 2016), who argue that new tracking technology is promoting data collection from networks to analyze high spatial, temporal, and human scales. The Internet, in particular, has changed from a “publishing-browsing platform” to a “participation-interaction platform” (Xiang et al. 2015). For instance, Zhou, Xu, and Kimmons (2015) developed a method for detecting tourism destinations using geotagged photos from social media. Liu et al. (2014) utilized social media check-in data set submitted by approximately half a million users to analyze interurban trip patterns. As social networks and online communities create the social DNA of the Network Society (Castells 2001), a range of platforms have emerged specifically within the tourist sector that utilize UGC, for example, virtual communities, consumer reviews, personal stand-alone blogs, blog aggregators, and microblogging platforms, social networks, media sharing tools, and wikis (Lu and Stepchenkova 2015). UGC, which is textual or visual (e.g., Zhang, Wu, and Mattila 2016), can improve data availability, reduce data cost, and promote speed and simplicity of data collection, when compared with other methods (e.g., GPS tracking, travel diary, or questionnaire survey). After completing a systematic review of empirical research using UGC as a data source to address issues in tourism and hospitality applications, Lu and Stepchenkova (2015) conclude that textual UGC and qualitative content analysis are the leading data type and research method employed, although there is an increasing trend in quantitative analysis (e.g., big data analytics) using both textual and visual data. However, most of these applications focus more on destinations or attractions rather than tourists flows.
Analyzing
McKercher and Lau (2008) summarized 78 tourist movement patterns within an urban destination using qualitative techniques. Quantitatively analyzing the mobility patterns is very much dependent on these scales (spatial, temporal, and human scales), complicating how to aggregate these data at the individual level, spatially and temporally. One approach is to conduct analyses at the individual level. For example, Vu et al. (2015) modeled the travel behaviors of inbound tourists to Hong Kong using geotagged photos and the Markov Chain model. Time–geography focuses on the movement and interaction of tourists in time and space, and has been extensively applied to visualizing, analyzing and modeling patterns of tourist mobility (e.g., Grinberger, Shoval, and McKercher 2014), and time-space constraints (Shoval 2012). Modsching et al. (2006) analyzed the tourist’s spatial behavior, in particular, the time spent for different activities at an urban destination using GPS tracking data. McKercher et al. (2012) further explored the significant impacts of visitor numbers on their spatial patterns using the same kind of data. These studies highlighted the heterogeneous uses of time between activities during a trip.
There is an increasing trend involving the aggregation of mobility data at the individual level into networks, as the basis for the analysis of the topological structure of the attraction system (Smallwood, Beckley, and Moore 2012). This is because understanding the network characteristics of tourism attractions has practical implications for the competitiveness, management, and planning of tourism (Stienmetz and Fesenmaier 2015). Travel patterns can be viewed as network, and therefore, subject to network analysis (e.g., Shih 2006), which include several structural indicators, for example, centrality. Based on a core–periphery analysis of attractions network, Zach and Gretzel (2011) argued that the network structure provides a strong and practical basis for dynamically bundling products that create value for tourists and the destination. Hwang, Gretzel, and Fesenmaier (2006) concluded that (social) network analysis methods can provide insights into structural properties of trips and allow for testing a variety of structural differences, after their successful analyses of multicity tourism behavior across the United States. More interestingly, using UGC data from overseas tourists (textual trip diaries) from 6 websites and data from social media, Leung et al. (2012) demonstrated the strength of social network analysis methods through the analysis of patterns of intradestination tourist movement networks to examine the impacts of the 2008 Olympic Games. These case studies have proved that both UGC data and social network analysis can be effectively integrated into analyzing the spatial patterns of tourist flows. However, temporal patterns of tourist flows are not reported yet.
In the majority of tourism studies, distance plays a crucial role. Distance decay, sometimes referred to as “friction of distance,” indicates declining interaction with an increase in distance. The distance decay effect is a universal phenomenon in geography for modeling spatial interaction between origin and destination sites. McKercher and Lew (2003) noted that distance decay was popular in tourism research in the 1960s and 1970s, when it was used as a proxy for forecasting. McKercher, Chan, and Lam (2008) summarized three types of distance decay curves: classic decay curve, plateauing decay curve, and a decay curve with a secondary peak to represent the global flow of tourists. The distance decay effect within tourism studies has focused on destination choices, such as outbound international travel (Hooper 2015). Actual decay curves have been found to exhibit a plateau, or a secondary peak (Smallwood, Beckley, and Moore 2012). However, few studies have explored the distance decay in tourist flow distributions.
Greer and Wall (1979) observed that the type of trip undertaken influences the decay rate—short duration trips experienced the steepest curve, while longer duration trips tended to display more of a plateauing curve. This means that the distance decay effect may demonstrate temporal variations with length of trip. Halás, Klapka, and Kladivo (2014) have explored the heterogeneity of distance decay effect in commuting patterns. However, in network analysis case studies, temporal heterogeneity has been ignored. For example, tourists’ spatial movement patterns during the fall might be different from the spring, summer, and winter seasons. Nicolau, Zach, and Tussyadiah (2016) confirmed a positive impact of both distance and first-time visitation on length of stay using data collected from 908 US visitors to a tourism destination in the Atlantic Coast of the United States. This knowledge gap regarding the temporal heterogeneity in mobility analysis, therefore, forms a central aim of this article.
Methodology
Study Area
Nanjing City is the capital of Jiangsu Province, which is located in eastern China and at the middle and lower reaches of the Yangtze River (Figure 1). Nanjing City comprises 11 districts, with an area of 6,597 km2 and a total population of 8.22 million in 2014 (JSBS 2015). As a key node within the Yangtze River Delta, which is an emerging global city region, Nanjing has been a national center for education, research, transportation, and tourism. As a result of rapid urbanization, Nanjing has become a core mega city in this economically wealthy Yangtze River Delta (Wu et al. 2014). In terms of tourism, this city, with 2,500 years of urban history, has been the capital city of six different dynasties and is now recognized as one of the four great ancient capitals of China (Yuan, Gao, and Wu 2016). With such a long historical culture and a unique natural landscape, Nanjing has been attracting domestic and overseas tourists. Nanjing hosted 5.66 million overseas tourists and 94.19 million domestic tourists in 2014 (JSBS 2015). With the total income of RMB 150.48 billion yuan in 2014, the tourism economy of Nanjing was ranked within the first few and well recognized as a top tourism city (JSBS 2015). Being focused on the internal mobility of tourists within a city, the study area is composed of 47 subdistricts, located to the south of the Yangtze River and within the ring roads (Figure 1).

Location of study area (Jiangsu province, Nanjing City, and distribution of attractions).
Data Collection and Processing
The tourist flows data set was compiled from a commercial web service called “Where to go?”(www.qunar.com) (another application example can be seen from the case study by Wu and Pearce 2016), which was launched in late 2010. This web service, an equivalent of Trip Advisor in the Western world, provides a platform for the public or specifically tourists all over the world to upload and exchange trip-related information (e.g., story, narrative, and photo for attractions and routes) across the world. Such UGC data, although in text or photo format, has high spatial and temporal resolution. Spatially, all of the attractions visited are detailed with a specific popular name together with easily recognizable photos. Each attraction can then be geo-referenced by using longitude and latitude coordinates extracted from an open Geographical Information System (GIS), for example, Tianditu web service. A point layer with all of these attractions can be overlaid with other GIS layers (such as road networks, collected from secondary sources) for further spatial analysis, after being projected onto the local coordinate system. Temporally, the visits to attractions are recorded in order on a daily basis (some even on an hourly basis) and therefore it is possible to recognize which attraction was visited and on which date. In total, 1,424 valid samples of complete trips (those trips with only one attraction visited were removed from the sample), involving 45 attractions within Nanjing in 2012 (including traditional spots and also shopping streets, bars and pubs, and university campuses) (see Figure 1), have been recorded in a database.
In terms of the network, each attraction is represented as a node, the direct line between a pair of nodes is denoted as edge (E), the number of tourists on the line is represented as flow. All of these edges form a network of tourist flow (or mobility). Suppose that a complete trip is composed of a departure train station, hotel, arrival airport, and eight attractions from A to H (but E is outside of the study area), then this trip can be semantically represented by a flow between the seven attractions excluding E. Further, this flow is mathematically represented by a 7×7 matrix in which value 1 indicates a move and 0 means no move between a pair of origin and destination spots.
Following this rule, each trip in the sample was transformed into a 45×45 relational matrix with only 1 or 0 values. For the whole set of samples, all of these 45×45 matrices are overlaid by a simple matrix addition operation. The final output—a flow network—is represented as a 45×45 matrix, in which the value for each matrix element ij is the total flow of tourists from attraction i to j. With the temporal information (the attractions visited each day) available from the web service, these trips can be classified into one-day trips, two-day trips, and three-or-more-day trips. Accordingly, the flow network can also be split into three subnetworks of flows, which are composed of flows from only one-day or two-day or three-or-more-day trip, respectively. All of these networks will be used to analyze and compare the mobility and behavior of tourists in central Nanjing. Statistically, this network, formed of 1,424 trips in total, is composed of 45 nodes (Figure 1), 590 edges, with a total flow of 5,337 on the network. The statistical patterns of the three categories are summarized in Table 1. This table demonstrates a clear trend that the numbers of trips and flows decreased with the length of the trip but the total number of attractions and average numbers of attractions and routes per trip increased with the trip length.
Flow of Tourists between Attractions for Different Trip Lengths.
Methods
Analyzing mobility network between attractions
In social network analysis, the basic measures of centrality are calculated through a n × n matrix of attractions to evaluate the structural features of networks. The measures classify the links among attractions according to their nodal role along tourist routes and identify the most important network attractions. This measure is composed of several metrics, including in-degree, out-degree, and degree centrality. In-degree is the total number of tourists arriving at a node or attraction and, conversely, out-degree the total number of tourists departing from a node or attraction (Table 2). As the sum of in-degree and out-degree, degree centrality (Table 2) allows for recognition of central attraction in the tourism networks. This paper proposes, therefore, a new index vergence (between −1 and 1) to measure the difference between inflow and outflow at each node.
Network Metrics.
Network density D, describing the global level of linkages among the nodes, is defined as division of the total number of edges or links E by the maximum number of links on the network. The more connected the nodes are, the denser the network. This global measure varies from 0 to 1. If the value of network density is equal to 1, it means that the nodes are tied to each other. In this study, seven network metrics are employed to analyze the mobility of tourists between the attractions, which are detailed in Table 2 and explained as follows.
Cin,i and Cout,i are the inflow and outflow of tourists arriving at and departing from attraction i, respectively. lji is the number of flows or tourists from attraction j to i and lij from i to j. Ci is the degree centrality and Si the vergence value at attraction i. D is density of the network, E is the total number of edges, CD is the centrality of the network,
Modeling flows of tourists between attractions
Santeramo and Morelli (2016) concluded that distance is a main friction to tourist flows, but its effect varies with the intensity of tourist flows: the latter are less affected. A gravity model is commonly expressed as Iij=k PiPjf(dij), where Iij is the intensity of interaction between i and j, Pi and Pj are the attraction force at locations i and j respectively, dij is the network distance between i and j, f(dij) is a function of distance dij. In practice, Pi and Pj are often measured by the economy and population in the two places, f(dij) is either an inverse power function d-β, or a negative exponential function e-βd. β is a spatial barrier co-efficient or distance friction coefficient. The greater the parameter β is, the more sensitive to distance it is, then the higher speed of decay it is. Wilson has further developed the gravity model based on entropy maximization into flow model (Wilson 1967), which is Iij = k OiDj/ f(dij), where dij and f(dij) remain the same, Iij is the intensity of flow from location i to j. When moving to tourist flows, Oi is the outflow at location i and Dj the inflow at location j. In this study, lij is used to measure Iij, the flows between two attractions, Cout,i for Oi and Cin,j for Dj, with dij the real traveling distance between attractions.
In order to consider the variation in the effects of inflow and outflow at attractions, equation (9) needs to be revised into equation (10) as follows (see another example in Patuelli, Mussoni, and Candela 2013):
where two parameters a and b are added to the equation to reflect the intensity of impacts of inflow and outflow on flows. Parameters a and b can be viewed as the elasticity of attraction interactions. This gravity model has also been applied by Morley, Rosselló, and Santana-Gallego (2014) for modeling tourism demand. The interpretation is straightforward. For parameter a, a 1% increase in inflow size is expected to produce a proportional increase in the number of local flows, and the same for parameter b.
To calibrate the parameters a, b, and β, a natural log transformation is taken for equation (10), which results in equation (11) for the power function of f(dij) and equation (12) for the exponential function of f(dij). Then an ordinary least square (OLS) regression analysis is used to seek a best-fit equation, with the real data set including flow of tourists, inflows, outflows, and distance between attractions.
Results
Using the three flow networks defined above, social network analysis was conducted and flow models were calibrated. Their network and flow characteristics were calculated separately for comparison, using the metrics defined in Table 2 and the models in equations (11) and (12).
Nodal Position on the Tourist Flow Networks
Nodal position on tourist flow networks
The centrality value at each attraction was calculated using equation (3) for each network separately (Figure 2). There is a general trend: the centrality value decreases with the length of trip. The average centrality value over a network has decreased from 97.91 (one-day trip) to 80.13 (two-day trip) and further down to 59.16 (three-or-more-day trip) and their corresponding standard deviation from 188.51 to 115.97 and further down to 64.27. This indicates that a one-day trip is focused on more selective well-known attractions, but a three-or-more-day trip, on most attractions. These results reflect that the nodal position on the network of attractions varies with trip length. For simplicity and interpretation purposes, all attractions are classified into only two categories, primary and others, based on their centrality values. It is found that the top four attractions with the highest value of centrality are the same for each network, which include the Confucian Temple, Sun Yat-sen’s Mausoleum, Xuanwu Lake, and Presidential Palace. These four attractions are defined as primary spots and all the rest as others. However, each primary attraction has a varied value of centrality between the networks. On the one-day flow network, the centrality values of the Confucian Temple, Sun Yat-sen’s Mausoleum, and Xuanwu Lake are all greater than 500. On the two-day flow network, the centrality values of the four primary attractions range between 300 and 500, with the highest value of 476 taken by the Confucian Temple. On the three-or-more-day flow network, the highest centrality value by the Confucian Temple is 235 and lowest value by Xuanwu Lake 174. Therefore, as the core attractions in Nanjing, the Confucian Temple and Sun Yat-sen’s Mausoleum are ranked the first two for all networks. These two attractions, as the only two recognized as 5A level (highest) attractions, are the most well-known ones across Nanjing. Both Xuanwu Lake and Presidential Palace are classified as 4A-level tourist attractions. The general declining trend of centrality values indicates that the importance of the four primary attractions in tourists’ travel behavior decreases with the length of trip. In some sense, this result confirms the role of time in ranking primary attractions, a theory of tourist attraction as highlighted by Bottia, Peypoch, and Solonandrasana (2008), which means the one-day trip is focused on discovery attractions, but the longer-length trip on escape attractions.

Mapping the centrality values of primary attractions across the study area.
The visitation rates to primary attractions
Visitation rates to the top four primary attractions are calculated for each length of trip, shown in Table 3, in which the percentage of primary attraction k for trip t is calculated as the division of the number of samples involving attraction k for trip t over the total number of samples for trip t. It is clear to see that these visitation rates increase with the trip length. For short-length trips, only the selected primary attractions are visited quickly because of limited time availability. For longer-length trips, each of the most attractive primary sites, together with its surrounding minor sites, will be visited each day because of higher time availability.
The Visitation Rates to Primary Attractions between the Three Trips.
The vergence of attractions on flow networks
On each network, a vergence value is calculated for each attraction using equation (4), and then all of the attractions are classified into three categories—divergence, balancing, and convergence—according to the vergence values. A one-day trip shares the same set of convergence-type spots (Confucian Temple), divergence-type spots (Sun Yat-sen’s Mausoleum), and balancing (the rest), with a two-day trip. A three-or-more-day trip adds two more attractions (Xinjiekou Centre and Hulan street) to the convergence-type spots, and two other spots (Presidential Palace and Yangtze River Bridge) to the divergence-type spots. These variations indicate that tourists tend to start with a divergence-type attraction and end with a convergence-type attraction. It also implies that the flow network of a longer-length trip tends to be mixed with divergence-type and convergence-type spots, which might be determined by purchasing power, physical consumption, and many other factors. These results demonstrate that temporality has great impact on intensity and direction of vergence.
Flow Characteristics on the Tourist Flow Networks
Using the values of tourist flows lij, the network density, mean flow, and standard deviation are calculated using equations (5)-(7) for each network, respectively. The network density is nearly the same between the three networks (0.216, 0.217, and 0.222 for one-, two-, and three-or-more day trip respectively), which is very low. However, there is a large variation in mean flow and its standard deviation between the three networks, demonstrating a decreasing trend (7.27, 4.82, and 3.32 in mean flow and 15.17, 8.83, and 5.28 in standard deviation for one-, two-, and three-or-more day trip respectively). The trip over three days is more spatially uniform. It is clear to see that the short-term trip is characterized by the large flows between primary attractions that have higher values of centrality and conversely the longer-term trip by a small number of flows between minor attractions. For example, in the one-day trip, the total number of tourists from the Presidential Palace to Confucian Temple is 108 (of 2,203). The shortest network distances between attractions are calculated using ArcGIS 13.1, based on the road transport network on Tianditu Maps. According to Equation 7, the weighted average distance d is calculated for each network. The one-day trip network has the longest distance (5.51 km), the three-or-more day trip network the shortest (3.45 km) and the two-day trip network between both (4.39 km). This index has also shown that the tourist flows on short-term trips tend to be long distance, but the flows of long-term trip tend to be short distance.
The top five flows over each network are defined as primary flows and the rest as others. The distributions of these primary flows are shown in Figure 3. Over the one-day trip network, the largest number of flows with a total value of 108 is from the Presidential Palace to the Confucian Temple, followed by the flows with a value of 89 from Sun Yat-sen’s Mausoleum to the Confucian Temple. These flows are characterized by a long distance between primary attractions. Over the two-day trip network, the largest number of flows with a total value of 113 is from Sun Yat-sen’s Mausoleum to Ming Xiaoling Mausoleum, followed by the flows with a value of 67 from the Presidential Palace to the Confucian Temple. These flows are characterized by shorter distances between primary attractions than those over the one-day trip network. Over the three-or-more-day trip network, the largest number of flows with a total value of 67 is from Sun Yat-sen’s Mausoleum to Ming Xiaoling Mausoleum. The primary flows are dominated by the short distance between two primary attractions. The comparisons between the three networks indicate that there is strong association between primary attractions and primary flows. Primary flows are connected with at least one primary attraction. The primary flows from short-term trips tend to include long-distance mobility between primary attractions. However, the primary flows from long-term trips tend to be short-distance mobility between a primary attraction and its surrounding other attractions.

Tourist flow networks for different lengths of trip.
Flow Models
For each trip network, the power and exponential functions and their revised versions, selected to represent distance decay effects, are calibrated by OLS regression analysis (Equations 11 and 12). Comparing the results of regression analysis (Table 4), the adjusted R2 is highest for the revised power functions, which are 0.724, 0.662, and 0.633 for the one-day, two-day, and a three-or-more-day trip, respectively. Thereby, the model based on the revised power function is best fitted with the real tourist flows between attractions, so can better interpret the tourist flows. This has further confirmed that most flow networks of tourists have a degree distribution, which follows a power law for most part (Baggio, Scott, and Cooper 2010).
Calibrated Regression Equations.
In the three calibrated models using the revised power function, parameters a and b are all statistically significant at the 1% level. However, parameters a and b vary with trip length, from 0.424 and 0.433 (one-day trip network) to 0.390 and 0.387 (two-day trip network) down to 0.308 and 0.330 (three-or-more-day trip network), respectively. Comparatively, the three-or-more-day trip network demonstrates a larger difference between a and b, which is 0.022, than the other networks. It indicates a roughly smaller effect from outflows than inflows on tourist flows between attractions.
The spatial barrier coefficients of flows between the attractions are all statistically significant at the 1% level, being 0.248, 0.370, and 0.529 for the one-day, two-day, and three-or-more-day trip, respectively. In other studies of human mobility using mobile phone and taxi data sets, this parameter β is calibrated between 1 and 2 (Kang, Ma, Tong, and Liu 2012; Liu et al. 2012). Thereby, the parameter β in this study is beyond this range. In addition, in the study of long-distance flows between cities in China using the data sets of passengers, its parameter β is also higher than this (Xiao et al. 2013). These variations indicate that the effect of distance on the flow of tourists within a city is smaller than that of long-distance travel and those of the daily travels of local residents. The travel from origin to destination cities needs to deal with a long distance, and with higher time and economic costs. However, the travel within a destination city only needs to deal with a short distance, as its friction coefficient is smaller than that from the long distance one. As a high transport cost has been invested, tourists often are concerned about flows rather than distance. In addition, distance friction is smaller than those of local people’s daily activities.
The results from the regression analysis of the three types of trip exhibit that the influence of the attraction system on tourist flows decreases, but the distance decay effect increases with the increase of trip length. Or rather, for a short-period trip, a primary attraction has higher impacts on flows, and the effects of distance between attractions decrease. Consequently, tourists tend to move to the primary attractions. For a long-term trip, the influence of attraction hierarchy on tourist flows decreases, but the distance decay effect increases, tourists tend to move to the other attractions within a shorter distance. This may imply that the flow network of tourists does not exhibit scale-free property as demonstrated by other tourism networks (see Baggio, Scott, and Cooper 2010).
Discussion
All of the attractions in Nanjing City can be classified into three attraction subsystems: Sun Yat-sen’s Mausoleum attraction subsystem, Confucian Temple attraction subsystem, and city center attraction subsystem (Jin, Xu, Huang, and Cao 2014). Tourist flows can be classified into two categories: intrasystem flows and intersystem flows. The primary flows within and out of the systems have large variations between the three networks. During a one-day trip, the top four flows are all intersystem mobility. This demonstrates that the one-day flow network is dominated by the long-distance intersystem mobility. During a two-day trip, the primary flows include two intersystem flows and three intrasystem flows. The former mainly occurs between the core primary attractions. During a three-or-more-day trip, all the primary flows are intrasystem moves, which reveals that this network is dominated by the short-distance mobility with limited smaller number of long-distance intersystem flows.
Based on this analysis, the patterns of tourist flows between attractions in Nanjing City are abstracted into a conceptual model (Figure 4). In the pattern of a one-day trip, primary flows are mostly distributed over the primary attractions between systems and few within the systems. These primary flows dominate the flows of the entire system, connecting different attraction subsystems. Other flows are distributed within the system.

Patterns of tourist flows between attractions for three networks.
In the pattern of a two-day trip, primary flows are mostly distributed between the primary and other attractions within the system, and few between the primary attractions of different systems. The other flows distribute uniformly between systems and within a system. In the pattern of a three-or-more-day trip, the primary flows uniformly distribute between the primary and other attractions within a system and other flows between the primary attractions within different systems, and also between the primary and other attractions within the system.
There have been large variations in the patterns of tourist flow between the three networks. For the short-length trip, the flows between the primary attractions of different systems dominate the whole process of mobility. However, the long-length trip is characterized by intrasystem flows. To understand the reasons, we have conducted interviews with the tourists from three networks, 10 from each one. It has been found that
during the one-day trip, eight tourists decided to visit two to three well-known primary attractions within different systems (e.g., Sun Yat-sen’s Mausoleum, Confucian Temple, Presidential Palace, and Xuanwu Lake), but two tourists tended to concentrate on one system by visiting its primary attractions and the other attractions around it. Because of the long distance between the primary attractions within different systems, the one-day trip is dominated by the intersystem flows.
during the two-day trip, three tourists choose one system each day and have detailed visits of the primary attractions and its surrounding other attractions. Four tourists tend to visit all attractions within a system in one day, and only the primary attractions of the other two systems for the second day. A further three tourists did not have any specific preferences and focused more on primary attractions. As a result, the two-day trip network is characterized by long-distance intersystem and short-distance intrasystem moves.
during the three-or-more-day trip, because of the availability of sufficient time, ten tourists all had detailed plans and tended to concentrate on one attraction subsystem per day so that they could explore all of the attractions in detail. Their traveling efficiency can be maximized as a result of the short distances between attractions within each system. Consequently, this trip network is characterized by short-distance moves.
Conclusions
In this paper, taking Nanjing City as a case study, user-generated content (UGC) with high spatial and temporal resolution has been successfully utilized for constructing three flow networks corresponding to different lengths of trip, analyzing the temporal heterogeneity of tourist mobility, allowing significant conclusions to be drawn, as follows.
The position of each attraction, highly correlated with the centrality value, in the local attraction systems, declines with the length of trip. The intensity and direction of flow vergence also demonstrate significant temporality. The distance decay effect in intraurban tourist movements, following power law, declines with the length of trip. Their spatial barrier coefficients demonstrate that these spatial flows are less sensitive to distance, when compared with intercity (Xiao et al. 2013) and the daily human movements (Liu et al. 2012). It also indicates the scale-dependent property of tourist flows.
With the increase of trip length, the influence of attraction systems on flows is weakening, but the influence of distance is continuously enhancing. Tourists tend to change from long-distance intersystem to short-distance intrasystem movements. These tourist behaviors might be explained by the theories of “process” oriented tourists (Lew and McKercher 2006) and ‘Escape/discovery attractions’ (Bottia, Peypoch, and Solonandrasana 2008). To achieve this, more contextual data from the web service need to be extracted and analyzed with text analytics, such as opinions and comments on these attractions. These temporal (decaying) effects mentioned above indicate the roles of time in perceiving, ranking, and visiting primary attractions, which provide empirical evidence to the theories regarding time in tourism studies (Leiper 2004; Fuchs, Höpken, and Lexhagen 2014).
This case study has confirmed that UGC data, a form of big data and digital footprint, can be utilized for exploring temporal effects in tourist mobility through a static network analysis and spatial interaction model at the aggregate level. However, social network analysis and spatial interaction models focusing on pattern analysis are not able to model spatial and temporal processes. The UGC data used in this article is a small sample in terms of its temporal scale (only one year). With a large size sample in the future, these UGC data can be further used to model the dynamics of tourists’ mobility (e.g., Yang, Fik, and Zhang 2013) through developing Markov chain models (e.g., Vu et al. 2015). As one form of big data, many other elements of digital footprint, such as photos and narratives, are not explored yet, which contain information on tourists’ experience, attitude, and demographics, and can be used to analyze the tourists’ behavior and emotions (e.g., sentiment analysis). The study is also subject to sampling bias as only a small proportion of tourists have recorded their trips online, and in particular, those who might be deprived of digital access are not sufficiently represented in the samples. This reflects the drawbacks of UGC though it has also provided evidence for the linkage between citizen science (Goodchild 2007) and tourism studies (e.g., Brosnan, Sebastian, and Rock 2015). It is rewarding to see how citizen science can potentially be incorporated into travel studies. The limitations of the web service (where to go?) include no attribute data for tourists (e.g., age, gender, origin), number of visits (first-time or repeat visit), types of tourism (e.g., pleasure), where the tourists stayed, and how they traveled to the destination. These data, which are recommended for the web service to add in the future, will not only help to explore more spatial and temporal patterns and behavior (e.g., different consumption patterns between first-time and repeat visitors, and influence of hotel location on behavior patterns) but also interpret the mechanisms behind the disclosed spatial and temporal heterogeneity from the perspective of tourists’ behavior. The incorporation of the temporal dimension into tourist mobility study will enhance understanding of the valuation of time in tourists’ behavior.
Theoretically, this case study reflects the tourist mobility in a rapidly developing city whose tourism studies remain conceptually ill equipped, given most of their theories have been generated from Western contexts (Winter 2009). This article, therefore, begins to address this significant gap in the field through exploring the temporal heterogeneity of tourism mobility in a non-Western context.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: We are thankful to the National Natural Science Foundation of China (serial number: 41571134, 41430635, and 41601131), the fund of Tourism Young Expert Training Program (serial number: TYETP1426), the Natural Science Foundation of the Jiangsu Higher Education Institutions (serial number: 16KJA170002), and the fund of the Priority Academic Program Development of Jiangsu Higher Education Institutions (serial number: 164320H116).
