A framework of comparative urban trajectory analysis

Abstract

The increasing availability of urban trajectory data from the GPS-enabled devices has provided scholars with opportunities to study urban dynamics at a finer spatiotemporal scale. Yet given the multi-dimensionality of urban trajectory dynamics, current research faces challenges of systematically uncovering spatiotemporal and societal implications of human movement patterns. Particularly, a data-driven policy-making process may need to use data from various sources with varying resolutions, analyze data at different levels, and compare the results with different scenarios. As such, a synthesis of varying spatiotemporal and network methods is needed to provide researchers and planning specialists a foundation for studying complex social and spatial processes. In this paper, we propose a framework that combines various spatiotemporal and network analysis units. By customizing the combination of analysis units, the researcher can employ trajectory data to evaluate urban built environment dynamically and comparatively. Two case studies of Chinese cities are carried out to evaluate the usefulness of proposed conceptual framework. Our results suggest that the proposed framework can comprehensively quantify the variation of urban trajectory across various scales and dimensions.

Keywords

Comparative analysis urban trajectory spatial social network China

Introduction

Advancements in sensing and computing technologies have created urban trajectory datasets of human and vehicle movement at an unprecedented scale and speed. For example, the prevalence of GPS, Wi-Fi, Cellular, and RFID devices has enabled human dynamics to be recorded through the movement of taxis, fleets, public transits, and mobile phones at the individual level. Understanding and analyzing human dynamics can help assess transportation infrastructure and urban planning policies, and further inform better strategies to optimize urban and transportation planning, improve human life quality, and amend city operations.

Gong et al. (2015) studied the intra-urban daily human mobility using taxi trajectory data in Shanghai. Online check-in data recorded mostly by GPS-enabled mobile devices, can also be good indicators of human activities. To enable data-driven urban studies using these increasingly rich trajectory data and facilitate better decision-making for domain experts and policy-makers, the following two issues need to be addressed. First, trajectory data sets are often collected with diverse formats and at various scales and second, it is often challenging to incorporate social connection data with geographic analysis (Andris, 2016).

Existing methods and frameworks mostly focus on analyzing human dynamics at a specified scale (e.g. individual level, intra-urban level, or inter-urban level), instead of comparing human movement patterns across various scales. Such comparative studies are particularly needed in the urban planning context, given that assessment and comparisons of different planning scenarios are important for decision-making. Taxi trajectory data can be used to investigate urban functional zones at different scales (Zhao et al., 2017). They may be studied at an individual level to identify human movement patterns, or aggregated by officially-defined boundaries or trajectory similarities to compare administrative spatial layout with human-demarcated community structures and explore gaps between public’s transportation needs and current state of transportation planning scenarios (Li et al., 2017). Experiments with different analysis units are often needed as well in that researchers or planning specialists may critically examine the effects of aggregation, both spatially and temporally, to identify the proper analysis unit (Ye and Rey, 2013).

Trajectory data contain rich information about both spatial and social processes within urban environments, and the analytical methods vary drastically from spatiotemporal analysis to network analysis at various scales. In this regard, we propose a framework that can accommodate various analysis needs by considering different combinations of space, time, and networks for analyzing trajectory data. Although there are no one-size-fits-all solutions for analyzing trajectory data, the proposed framework aims to provide researchers and planners a foundation to customize their analysis. The synthesis of various analysis scales and dimension allows users to uncover spatiotemporal and societal implications of human activity patterns dynamically and comparatively. In the rest of the paper, we first provide a comprehensive literature review on theoretical foundations for developing the proposed framework, following by an elaboration of the framework. Two case studies are then presented to showcase the feasibility of the suggested framework. The paper concludes with a discussion on potential applications of the framework and future research directions.

Literature review

Socio-spatial studies of built environment have a long history in human ecology and urban geography (Lee, 1968). Burgess’s well-known concentric zonal hypothesis is one of the earliest work of recognizing “natural areas” by delineating urban areas based on their physical attributes and associated human activities. This method has since been widely adopted to study the social and spatial dimensions of urban built environment and examine issues such as social polarization, segregation, and reconfiguration of social areas. Along this line, a number of studies have used census data to explore socio-spatial structures of cities. Spielman and Thill (2008) studied social areas of New York City by linking physical space with social attributes. They suggested that this method could shed light on the geographic proximity as well as the social similarities among census tracts and thus provide valuable insight into the complex social landscape of the city. Yao and Zhang (2014), similarly, investigated the structural change of Beijing’s socio-spatial configuration using sub-district level census data. They found that the mosaic of physical and social landscape could effectively capture the formation of more disaggregated urban structure as well as the emergence of new types of social areas.

While these census data-based studies demonstrate the value of incorporating population characteristics with geographies of physical landscape, they are limited to static datasets that can only capture the spatial patterns of residential- or work-based activities. In addition, census surveys are usually implemented every several years, which presents challenges for researchers to investigate the dynamics of socio-spatial areas at a finer temporal resolution. Recent developments in location-aware technologies have provided researchers with opportunities to probe into socio-spatial patterns of the built environment at the finer spatial and temporal resolutions and to extend the analysis to recreational activities through various types of digital footprints and user-generated contents. Cranshaw et al. (2012) presented an exemplar work of delineating human-demarcated areas using online check-in data. Venues are clustered based on their spatial proximity as well as the social proximity with one another. In particular, the spatial proximity is calculated according to the geographic coordinates, while the social proximity is measured by the characteristics of people checking-in at the location. As a result, dynamic units that are subject to human activities are able to be identified. The richness of contextual information embedded in user-generated content has also allowed researchers to study the semantic meanings of places in terms of how place is used and perceived by people. Hollenstein and Purves (2010) argued that the tagging behavior on social media has implications on collective perceptions of urban places and thus shed light on social landscape of the built environment. Although a number of studies have pointed out the biased representation of such user-generated data in terms of the biased demographics and uneven geography within and across the city (Luo et al., 2015; Robertson and Feick, 2015), the increasing amounts of data about human activities provide opportunities to advance our spatiotemporal understanding of the built environment.

According to Newman (2003), there are four “loose categories” of network analysis, namely social networks, information networks, technological networks, and biological networks. While these networks encompass multiple domains, they are modeled using similar relational structure that is composed of nodes and ties, and are analyzed based on a set of mathematical methods grounded in graph theory (Newman, 2003). For example, a city can be considered as a network system whose edges are defined by streets and nodes by the edge intersections or landmarks (Agryzkov et al., 2016; She et al., 2016), while a social relationship graph can be accounted for a network with people as its nodes and the physical contacts among people as its edges. Network analysis is considered as a useful approach for many scientific research as it provides both a theoretical framework that conceputalizes the relationships among individual actors and a set of analytic approaches that model and analyze these relationships. Social network analysis (SNA), in particular, has its theoretical roots in interaction among individuals and provide insights into various social networks, whether be it the interaction among people in the physical world to the virtual connections established in the digital world, through analytic approaches (Dempwolf and Lyles, 2012).

While SNA primarily focuses on social aspects of people’s interaction, Andris (2016) called for a need to combine such analysis with GIS to advance our understanding of geographies underlying various social processes. Luo and MacEachren (2014) proposed a conceptual framework to study geo-social relationships, suggesting that the First Law of Geography can potentially be extended to account for both geographical and social network distance, relationship, and interaction. Such an integration is particularly useful to facilitate studies of built environment as it considers both social and spatial aspects of human behaviors. For example, while built environment has been considered as closely related to public health, SNA often considers the impacts of surrounding population while spatial analysis may primarily focus on the accessibility of physical infrastructure (Andris, 2016).

Integrating these two distinct yet related perspectives allows us to examine cities as not only the physical layout but also the composition of multiple social network layers. Human mobility is one example of these social networks, where the mobility trajectories form the edges of the network and the origins and destinations form the nodes. A range of studies have suggested the usefulness of applying network analysis to human mobility data to understand urban layout, discover functional regions within the city, and explore spatial interactions of places within and across the cities (Huang and Wong, 2016; Liu et al., 2014; Yuan et al., 2012). While each of these studies represents a significant effort towards the understanding of the city at a particular spatial scale (individual, city, inter-regional), a more comprehensive framework can advance our understanding of the built environment from a systematic perspective.

Studies of built environment have long been considered as an important component for urban planning as it is closely related to people’s physical activities and is thus essential to urban design and transportation planning (Handy et al., 2002). For example, a number of studies have found that certain characteristics of built environment, such as mixed land use, better street connectivity, and high population density, have positive impacts on people’s participation in physical activities and health outcomes (Harvey and Aultman-Hall, 2016; Troped et al., 2010). Studying human mobility through an integrated spatiotemporal and network analysis provides planners with a lens to understand the geographic landscape of various social processes. For example, Luo and MacEachren (2014) developed a geosocial visual analytics tool which can identify human interaction patterns, as well as design and evaluate the efficacy of different infectious disease control measures.

In our paper, we propose a comprehensive framework that can systematically measure and visualize spatial social network of built environment. In addition, this framework will be applied to different data sources (taxi and cell phone) and various urban settings (Beijing and Chongqing).

A framework for spatial-social network analytics

We establish an analytical framework that considers three dimensions and four scales. Table 1 conceptualizes the according 12 basic analysis units. Social network here is represented by relationship among urban objects connected through trajectory. Human movement patterns reflected by trajectory data are associated with places (Zhang et al., 2016). Therefore, trajectory among places evidences relationship among individuals, groups, and places (Andris, 2016). That is, the trips people travel to work, visit friends, and shopping all reflect their social connections with people and places. Analyzing the networks formed by trajectory data thus provides an additional lens to examine the social proximity among places (Huang L et al., 2016).

Table 1.

The framework for spatial-social network analytics.

		Level
		Individual	Local	Meso	Global
Distributions	Space	A1	A2	A3	A4
	Time	A5	A6	A7	A8
	Network	A9	A10	A11	A12

The analysis unit at the individual scale signifies the geographical location of an attribute (A1, Table 1), the temporal label of an attribute (A5, Table 1), or a single trajectory that connects two places (A9, Table 1).

The analysis unit at the local scale explores a group of units formed by the focal observation and its neighboring observations. A focal area and its neighboring areas, for example, can be considered as the unit of analysis from the perspective of the spatial dimension (distribution) at the local scale (A2, Table 1) (Al-Dohuki et al., 2017). The specific time such as a focal hour, an hour before, and the hour after can be considered as the unit of analysis from the perspective of the temporal dimension at the local scale (A6, Table 1) (Huang X et al., 2016). A local network can be considered as how a focal community is connected to its related area (A10, Table 1). For example, how a shopping center is connected to its related communities through various transportation means (Wang et al., 2016a).

A meso-scale analysis studies a group of entities which shares similar features in space, time or network. The spatial distribution of areas with certain features can be treated as the meso-scale (A3, Table 1). The time period after an event or policy can also be considered as the meso-scale (A7, Table 1) (Wang et al., 2016b). Meso-level networks depict connections among a group of similar communities (A11, Table 1) (Al-Dohuki et al., 2017).

The analysis at the global scale examines distributions of all spatial entities, times, or trajectories. Spatial distribution of all spatial entities, for example, can be considered as the global scale from the perspective of the spatial dimension (distribution) (A4, Table 1) (Li et al., 2017). The entire study period can be considered as the global scale from a temporal point of view (A8, Table 1) (Wang et al., 2016b). Global-scale network takes into account all the trajectories and their associated places (A12, Table 1) (Huang L et al., 2016). Limiting attention to only one of these dimensions or scales may result in a misguided or partial understanding of urban dynamics.

The framework developed based on Table 1 leads to a general task topology for analyzing the built environment by integrating spatial, temporal, and network distributions at individual, local, meso, and global scales. It allows the behavior of a dynamic system to be reconstructed from a group of analysis units. The key aspect of this framework is to integrate the three dimensions of urban trajectory dataset in a four-scale environment. In total, 64 possible combinations of space and time, space and network, time and network can be derived from the task design.

Spatiotemporal movement patterns at the individual level can adopt some methods from time geography, such as space–time prisms and choice set of individual’s activities (Chen and Kwan, 2012; Kuijpers et al., 2010; Miller, 2005). At the local scale, we can examine how a focal agent interacts with others (Winter and Raubal, 2006), and whether two moving objects have physically met (Kuijpers et al., 2011). The meso-scale analysis includes clustering and generalization of trajectories toward studying the community structures and social proximity among places (Andrienko et al., 2011; Guo et al., 2012a; Murray et al., 2012). The global scale considers the overall pattern, such as the social interaction potential in a city and predictions of future movements (Farber et al., 2012; Horner et al., 2012; Song et al., 2010).

Case studies

Study area

Trip flows reflect human mobility patterns and social flows (Andris, 2016). These flows can be captured by various sources including GPS traces (e.g. Liu et al., 2010), pedestrian activities (e.g. Girardin et al., 2008), mobile phone records (e.g. Woodard et al., 2017), and subway cards (e.g. Lathia and Capra, 2011). In the following section, we present case studies in Beijing and Chongqing, two large metropolitan areas in China, to showcase the proposed conceptual framework. Cell phone and taxi trajectory data are used for Chongqing and Beijing respectively to derive trip flows.

Methodology

Data source

Cell phone signaling data. The mobile phone signaling data from GSM (Global System for Mobile Communication) network operated by China Unicom takes up approximately 20% of the market share in China. The data used in this paper were collected from the A and E interface in GSM network. Each record is generated when a mobile device connects to the cellular network in the following instances:

– when a call is placed or received (both at the beginning and end of a call);

– when a short message is sent or received;

– when a location update occurs;

– when handover occurs;

Every record includes seven fields, including the IMSI (International Mobile Subscriber Identification Number), timestamp, location area code, base station ID, MSC (Mobile Switching Center) id and BSC (Base Station Controller) ID, event type (including random location update, periodic location update procedure, connect management service request, paging response, BSC handover and so on).

The data used in the case study of Chongqing were generated from 4.7 million mobile phones over one month in 2013. It consists of 140 millions of records per day, and covers 38 counties of 82,400 km² in Chongqing, with an average of 30 records per device.

Taxi GPS data. The mobile devices equipped with a GPS receiver chip can collect information about the movement of people, vehicles, and other mobile objects. The taxi GPS data used for the case study of Beijing were collected from GPS devices installed in taxis, a current wide-used floating car technology to track traffic conditions. According to the report from the Beijing Traffic Bureau, taxis accounted for 12% of Beijing’s total ground traffic (Yuan et al., 2012), therefore the taxi datasets can well reflect people’s activities and reveal the urban functional structure.

Trip flow matrix estimation based on cell phone signaling data

The methodologies for trip flow matrix estimation comprise five steps including: data pre-processing, trajectory cluster and spurious points filtering, trip identification, commuting identification and trip flow aggregate.

Data pre-processing. Data pre-processing includes scanning records one by one to remove those fail to track IMSI number, grouping records by users, and sorting every user’s records in chronological order to get each user’s daily activities trajectory.

The base station positioning technology was adopted here. Phone users are geo-located by combining their current base station ID with the station’s coordinates. In urban areas where base stations are densely distributed, the accuracy of geo-locating is about 200–800 meters. In rural areas where base stations are sparse, the accuracy of geo-locating differs from a few hundreds to a few thousands meters.

Users’ daily trajectory is presented in an ordered sequence $M = {m_{1}, m_{2}, \dots,, m_{n}}$ . Each location measurement $m_{i} \in M$ is characterized by a position expressed p_i in latitude and longitude and a timestamp t_i (Figure 1).

Figure 1.

Space–time trajectory for user’s activities.

Figure 2.

Extracting space–time stable points from user’s multi-day trajectory.

The enter time $t_{i}^{in}$ for base station m_i is $t_{i}^{in} \in (t_{i - 1}, t_{i})$

The departure time $t_{i}^{out}$ for base station m_i is $t_{i}^{out} \in (t_{i}, t_{i + 1})$

It is assumed that:

t_{i}^{in} = t_{i}

(1)

t_{i}^{out} = t_{i + 1}

(2)

Mobile phone signaling records are sparse and irregular, in that users’ displacements (consecutive non-identical locations) are often observed with long travel intervals. For example, the first location may be observed at 8:00 and next location may be observed after an hour or more. Therefore, we can only approximate when a user enters or leaves a base station, and estimate errors in calculating enter time and departure time. The max error of enter time or departure time is $(- T, 0)$ where T represents the max interval between two consecutive records and minus sign means the true enter or departure time is ahead of the calculated values.

The stay time $T_{i}^{stay}$ for each point m_i equals $t_{i}^{out} - t_{i}^{in}$ . The max error for stay time is $(- T, T)$ .

Trajectory cluster and spurious points filtering. In practice, the base station that provides service for a user is not necessarily the one nearest to the user and may be constantly changing in spite of no actual displacement. This is because the operator often balances call traffic among adjacent towers by allocating a new call (or shifting an ongoing call) to the tower that is handling lower call volumes at that moment. To reduce the number of false displacements, we therefore take the following measures.

We first analyze two consecutive trajectory points; if the linear distance of two points is less than the threshold, trajectory point m_i and $m_{i + 1}$ will be fused together:

dis (p_{i}, p_{i + 1}) < Δ S

(3)

where

dis (m_{i}, m_{i + 1})

represents the linear distance between point p_i and

p_{i + 1}

, and

Δ S

represents the spatial threshold, for whom recommended value is 1 km to take into account the localization errors for base station.

After this measure, duplicate trajectory points and trajectory points nearby will be merged into one point. The short distance trip (trip distance less than the threshold) will also be eliminated. Therefore, the total number of trips will reduce.

The fact that base stations a user connected to for communication may vary in a wide range makes the illusion that a user moved a long distance in a short period of time. To diminish this effect, we then analyze three consecutive trajectory points successively. If the following three conditions are met at the same time, trajectory points m_i, $m_{i + 1}$ and $m_{i + 2}$ will be fused together.

dis (p_{i}, p_{i + 1}) \geq Δ S

(4)

t_{i + 2} - t_{i} < Δ T

(5)

dis (p_{i + 2}, p_{i}) < Δ S

(6)

where

Δ T

is the temporal threshold, whose value is recommended to fifteen minutes.

This measure can effectively eliminate the false displacements of single drift phenomenon of base station. However, we may lose some quick return trips in this process. That is, when a user makes a short stop at the destination and return to the origin quickly, the total time cost of the trip may be less than the time threshold.

When execute the fusion of trajectory points, the former point m_i will be reserved and the second point $m_{i + 1}$ will be removed. The departure time and stop time of m_i need to be updated synchronously as follows:

t_{i}^{out} = t_{i + 1}^{out}

(7)

T_{i}^{stay} = T_{i}^{stay} + T_{i + 1}^{stay}

(8)

After this process, all points left in the sequence are spatial dispersed.

Trip identification. The stop points can be identified based on stay time that is greater than a time threshold, which is determined by the area size of analysis and the time interval between consecutive records.

A user’s trajectory can be cut into several consecutive trips by stop points. The stop point becomes the destination of last trip as well as the origin of next trip like $s_{1}, \dots, s_{2}, \dots, s_{i}, \dots, s_{n}$ , where s_i represents a stop point.

Each trip for a user u can be characterized by an origin o, a destination d, departure time t_o and arrival time t_d as follows:

trip (u, o, d, t_{o}, t_{d}) o = p (s_{i}) d = p (s_{i + 1}) t_{o} = t_{out} (s_{i}) t_{d} = t_{in} (s_{i + 1})

In this chapter, the basic analysis units are counties, whose area ranging from tens to hundreds of square kilometers. We only care for the long distance trips that cross the county. So we can simplify the methodology to calculate the total stay time in each county instead of the base station. The location measurement in one county can be fused together and their stay time can be added together to get user’s total stay time in that county. Stay time for county R can be calculated as follows:

T_{R}^{stay} = \sum_{p_{i} \in R} T_{i}^{stay}

(9)

According to the length of stay time, we can determine whether the user stopped in the county for an activity or not. If the stay time surpasses the threshold, which was 2 hours in our case, the county can be regarded as a stop point, which becomes a trip origin or destination. Otherwise, users are considered to be through traffic or make a temporary stop (e.g. stops for car fuel or driver rest).

Commuting identification

Most people have regular daily activity patterns that can be extracted through long-term observation (González et al., 2008). To extract users’ movement patterns, we first identify candidate home and workplace locations using a Location Stability Index (LSI) similar to what presented by Hao et al. (2010). Commuting trips between user’s home and work places can then be identified accordingly (Figure 2).

We first divide each day into 48 time windows, with each covering 30 minutes. For each stop, the time window that first overlaps the stay period $(t_{i}^{in}, t_{i}^{out})$ over 50% was chosen as the enter time window, and the window that latest overlap the stay period $(t_{i}^{in}, t_{i}^{out})$ over 50% was chosen as the departure time window. For each time window, we then overlay users’ trajectory with stop points of multi-weekdays, and calculate the LSI of each stay point using following formula:

{LSI}_{i} = \frac{cover (i, r)}{N_{day}}

(10)

where

N_{day}

is the total days for analysis,

cover (i, r)

is the number of stop points that be covered by the circle shape centered m_i with radius r.

The stay points with max LSI and is no less than the minimum threshold (value of 0.6 is suggested) was chosen as the multi-weekdays stay point in that time window (Figure 2). A fusion action will be taken if the linear distance of two multi-weekday points in adjacent time windows is less than the spatial threshold, which can refer to equations (7) and (8) in this section.

We consider points with the longest stay time at night (22:00–7:00) as users’ home locations and points with longest stay time in the day (7:00–22:00) as workplaces. The cumulative stay time of home or workplace needs to be more than the minimum threshold (i.e. 2 hours) to be considered as valid.

Trip flow matrix of mobile phone users. All trips are then aggregated by their origin and destination to form a trip flow matrix. For each time window $tw$ , the trip flow between area i and area j can be calculated as follows:

OD (i, j, tw) = \sum_{o \in i, d \in j, t_{o} \in tw} trip (u, o, d, t_{o}, t_{d})

(11)

The spatial statistical unit for trip flow, estimated by trajectory mobile phone users should not be too small. The static results error will increase sharply with the unit size reduction because of base station positioning errors. It is recommended that the size for statistical unit is no smaller than four square kilometers.

Taxi trajectory extracting and visualization based on taxi GPS data

Taxi trajectory data are also pre-processed given its unstructured nature. The processing steps include:

Taxis can only can reflect people’s purposeful activities and commuting in a city when they carry passengers. Therefore, trajectories of taxis carrying passengers are obtained according to the traveling state.

According to the trajectories of taxis that carry passengers, data on starting points, arrival points, and time of each trajectory is acquired.

The starting area, arrival area, and time (hours) are attained by plotting each point on the map, getting the area which the coordinates are in, and dividing time by hour.

As a result, this section basically achieves the identification and spatialization of the taxi GPS data-based origin (O) points, destination (D) points, and the corresponding time (T), with a total of 2,253,626 O points and 2,253,551 D points.

Framework implementation

With pre-processed taxis trajectory and cellular phone data, we utilize the proposed framework to showcase how different combination of spatiotemporal and network analysis can help investigate human movement patterns. Two different combinations of analysis units were used for Beijing taxi trajectory data (examples 1 and 2 in Table 2). In the example 1, we combine individual-scale space, individual-scale time and local-scale network. That is, we used individual level taxi trajectory data, and form them into origin–destination (OD) trips among places. As a result, a network is derived at local scale that represents how a place is connected to other parts of the city. Figure 3(a) and (b) demonstrates the resulting network that shows taxis trips going to and leaving from the Financial Street in Beijing within an hour period on 27 February 2013. The two maps show similar patterns of in- and out-trips in that most people travel between Financial Street and the area to the north of the street. Although the trips were spread out through city, the Financial Street is more connected to the northern city than the southern counterpart through taxi. In the example 2, we aggregated all the taxi trajectory data of Beijing in 2011, and derived a network at global level that shows interactions among different city places (see Figure 4). This network provides an overview of to what extent any of the two areas within the city are connected through taxi and which parts of the city are more socially connected. Vibrant areas that are more connected than others, such as Zhongguancun business district, the financial street of the Xicheng District, CBD, and Wangjing area, can also be identified from the network.

Table 2.

Examples for unit of analysis framework (spatio-temporal dynamic of urban trajectory).

		Examples
		Example 1	Example 2	Example3
Distributions	Spatio-temporal network	A1 + A5 + A10	A4 + A8 + A12	A4 + A8 + A11

Figure 3.

Taxi trajectories in Financial Street area of Beijing from 7 a.m. to 8 a.m. 27 February 2013.

Figure 4.

Counts of OD trips within Beijing in 2011.

The example 3 is represented using cellphone data from Chongqing. Cell phone data are aggregated at county level (example 3 in Table 2) to examine the connectedness of the city. Although similar county-level OD data can be derived from railway or bus records, they are limited to certain types of transportation modes. Moreover, the high cost makes this data less accessible to researchers. Estimating trip volumes from cell phone data provides an additional channel to examine travel patterns within the city and may also supplement to less-frequently collected transportation census data. Figure 5(a) shows that counties adjacent to the main city have the maximum amount of trip volume (more than 40 thousands per day) and the highest share (more than 50%). This indicates that these counties have very close relationship with the main city. The county of Qijiang and Fulin also have significant amounts of trips related to the main city, but with a lower share (less than 35%). This contrast may result from the fact that these two counties are economically independent from each other. Counties in the northeast or southeast have very few trip flows to or from the main city because of long distance and inconvenience of travel.

Figure 5.

Trip flow distribution in Chongqing City within one day. (a) Trip flow distribution between the main city and suburban counties. (b) Trip flow distribution between suburban counties.

Figure 5(b) (example 3 in Table 2) takes each county’s external trip flow volume (except these trips related to the main city) as an indicator to measure its regional dominance. High dominance means strong economic impacts, attraction to circumjacent counties, and more trips to or from external areas. The results indicate that in the northeast of Chongqing, Wanzhou showed a very significant regional dominance, and can be taken as the regional center. Yongchuan and Fulin, similarly, can be considered as the regional center on the west and east of the main city. However, in the southeast, there is no significant regional center, though Qianjiang has been defined as the regional center in the previous master plan. Jiangjin, Bishan, Hechuan have lost their regional dominance and no longer need to be defined as a regional center. Instead, they become integrated parts of the main city.

As discussed above, cellular phone and taxi trajectory data can reflect human activity and thus form networks of human movement patterns. In our case studies, we extracted human movement patterns and incorporate them with the analysis of urban built environment through an integrated spatiotemporal network analysis framework. The results suggest that social network of activities can help us understand urban functions (e.g. traffic demanding areas and domain regions) at different scales. Such an understanding can be used to help with better planning strategies.

Summary

With the fast growth of urban trajectory data, new opportunities are emerging for researchers to study urban built environment through real-time human activities. Nevertheless, how the dataset with unprecedented breadth and depth may facilitate the studies of built environment and further support effective urban planning requires new spatiotemporally explicit framework and methods that have not been fully addressed in current scholarships. This paper aims to bridge this gap by suggesting a framework that considers different spatiotemporal and network scales through the notion of analysis units. By combining different analysis units, researchers can answer questions about built environment through the exploration and comparison of the spatiotemporal patterns of human activities and interactions across various scales and dimensions. The case studies indicate how such combination can help answering questions about spatiotemporal social network dimension in mixed scales of units. Specifically, the presented examples demonstrate the advantage of analysis unit in that it provides both flexibility for analyzing data with different resolutions and comprehensiveness of understanding urban dynamics at various scales. Researchers can adopt similar approaches and develop their own combination of units according to the purpose of analysis.

Secondly, the proposed framework has its root in spatiotemporal analysis and network analysis. Researchers can combine methods from these two fast-growing domains and develop integrated approaches to address their research needs. Thirdly, the framework of multi-scale spatiotemporal network analysis enables access to a much wider thinking which addresses the role of dimensions and scales at different stages of urban dynamics for more in-depth study. In other words, the current work is mainly from an exploratory perspective, which can motivate urban scientists to design a series of tasks and formulate new hypotheses from theoretical and policy perspectives. This space–time work provides an important contribution to the current spatial science and urban studies literature, which lack frameworks of addressing integrated spatiotemporal network analysis. Although the proposed framework arose in the analysis of human activities and interactions, it can also be applied to a wide set of socioeconomic processes with geo-referenced data measured over time.

This paper notes that the multi-scale and multi-dimension methods can expose some hidden patterns and trends that otherwise would be very difficult to detect. This research presents a general framework for pattern discovery and hypothesis exploration in urban trajectory datasets. On this basis, this framework and specific domain could benefit from each other in the following procedures: First, the analyst has the specific reason for investigating issues related to urban built environment, which can be expressed as a general question or a set of general questions. Second, this nature of the investigation is checked against the task topology of the dataset. Third, the analyst carries out the matched tasks and detects something both interesting and relevant to this investigation. Fourth, new, more specific questions might appear, motivating the analyst to look for more details. These questions affect what details will be viewed and in what ways. Lastly, the general questions in step 1 are revised and the investigator goes through the procedures again. As such, explanations of various urban dynamics can be provided based on rigorous analysis, and policy interventions are then proposed in light of the understanding of the space–time-network dataset, which will open up a rich empirical context for social sciences (Ye et al., 2016).

Footnotes

Acknowledgements

The authors acknowledge the contribution of Dr. Jingyuan Wang and Dr. Jianhui Lai for providing necessary data support.

Authors' note

Zhenjiang Shen is also affiliated with theFuzhou University, China.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research study was supported by the National Natural Science Foundation of China (No. 41501181).

Miaoyi Li is a PhD candidate of the School of Environmental Design, Kanazawa University. His research interesting includes spatial/urban planning, urban big data analysis, geospatial analysis and geo-simulations. He served as a Researcher in the Information Center of Tsinghua Tongheng Planning and Design Institute (THUPDI) which is affiliated with the School of Architecture, Tsinghua University. He received his Master degree in 2012, and in the meantime, he was granted with the outstanding Master graduate honor by his great performance on his Master dissertation. During his three years’ master course, he published five research papers in Chinese top journals.

Xinyue Ye is an associate professor of Geography, Kent State University and a visiting professor in Center for Geographical Analysis, Harvard University. Dr. Ye’s major expertise is on modeling the geographical perspective of socioeconomic inequality and human dynamics. He develops and implements new methods on spatiotemporal-social network analysis/modeling/simulation for different application domains such as economic development, disaster response, land use, public health, and urban crime. His work won the national first-place award of research and analysis from the University Economic Development Association in 2011 and received the emerging scholar award from AAG’s Regional Development and Planning Specialty Group in 2012.

Shanqi Zhang is a PhD candidate at Dept. of Geography and Environment Management, University of Waterloo, Canada. Her research focuses broadly on the use of spatial information technology to investigate human activity patterns and city dynamics, data mining techniques to understand public perception from geosocial media, and the role of Web 2.0 and location-aware social media in improving government-citizen interaction.

Xiaoyong Tang is a post doctor candidate at the School of Traffic & Transportation at Chongqing Jiaotong University, and also a senior engineer in transportation planning at Chongqing Transport Planning Institute. His research interests are the application of mobile phone data to topics in urban planning such as trip characteristic and OD flow extracting, land use planning and layout planning of transportation infrastructures. His first PhD degree was in Transportation Engineering at Southeast University of China. Contact him at Yanghe 2 Rd. 18th, Jiangbei District, Chongqing, P.R.C; tangxiaoyong001@163.com.

Zhenjiang Shen is a professor of Environmental Design, Kanazawa University and a Visiting Professor of Architecture, Fuzhou University. His research interesting includes policy-making support system for planning and design using GIS & VR. He served as commissioner of Chugoku Branch of Architectural Institute of Japan, and planning advisor in local cities such as Nanao city, Kanazawa city in Japan and developed on-line design tools for enhancing public participation. Dr. Shen also participated in spatial strategic planning of local cities in China, and cooperated with Beijing Municipal Commission of Urban Planning for metropolitan growth simulation. He is a commission member of Commission on Geospatial Analysis and Modeling of International Cartographic Association (ICA), City Planning Institute of Japan (CPIJ) and also work as a joint member of Fudan University and Phd Instructor in Tsinghua University, China. Dr.Shen is Editor-in-chief of IRSPSD International (Indexed in SCOPUS), Managing Editor of IJSSoc (Indexed in SCOPUS), and IJSSS(EI/Inspec) and organizing an International Community on Spatial Planning and Sustainable Development.

References

Agryzkov T, Martí P, Tortosa L, et al. (2016) Measuring urban activities using Foursquare data and network analysis: A case study of Murcia (Spain). International Journal of Geographical Information Science 8816(May): 1–22. http://doi.org/10.1080/13658816.2016.1188931.

Al-Dohuki

Kamw

Zhao

et al. (2017) SemanticTraj: A new approach to interacting with massive taxi trajectories. IEEE Transactions on Visualization and Computer Graphics 23(1): 11–20.

Andrienko

Keim

et al. (2011) Editorial: Challenging problems of geospatial visual analytics. Journal of Visual Languages and Computing 22(4): 251–256. DOI: 10.1016/j.jvlc.2011.04.001.

Andris

(2016) Integrating social network data into GISystems. International Journal of Geographical Information Science 8816(March): 1–23. http://doi.org/10.1080/13658816.2016.1153103 .

Chen

Kwan

(2012) Choice set formation with multiple flexible activities under space–time constraints. International Journal of Geographical Information Science 26(5): 941–961. DOI: 10.1080/13658816.2011.624520.

Cranshaw J, Hong JI and Sadeh N (2012) The livehoods project: Utilizing social media to understand the dynamics of a city. In: Proceedings of the sixth international AAAI conference on weblogs and social media, pp.58–65.

Dempwolf

Lyles

(2012) The uses of social network analysis in planning: A review of the literature. Journal of Planning Literature 27(1): 3–21. DOI: 10.1177/0885412211411092.

Farber

Neutens

Miller

et al. (2012) The social interaction potential of metropolitan regions: A time-geographic measurement approach using joint accessibility. Annals of the Association of American Geographers 103(3): 483–504. DOI: 10.1080/00045608.2012.689238.

Girardin

et al. (2008) Digital footprinting: Uncovering tourists with user-generated content. IEEE Pervasive Computing 7(4): 36–43. DOI: 10.1109/MPRV.2008.71.

10.

Gong L, Liu X, Wu L, et al. (2015) Inferring trip purposes and uncovering travel patterns from taxi trajectory data. Cartography and Geographic Information Science 406(June): 1–12. Available at: http://doi.org/10.1080/15230406.2015.1014424.

11.

González

Hidalgo

Albert-László

(2008) Understanding individual human mobility patterns. Nature 453(7196): 779–782.

12.

Guo

Zhu

Jin

et al. (2012a) Discovering spatial patterns in origin-destination mobility data. Transactions in GIS 16(3): 411–429. DOI: 10.1111/j.1467-9671.2012.01344.x.

13.

Handy

Boarnet

Ewing

et al. (2002) How the built environment affects physical activity: Views from urban planning. American Journal of Preventive Medicine 23(2 Suppl. 1): 64–73. http://doi.org/10.1016/S0749-3797(02)00475-0.

14.

Harvey C and Aultman-Hall L (2016) Measuring urban streetscapes for livability: A review of approaches. The Professional Geographer (August): 1–10. http://doi.org/10.1080/00330124.2015.1065546.

15.

Harvey C, Aultman-hall L, Harvey C, et al. (2016) Measuring urban streetscapes for livability: A review of approaches. The Professional Geographer 68(1): 149–158. http://doi.org/10.1080/00330124.2015.1065546.

16.

Hao T, Ma XJ, Han W, et al. (2010) A novel approach to estimate human space-time path based on mobile phone call records. In: 18th international conference on geoinformatics, Beijing, China, IEEE, pp.1–6.

17.

Hollenstein L and Purves R (2010) Exploring place through user-generated content: Using Flickr to describe city cores. Journal of Spatial Information Science 1(1): 21–48. http://doi.org/10.5311/JOSIS.2010.1.3.

18.

Horner

Zook

Downs

(2012) Where were you? Development of a time-geographic approach for activity destination re-construction. Computers, Environment and Urban Systems 36(6): 488–499. DOI: 10.1016/j.compenvurbsys.2012.06.002.

19.

Huang

Wong

DWS

(2016) Activity patterns, socioeconomic status and urban spatial structure: What can social media data tell us? International Journal of Geographical Information Science 8816(March): 1–26. http://doi.org/10.1080/13658816.2016.1145225 .

20.

Huang

Zhao

Yang

et al. (2016) TrajGraph: A graph-based visual analytics approach to studying urban network centralities using taxi trajectory data. IEEE Transactions on Visualization and Computer Graphics 22(1): 160–169.

21.

Huang

Zhu

et al. (2016) Characterizing street hierarchies through network analysis and large-scale taxi traffic flow: A case study of Wuhan, China. Environment and Planning B: Planning and Design 43(2): 276–296.

22.

Kuijpers

Grimson

Othman

(2011) An analytic solution to the alibi query in the space–time prisms model for moving object data. International Journal of Geographical Information Science 25(2): 293–322. DOI: 10.1080/13658810902967397.

23.

Kuijpers

Miller

Neutens

et al. (2010) Anchor uncertainty and space-time prisms on road networks. International Journal of Geographical Information Science 24(8): 1223–1248. DOI: 10.1080/13658810903321339.

24.

Lathia N and Capra L (2011) How smart is your smartcard? Measuring travel behaviours, perceptions, and incentives. In: Proceedings of the 13th international conference on ubiquitous computing, UbiComp’11, Beijing, China, 17–21 September, pp.291–300.

25.

Lee

(1968) Urban neighbourhood as a socio-spatial schema. Human Relations 21(3): 241–267.

26.

Dong

Shen

et al. (2017) Examining the interaction of taxi ridership and subway for sustainable urbanization. Sustainability 9(2): 242.

27.

Liu

Andris

Ratti

(2010) Uncovering cabdrivers’ behavior patterns from their digital traces. Computers, Environment and Urban Systems 34(6): 541–548. DOI: 10.1016/j.compenvurbsys.2010.07.004.

28.

Liu Y, Sui Z, Kang C, et al. (2014) Uncovering patterns of inter-urban trip and spatial interaction from social media check-in data. PLoS One 9(1): e86026. http://doi.org/10.1371/journal.pone.0086026.

29.

Luo F, Cao G, Mulligan K, et al. (2015) Explore spatiotemporal and demographic characteristics of human mobility via Twitter: A case study of Chicago. Applied Geography 70: 11–25. http://doi.org/10.1016/j.apgeog.2016.03.001.

30.

Luo W and MacEachren AM (2014) Geo-social visual analytics. Journal of Spatial Information Science 8(8): 27–66. http://doi.org/10.5311/JOSIS.2014.8.139.

31.

Miller

(2005) A measurement theory for time geography. Geographical Analysis 37(1): 17–45. DOI: 10.1111/j.1538-4632.2005.00575.x.

32.

Murray

Liu

Rey

et al. (2012) Exploring movement object patterns. The Annals of Regional Science 49(2): 471–484. DOI: 10.1007/s00168-011-0459-z.

33.

Newman

MEJ

(2003) The structure and function of complex networks. Society for Industrial and Applied Mathematics (SIAM) Review 45(2): 167–256.

34.

Robertson C and Feick R (2015) Bumps and bruises in the digital skins of cities: Unevenly distributed user-generated content across US urban areas. Cartography and Geographic Information Science 43(4): 283–300.

35.

She

Duque

(2016) The Network-Max-P-Regions Model. International Journal of Geographical Information Science 31(5): 962–981.

36.

Song

Blumm

et al. (2010) Limits of predictability in human mobility. Science 327(5968): 1018–1021.

37.

Spielman Thill (2008) Social area analysis, data mining, and GIS. Computers Environment and Urban Systems 32(2): 110–122.

38.

Troped

Wilson

Matthews

et al. (2010) The built environment and location-based physical activity. American Journal of Preventive Medicine 38(4): 429–438.

39.

Wang

Jiang

Liu

et al. (2016a) Evaluating trade areas using social media data with a calibrated Huff model. ISPRS International Journal of Geo-Information 5(7): 112.

40.

Wang

Tsou

(2016b) Spatial, temporal, and content analysis of Twitter for wildfire hazards. Natural Hazards 83(1): 523–540.

41.

Winter S and Raubal M (2006) Time geography for Ad Hoc shared-tide trip planning. In: Proceedings of the 7th international conference on mobile data management, Washington.

42.

Woodard

Nogin

Koch

et al. (2017) Predicting travel time reliability using mobile phone GPS data. Transportation Research Part C: Emerging Technologies, 75, 30–44.

43.

Yao X and Zhang S (2014) Social-spatial structure of Beijing: A spatial-temporal analysis. International Journal of Society Systems Science 6(1): 18–33.

44.

Huang

(2016) Integrating big social data, computing, and modeling for spatial social science. Cartography and Geographic Information Science 43(5): 377–378.

45.

Rey

(2013) A framework for exploratory space-time analysis of economic data. Annals of Regional Science 50(1): 315–339.

46.

Yuan J, Zheng Y and Xie X (2012) Discovering regions of different functions in a city using human mobility and POIs. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining – KDD ‘12, Beijing, China, p.186. http://doi.org/10.1145/2339530.2339561.

47.

Zhang

Zhu

Guo

et al. (2016) Analyzing urban human mobility patterns through thematic model at the finer scale. ISPRS International Journal of Geo-Information 5(6): 78, DOI: 10.3390/ijgi5060078.

48.

Zhao

Qin

et al. (2017) A trajectory clustering approach based on decision graph and data field for detecting hotspots. International Journal of Geographical Information Science 31(6): 1101–1127.