Abstract
The improved temporal and spatial granularity of data now available from current information technologies offers an opportunity to study previously unexplored dimensions of the relationship between built environment and social outcomes. Within the field of urban studies, an old question worth revisiting with these new technologies is how to best trace the spatial boundaries that circumscribe a place or location to explore non-work activity. In this study, we explore a data-driven definition of places as units of analysis that can be used to explore non-work activity in Singapore. Such a definition of place characterizes an urban space in terms of its concentration of activity and the topology of the built environment–features that are especially important to urban planners. We utilize available smartphone data to develop a systematic framework to identify locations with a concentrated human presence. Using a cylinder moving over a grid representing Singapore, we scan aggregated smartphones locational requests (by time and cell), identifying areas with atypical high concentrations at a given time. Our tool identified 93 places with a concentrated human presence. Direct observation of six of these places at the selected times in conjunction with additional transportation and population data indicated that the topology of commercial establishments provided a strong approximation of non-work activity at a given time and place. Having established the relevance of commercial establishments in approximating non-work activity, then points-of-interest data within the 93 derived places are used to propose a typology of commercial patches, based on their spatial configuration. Nine metrics of the geometry and topology of patches of establishments, such as compacity and their dependence on proximity to shopping malls, were developed. These combined variables revealed more temporal and spatial variety within locations than had previously been recognized. The most popular places for non-work activity were densely configured with various commercial sub-spaces or patches appealing to different lifestyles and income groups. This study suggests that a location/place can be best defined as a highly detailed, multi-faceted, and always evolving area of activity rather than as a fixed location with temporal and unmovable boundaries. Suggesting a dynamic redefinition of location/place that builds on other recent work, this work offers potential contributions to locational models for non-work activity.
Keywords
Introduction
Motivation
It is the actions of individuals as they adapt to the wider urban context and to each other that shape and define a city. The nature of these actions is locational because they entail many individuals making spatial choices every day and at various time scales. For instance, people make quick decisions about which route to take to work, or where to go for shopping or dining. Their long-term decisions involve things like where to work or which neighborhood to live in. Both types of choices have a temporal and spatial dimension because they occur at a given time and within a specific territory. Michael Batty (2018) defines cities not as fixed locations, but, rather, as the outcome of flowing patterns of movement and interactions. In his view, a city is a complex system within which locations emerge as temporal and spatial outcomes from such flows, interactions, and the aggregation of individual locational choices made by people. A place, thus, is not static, but, rather, emerges as the product of individuals’ interactions with each other and with the urban space.
As a result, places are not fixed, and the boundaries of locations should evolve in response to their associated daily interactions and adaptations to human activity. Nevertheless, defining locations or places as the outcome of evolving flows poses the problem of how to document the more permanent geographical boundaries of a place or location. Given that “place” or “location” are the typical units of analysis used when tracking neighborhood or zonal change, the definitions and boundaries of these terms lie at the core of the disciplines of geography and urban planning. However, definitions of the boundaries of “place” and “location” generally tend to be arbitrary or to rely on fixed administrative units such as census tracts or traffic analysis zones. Thus, two questions emerge: (1) How can we systematically trace such boundaries of place or location from real data that reflects human interactions, flows of people, and observed locational choices? and (2) How can we build a unit of analysis that has flexible boundaries to explore non-work activity across space–time scales?
Literature review
There has been a growing interest within transportation studies in creating a model to represent the behavioral process through which individuals non-work destinations are selected, especially for shopping or leisure trips (Huang and Levinson, 2015). People form mental maps of the city—abstract representations of the geographical boundaries of a “location” or “place”. These boundaries can evolve throughout a single day and/or across several days in response to accommodate new information and previous knowledge or experience of ongoing activity in space. The problem arises when we move from clearly defined “locations” having strict boundaries such as a workplace or home to more ambiguously defined places associated with activities such as leisure or shopping, which require us to model how a traveler considers multiple places and then chooses a destination.
Many studies are in an effort to explore urban form and function using big data combined with machine learning techniques. These studies can be divided into three groups: (1) methods to infer form and function in the urban context relying on social media data and/or points-of-interest (POI) (Gao et al., 2017; Hu et al., 2020; Li et al., 2016; Zhai et al., 2019; Zhi et al., 2016; Zhong et al., 2018); (2) methods to discover urban functional regions through Call Detail Records (CDR) or GPS trajectories (Yuan et al., 2015; Zhang et al., 2016); and (3) usage of network science to discover spatial interaction communities from CDR and transit records (Gao et al., 2013; Xia et al., 2020; Zhong et al., 2014). Several of such work relies on methods of Natural Language Processing such as the Latent Dirichlet allocation (LDA) to discover the form and function. The scale of analysis of this work is the metropolitan region to identify sub-centers of activity.
Our work distinguishes from previous research in two ways. First, we are interested in the spatial scale that captures daily non-work activity, which typically occurs at the level of zones or neighborhoods. Our work does not identify the larger sub-centers of activity, which relate more to explorations of urban form. Second, given our interest on a finer geographical scale, it is critical to avoid relying on pre-existing administrative units or to create arbitrary units of analysis. We need a unit of analysis that emerges from the activity itself that deploys at the street level or at the scale of shopping malls. Ours is a bottom-up approach that considers the density of activity and the density of establishments to identify such places and to characterize them through a finer geographical scale that is suitable to explore the locational choices of people over non-work activity.
Data
Call detail records
The SpotRank dataset we use summarizes the location of smartphone requests during 1 January to 31 July 2012, reporting them as a space–time score of the number of requests for 100 × 100 meter grid cells during each hour of the week. The dataset aggregates the requests by time and place of occurrence, reporting the score by time slot and cell grid. SpotRank normalizes the number of requests by the total number in the metropolitan area, reporting them as a score. The dataset contains data from all the users with smart phones (and locational services) containing software provided by the Skyhook company, which collected and owns the data. During this time (2012), Skyhook provided location-finding services for Samsung and Apple, which together had the largest market share of smart phones. The data are reported as a spatial grid with temporal resolution. The dataset contains 168 scores for each 100 × 100 meter cell, representing the average smart phone activity for a particular place and time during the six months of data collection. Skyhook averaged the number of requests for the six-month period, generating the 0–9 index for the time slots for each week. Thus, the score ranges from 0 (“no requests”) to 9 (“many requests”). For instance, the score for time slot Monday 9–10 a.m. for cell
Data of points of interest
The POI database from Google Place, accessed from the public API between June 2016 and April 2017, is the source of the data for our characterization of places. This is a list of establishments that includes their latitude and longitude, and a category with a type of POI. The focus of our analysis is on commercial establishments, shops, services, retail, and dining. We exclude offices, worship, and tourist attractions. We are aware of the limitations and changes that might have occurred in the configuration of retail in the city between 2012, date of collection of the SpotRank data, and 2016 date of access to the Google Place records. However, such changes might be limited at the already well-established and major non-work destinations in the island.
Case study
Our case study was done in Singapore. This city-state offers an exceptional combination of: a polycentric structure (Zhong et al., 2014), easily accessible transportation and mixed land use (Haila, 2015; Soh and Yuen, 2011), and a high income and diversity of lifestyles (Singapore Department of Statistics, 2013). In summary, Singapore offers activities across the island, activities are accessible from anywhere on the island, and people have enough disposable income to enjoy even the activities offered in outlying areas. The SpotRank grid for Singapore contains 30,649 cells and the 168 time slots, representing each week of the study, 24/7. Supp Figure 1 shows the SpotRank grid for the city.
Methodology
Detection of space–time clusters with peaks of activity
A space–time statistic based on a normal probability model serves to detect places and times of the greatest activity (Kulldorff et al., 2009). This tool used the score from SpotRank as the input measure in order to identify peaks occurring at specific times and locations on the grid, enclosing the geometry and temporal frame of such high occurrence. The method used a cylindrical window scanning in three dimensions: longitude, latitude, and time (Kulldorff, 2010). A moving cylinder tested inputs at different radiuses and heights to detect unusual high SpotRank scores on the grid. The radius varied continuously from zero to an upper limit of 1 km, which we set as the threshold because we wanted to characterize places at the scale of neighborhoods, and a 1 km diameter seemed a reasonable maximum distance for walking between places in Singapore. We also set the temporal window for scanning to two-hour intervals. A sensitivity analysis was conducted to test the input hour thresholds from one to five. The larger the time frame, fewer clusters with a larger radius. The difference in the results was negligible between one and three hours; finally, selected a value of two hours because it is a reasonable time frame to restrict a non-work activity. The technique calculates a log-likelihood ratio test based on a likelihood computed under the null hypothesis (of no spatial–temporal clustering) and an alternative hypothesis. The tool examines several radiuses and heights for the cylinder, computing a ratio test for each combination. Under the null hypothesis, all observations come from the same distribution. Under the alternative, there is one cluster location at cylinder z where the observations have either a larger or a smaller mean than outside of the cluster.
The log-likelihood under the null hypothesis is as follows
N is the number of cells in the grid;
Under the null hypothesis, the maximum likelihood estimates of the mean and variance are
The statistics in the null hypothesis depend on the values of the entire grid. In contrast, with the alternative, the log-likelihood is specific to each cylinder z tested by the scanning algorithm
The method uses Montecarlo permutation of the data to run a significance test. For each potential cylinder, the method orders the observed values across the grid, generating random data sets. For each of these random datasets, the log-likelihood ln L(z) is calculated, identifying the most likely cluster and recording its log-likelihood ratio. Then, the likelihood of the most likely cluster is compared to the likelihood from the random datasets to compute a resulting significance level (Kulldorff et al., 2009).
To conduct the analysis, we split the seven days of SpotRank data into files for each day and ran the algorithm separately on each file. The purpose of analyzing the data by day was to explore the variation within the geographical and temporal boundaries of locations across days, especially when comparing weekends and weekdays. This generated a geometry of circles that enclosed the areas with the peaks of human concentration.
Characterization of places using POI data
We performed a spatial clustering and then a point-pattern analysis of the commercial establishments within the SpotRank places identified through the previous step, including four periods for each place: (1) Monday to Thursday, (2) Friday, (3) Saturday, and (4) Sunday. As the boundaries from Monday to Thursday exhibited a large degree of uniformity, we dissolved them to explore the commercial establishments along only four buffers instead of seven. This spatial clustering enabled us to first characterize the spatial structure of subzones or patches of establishments in order to later characterize them by topology and diversity of shops. The reader can refer to Ponce Lopez (2018) for the technical description about the processing of the data of commercial establishments. The goal of the spatial clustering was to identify the internal spatial structure of subzones within the broader SpotRank places and at the scale of neighborhoods.
The algorithm Density-Based Spatial Clustering and Application with Noise (DBSCAN) performed the grouping of commercial establishments, using the density of points and locations (Ester et al., 1996). The intuition behind DBSCAN is that clusters are dense regions in the data space, separated by regions of lower density of points. The areas in between regions of lower densities contain outliers or noise (Ester et al., 1996).
Figure 1 exemplifies the method by showing the spatial clustering for the SpotRank location of Orchard Road in the buffer from Monday to Thursday. The DBSCAN identified 38 distinct patches of establishments in the interior of the place (color code denotes group membership to a spatial cluster). The gray dots do not belong to any cluster but are isolated points that the algorithm classified as noise. Unlike other methods of spatial clustering, the DBSCAN algorithm can capture irregular geometrical forms of clusters, such as a strip of shops or an “L”-shaped establishment between the intersection of Orchard and Scotts Road.

The 38 clusters resulting from DBSCAN algorithm for Orchard Road, in the buffer Monday to Thursday. Gray points indicate outliers or noise. Color-coding indicates group membership.
We ran the DBSCAN algorithm on the entire set of locations for the four temporal frames. Then, we validated the quality of the representation of the internal patches or clusters for the SpotRank places and analyzed the results on the map, one by one, contrasting the resulting patches against our first-hand knowledge of the places. It is notable that in most of the cases, the algorithm accurately identified the boundaries of commercial zones based on the geometrical density of the points representing establishments.
Results and characterization of the SpotRank places
This section presents the results of our analysis and uses a transportation survey to assess the quality of our method in tracing the boundaries of units of analysis enclosing non-work activity. In sum, the method generated a valid geometry of “locations” that not only had the highest concentration of human activity at the scale of areas with less than 2 km of diameter, but also were the most popular destinations for daily activities.
Places
The number of clusters or “places” is stable from Tuesday to Friday. Supp Table 1 summarizes the number of places identified on each day of the week, and Supp Figure 2 compares the boundaries of the places on Tuesday and Saturday. Monday has fewer clusters (n = 106), but with larger average radiuses. One possible explanation for this pattern is that people tend to concentrate around fewer, but larger, areas on Mondays. An explanation could that, on Mondays, people tend to prefer to either go home right after work or to go to places closer to their work or home rather than going to other locations. The pattern seen for Saturday shows an increase in the number of identified “places” (n = 145), indicating that on this day, people tended to concentrate within specific and narrower areas than on weekdays or Sunday. Sunday also exhibited a very different temporal and spatial pattern than Saturday: people concentrated within fewer clusters with wider areas on Saturday. The peaks of activity on Sunday contrasted with those of other days, including Saturday, because most of Sunday’s peaks occurred in the morning instead of the afternoon or evening.
Regarding the time frame of activity, the greatest peak in the city occurs between 4 and 9 p.m. on Monday to Saturday. These peaks reflect the combined effect of individuals leaving work and then beginning activities after work. The clusters seen after 9 p.m. primarily correspond to zones with bars, dining, and clubbing options, such as in Bugis or Geylang. By contrast, the clusters seen before 4 p.m. correspond to business areas, such as downtown; these peaks coincide with either the morning rush hour (8–10 a.m.) or with lunchtime (12–2 p.m.).
The boundaries and radiuses of the clusters fluctuate across days. This fluctuation demonstrates that places, or the spatial discretization of the urban space expressed through locations, are not fixed over time. It is not only that urban form changes, but the definition of a place at a finer geographical scale changes as well.
Assessment of the identified SpotRank places as containers of activity
Household Interview Transportation Survey 2012
To assess the quality with which our method identified locations with high concentrations of non-work activity, we compared the boundaries of the resulting clusters against the destinations of trips in a 2012 transportation survey, the Household Interview Transportation Survey (or HITS). The results of the survey were based on the travel logs of 35,714 individuals during a day of activity, summing up 70,987 trips. Each trip recorded in the HITS lists an origin and a destination by postcode. In Singapore, a postal code corresponds, roughly, to a building. From the total of 70,987 trips, about 60,765—or 85.6% of the trips—had a destination falling within the boundaries of the SpotRank clusters for Monday to Friday. About 90% of the trips with a purpose of shopping or dining were comprised by our “places”. In summary, our method was effective in identifying and drawing the boundaries of places.
A limitation of our method is that “places” capture not only non-work destinations, but also work, residential, and commercial locations with high concentrations of human activity. This indicated the need for additional data to discriminate and characterize non-work destinations with more precision.
Geometry of the resulting SpotRank places
We explore the geometrical and temporal boundaries generated across the seven days and represented by the shifting, radius, and height of the cylinders, which allowed us to identify 93 areas of activity in the island, which we call “locations” or “places”. Each of these locations has between one and seven circles that exhibit a certain degree of geometrical replicability across days. Interestingly, the boundaries of these locations adapt to the activity occurring in them, and change based on human concentration; the temporal boundaries also show variation with regard to the occurrence of the peaks. The most interesting finding is the regularity of spatial and temporal boundaries on weekdays and their irregularity on the weekend. This pattern consists of circles with regularity on weekdays, but circles enlarging to cover a wider area on Saturday and/or Saturday. Figure 2 compares the weekday versus weekend boundaries of three neighborhoods in Singapore: Jurong East, Kovan, and Yishun. The orange color shows the boundary of the circles from Monday to Thursday; the green color illustrates the cluster on Friday; the purple color corresponds to the boundaries on the weekend.

Daily clusters from SpotRank for Jurong East (left), Kovan (center), and Yishun (right).
In Jurong East (left pane), the boundaries of the clusters are the same from Monday to Saturday; the cluster expands and shifts northwest on Sunday. The expansion and shift of a Sunday cluster is even more dramatic for Kovan (center pane), whose Sunday circle enlarges to encompass an area of high-density social housing. Finally, the clusters from Monday to Thursday are identical in Yishun (right pane), while the cluster shrinks to a smaller area on Friday. The zone in which this place reduces to is a commercial mall and retail area outside of a train station. The clusters at Yishun on Saturday and Sunday expand to cover a broader space of high-density social housing with the presence of street markets on the weekends.
The degree of regularity in the geometry of the 93 SpotRank clusters across days reflects the underlying activity and land use that is prevalent with such closely circumscribed space. For instance, the highest degree of regularity in the boundaries of the cylinders was found in high-density residential locations, with mixed retail in the ground floor.
The boundaries of the places emerge from activities without relying on pre-existing administrative units from a census. Although it is not possible to access the mental maps that each individual makes to approximate her own definition of place, aggregate data from smart phones offers a good approximation of it and enables us to define the boundaries of places according to human interaction and daily activity. Additionally, this approximation is an alternative to generate categorical alternatives from the urban space to explore non-work destination choice.
Findings from a more detailed exploration of non-work activity at six locations
One limitation of this study is that our data and method do not hint at the degree to which a location represents a potential non-work destination for visitors. Motivated by this constraint, we conducted an observational study of six locations and visited each area within the clusters during the busiest hours (those having the highest scores) to observe the establishments, note the clientele, and walk over the shifting boundaries of the SpotRank location. The six places or clusters observed were: Changi Business Park, Kathib, Marine Parade, Tampines, Toa Payoh, and Downtown. The criterion used in selecting these six places was that they represent well-known communities with different types of land use and were located at different subcenters of activity.
In addition to visiting these places, we acquired four datasets to obtain a sense of the dynamics underlying the spatial patterns of activity: (a) a high-resolution synthetic population of households and residential units by building (Zhu and Ferreira, 2014, 2015); (b) a high-resolution synthetic population of jobs by building (Le et al., 2016); (c) transit EZ-Link records for a week of ridership in 2011; and (d) Google Place API data showing the location of commercial establishments by type. The reader could refer to Ponce Lopez (2018) to consult a detailed profiling on these six places.
In each of the six places visited, the combination of leisure activities, jobs, and dwellings seemed to define both the character of the place and the type of people patronizing the establishments—whether residents, workers in the area, or visitors from elsewhere. Generally, these places mix land uses and attract at least a few visitors each day for non-work purposes, making it hard to differentiate between work and non-workspaces.
One remarkable element, noticeable in both the data and field visits, is the different spatial configurations of stores and retail in the six visited places. The data on commercial establishments shows what seems to be an internal spatial structure operating at a very granular scale in the location of establishments within the six clusters of study. Our exploration of retail establishments suggests that there are two types—at least—of places or SpotRank clusters. On the one hand, there are simple ones such as Kathib or Changi Business Park, where the cluster is composed mostly of residences or workplaces, and available non-work activities are limited, given that the businesses cater predominantly to workers or residents in the area. On the other hand, we also found complex places (Marine Parade), with more variety and that encourage people with various lifestyles and from various parts of the city to visit them for non-work. Interestingly, the establishments within are not distributed uniformly within the SpotRank cluster; they also have an internal structure and form sub-clusters or patches within the location.
In summary, based on our visits, we learned that the combination of several features—the configuration of retail establishments, the mixture of retail and commercial establishments, the presence of anchor stores, and geometrical boundaries—made it possible to estimate the appeal of non-work visits to a place. Therefore, our resulting characterization of the places using the POIs imitates the type of knowledge that we derived through our fieldwork, measuring the diversity of establishments and the geometrical sub-spaces that they contain.
Characterization of non-work destinations using POI data
Google Place reported 193 K commercial establishments within the geometrical boundaries of the SpotRank places across the seven days of activity. After applying the DBSCAN algorithm to the POIs on the 93 places across the four days studied, we identified 3750 different and unique patches of commercial establishments representing differentiated commercial sub-zones. For example, in the temporal frame Monday to Thursday, there were 747 different patches; 1008 on Friday; 802 on Saturday; and 860 on Sunday. Using these patches of establishments, we used a set of quantitative indicators to characterize the potential appeal of non-work activity for each patch of establishments. From the visits, we learned that the mix of establishments, the presence of a shopping mall, and the configuration of the streetscape of the establishments are pieces of information that, together, indicate the type of visitors and the non-work activity taking place. We tried to imitate and replicate that information from the field visits using data to characterize the places.
The result is a typology of commercial patches for the 93 locations, built by considering the topology, the geometry, and the mixture of establishments. The boundaries of the 93 locations visited across the seven days enclose different geometrical structures of establishments. The spatial composition of the patches provides valuable information regarding the character and attraction of a place as a non-work destination. A place that has distinct sub-spaces with rich and differentiated offerings will attract more non-work activity as it has something to offer to various clienteles. In consequence, applying the topology of retail into geometrically defined sub-spaces is critical to determining the potential of a place as a non-work destination and generator of multi-purpose trips.
Characterizing the patches and building a typology
This section presents the specific measures to be computed for each of the patches identified by DBSCAN and the resulting typology of patches. Refer to Ponce Lopez (2018) for the mathematical description and a more detailed explanation of the operationalization of these measurements.
Compactness
Activities and visitors to a commercial strip are different from the visitors at a rectangular shopping mall. We suggest a measurement of compactness of the POIs in the DBSCAN patch to represent the irregularities of the geometrical shapes. The Convex-Hull algorithm is used to compute a polygon that encloses the commercial establishments of a given patch (Jarvis, 1973; McCallum and Avis, 1979). Then, for every polygon–DBSCAN cluster that we derived, we computed the Osserman measure of compactness using the area and perimeter of the Convex-Hull polygon (Osserman, 1978).
Density and diversity of POI
A larger offering of activities is associated with more multi-purpose trips (Arentze et al., 2005). Therefore, we select the number of establishments by area of the patch in the polygons calculated in sections “Detection of space–time clusters with peaks of activity” and “Geometry of the resulting SpotRank places”, computing the density and diversity (a Herfindahl–Hirchman Index).
Area under the curve
Another relevant indicator of the type of activity or visitors to a patch is that information indicating whether the establishments in the area respond to an anchor shopping mall or food market. Some agglomeration effects can influence the gathering of establishments around food markets and shopping malls that cater to different clienteles than a strip of shops scattered along a streetscape does. Thus, a measure that allows us to capture the spatial dependency of retail on the presence of malls and food courts was needed, and for this, we selected the area under the curve. Ponce Lopez (2018) provides the mathematical specification and operation of this measure. The intuition behind this technique is fitting a binary outcome model to predict presence and not presence of a commercial establishment at a given pixel. The only predictor in the model is distance to a shopping mall and food court. Then, the ratio of false positives and false negatives measures the degree of spatial dependency of commercial establishments to nearby location of shopping malls and food courts.
Number of department stores and chain supermarkets
Two additional elements to characterize non-work destinations are the number of chain supermarkets (if any) and the number of department stores (if any) in the DBSCAN patch. The variety and size of the leisure offering of a plaza or shopping mall is generally related to the presence of an anchor store. Therefore, we counted the number of supermarkets and department stores in a patch separately, including the resulting numbers into our characterization of place.
Ratio of shops and restaurants
Proportion of shops and food-dedicated establishments was important to distinguish clientele and attraction of non-work activity trips to a place.
The conjunction of the variables captures important features that indicate the attraction of non-work activity to visitors to a given patch of establishments. The next step was to use these variables and a clustering algorithm to propose a typology of patches based on these attributes.
Typology of patches containing commercial establishments
Among various tested algorithms and number of groups, the k-medoids method of clustering yielded the most consistent results with eight groups, in terms of variance explained (Kaufman and Rousseeuw, 1990). The identified eight types of patches were as follows:
Small patches of mixed retail where food predominates (n = 194 patches) Small patches of mixed retail along irregular geometrical shapes (n = 166 patches) Mixed retail along compact geometrical shapes (n = 218 patches) Agglomeration of specialized shops by business type without food supply (n = 247 patches) Large commercial areas (n = 12 patches) Midsize shopping malls with various retail (n = 60 patches) Supermarket surrounded by mixed retail (n = 75 patches) Food court market surrounded by mixed retail (n = 36 patches)
Characterizing non-work destinations
Finally, this last section of the analysis relates the geometry of the SpotRank places that resulted from the SpotRank data with the eight-type groups of patches, in order to explore the mixture of groups by place. Figure 3 illustrates the distribution of patches of retail by type across the 93 SpotRank places. The color coding indicates the type of patch. The circles represent the boundaries of the locations on the canonical Friday.

K-medoids clusters on Friday in the 93 SpotRank places.
In summary, our method resulted in a typology capable of differentiating and classifying commercial patches based on their potential to attract non-work activity. Large datasets, such as that created using Google Place, demonstrated their usefulness to create measurements of the built environment that allowed us to characterize, with a high level of spatial detail, the built environment and the amount of attraction associated with a place.
An additional substantial finding is the discovery that the resulting typology of commercial patches reproduced a spatial hierarchical organization of consumption that evokes the classic Central Place Theory (Berry and Garrison, 1958). Twelve major commercial corridors located either in the central area of town or at two satellite regional centers occupy the top of the hierarchy. These are large corridors with street-level retail that connect various shopping malls and clusters of restaurants. Mid-size shopping malls that are present in almost every important sub-regional center of activity comprise the next level. Small plazas with supermarkets are in the third level, and food markets in the fourth one. Finally, four medoid groups represented small patches of various mixed retail and restaurants, which spread across almost every SpotRank place. The SpotRank places that exerted the largest attraction for visitors interested in non-work activity seem to combine both a density and diversity of commercial patches—i.e. differentiated spaces that cater to different lifestyles and include a distinctive offering of non-work activities.
The structure of the commercial patches and the SpotRank locations indicate that the city remains highly monocentric in the distribution of non-work activity, especially for shopping and dining. Finally, iconic places that are major non-work destinations in Singapore possess the uniqueness of both diversity and density of the type of commercial patches that they combine.
Conclusions and future work
“Place” and “location” are key concepts when exploring the interaction between the built environment and social behavior. A place represents the geographical and territorial scale that anchors the interaction between these two elements. A clear definition of the geometrical boundaries of a place is needed to analyze why people choose one location over another for non-work activity.
Our approach combined various techniques that are already well established in the field of machine learning, in order to propose a systematic method to derive a unit of analysis that could serve to explore non-work activity using CDR and POI data. The proposed method distinguishes from previous approaches found in the literature of urban form and function in that ours characterizes a finer geographical scale at the level of neighborhoods or zones rather than regions. Our goal has been to build a unit of the analysis with the proper scale to study non-work destinations.
Our process intends to replicate what we learned from the fieldwork visits: the appeal of a non-work destination relates to its ability to have differentiated attractions of sub-spaces that cater a variety of lifestyles. A sufficient approximation of a place’s potential attraction for non-work activity was provided by the geometry and internal structure of patches of commercial establishments since its density and diversity reflect activity and clienteles.
A pattern-detection algorithm scanned the data by the hour and cell in order to identify when and where large numbers of people could be found. Our method of pattern detection generated, for each day, a set of cylinders containing spaces and a time frame that corresponded to unusual high peaks of human concentration. The resulting SpotRank places contained spatial–temporal boundaries that identified 93 locations. The SpotRank places included over 90% of shopping and dining destinations reported by a transportation survey, which was used to assess the quality of our locations to capture of non-work activity. However, our SpotRank places also contained destinations for trips to industrial or open space locations. It was clear that additional data were needed to supplement the characterization of the SpotRank locations in order to filter those places that had a high concentration of non-work people presence.
We took such SpotRank locations as a unit of analysis and proposed a combination of established methods in data science to identify and characterize patches of commercial establishments within. A DBSCAN algorithm produced a spatial clustering and traced the boundaries of patches containing commercial establishments (from the Google Place API) within each SpotRank location. Then, the method computed indicators that measured the geometrical topology of the patch, the diversity and density of the establishments contained, and the spatial dependency of establishments on major commercial outlets.
In general, the resulting SpotRank locations and their characterization into commercial patches located within it reflect important features of human choices and activities that correspond not just to how land is used and/or how accessible a place is but also, and perhaps more importantly, to how people interact with the built environment.
The data and method had limitations that resulted in inaccurate representations of some patches of establishments. These inaccuracies were caused by missing information regarding the desirability of a place profiled earlier by visitors. The DBSCAN traced the boundaries of a patch considering the density of the establishments in a two-dimensional space. However, we did not have data that represented the social stratification of shopping malls in Singapore. For instance, luxury shops and restaurants tend locate in the ground floor and first floor, while cheaper shops and food courts locate down the ground. Our method did not deal with such cases. The integration of BIM data with latitude and longitude seems promising to handle this type of data in high dense commercial areas.
We consider two future research directions. First, the exploration of the temporal with day dynamics for the SpotRank places. It will be relevant to explore its spatial boundaries of activity along a same day to profile the potential of a place as attractor of non-work activity. Second, the amalgamation of the eight different types of patches within a SpotRank place can serve as the basis of a typology of non-work attractions to feed the choice sets of demand models of transportation.
Footnotes
Acknowledgements
We appreciate the contributions of past and present members of the SimMobility Long-Term team. A preliminary version of this paper was presented in June 2019 at the 2nd International Conference on Urban Informatics in Hong Kong. We would like to thank both the reviewers for their constructive feedback.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded in part by the Singapore National Research Foundation through the Future Urban Mobility group at the Singapore-MIT Alliance for Research and Technology Center (SMART).
Supplemental material
Supplemental material for this article is available online.
