Abstract
Timely responses to emergencies are critical for urban disaster and emergency management, particularly in densely populated mega-cities. Researchers and personnel involved in urban emergency management nowadays rely on computers to carry out complex evacuation planning. Agent-based modeling, which supports the representation of interactions among individuals and between individuals and their environments, has become a major approach to simulating evacuations wherein spatial–temporal dynamics and individual conditions need attention, such as congestion in urban areas. However, the development of optimal evacuation plans based upon agent-based evacuation simulations can be very time-consuming. In this study, to shorten the computation time to provide a timely response in an efficient way, we develop a knowledge database to store evacuation plans for typical population distributions generated by mobile phone location data. Subsequently, we use the prepared knowledge database (offline) to accelerate real-time (online) processes in searching for near-optimal evacuation plans. Our experimental result demonstrates that the evacuation plans generated with a knowledge database always outperform those that are generated without a knowledge database. Specifically, the knowledge database can reduce the computation time by an average of 96.76%, with an average fitness value improvement of 21.86%. This result confirms the effectiveness of our proposed approach in improving agent-based evacuation planning. With the rapid development of human sensor data collection and analysis, the estimation of a more accurate population distribution will become easier in future. Thus, we believe that the proposed approach of developing a knowledge database based on population distribution patterns will provide a more feasible alternative solution for evacuation planning in the practice of urban emergency management.
Keywords
Introduction
Timely responses to emergencies are critical for urban disaster and emergency management. When emergent hazards such as fires, explosions, and toxic gas leakages occur in densely populated urban areas, if the affected population cannot be efficiently evacuated, such hazards often lead to great loss of life. Efficient evacuation depends on the scientific preparation of emergency evacuation plans, including accurately estimating affected populations and their locations, generating optimal evacuation paths for populations in different locations, and offering personalized evacuation guidance. Efficient solutions to evacuation planning will address the need for timely response in urgent situations.
However, the modeling of emergency evacuation planning is challenging. First, populations in urban areas are highly dynamic. Previous approaches to emergency evacuation planning have largely relied on census data or traffic survey data, which cannot afford timely assistance to the afflicted. In recent years, large-scale mobile data, particularly mobile phone location data, which provide opportunities to better understand the dynamics of urban populations (González et al., 2008; Liu et al., 2015; Shaw et al., 2016; Yue et al., 2014; Zheng et al., 2014), have been suggested as useful for emergency management. Existing studies that use mobile phone location data in fine-grained urban population estimation (Ahas et al., 2007; Kang et al., 2012; Liu et al., 2018; Ratti et al., 2006; Reades et al., 2009; Xu et al., 2016) and prediction (Chen et al., 2018), dynamic population estimation (Deville et al., 2014), urban crowd flux estimation (Fan et al., 2018), unusual population behavior detection (Dobra et al., 2015), population movement tracking (Bengtsson et al., 2011; Gething and Tatem, 2011), and population displacement prediction (Lu et al., 2012; Song et al., 2014) after disasters lay a solid foundation for the feasibility of near-real-time emergency evacuation planning.
Second, the process of emergency evacuation planning is computationally intensive. In the search for an optimal solution in emergency evacuation, the effectiveness of a candidate evacuation plan often originates from an evacuation simulation. Evacuation simulation models have been intensively studied since the 1980s (Vermuyten et al., 2016; Zheng et al., 2009). These models evolved from macro models for large spatial regions that uniformly treat evacuees to micro models in a building that consider differences between individuals. The current major evacuation models include the cellular automata model (Muramatsu et al., 1999; Von Neumann, 1966), the social force model (Chen et al., 2012; Helbing and Molnar, 1995; Parisi and Dorso, 2005), the lattice gas model (Song et al., 2006; Tajima and Nagatani, 2001), the game theoretic model (Løvas, 1995; Zheng and Cheng, 2011), the fluid-dynamic model (Hughes, 2003), and the agent-based model (Bratman, 1987; Braun et al., 2005). Among these models, the agent-based approach models a system as a collection of autonomous decision-making agents. It supports the representation of interactions among individuals and between individuals and their environments (Bonabeau, 2002). Thus, it has become a major approach to simulating evacuations where spatial–temporal dynamics and individual conditions need attention, such as congestion in urban areas (Chen and Zhan, 2008; Crooks et al., 2008; Pan et al., 2007; Shi et al., 2009). However, the agent-based model involves many evacuees as well as complex interaction rules, and it is often computationally demanding. Moreover, the process of evacuation simulation becomes even more time-consuming in searching for the optimal evacuation plan because it involves comparing combinations of evacuees, evacuation paths, and refuge selections, which creates a spatial optimization problem.
In this regard, heuristic algorithms have been widely adopted to solve optimization problems in evacuation simulation. With a given evacuee distribution and locations of refugees, such algorithms aim to determine an evacuation plan that can achieve certain objectives such as minimizing the clearance time, shortening the evacuation route, or reducing the congestion degree (Brachman and Dragicevic, 2014). The exact algorithms comparing all possible solutions to search for the optimal solution often require exponential or factorial computation time (Garey and Johnson, 1979). Heuristic algorithms can generate near-optimal solutions with significantly less computation time (Khalid and Yusof, 2018). Typical heuristic algorithms include ant colony optimization algorithms (Colorni et al., 1992; Yuan and Wang, 2007; Zong et al., 2010), genetic algorithms (GAs) (Fang et al., 2013; Holland, 1975), simulated annealing (SA) (Cepolina, 2005; Kirkpatrick and Vecchi, 1983), tabu search (Jiang et al., 2013; Xie et al., 2010), and so on. Although many heuristic algorithms have been used to determine near-optimal solutions by trading optimality for speed, an increase in the number of evacuees and the affected area size can also significantly expand the search space. Therefore, the computation time for near-optimal solutions in many real situations may still not be acceptable.
To further reduce the computation time for generating near-optimal evacuation plans using agent-based simulations, some researchers have leveraged the power of parallel computing to speed up agent-based simulation (Gong et al., 2013; Tang and Wang, 2009). Although parallel computing accelerates the process of evacuation planning, this approach consumes a large amount of computing resources. Meanwhile, another stream of research has developed knowledge-guided heuristic algorithms to shorten the computation time in searching for near-optimal solutions (Liu et al., 2013; Yin et al., 2012). Seeking to generate outcomes of knowledge via studying representative scenarios and using such knowledge, this approach can significantly reduce the solution-search space for spatial optimization problems.
This study utilizes the concept of knowledge-guided heuristic algorithms to propose a new approach that exploits mobile phone location data to improve the efficiency of emergency evacuation planning. We adopt a heuristic search strategy to develop near-optimal evacuation plans based on agent-based evacuation simulations. With the mobile phone location data, this proposed approach does not only evaluate affected populations and their locations in real-time, but also derives the hourly population distribution patterns to develop an offline knowledge database that stores evacuation plans for typical situations to reduce the online computation time of generating evacuation plans. The remainder of this paper is organized as follows. The next section introduces the proposed approach to provide real-time evacuation planning using mobile phone location data. The “Study area and data set” section describes the study area and the mobile phone location data set used in the case study. The “Results and analysis” section presents the evacuation planning results. Finally, the “Discussion and conclusion” section draws certain conclusions as well as plans for future work.
Method
Framework design
In the context of the popularity of mobile phone usage in urban areas, we make the following assumptions in developing our approach: (1) In an evacuation scenario, mobile phone signals covering the affected area are not subject to serious interference; (2) telecommunication operators can acquire approximate user locations in real-time through the mobile phone network and send back the evacuation plan to the user through short messages or other means; (3) large-scale mobile phone location data from one or several major telecommunication operators can reflect urban population distributions in fine spatiotemporal granularities (Chen et al., 2018; Xu et al., 2016); and (4) even as people move across urban spaces, there are certain regularities in population distributions in urban areas over a 24 hour period (Ratti et al., 2006).
Since there are many possible evacuation scenarios for demonstration purposes, this study focuses on outdoor urban spaces mainly occupied by pedestrians. The evacuation objects are human beings, and the planning objective is to minimize the evacuation clearance time. The inputs include an evacuation area, a given population distribution in this area, and refuge locations. The outputs include an assigned refuge for every evacuee. The proposed optimization model of the evacuation scenario can be stated as follows.
Variables:
N: number of evacuees;
M: number of refuges;
I: set of evacuees (origins),
J: set of refuges (destinations),
Decision variables
Constraints
Objective function
We design a framework that uses a knowledge database to accelerate emergency evacuation planning, as shown in Figure 1. Based on an agent-based evacuation simulation model, this study applies a heuristic algorithm to determine a near-optimal evacuation plan. In the offline environment, as the computation time is relatively tolerant, high-performance computing resources can be utilized to run the heuristic algorithm to obtain a near-optimal solution. We develop a knowledge database of evacuation plans by calculating the near-optimal solutions based on typical population distributions of every hour for a certain area. In a real-time situation, with the real-time population distribution in an emergency event, we search through the typical population distributions that best match the real-time population distribution in the knowledge database. Subsequently, we use the best-match evacuation plan to initially seed the heuristic algorithm. The knowledge-guided seeds are expected to significantly speed up the process to a convergence relative to random seeds.

Framework design of proposed approach.
Simulating evacuation with an agent-based model
Agent-based models are suitable for representing interactions among individuals and between individuals and their environments. In the initial state of the evacuation simulation, all buildings and evacuees are set as agents. Among them, buildings are agents with a speed of zero, and they remain stationary in the simulation process, while evacuees are agents with varying speeds. With a given evacuation plan, each evacuee is assigned to a refuge. The evacuation path for each evacuee is the shortest distance path between one’s start location and the assigned refuge. The path is calculated by use of Dijkstra’s algorithm based on the road network in a given area. For an evacuee’s movement, this study applies the concept of reciprocal velocity obstacles (RVOs) to an agent-based model (Van den Berg et al., 2008) to lower the risk of road blockage in evacuation situations (Sasabe et al., 2020). The RVO concept has the advantage of simulating collision-avoidance among multi-agents, which is suitable for crowd simulation. The output of the evacuation simulation is the total evacuation clearance time.
Searching for near-optimal evacuation plans by a heuristic algorithm
Based on an agent-based evacuation simulation model, this study applies a GA to search for a near-optimal evacuation plan. A GA is one of the most classical and effective population-based methods, which encodes the decision variables in a flexible way. Compared to other heuristic algorithms such as SA, the obtained solution from a GA is more stable and the required computation time is generally smaller (Deep et al., 2009). Therefore, this study adopts a GA as the primary option to demonstrate the proposed framework. The GA-based evacuation planning includes three major steps. The first step is to construct a chromosome. In this study, a chromosome represents an evacuation solution candidate. The chromosome is composed of genes. Each gene stores an assigned refuge ID for a group of evacuees that includes at least one evacuee within a certain area. The methods of grouping evacuees can be adapted to particular evacuation scenarios. For example, if the basic spatial unit offered by the mobile phone location data is the mobile phone tower (MPT) service area, we can treat evacuees within the same MPT service area as one group. Genes of a chromosome are sorted by the IDs of evacuee groups without any prior knowledge. Each gene initially stores a randomly selected refuge. These initially generated solution candidates are represented as the initial seeds of the GA. Given n evacuation groups in an evacuation area and m refuges, there are
Constructing a knowledge database with mobile phone location data
To further speed up the GA-based evacuation planning process, we propose to construct a knowledge database of evacuation plans. This study uses a mobile phone data set to estimate the regular patterns of population distributions in a region over a 24 hour period. Since the spatial resolution of most existing mobile phone location data sets is determined as the MPT service area, for our demonstration, we use the MPT service area as the spatial unit of a population distribution. In the offline environment, based on the GA-based evacuation planning process, the regular patterns of population distributions in an urban area for each hour over a 24 hour period are used to generate typical population distributions and to calculate near-optimal solutions. These typical population distributions and their evacuation solutions form a knowledge database of evacuation plans.
With the real-time population distribution in an emergency event, we search for the typical population distributions that best match the real-time population distribution in the knowledge database. Given the real-time population distribution A = {α1, … , αi, … , αn}, where αi represents the population within the ith MPT service area, given the typical population distributions in knowledge database S = {B1, … , B
k
, … , B24}, Bk = {β1
k
, … , βik, … , βnk}, where βik represents the population within the ith MPT service area during the kth time slot of a day, the matching function can be represented as equation (5)
Smaller values of D correspond to better typical population distribution matches. Next, we use the best-match evacuation plan to initially seed the GA. This initial seed can significantly speed up the process to a convergence compared to a random seed.
Study area and data set
Study area
In our study, we consider the Huaqiangbei business district in Shenzhen, China, as the emergency evacuation area (Figure 2). Shenzhen is a major city in China with its gross domestic product ranked fourth among all Chinese cities (National Bureau of Statistics of China, 2016). It is located in southern China, adjacent to Hong Kong, with 10 administrative districts covering an area of 1952 square kilometers and containing a population of 15 million as of 2017. The Huaqiangbei business district, located in the Futian district of Shenzhen, is the largest electronics market in China. It covers an area of approximately 1.45 square kilometers with the highest population density in Shenzhen. The Huaqiangbei business district is a pedestrian district, which fits well with the evacuation scenario considered in this study.

Study area.
The mobile phone data set
The mobile phone data set used in this study was acquired by the China Mobile Telecommunications Company, which accounts for approximately 75% of the entire mobile phone market in Shenzhen. Referred to as the Signaling System 7 (SS7) data set, it was generated from real-time monitoring of mobile phone signals based on the SS7 protocols. The data set covered 16 million anonymous mobile phone users during one weekday in 2012 without major events, containing the user ID, recording time, and longitude and latitude of the connected MPT at the recording time. In this data set, 5.85 million mobile phone users are recorded on an hourly basis. The user ID was encrypted for privacy protection before the data set was released for research. The data of approximately 6000 MPTs were extracted from the data set. Each MPT provides cellular coverage to an area that is often approximated by a non-overlapping Thiessen polygon, hereinafter referred to as the MPT service area. Regarding the Huaqiangbei business district, there are 66 MPTs within this region and the radius of the service areas varies from 20 to 200 meters depending on the density of the towers.
Results and analysis
Population distribution in space and time
Figure 3 shows the map of the area of interest, indicating the buildings and MPTs present. In the figure, the irregular square area denotes the outline of buildings, the black dot denotes the MPT location, and each Thiessen polygon around the black dot represents the service area of an MPT. Figure 4 illustrates the population density distribution of this area at typical time slots including night (00:00–01:00), morning (09:00–10:00), afternoon (15:00–16:00), and evening (21:00–22:00). To obtain the population density distribution, we obtain the sum of unique mobile phone users connected with an MPT over a given hour, and we divide this value by the MPT service area. The population density changes significantly throughout the day. Darker colors of the polygon correspond to higher population densities. From Figure 4, we note that the mobile phone location data can reveal fine-grained population dynamics in space and time, with which we can more accurately assess both the size and location of the population affected by emergencies.

Buildings and MPTs in the study area.

Population density distributions in the study area at different time slots.
GA parameter setting
To search for near-optimal solutions, this study designs a systematic test to tune GA parameters, including population size, crossover probability, and mutation probability. To make a large number of tests tractable, we first define a simplified testing scenario by selecting the upper right corner of the study area, which accounts for about a quarter of the study area. Fifty evacuees are randomly located in this area along with four refuges outside this area. Because 50 evacuees will not generate a crowded situation and each evacuee will move with a free-flow velocity, the theoretically shortest clearance time of the system (i.e. the exact solution) will be the longest travel time to the nearest refuge among these evacuees. Since the optimization problem becomes tangible, we can find the optimal clearance time by simulating all travel times to each refuge for each evacuee.
After we obtained the optimal solution, we designed a series of tests with combinations of different population sizes (from 30 to 300), crossover probabilities (from 0.1 to 0.9), and mutation probabilities (from 0.01 to 0.09). In addition, considering the uncertainty of a GA, we conducted five tests for each parameter combination. All these tests use a simple GA without the aid of a knowledge database. After retrieving all these results, we adopted two steps to select the appropriate parameters. The first step was to find the appropriate population size. Figure 5 illustrates that as the population size increases, the percentage of finding the optimal solution increases. When the population size reaches 200, the percentage of finding the optimal solution increases to 94%, which is adequate for the purpose here. Therefore, a population size of 200 was considered to be a good candidate when the GA starts with random seeds to search for a solution when we construct the knowledge database. The second step was to select the crossover probability and mutation probability. Based on the population size of 200, the test results in Table 1 show that the mean generation of converging is the smallest and the results are the most stable with a crossover probability of 0.7 and a mutation probability of 0.03. Accordingly, we adopted the above GA parameters for developing the knowledge database in the offline environment.

The percentages of finding an optimal solution with population size in the GA parameter test.
The test results for crossover and mutation probabilities with a population of 200.
Evacuation simulation using a knowledge database
For test purposes, four refuges are set at the four corners outside of this area (Figure 6). Assuming that an emergency occurs at 08:00, the online environment will obtain the population distribution at the 08:00–09:00 time slot and search for a typical population distribution that best matches the real-time population distribution in the knowledge database. Figure 6 displays the near-optimal evacuation plan for the 08:00–09:00 time slot stored in the knowledge database. Subsequent to sorting the refuges as 0, 1, 2, and 3, each region in the evacuation area is indicated by the color corresponding to the refuge. The output near-optimal evacuation plan for the 08:00–09:00 time slot can be represented as {3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,3,3,1,1,1,1,3,1,1,1,3,3,3,1,1,1,3,1,3,1,3,1,1,1,3,3,1,1,1,1,0,0,2,2,0,2,0,0,0,2,0,2,0,2,2,2,2,0,0,3}. Then we use the best-match evacuation plan to initially seed the GA-based simulation. Figure 7 illustrates the evacuation simulation for this evacuation plan. The blue dot represents an agent (i.e. an evacuee) who evacuates to a refuge, given the behavior of searching for the shortest path and avoiding other agents (including buildings and other evacuees).

Optimal refuge configuration for the population distribution corresponding to 08:00–9:00.

Simulated evacuation process using the population distribution corresponding to 08:00–09:00.
Model performance evaluation
In this study, we designed six experiments and conducted tests on a generic PC with an Intel(R) Core(TM) i7-6700 CPU@3.40 GHz and RAM of 16.0 GB. We first assumed that an emergent event occurs at 00:30, 08:30, 12:30, 15:30, 18:30, and 21:30, and we used the population distributions after the random disturbance as the real-time population distributions. Specifically, the real-time population of an MPT at a given time t:30 is simulated by a random number between the population of its two adjacent time slots (t−1):00–t:00 and (t + 1):00–(t + 2):00. Next, we compared the computation times of generating evacuation plans with and without the aid of a knowledge database. Note that the GA parameters are set the same for the two groups of experiments for a fair comparison. Two indicators are defined as follows to evaluate the model performance
From Table 2, we note that based on the six experiments, the results with a knowledge database always outperform the results without a knowledge database. Specifically, the knowledge database can reduce the computation times by an average of 96.76%, with an average fitness value improvement of 21.86%. These results demonstrate that the proposed approach in this study can significantly improve the efficiency and effectiveness of agent-based evacuation planning. Moreover, the differences between the random disturbance distribution and its corresponding best-match population distribution in the knowledge database are measured by the average change percentage per MPT. As Table 2 shows, when it is in the morning rush hour around 08:30, the population changes fast, so our simulated real-time population has a relatively large difference (59.03%) with its best match in the knowledge database, while the difference becomes as small as 2.25% at 15:30 when people generally stay at their workplaces and move much less. Our test results show that the performance with the knowledge database is always better than without it despite changes in the population distribution. It is noteworthy that even if the knowledge database is used in the experiment, the computation time still amounts to several tens of minutes, which cannot meet real-time emergency response requirements. However, we note here that our experiments were conducted on a generic personal computer; the computation time will significantly reduce with computer hardware upgrades or with the use of supercomputing systems.
Comparing computation times and the best fitness values with and without the help of a knowledge database.
Discussion and conclusion
In this study, we proposed a new approach to improving the evacuation planning process by exploiting mobile phone location data corresponding to an evacuation area. Starting with an agent-based emergency evacuation simulation model, we developed a knowledge database prepared offline, which stores evacuation plans for typical population distributions to accelerate the real-time processes in searching for near-optimal evacuation plans. The result of our performance experiment demonstrates that the proposed approach can reduce the computation time by an average of 96.76%, with an average fitness value improvement of 21.86%. Our study makes two contributions. First, this approach provides an efficient and stable evacuation plan for emergency evacuation, particularly because the online computation of this approach does not depend heavily on a high-performance computing environment when an emergency occurs. Second, this study points to a way to deeply combine large-scale trajectory data with domain applications in the big data era.
However, there are limitations to this study as well. First, the mobile phone location data set used in this study was collected for only one day from a single telecommunication company, which does not sufficiently reflect the regular patterns of urban population distributions. As a city usually includes multiple telecommunication companies, the integration of population estimates from multiple mobile phone data sets to obtain a more comprehensive view of the urban population distribution is important for evacuation planning. With the rapid development of human sensor data collection and analysis, obtaining a more accurate estimation of the population distribution will become easier in the future. Second, some real-world situations can be considered such as trampling of people and overcrowded refuges to enhance the proposed model in practice. Additionally, the mobile phone data set used in this study only records the connected MPT of a user. Without a more accurate user location, it was difficult to subdivide users within the same MPT service area. Therefore, we adopted the same evacuation plan for all people within one MPT service area in this study. Since the radii of those MPT service areas in our evacuation region are not very large (from 20 to 200 meters), this setting should not greatly affect the results of this case study. However, when it comes to a much larger MPT service area in other cases, the evacuation plan will be improved if finer subgroups are divided. In the future, if more accurate locations of mobile phone users are available, the proposed framework can be extended by a finer subdivision of the evacuation groups. In summary, we believe that our approach of developing a knowledge database based on population distributions and their subdivisions will afford a more feasible alternative solution for evacuation planning in the practice of urban emergency management.
Footnotes
Acknowledgements
The authors thank Dr Linda See, Dr Kai Cao, and the anonymous reviewers for their insightful comments. The authors also thank Mr Uzwal Prakash and Ms Megha Bisht for their online assistances.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was jointly supported by the National Natural Science Foundation of China (41771441, 41571431, 41421001), the Basic Research Program of Shenzhen (JCYJ20170307164104491), the Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, and the Shenzhen Discipline Construction Project for Urban Computing and Data Intelligence.
