Abstract
Urban big data include various types of datasets, such as air quality data, meteorological data, and weather forecast data. Air quality index is broadly used in many countries as an indicator to measure the air pollution status. This indicator has a great impact on outdoor activities of urban residents, such as long-distance cycling, running, jogging, and walking. However, for routes planning for outdoor activities, there is still a lack of comprehensive consideration of air quality. In this paper, an air quality index prediction model (namely airQP-DNN) and its application are proposed to address the issue. This paper primarily consists of two components. The first component is to predict the future air quality index based on a deep neural network, using historical air quality datasets, current meteorological datasets, and weather forecasting datasets. The second component refers to a case study of outdoor activities routes planning in Beijing, which can help plan the routes for outdoor activities based on the airQP-DNN model, and allow users to enter the origin and destination of the route for the optimized path with the minimum accumulated air quality index. The air quality monitoring datasets of Beijing and surrounding cities from April 2014 to April 2015 (over 758,000 records) are used to verify the proposed airQP-DNN model. The experimental results explicitly demonstrate that our proposed model outperforms other commonly used methods in terms of prediction accuracy, including autoregressive integrated moving average model, gradient boosted decision tree, and long short-term memory. Based on the airQP-DNN model, the case study of outdoor activities routes planning is implemented. When the origin and destination are specified, the optimized paths with the minimum accumulated air quality index would be provided, instead of the standard static Dijkstra shortest path. In addition, a Web-GIS-based prototype has also been successfully developed to support the implementation of our proposed model in this research. The success of our study not only demonstrates the value of the proposed airQP-DNN model, but also shows the potential of our model in other possible extended applications.
Introduction
A variety of studies show that the outdoor activities in an environment with good air quality are critical to people's physical and mental health (Boyan and Mitzenmacher, 2001; Panahi and Delavar, 2008; Zahmatkesh et al., 2015). On the contrary, it is also noted that outdoor activities surrounded by poor air quality, for example with high air quality index (AQI), proves to have a negative effect on human beings’ health (Kampa and Castanas, 2008), especially for the path-based long-distance outdoor activities, such as cycling, running, jogging, and walking. Hence, it is important to scientifically plan routes with good air quality for these kinds of outdoor activities.
In general, there are two primary types of routes planning methods: routes planning with static cost of path and routes planning with dynamic cost of path (Boyan and Mitzenmacher, 2001; Panahi and Delavar, 2008; Zahmatkesh et al., 2015). The static cost means that it is certain and fixed, e.g., the travel distance of one specific path. On the other hand, the dynamic cost means that it is uncertain and its value varies over space and time, e.g., the travel time between two specific points, which depends on the flow and conditions of traffic for road networks. With regard to the first type of routes planning method, i.e., the routes planning with static cost of path, the standard Dijkstra algorithm is a representative method that has been widely used (Dijkstra, 1959). However, the cost in the standard Dijkstra algorithm is not time-dependent, i.e., the Dijkstra algorithm only solves the problem of finding a path with a fixed distance. Considering the AQI as a fixed cost, Zahmatkesh extended the standard Dijkstra algorithm by taking the minimum air pollution as the optimized objective for planning a route (Boyan and Mitzenmacher, 2001; Panahi and Delavar, 2008; Zahmatkesh et al., 2015). However, the AQI value may not be fixed and may vary over time and space while it can be retrieved based on the mining of urban big datasets, such as historical air quality datasets, meteorological datasets, and weather forecasting datasets (Nikolova et al., 2006; Zheng et al., 2013). Regarding the second type of routes planning method, i.e., routes planning with dynamic cost of path, many studies have been conducted successfully in order to solve this kind of complex problems. Based on an adaptive decision rule approach, Hall (1986) brought forward a method to find the minimum cost path of bus routes with the consideration of both random and time-dependent cost. In addition, some heuristic algorithms, e.g., Genetic Algorithm, have also been utilized to help the dynamic cost-based routes planning in various applications (Chien et al., 2001; Pattnaik et al., 1998; Pellazar, 1994). Besides, probabilistic models have also been brought forward and employed in some studies for planning routes with the minimum travel time (Boyan and Mitzenmacher, 2001); uncertainty has also been considered in some studies to provide a better optimal route (Nikolova et al., 2006).
In this research, to plan routes with good air quality for these path-based outdoor activities, the dynamic AQI, or other similar indicators, could be considered. However, most of these measurements are spatially dispersed and with high temporal resolution. For example, in Beijing, there are only 36 monitoring stations that can provide the accurate AQI every 60 minutes, while in Singapore, there are only 4 stations that can provide similar indicator every 30 minutes. To support the planning of routes for these outdoor activities, these dispersed datasets cannot satisfy our demand unless accurate spatially and temporally continuous air quality datasets could be obtained, which is just the research aim of this study.
There are many existing approaches that can be used to predict AQI values. PM2.5 is an important factor affecting AQI values. There exists a nonlinear relationship between AQI value and PM2.5 data. Traditional AQI predictions mainly include methods based on regression models and support vector machines. For example, Cobourn and Baker used the regression model to predict the autocorrelation of the PM2.5 concentration sequence in time (Baker and Foley, 2011; Cobourn, 2010), Wang designed a method based on the support vector machine model to predict the concentration of PM2.5 (Wang et al., 2017). Moreover, Slini et al. (2002) successfully used the stochastic autoregressive integrated moving average model (ARIMA) (Box and Pierce, 1970) to forecast maximum ozone concentration. This model is quite commonly used, and is simple and requires only endogenous variables without the need of other exogenous variables. However, the model needs stable time series data. Different from the above-mentioned methods, Zhang et al. (2012) pointed out the importance of forecasting AQI and presented a comprehensive review for real-time air quality forecasting, in which they also introduced a 3D air quality model. Zhang et al.’s model is based on artificial neural networks, and is a nonlinear model. Compared with other models, artificial neural networks can learn from the historical air quality data and predict the AQI in the future with high accuracy. For example, Microsoft Research carried out a series of studies based on artificial neural networks, and achieved very good research results in predicting the AQI (Zheng et al., 2013, 2015). In addition, a deep neural network has also been proposed and successfully used by Microsoft Research to predict the air quality for the upcoming 48 hours on each monitoring station (Yi et al., 2018). Another method, structural cross-validation, successfully used historical data to extrapolate pollutant values both spatially and temporally (Guizilini and Ramos, 2015) by using Gaussian processes and a nonparametric model. For predicting PM2.5 at a large spatial scale, Li et al. (2017) presented an ensemble spatiotemporal model by combining nonlinear associations, ensemble learning, and residual Kriging methods. Given that there are multiple factors affecting AQI values, a multi-level attention-based recurrent neural network has also been successfully introduced and utilized to predict AQI values (Liang et al., 2018).
However, there is still room for improvement in this field along with the development of the state-of-the-art machine-learning techniques and the demand on higher accuracy of the air quality predication spatially and temporally. It is also noted that there have not been any efforts put into the study of air quality in routes planning for outdoor activities, even though there have been many studies working on air quality predication and routing, respectively. In this research, an air quality prediction model based on a deep neural network (called airQP-DNN) is proposed to predict the AQI for planning the outdoor activities routes. There are two components in this paper. The first component is airQP-DNN, which can predict the AQI value for the next six hours. The second component is a case study of outdoor activities routes planning in Beijing, which can infer the AQI value in a
Study area and datasets
Study area
Beijing, the capital of the People's Republic of China, is located in northern China. Beijing Municipality comprises 16 administrative county-level subdivisions, which is covered by a land area of 16,411 km2. In 2017, the resident population of Beijing was 21,707 million, and the urban population was 18,766 million, accounting for 86.5% of the total population. Among these residents, many people are in favor of outdoor activities. However, Beijing is facing serious air pollution issue due to the rapid industrialization and urbanization in recent decades (Zhang et al., 2016). According to the measured results of U.S. Embassy in 2013, the annual average value of PM2.5 in Beijing is around 100 μg m−3(Zhang et al., 2016). In addition, the distribution of AQI is spatially uneven. It is a challenging task to route the outdoor activities considering air quality given the spatiotemporal dynamic of air quality and the characteristic of path-based outdoor activities.
Datasets
In order to verify the above model and method, the air quality monitoring data of Beijing and surrounding cities from April 2014 to April 2015 (over 758,000 records) are used. Readers interested in details of this dataset can find the report from Zheng et al. (2015).
Air quality data
This dataset includes air quality data acquired every hour in Beijing from May 2014 to April 2015 at 36 stations and some monitoring stations surrounding Beijing. The total number of records of air quality data is over 278,000. Figure 1 presents the location of the 36 stations in Beijing, where each black icon represents a station. According to the Chinese AQI standards (GB3095-2012), the AQIs are calculated by using six air pollutants: PM2.5, PM10, CO, NO2, O3, and SO2.

Research area and the distribution of Beijing air quality monitoring stations.
Meteorological data
This dataset consists of meteorological data of all districts in Beijing and some districts surrounding Beijing; the data were acquired every hour from May 2014 to April 2015. The total number of records of meteorological data is over 110,000. There are eight properties included in the meteorological data, such as ID, time, weather, wind speed, and wind direction as shown in Table 1. The property of weather can be divided into 17 sub-properties, e.g., moderate rain, heavier rain, rainstorm, thunderstorm, and sunny.
The properties of meteorological data.
Weather forecast data
This dataset consists of forecast data of each district in Beijing; the data were acquired every three or six hours from May 2014 to April 2015. There are nine properties in the weather forecast data, such as weather, up_temperature, bottom_temperature, time_ forecast, and time_future. The total number of weather forecast data is over 370,000.
Methodology
Before describing the airQP-DNN model in detail, its framework is presented. And then, two predictors of the airQP-DNN model are described, respectively.
Framework
This research aims to address the issue of air quality prediction through the development and implementation of two predictors. Figure 2 depicts the framework of airQP-DNN. The airQP-DNN is the air quality prediction model used to account for AQI value predictions over future time periods. This model consists of two predictors: a time predictor and a spatial predictor. The time predictor pays more attention to predictions of changes in a station’s time series, since the value of air pollution changes over time. The AQI values of a station over the past few hours, the meteorological data of the area where the station is located, the weather forecast data, and the historical AQI data of station are used to predict the AQI value at future times. LSTM is adopted to complete the task of regression, which performs well in dealing with sequential data. The spatial predictor is more concerned with the spread of air pollution in space. After being discharged into the air, air pollutants are affected by meteorological factors such as wind, and spread to surrounding areas. Neural networks are used to construct the spatial predictor considering manually extracting these features is difficult, and then meteorological data and the historical data of surrounding stations are fed into the spatial predictor to learn the influence from the surrounding stations on the target station. In order to further improve the accuracy, after obtaining the results of the time predictor and the spatial predictor, the results from the two predictors are put into a decision tree to obtain a final prediction.

Architecture for air quality prediction model and case study of outdoor activities routes planning in Beijing.
The second component is a case study of outdoor activities routes planning in Beijing based on the AQI value, which is from the output of the first component. Although the AQI values of 36 stations in Beijing at future times are able to be predicted based on airQP-DNN, no air quality monitoring stations are deployed in the majority (more than 99%) of the city. Hence, the entire city is divided into several grids, and the air quality values in each grid are inferred by spatial interpolation based on the Kriging interpolation (Li et al. 2017). In this component, the route can be planned with the minimum cost, such as the minimum AQI or minimum path length.
The air quality prediction model: airQP-DNN
There are three key elements, i.e., time predictor, spatial predictor, and prediction aggregator, in predicting AQI values. The time predictor is to conduct the prediction from the temporal perspective based on historical data. The spatial predictor is from the spatial perspective with the consideration of the surrounding environment. Afterwards, the prediction aggregator is an integration of outputs from the abovementioned two elements. More details will be discussed as follows.
Time predictor
Intuitively, the change in AQI is time-dependent. In other words, the previous hours’ AQI data have an important impact on the changing trend of the AQI in the future. As Figure 3 shows, weather forecast data of target station also have an impact on the future AQI; these datasets are converted into two sub-features. Since time has been proven relevant to the AQI (Zheng et al., 2015), the time has also been transformed to a sub-feature. And then, the above features are connected into the whole feature vector. In order to reflect the impact of the whole feature vector on the changing trend of the AQI in the future, a mathematical model needs to be built. Different from the work by Zheng et al. (2015), who used a linear regression model, we adopt the LSTM as the model of the time predictor because LSTM has an advantage in processing and predicting sequential data (Hochreiter and Schmidhuber, 1997). The whole dataset is divided into the training set (80%) and testing set (20%) randomly. During the training procedure, the previous three hours’ AQI data, T − 2, T − 1, and T, are the input of the time predictor; and the next six hours' AQI data, T+1, T+2, T+3, T+4, T+5, and T+6, are the output of the time predictor. During the testing procedure, the whole feature vector is the input of the time predictor, and the output of the time predictor is the next six hours’ predicted AQI values. The reason for iterating without using the predicted value is that the predicted value itself contains error, and if the predicted value is used as an input for the next prediction, the cumulative error will become larger and larger. The framework of the time predictor can be seen in Figure 3. However, the time predictor has its weaknesses. For example, it might not be able to well handle the impact of spatial dimension on the AQI. This problem could be addressed by designing a spatial predictor, which will be introduced below.

The framework of the time predictor.
Spatial predictor
Air pollution is space-dependent. In other words, the AQI values of surrounding stations are related to the predicted AQI value of the target station. Therefore, the AQI values of the surrounding stations should be considered as a feature. The surrounding stations not only include nearby stations, but also include the stations located in surrounding cities, where the distance between the target station and the surrounding stations ranges from several kilometers to hundreds of kilometers. It is noted that surrounding stations with different distances and directions to the target station have different impacts on the AQI of the target station. Referring to Zheng et al. (2015), and taking the target station as the center, three circles with radius of 30 km, 150 km, and 300 km were drawn and divided into eight sectors, as shown in Figure 4. Each sector corresponds to a region, and air quality monitoring stations in Beijing and surrounding cities will be allocated into the corresponding sectors. Different colors of sectors indicate different levels of the AQI values.

The framework of the spatial predictor.
The aim of the spatial predictor is to predict the AQI data of the target station from the spatial perspective with the consideration of the surrounding environment. To model the effect of the surrounding environment, we adopt a deep neural network since it is very challenging to manually extract these features. The deep neural network consists of an input layer, output layer, and two hidden layers. If the number of layers continues to increase, the computing time will rapidly increase. The following paragraphs first introduce the construction of features from three aspects, and then present the training process of the spatial predictor.
The first aspect is the influence of the surrounding environment, which can be formulated by the AQI from the target station and the surrounding stations. In Figure 4, if the color of a sector is transparent, it means that there is no air quality monitoring station deployed in the region, and therefore the feature from that area is not considered for the spatial predictor. On the contrary, if the color of a sector area is orange, it means that the AQI of the monitoring station deployed in this region is high, and thus the feature from that sector should be considered in the spatial predictor. In short, the darker the color, the higher the AQI value. If there are multiple monitoring stations in the sector, the station closer to the target station is selected as the representative station of the region. In this way, the first sub-vector can be constructed.
The second and third aspects are the influence of meteorological data and time data, which can be formulated by two sub-vectors. We use the three data mentioned above for the first three hours T−2, T−1, and T as input, and the actual AQI values for the next six hours T + 1, T + 2, T + 3, T + 4, T + 5, and T + 6 as output for training. As shown in equation (1), we need to learn the function f (x) by deep neural network, where x is the input data.
The influence of distance and direction between the target station and the surrounding stations can be formulated by the sectors’ division. The sectors’ division allows us to aggregate the intra-sector data from different monitoring stations, which can reduce the dimensions of the input data, thus making spatial predictor train faster and be more accurate. For example, two stations in one sector may have opposite wind directions due to the geographical environment. If the data from both the two stations are the input of the spatial predictor, it might negatively affect the accuracy of the spatial predictor. Based on sectors’ division, we aggregate the intra-sector data as a whole and feed them into the spatial predictor. To express the different sectors’ effects on the AQI value of the target station, different weights are allocated to sectors through learning from the deep neural network by training.
The above three aspects are combined into one whole feature vector and fed into the spatial predictor. There are over 300,000 records that participate in the training process of the spatial predictor for predicting the AQI of the target station. This is depicted as the center of circles in Figure 4. After about 200 iterations, the variation of the predicated AQI value is getting stable, and thus one relatively accurate AQI value could be obtained from the spatial predictor.
Prediction aggregator
Taking into account the fact that the results from these two predictors are one-sided, the two results should be aggregated. There are some classical ways to aggregate different datasets for better modeling results, such as using bagging or adaBoost. The advantage of bagging is that it is capable of handling the overfitting phenomenon during the training procedure; however, the weights of different prediction functions are the same. adaBoost’s advantage is that it is able to employ different weak classifiers, and meanwhile, its data imbalance may lead to the degradation of classification accuracy.
Different from bagging and adaBoost, our proposed prediction aggregator uses the classification and regression tree to aggregate data from the time predictor and spatial predictor because it can automatically judge the importance of features with less computational cost. The basic idea of the prediction aggregator is to first build a tree by using the features from the time predictor and spatial predictor. We combined the results data of time predictor and spatial predictor with weather forecasting data and meteorological data into a vector. Then, it prunes the tree by learning the samples. Lastly, it chooses the optimized subtree with minimum error and outputs the predicted AQI value. When a piece of data is entered into the spatial predictor, the model assigns different weights to time predictor and spatial predictor based on the meteorological and weather forecast data of target station.
The case study of outdoor activities routes planning in Beijing
Conventional route planning, such as that via the standard Dijkstra algorithm, can find the path between a specific travel origin and destination with the fixed minimum cost. This may be computed according to one object or multiple objects. However, sometimes the cost of each path is time-dependent. Previous researchers (Boyan and Mitzenmacher, 2001; Nikolova et al., 2006) addressed the general problem of time-dependent cost and optimized the standard Dijkstra’s algorithm by using a stochastic model, while they ignored the specific application objects, such as travel time and air quality, whose costs are also time-dependent. Zahmatkesh et al. (2015) took the minimum air pollution as the optimized object to plan a route, and regarded the AQI as static cost. In other words, they did not consider that the AQI value is time-dependent. Although all the above methods are capable of finding a route, none of them can plan a route with a minimum AQI value that is time-dependent. In this section, a case study of outdoor activities routes planning in Beijing that can not only consider static path lengths, travel time, but can also consider dynamic air quality is employed to find optimal routes for path-based outdoor activities.
Unfortunately, the travel origin, destination, and locations through the path are arbitrary points, and do not necessarily have the needed AQI values because the airQP-DNN model only predicts the AQI value of 36 stations in future. Therefore, before planning routes for outdoor activities, inferring the unknown AQI values from the known data of 36 stations is necessary. Note that the change of AQI values conforms to the first law of geography; in other words, values of an unknown variable are spatially related and the values from closer locations are more similar than that from locations farther apart. In this research, the Kriging method is employed to interpolate the AQI value across the entire research area based on a
Experiments and results
A Web-GIS-based prototype has been developed to conduct the experiments in this case study, in which the AQI data, meteorological data, and weather forecast data of Beijing are used. The prototype contains three parts: a prediction server, map server (AMap: https://www.amap.com/), and browser. The prediction server provides the AQI prediction function, from which we can get the AQI value of a future time. The map server provides the map function and the route planning function. In addition, the user can input an origin and destination, and the browser will show routes provided by the map server.
Results of the time predictor
First, A stands for the historical AQI data, T represents the hour of the day and the day of the week, M denotes the meteorological data, and F represents the weather forecast data. The prediction of each hour
Then, the different features A, T, M, and F are added in turn in order to train the time predictor. From Table 2, it can be noted that the worst accuracy is 78.6% (when using all features A, T, M, and F), the best accuracy is 79.5% (when using all features A, T, M, and F, and the mae reduces from 25.8 to 21.8.
Results of the time predictor.
Results of the spatial predictor
When training the spatial predictor, the features A, T, and M are added in turn, and it is noted that the prediction accuracy has been effectively improved. The different times of the day and whether it is a working day are helpful to explore the pattern of AQI variation. Thus, after adding feature T, the accuracy improves from 73.5% to 73.8%, as shown in Table 3. Since meteorological data play an important role in the variation of AQI, adding feature M into the spatial predictor can increase the prediction accuracy from 73.8% to 74.3% and cause the mae to drop by 1.3.
Results of the spatial predictor.
Results of the prediction aggregator
After obtaining the results of the time predictor and the spatial predictor, the results are combined through the prediction aggregator based on the decision tree theory, which can automatically adjust the weights between results according to the current meteorological conditions, such as wind speed and rainfall. As shown in Table 4, using the prediction aggregator can further improve the accuracy of the prediction from 74.3% to 80.6%, and significantly reduce the error from 31.2 to 20.9 because the prediction aggregator has the ability to learn from the samples.
Results of the prediction aggregator.
Comparative results
In this section, the following comparative experiments have been conducted. The first is to evaluate the airQP-DNN prediction accuracy of the time-dependent cost of the planned routes, i.e., the AQI. The second is to implement the case study of outdoor activities routes planning in Beijing based on our airQP-DNN.
Comparative experiment 1
There are some classical models to predict values from a time series with good results. ARIMA is a well-known time series prediction method that is often used in predicting air quality. The gradient boosted decision tree (GBDT) is another commonly used model (Friedman, 2001), and is a machine-learning model for regression and classification. ARIMA, GBDT, and LSTM are applied to act as baselines to evaluate our airQP-DNN.
As shown in Table 5, compared to the three common time series models ARIMA, GBDT, and LSTM, it can be seen that the prediction accuracy of our airQP-DNN is greater by 5.5%, 1.4%, and 1.1%, respectively. The experimental result shows that our airQP-DNN is superior to the above models in AQI prediction.
Comparative results of AQI prediction accuracy.
Comparative experiment 2
There are different costs in the outdoor activities routes planning: the path length, travel time, and air quality. As shown in Figure 5, the first experiment is conducted to find the shortest path length (Route 1, green line with a red border). Then, another experiment is conducted to find the shortest travel time (Route 2, green line). Finally, the AQI values of each route are separately computed, and the optimized route with the minimum AQI value and the shortest travel time (Route 2, green line) are output.

Distribution of Beijing AQI values and route planning with the minimum AQI.
Discussion and conclusion
Despite the increasing number of studies on routes planning, there is lack of efforts put into routes planning for outdoor activities, such as long-distance cycling, running, jogging, and walking, considering dynamic cost, e.g., air quality. In addition, as a very challenging task, there is also room for improvement in the field of accurately predicating air quality spatially and temporally. In this paper, an AQI prediction model (namely airQP-DNN) is proposed to address the issue. This paper primarily consists of two components. The first component is the airQP-DNN model, which can predict the future AQI based on a deep neural network, using historical air quality datasets, current meteorological datasets, and weather forecasting datasets. The second component refers to the case study of outdoor activities routes planning in Beijing, which can help plan the routes for outdoor activities based on the airQP-DNN model, and allow users to enter the origin and destination of the route for the optimized path with the minimum accumulated AQI. The air quality monitoring datasets of Beijing and surrounding cities from April 2014 to April 2015 (over 758,000 records) are used to verify the proposed airQP-DNN model. The experimental results explicitly demonstrate that our proposed model is better than other commonly used methods in terms of prediction accuracy, including ARIMA, GBDT, and LSTM. As for the case study of outdoor activities routes planning in Beijing, when the origin and destination are specified, the optimized paths with the minimum accumulated AQI would be provided, instead of the standard static Dijkstra shortest path. In addition, a Web-GIS-based prototype has also been successfully developed and implemented to test our proposed model in this research. The success of our study not only demonstrates the value of the proposed airQP-DNN model, but also shows the potential of our model in other possible extended applications.
On the other hand, there is still room for improvement in this research. In this research, Kriging spatial interpolation is applied to infer the AQI value. However, it could be better improved by semi-supervised models with the consideration of some ancillary datasets, such as meteorological data and road networks, which will be one of the major directions of our future research. The second aspect is about the criterion we have considered for routing, travel time. In this research, this criterion is assumed to be fixed, while it should be dynamic due to the variation of traffic conditions or other factors. We will also pay our attention to this point in our future study.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by the National Natural Science Foundation of P. R. China (Nos. 41571389, 61472193, and 41501431), supported by Key Laboratory of Spatial Data Mining & Information Sharing of Ministry of Education, Fuzhou University (No. 2016LSDMIS07), Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks (Nos. WSNLBZY201519), and the NJUPT Natural Science Foundation (No. NY215116).
