Abstract
Private transport has become a viable and increasingly popular alternative to urban transportation. However, with this growth, an old and recurring problem becomes more latent: the relationship between passenger demands and taxi supply. This problem suggests the creation and use of techniques which make it possible to reduce the gap between the demand for taxi passengers and the effective contingent of vehicles needed to meet this demand. This work introduces a new approach to forecasting and classifying taxi passengers’ demands. The proposed approach uses historical data from taxi rides and meteorological data. The Kruskal-Wallis method identifies the most relevant variables, and an evolving fuzzy system performs demand forecasting/classification. Five evolving systems are evaluated with our approach: Autonomous Learning Multi-Model (ALMMo), evolving Multivariable Gaussian Fuzzy System (eMG), evolving Fuzzy with Multivariable Gaussian Participatory Learning and Recursive Maximum Correntropy (eFCE), evolving Fuzzy with Multivariable Gaussian Participatory Learning and Multi-Innovations Recursive Weighted Least Squares (eFMI), and evolving Neo-Fuzzy Neuron (eNFN). In addition, computational experiments using real-world data were conducted to evaluate and compare the performance of the proposed approach. The results revealed that it obtained performance superior or comparable to state-of-the-art ones. Therefore, the experimental results suggest that the proposed approach is promising as an alternative for forecasting and classifying taxi passenger demand.
Introduction
Taxi rides are an attractive alternative to traditional public transport, as they offer efficient and convenient services to passengers, allowing for a personalized and easily accessible experience [1]. However, a pertinent and constant problem for this transport system has not yet been resolved: the supply and demand of passengers. Passengers may be left without service due to a high demand in a specific region, and, on the other hand, there are places where the offer of vehicles is large, but few customers will request rides [2]. By knowing in advance what the demand of a particular region is, taxi companies can intelligently organize their fleet, increasing or decreasing the offer according to the need. As a result, the waiting time for passengers and the idleness of taxis is reduced, and consequently, taxi drivers can serve more customers and optimize their profits [3]. In addition, vehicle depreciation and time spent in transit are also reduced, leading to reduced fuel and maintenance costs [4]. Research shows that if the supply of taxis were able to meet the desired demand, the inefficiency of the private transport market would fall by more than 60% [5].
Various machine-learning techniques have been used to address the issue of passenger supply and demand. Autoregressive Integrated Moving Average (ARIMA) [6, 7], Convolutional Neural Networks (CNN) [8, 9], Long Short-Term Memory (LSTM) [10, 11], and Support Vector Machine (SVM) [1, 12] are examples of some of them. However, these techniques, for the most part, present certain limitations inherent to their nature, namely: needing periodic revisions of the models, training them, and adapting them again to the reality of the data [13]; not dealing with situations in which there are changes in environmental conditions, typical of online configurations with non-stationary data [14]; rarely being able to handle complex systems where there are multiple modes of operation [15].
These limitations generate methodological gaps that make it challenging to predict taxi demand accurately. This is mainly because the demand is related simultaneously to several exogenous factors with high volatility. Temperature, time of day, important events in the region, and traffic are some examples among them [16].
Furthermore, given its nature, extracting knowledge from datasets with these characteristics is a complex and unintuitive task, favoring uncertainty in interpreting such information [17]. Given this context, it is desirable that the techniques used to forecast taxi demand have continuous and incremental learning, adapting to external factors related to traffic. Additionally, these systems must, preferably, supply the restrictions inherent to the aforementioned traditional techniques.
Evolving Fuzzy Systems (EFS) have been proposed for the incremental processing of large data streams, in which samples are presented only once to the model and then discarded [15]. These systems have as their main characteristic the development of their structure (in this case, neurons, clusters, data clouds, and/or fuzzy rules) and parameters update online, eventually in real-time, as new information is received from a continuous data stream [18, 19]. In addition, the characteristic of the EFS in dealing with the uncertainty of information is highlighted, and these are inherent to the problem in question. EFS have been successfully used in several applications, such as forecasting in the financial market [18, 21], handling missing data [22, 23], urban mobility and traffic management [24, 25], forecasting and monitoring weather data [26, 27], pattern recognition [19, 28], and diagnostics in health care [29], to name a few.
In this context, this work introduces a new approach to forecasting and classifying taxi passenger demand. The proposed approach to forecast taxi passenger demand can be summarized in the following steps: The first step is the dataset build. The dataset contains taxi rides information and meteorological data of a city or region. After obtaining the data, the area is split into microregions, named zones. Then, for each zone, the rides are aggregated to characterize the taxi demand. The second step is feature construction. The input variables set are built in using historical values of taxi demand and meteorological data. A set of 26 input variables are obtained, of which 21 are extracted from taxi rides and 5 from meteorological data. The third step is the feature selection. The selection is performed by Kruskal-Wallis statistical method. Finally, the fourth step is demand forecasting. The forecasting is carried out by an evolving fuzzy system that performs the forecast one step ahead.
Taxi demand classification is performed by transforming demand forecasting into a classification problem. In other words, the classification is carried out using the outputs of the demand forecast. So, the steps to create the dataset, feature construction, and feature selection are the same used to forecast demand, so they will not be described again. The below steps brief the proposed approach to classifying taxi demand. The first step is the classes definition. Based on the dataset created in the first steps, a division of rides is made into 4 classes (Very Low, Low, Medium, and High). After the classes definition, the classification is performed by converting the results obtained in the forecasting step into classes. The third and last step is the class’s visualization into a heatmap, updated at every new prediction.
In summary, the main contributions of this work are: The description of a new approach to forecasting taxi passenger demand using evolving fuzzy systems, historical ride data, and weather information. The proposition of a new approach to the classification of taxi demand. The proposed approach uses the demand predicted by the evolving model to perform the classification. The proposed approach was developed for online configurations with volatile environmental conditions and non-stationary data. In these environments, data arrives as a stream, has unlimited size, and is subject to variations and changes in the concept. These are typical characteristics of the problem at hand. The representation of taxi demand in heatmaps. This visualization is more intuitive, allowing a quick visualization of the demand in different city areas and facilitating decision-making.
Subsequently, the rest of the article is organized as follows. Section 2 provides a brief literature review concerning works related to demand forecasting. Section 3 presents the concepts of evolving fuzzy systems and describes the evolving models used in this work. In Section 4, the proposed approach is introduced and detailed. The computational experiments and their respective results are presented in Section 5s. Finally, Section 6 illustrates the final considerations and proposals for future studies.
Literature review
Taxi passenger demand forecasting has been the subject of several types of research, encouraging the modeling and application of various techniques to solve this problem. Initial studies used consolidated statistical methods for time series and historical data from the rides for demand forecasting. For example, in Moreira-Matias et al. [6] a hybrid statistical model is created to forecast passenger demand for the next 30 minutes at 63 stands 1 in Porto, Portugal. The model proposed by Moreira-Matias et al. [6] is incremented in [7] to simulate real-time demand forecasts. Zander’s research [30] forecasts taxi demand in Stockholm, Sweden, using Artificial Neural Networks (ANN). In Qu et al. [31], historical data and Least Squares Support Vector Machine (LS-SVM) are used to demand forecasts in the next 10 minutes in China.
Some studies suggest including exogenous variables, such as time of day, weather conditions, regional events, and day of the week, to improve the models’ performance [10, 32]. Ke et al. [10] performed the taxi demand forecast in Hangzhou, China, using a model based on deep learning named Fusional Convolutional Long Short-Term Memory (FCL-Net). The FCL-Net uses historical data and meteorological information as input. Vanichrujee et al. [32] proposed an approach that combines the results of three models, Gated Recurrent Unit (GRU), eXtreme Gradient Boosting (XGBoost), and Long Short-Term Memory (LSTM), to perform demand forecasting in 7 regions of Bangkok, Thailand. The hybrid model uses taxi rides and weather conditions (rainy, sunny, etc.) as input. In work conducted by Tong et al. [33], a linear regression model named Linear Unit Original Taxi Demand (LinOUTD) is proposed. LinOUTD was designed to process variables from different sources, such as points of interest, discounts applied to rides, and weather conditions, for example. The experiments were conducted in Beijing and Hangzhou, China, performing demand forecasts with 1-hour intervals.
Yu et al. [34] developed a modified density-based spatial clustering algorithm with noise (DBSCAN) model, using points of interest and historical demand to perform demand forecasts at 1-hour intervals in Beijing (China). Rodrigues et al. [35] introduced a framework based on Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM). The framework uses taxi rides and event information to predict demand at two major entertainment centers in New York City. In studies by Liu et al. [36], a model was developed using Random Forest and Ridge Regression techniques to predict passenger demand at points of interest in Xian, China, with a 1-hour interval. Finally, Kong et al. [37] developed a framework named TBI2Flow, which aggregates information from passenger applications, weather, local traffic, and vehicle location to perform the forecast in Shangai at 1-hour intervals.
Other approaches present in the literature perform knowledge extraction through maps (spatial) combined with historical rides data (temporal) to create hybrid models, known as Spatio-Temporal models. Zhao et al. [38] proposed the Unified Spatial-Temporal Network (USTN), which uses maps processed in convolutional layers, along with historical data from Uber 2 and information from New York yellow taxis, to predict demand with 1-hour intervals. Liu et al. [39] created a model named the Context-Aware Attention-Based Convolutional Recurrent Neural Network (CACRNN), combining Points Of Interest (POI) with meteorological data to predict demands in the next 15 minutes in Chengdu and New York. In Luo et al. [40], a feature selector is used based on a statistical method known as Augmented Dickey-Fuller (ADF). After selection, the Multi-Task Deep Learning (MTDL) model forecasts demand in the next 10 minutes in New York. A model called Multi-Level Recurrent Neural Networks (MLRNN), based on Long Short-Term Memory (LSTM), is the subject of the work by Zhang et al. [41]. The model receives meteorological information and historical data to predict demands in the next 30 minutes in Manhattan, New York.
In the studies by Zhu et al. [42], a framework is proposed combining Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM) to capture the temporal dependencies of rides. At the same time, convolutional layers are used to interpret event data and convert it into information to predict demand in New York, in two centers where significant events occur in the city (Barclays Center e Terminal 5). Lin et al. [43] use weather data, moving averages, and historical rides to feed a model which uses three deep learning techniques: Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU). This study predicts demands in points of interest (tourist attraction sites, hospitals, etc.) in Kaohsiung City, Taiwan.
Complementary Table 1 summarizes all the works described in this section: techniques and models used, time intervals, input variables, and cities where the forecasts occur.
Overview of the state-of-the-art
Overview of the state-of-the-art
Evolving fuzzy systems were conceived to fill a methodological gap in the context of adaptive modeling, considering the online processing of a data stream [44]. Generally, this kind of data has the following characteristics:
Taxi demand can be understood as a data stream because ride information is constantly generated (Pick-Up and Drop-Off) and is subjected to environmental changes (transit, wheater, etc.). Therefore, the evolving fuzzy systems are suitable for taxi demand forecasting and classification.
After presenting a brief introduction to evolving fuzzy systems and their promising applications in taxi demand data, the subsequent sections regard a technical description of the five models used in this work: ALMMo (Autonomous Learning Multi-Model System), eMG (evolving Multivariable Gaussian Fuzzy System), eFCE (evolving Fuzzy with Multivariable Gaussian Participatory Learning and Recursive Maximum Correntropy), eFMI (evolving Fuzzy with Multivariable Gaussian Participatory Learning and Multi-Innovations Recursive Weighted Least Squares), and eNFN (evolving Neo-Fuzzy Neuron).
ALMMO - autonomous learning multi-model system
ALMMo, proposed in [18], is an evolving multi-model fuzzy system with fuzzy rules of the AnYa type [47]. The antecedent of the AnYa rules is based on the concept of data clouds. The principle of data clouds is similar to that of clustering algorithms. In other words, each cloud represents a set of samples with similar characteristics. Each cloud is shaped by its constituent data, and focal points represent its centroids. ALMMo rules can be described by:
in which x
t
= [x1, x2, …, x
M
]
T
is a data sample in Euclidean space defined as
ALMMo structure evolves based on the data clouds, creating or deleting them as new samples are added to the model. With the arrival of new samples, the scalar means of products
and
Then the unimodal density
At each x t sample, the ALMMo structure is evaluated, and you can create a new rule, merge two rules, delete a rule or update the parameters. Thus, with each sample received, it is verified through Condition 1 whether a new rule should be created.
If Condition 1 is not accepted, the sample is assigned to the closest data cloud with updated parameters. Otherwise, if Condition 1 is satisfied, a new rule is created with the focal point at x t , and Condition 2 is tested. This condition checks if the created cloud overlaps with any existing clouds.
If Condition 2 is satisfied, the overlapped cloud will be replaced by the newly created one, and the new cloud will inherit the parameters of the consequent overlapping cloud. Otherwise, if Condition 2 is not satisfied, the consequent parameters of the new cloud are initialized.
Afterward, the quality of the rule base is evaluated by a utility measure which aims to exclude unrepresentative rules. The utility measure is calculated based on the cumulative sum of the rule’s contribution to the output calculation from its creation to the current sample. In other words, it is the measure of the importance of a respective fuzzy rule concerning the other rules (i = 1, 2, . . . , Gt+1). Its calculation is obtained by:
where activated
i
represents the time that the rule/data cloud was activated and
Let η0 be a constant which defines a tolerance value. Rules with low representation will be excluded as per Condition 3. If Condition 3 is met, the j-th rule/data cloud will be deleted, together with their consequent parameters.
In addition to rule exclusion, ALMMo implements a method for excluding inter-correlated entries to improve processing time, reduce memory usage, and improve the algorithm’s overall performance. The proposed method is based on the normalized cumulative sum of the input parameter values.
eMG [48] is an evolving fuzzy model that uses first-order Takagi-Sugeno rules and multivariable Gaussian membership functions. The rule base is built under an evolving clustering algorithm with participatory learning. The main characteristic of participatory learning is to use what has already been learned to evaluate the impact of including a new sample in the model. In short, the relevance of a new sample is evaluated concerning what has already been consolidated as knowledge [49].
The eMG structure evolves by updating, creating, or merging clusters/rules, and the output is obtained by the weighted average of the contributions of each rule. The rules are represented by their respective clusters, and each new sample presented to eMG updates its structure by a compatibility measure computed for each cluster. The compatibility measure
where the distance M is calculated by:
in which x
t
represents the current sample and
where
An alert mechanism is used to identify when the structure of the current clusters does not adequately represent the current knowledge, and this needs revision [48]. For each new sample x
t
, the alert index
where
α being a parameter that defines the level of significance and ω the observation window for calculating the alert index.
A new cluster is created when the compatibility measure
Otherwise, if
then the sample is assigned to the most compatible cluster, which has its center updated using:
where λ ∈ [0, 1] is the learning rate. The eMG consequent parameters are updated by a weighted least squares recursive algorithm.
eFCE [50] builds its structure based on a recursive clustering algorithm with participatory learning and multivariable Gaussian membership functions. The eFCE structure evolves by including, merging, or excluding clusters and rules. As in eMG, clusters are created using a compatibility measure and an alert index. However, in eFCE, the compatibility measure is computed by Euclidean and Mahalanobis distances. Using two distances avoids the singularity problem when calculating the inverse of the dispersion matrix of clusters with a small number of samples. The distance measure is defined as follows: If
If
in which
The exclusion of clusters and rules is based on the concept of age and population. The cluster’s age is used to define the inactivity time of the cluster, and the population represents the number of samples attributed to a cluster. Thus, we find
in which t is the number of samples.
The procedure for the consequent parameters update of the eFCE is based on the Recursive Maximum Correntropy. Correntropy is a generalized similarity measure between two random variables that minimize the output error by maximizing the probability density of the source error.
The evolving Fuzzy with Multivariable Gaussian Participatory Learning and Multi-Innovations Recursive Weighted Least Squares (eFMI) [51], as the eMG and the eFCE, is an evolving system that uses a clustering algorithm based on participatory learning and multivariable Gaussian membership functions. The clustering algorithm uses the compatibility measure computed recursively to add a new cluster. In addition, the age and population concepts are used to exclude inactive clusters and rules.
The eFMI uses a merge procedure based on the remarkably overlapping of a cluster pair. In the method, if two clusters i* and i have the norm of the difference in distance between their centers, μ i * and μ i , less than or equal to a predefined threshold, then they are merged. More formally,
in which ρ is the threshold for merging clusters. The center of the new cluster is defined by the weighted average as follows:
in which n is the number of samples of the clusters. The new cluster depends on the number of samples of the merged clusters. The dispersion matrix of the resulting cluster is updated to the average value of the dispersion matrices of the merged clusters by:
The eFMI consequent parameters are updated by the MI-WRLS (Multi-Innovation Weighted Recursive Least Squares). The MI-WRLS introduces the concept of innovation length that guides the amount of innovation vector estimates and represents the errors passed used to update the consequent parameters.
eNFN [44] has a structure composed of n zero-order Takagi-Sugeno models, one for each input variable. The domain of the input variables is partitioned by triangular and complementary membership functions, and its structure evolves based on the modeling error computed recursively. The consequent parameters are updated by an algorithm based on gradient descent with an optimal learning rate.
The eNFN starts its learning with two membership functions for each input variable. New functions can be added based on the relationship between the global average error of the model and the average local error of the most active membership function. The mean value
and
where
A limiter τ is used to allow control over the number of rules, avoiding complex models and overfitting. In eNFN, this limiter is compared to the smallest distance (dist) allowed between the modal value of the function to be created and adjacent functions. A new membership function will be created and inserted into the model if
On the other hand, the exclusion of membership functions follows the concept of age. A function will be excluded from the model if it remains inactive for a long time [14]. The age of a particular membership function is calculated by:
where t is the current step and activated
j
represents the step where the activation of the j-th membership function took place. Consider
where ω is a parameter that indicates the time limit for deleting a function.
In this Section, we show in Table 2 a short comparison between the evolving algorithms used in this work. The table presents the membership functions, fuzzy rule type, the methods to add, merge, and delete rules and/or clusters, and the approach to updating the consequent parameters. In Table 2, WRLS is Weighted Recursive Least Squares, FWRLS is Fuzzy Weighted Recursive Least Squares, MI-WRLS is Multi-Innovation Weighted Recursive Least Squares, RMC is Recursive Maximum Correntropy, and GD is Gradient Descent with Optimal Learning Rate.
Comparison between the evolving models
Comparison between the evolving models
This section presents the proposed approach to taxi demand forecasting and classification. Section 4.1 details the approach for demand forecasting, while Section 4.2 describes the methodology for demand classification.
Demand forecasting
The proposed approach for taxi demand forecasting is illustrated in Figure 1. As can be seen, the approach consists of 4 steps. The first step is extracting the rides database, detailed in Section 4.1.1. Then, Section 4.1.2 describes the input variables and the construction of the dataset. Section 4.1.3 details the third step, which is the feature selection. Finally, the fourth and final step is the demand forecasting presented in Section 1.1.4

Steps of the proposed approach to demand forecasting.
The rides database contains information on taxi rides completed in a given city or region, provided by companies which operate in the field. Generally, this information is extracted from the monitoring systems installed in the cars (GPS devices and other mobile applications) and presented in the following format: Pick-Up DateTime, Drop-Off DateTime, Pick-Up Location, and Drop-Off Location. Additionally, these datasets can provide complementary information such as the route of the rides, the amount paid, payment type (credit card, cash, etc.), number of passengers, and type of vehicle.
After obtaining the rides dataset, the extraction of information used in the following steps is performed. Thus, only the following information is selected: Pick-Up DateTime, Drop-Off DateTime, Pick-Up Location, and Drop-Off Location.
The next step divides the region of interest into microregions, called zones. This division aims to frame the study region in pre-defined limits and cluster the rides’ locations. Several studies make this division by delimiting the maximum latitudes and longitudes of a region R and dividing the region into K similar polygons, with rectangles or hexagons being the commonly used shapes [8, 53]. A set of latitudes and longitudes defines each vertex of the polygon. Therefore, a zone z can be described as R z , where 1 ≤ z ≤ K. There are other ways of delimiting zones, such as the selection of tourist attractions (stadiums, museums, etc.) [32, 35], the spatial division by neighborhoods, or irregular shapes pre-defined, for example, by the local government [9]. In this work, to perform the zones division, the QGIS 3 tool was used. However, it is possible to find several other software to assist in the delimitation and division of zones, such as ArcGIS 4 and the library GeoPandas 5
Once the region was divided into zones, the locations (longitude and latitude of Pick-Up and Drop-Off) were converted to their respective zone number. A pair of coordinates will belong to a zone z, if its latitude and longitude are within the limits of the coordinates of all vertices of the polygon referring to that zone. It is noteworthy that the conversion of latitudes and longitudes into zones can also be performed with the support of the previously mentioned software, which can create polygons.
Considering a scenario in which the demand forecast for Pick-Up and Drop-Off is performed, a Pick-Up and a Drop-Off dataset are created for each of the k zones. For example, the Pick-Up dataset from the z zone is composed of the Pick-Up DateTime of all rides which occurred within the boundaries of that zone. After defining each ride’s zone and constructing the dataset for each taxi zone, the next step is the aggregation of rides to characterize taxi demands. First, the continuous-time (days and hours) is initially partitioned into identical sequential periods, defined as time intervals. Then, all rides completed in a given time interval are grouped to characterize the taxi demand for that interval. In other words, the taxi demand D in a zone z is obtained by the number of rides in that zone in a given time interval. Thus, at the end of this step, there is a dataset containing the time intervals and their respective taxi demands for each zone.
Feature construction
A time series is defined by a set of observations generated sequentially over time [54]. Therefore, given its characteristics, taxi demand forecasting can be defined as a time series, and this can be described as a nonlinear dynamic problem and represented as in [51, 55] by:
in which the forecast at step t is performed using n
y
lags to model the series, t being the current step, y the lagged values of the series,
In this study, the historical data used to model the series are characterized by:
for the same relative time interval, in which tr represents the previous day’s relative time interval of the demand to be predicted.
Thus, 18 variables extracted from historical values of taxi demand (series lags) are generated. In addition, using information extracted from taxi rides, the following 3 variables are obtained:
The exogenous variables are obtained from meteorological information, as suggested in [10, 33]. The meteorological data were extracted from the platform Wheater Underground
10
. On this platform, meteorological data are available every hour, being used in the dataset in the interval before the demand is predicted. The following list details the 5 variables used in this work:
Therefore, there are 26 variables, 21 of which are extracted from information contained in the rides dataset and 5 exogenous variables obtained from meteorological information. Thus, at the end of this step, there is a dataset containing the indexes of the intervals, the 26 input variables, and their respective taxi demands for each zone.
Once the creation of the datasets with the input variables and their respective desired outputs (taxi demands), a feature selection step is performed. This step identifies and selects the most relevant input variables for demand forecasting. In this work, the selection is performed using the Kruskal-Wallis [57] statistical method, which orders the variables according to their degree of relevance.
The selection of variables is necessary since a high number of dimensions presented to the models causes two main problems: (i) longer execution time of the models and; (ii) curse of dimensionality, a term used to explain the problem caused by the exponential increase in the volume of data associated with the inclusion of extra dimensions in the euclidean space [58]. In practical terms, once the number of variables exceeds a certain threshold, the models will likely lose performance, consequently affecting their results.
Once sorted, the most relevant N variables which will compose the new datasets are selected. Unfortunately, the Kruskal-Wallis method does not explicitly indicate which N is optimal for the best performance. Therefore, this value must be found empirically or by some other method. At the end of the feature selection step, for each zone, there is a dataset with the indexes of the intervals, the most relevant N variables, and their respective taxi demands.
Demand forecasting
In the proposed approach, demand forecasting is performed by an evolving fuzzy system which performs the forecast one step ahead, that is, for the next time interval. Then, the forecast is obtained individually for each Pick-Up and/or Drop-Off zone using the set of variables selected in the previous phase as input.
Therefore, if the most relevant variables for a given dataset are hdit-1, hdit-2, dhdtr-1, dhdtr-5, wk and td, the model for this dataset will be defined by:
Figure 2 summarizes the demand classification approach, illustrating its main steps. The steps of creating the dataset and demand forecasting are, mutatis mutandis, as described in Section 4.1 and therefore will not be described again. Section 4.2.1 describes the definition of classes. In this work, taxi demand classification is performed by transforming the demand forecasting task into a classification problem, as suggested in [59]. Thus, the classification is performed using the outputs of the forecast step. The demand classification step is detailed in Section 4.2.2. Finally, Section 4.2.3 illustrates the process of heatmaps construction.

Steps for classification and heatmaps generation.
Initially, the number of classes is defined. We chose to work with 4 classes of demand, namely: (i) Very Low; (ii) Low; (iii) Medium; (iv) High. Next, it is necessary to specify the domain of each class, i.e., the range of values which determine its lower and upper limits. It is desirable that these are defined, for each class, keeping the balance between the number of rides between classes. Thus, the problems caused by unbalanced classes are avoided [60]. Finally, class boundaries are identified based on the demands of all zones. In this work, the boundaries of all classes are always multiple of 5. The identification of limits can be performed with a histogram composed of the demands of all zones of a dataset of the rides.
Demand classification
As described in the previous section, classification is performed by transforming a forecasting task into a classification task. Thus, the outputs obtained in the forecasting step (see Section 4.1.4) are used in the demand classification. The values predicted by the evolving model are interpreted and converted into one of the classes. The conversion is defined as follows:
where LB and UP are the lower and upper bounds of the class, respectively.
Subsequently, the demands converted into classes will compose a heatmap. Heatmaps are graphical representations of taxi demands in a city or region. Through heatmaps, it is possible to visualize and interpret the demand for taxis simultaneously in all areas of a city or region in a simple, intuitive, and easy way.
Each zone is represented by colors on the map, with the highest demands identified by warmer colors and those with lower demands by cooler colors. In this work, the heatmap is generated considering the 4 classes defined in Section 4.2.1.
Experiments and results
This section presents the computational experiments to evaluate and compare the performance of the proposed approach in Pick-Up and Drop-Off demand forecasting and classification tasks. Section 5.1 illustrates demand forecasting and classification for Chengdu (China). The forecasting and demand classification experiments for New York City (United States of America) are shown in Section 5.2. The datasets used in experiments are available at the URL: https://shorturl.at/nPX89. The experiments were conducted using the following methodology:
and
in which
where cc is the number of samples correctly classified, and S is the total number of samples. The Precision is defined by (33) and the Recall by (34):
in which TP is True Positive, FP is False Positive, and FN is False Negative. In addition to Accuracy and F1-Score, the performance of the best classifier is represented by a Confusion Matrix and a Heatmap.
This section describes the computational experiments performed on the dataset regarding rides completed in Chengdu, China. The database is provided by Didi Chuxing 11 . The available files contain rides in the range from 11/01/2016 to 11/30/2016.
The study site is located in a specific region of Chengdu, with approximately 65 km2. The region’s boundaries range from 104.043 to 104.129 degrees East in longitude and 30.653 to 30.726 North in latitude. The area in question was partitioned into 25 rectangular zones with approximately 1.6 km on each side, as illustrated in Figure 3. The dataset was divided into 80% (11/01/2016 to 11/23/2016) for feature selection by Kruskal-Wallis and the definition of classes of values. The remaining 20% (11/24/2016 to 11/30/2016) are reserved for performance evaluation and comparison. Data sets were normalized between [-1 and 1]. To compare results, the settings mentioned above were the same used by Zhang et al. [8].

Division of 25 zones in Chengdu.
In his work, Zhang et al. [8] evaluated demand forecasting performance by comparing 11 models with offline learning: SVM, XGBoost, STL-LSTM, STL-ConvLSTM, STL-TCNN, MTL-LSTM, MTL-ConvLSTM, FLC-Net, FTCNN-Net, MTL-TCNN e MTL-TCNN (ST-DTW). The best results were obtained by MTL-TCNN (ST-DTW), which will be used as a benchmark in this work.
Table 3 illustrates the performance of the models in forecasting Pick-Up and Drop-Off with 15-minute intervals. The table presents the RMSE, NDEI, and R2 of the best results obtained by each models. It suggests that the best performance was achieved by ALMMo (with 5 input variables), followed by Zhang et al. [8], eFCE (with 5 input variables), eMG (with 5 variables input), eNFN (with 10 input variables), and eFMI (with 5 input variables). Figure 4 presents the forecast of ALMMo (with 5 input variables) at 15-minute intervals in zone 15, in which we see its fast convergence.
Performance on demand forecast at 15-minute intervals for Chengdu
Performance on demand forecast at 15-minute intervals for Chengdu

ALMMo forecasts (with 5 input variables) at 15-minute intervals, for zone 15, from 11/24/2016 to 11/30/2016.
Table 4 presents the best results of each model in predicting Pick-Up and Drop-Off for the 30-minute intervals. In this experiment, the best performance was achieved by ALMMo, followed by eFMI, both with 5 input variables. The eMG (with 5 input variables) obtained the third best performance for Pick-Up and the eFCE (with 5 input variables) for Drop-Off. The model with the worse performance was eNFN (with 10 input variables). Figure 5 illustrates the forecast of Pick-Up and Drop-Off of ALMMo (with 5 input variables) with 30-minute intervals for zone 15. As in Figure 4, in Figure 5, we see the fast convergence of the ALMMo.
Performance on demand forecast at 30-minute intervals for Chengdu.

ALMMo forecasts (with 5 input variables) at 30-minute intervals, for zone 15, from 11/24/2016 to 11/30/2016.
Table 5 shows, for Pick-Up and Drop-Off, the range of values of each class for the 15 and 30-minute intervals, in which D represents the demand. In addition to the range of values, the table highlights (in parentheses) the number of samples in each class.
The performance of the models in classifying demand by value range with 15-minute intervals for Chengdu was illustrated, in Table 6, by Accuracy and F1-Score as well as its respective Standard Deviation. The best performance of Pick-Up and Drop-Off was achieved by ALMMo (with 5 input variables), followed by eFCE (with 5 input variables). The eMG achieved the third best performance for Pick-Up and the eFMI for Drop-Off, both with 5 input variables. The eNFN (with 10 input variables) obtained the worse performance.
Definition of demands in value ranges for Chengdu
Definition of demands in value ranges for Chengdu
Performance on demand classification at 15-minute interval for Chengdu
Figure 6 shows the confusion matrices for Pick-Up and Drop-Off at 15-minute intervals. The ALLMo (with 5 input variables) achieved the best performance in the Very Low and High classes, with 89% and 91%, respectively. In the intermediate classes, the assertiveness rate was equal to or above 74%, with the highest hit rate obtained in the Medium class for Pick-Ups, with 82%. In addition, note that incorrect classifications, when they occur, are only in adjacent classes.

Confusion Matrices at 15-minute intervals for ALMMo.
Figure 7 illustrates the heatmaps for the 25 zones of Chengdu. Figure 7 (A) shows the map generated based on the ALMMo classification for Pick-Up, and Figure 7 (B) shows the desired map. On the other hand, Figure 7 (C) depicts the map for Drop-Off obtained based on the ALMMo results, and Figure 7 (D) the desired one. As can be seen, there are few visual differences between the predicted and desired maps, thus showing the good performance of the evolving model in question. The black highlights illustrate the zones where the predicted class does not match the desired class. These highlights exemplify that, even though the model did not perform the classification correctly, the difference in this error is only one adjacent class.

Heatmaps in the 25 zones in Chengdu - day 11/30/2016 in the range from 07:00 to 07:15: (A) map obtained by ALMMo {N = 5} for Pick-Up; (B) desired map for Pick-Up; (C) map obtained by ALMMo {N = 5} for Drop-Off and; (D) desired map for Drop-Off.
Table 7 shows the Accuracy and F1-Score, as well as its Standard Deviation for Chengdu, in intervals of 30 minutes. The best performance for Pick-Up and Drop-Off was obtained by ALMMo (with 5 input variables), followed by eFMI (with 5 input variables). Subsequently, eMG (with 5 input variables) achieved the best performance for Pick-Up and eFCE (with 5 input variables) by Drop-Off.
Performance on demand classification at 30-minute interval for Chengdu
Figure 15 illustrates the confusion matrices for Pick-Up and Drop-Off for 30-minute intervals. Again, the best hits obtained by ALMMo (with 5 input variables) were in the Very Low and High classes, with a minimum percentage of 88% and a maximum of 94%. The hit rate is equal to or above 78% in the intermediate classes, with better accuracy achieved in the Medium class, with 86%. Again, as with 15-minute matrices, the errors are present only in adjacent classes.

Confusion Matrices at 30-minute intervals for ALMMo.
The heatmaps generated for the 25 zones of Chengdu, considering the interval from 07:00 to 07:30 a.m. on 11/30/2016, are presented in Figure 9. The composite map using the ALMMo results for Pick-Up is shown in Figure 9 (A) and the desired one in Figure 9 (B). On the other hand, the map for Drop-Off generated based on the ALMMo forecasting is illustrated in Figure 9 (C), and the desired map for Drop-Off is in Figure 9 (D). As in the maps in Figure 9, it can be seen that the predicted and desired maps are similar. Once more, the black highlights show examples of zones where there is a divergence between the predicted and the desired classification. The difference in ranks is also just one adjacent class.

Heatmaps in the 25 zones in Chengdu - day 11/30/2016 in the range from 07:00 to 07:30 a.m.: (A) map obtained by ALMMo {N = 5} for Pick-Up; (B) desired map for Pick-Up; (C) map obtained by ALMMo {N = 5} for Drop-Off and; (D) desired map for Drop-Off.
This section presents the computational experiments on the rides dataset from New York City in the United States of America. The database comes from the NYC Taxi and Limousine Commission 14 and has daily records of the trips of three types of taxis: yellow, green, and FHV (For-Hire Vehicle). In this dataset, the city of New York is divided into 263 zones (microregions) already pre-defined by the city government. The experiments were carried out in a subset of 63 zones of Manhattan, New York [8], as illustrated in Figure 10. The experiments were carried out from 01/07/2017 to 30/06/2020. The first 20% (07/01/2017 to 12/31/2017) were reserved for feature selection by Kruskal-Wallis and the classification of the values range. The samples’ remaining 80% (01/01/2018 to 06/30/2020) were used to evaluate the models. This period also includes the COVID-19 pandemic 15 , which has negatively affected demand for private rides services. Datasets were normalized by the interval [1].

Division of 63 zones in New York.
Table 8 shows the best performance of the models in New York for the 15-minute interval in Pick-Up and Drop-Off. The best result was presented by ALMMo (with 20 input variables), followed by eMG (with 10 input variables), eFCE (with 10 input variables), eNFN (with 5 input variables), and eFMI (with 26 input variables). Figure 11 illustrates ALMMo’s Pick-Up and Drop-Off forecasts (with 20 input variables), from 12/10/2019 to 06/30/2020 at Yorkville West (zone 263), with 15-minute intervals. The graph shows a drastic decrease in demands at instant 9000. This decrease is due to the decree establishing the lockdown due to the COVID-19 pandemic. It is relevant to highlight that the model was able to adapt quickly to changes in data dynamics.
Performance on demand forecast at 15-minute intervals for New York
Performance on demand forecast at 15-minute intervals for New York

ALMMo forecasts (with 20 input variables) at 15-minute intervals for Yorkville West (zone 263) from 12/10/2019 to 06/30/2020.
Table 9 illustrates the best results of the models for New York for the 30-minute intervals in Pick-Up and Drop-Off. The ALMMo (with 20 input variables) was the model which presented the best results. The eMG (with 10 input variables), eFCE (with 10 input variables), eFMI (with 26 input variables), and eNFN (with 5 input variables) describe the best results in the sequence. Figure 12 depicts the ALMMo Pick-Up and Drop-Off forecasts (with 20 input variables) from 12/10/2019 to 06/30/2020 at Yorkville West (zone 263), with 30-minute intervals. As seen in Figure 11, there is a drastic change in the trend pattern of demands. Again, the model maintained the forecast adequate to the new dynamics of the data, indicating good adaptability and accuracy in both time intervals.
Performance on demand forecast at 30-minute intervals for New York

ALMMo forecasts (with 20 input variables) with 30-minute intervals for Yorkville West (zone 263) from 12/10/2019 to 06/30/2020.
Table 10 illustrates the ranges and the respective values contained in each of the 4 classes for the New York dataset, where the values in parentheses represent the amount in each class, and D, the demand.
Definition of demands in value ranges for New York
Definition of demands in value ranges for New York
The classification performance for Pick-Up and Drop-Off obtained by each model for the 15-minute interval in New York are described in Table 11 by the Accuracy, F1-Score, and its respective Standart Deviation. The best performance was obtained by ALMMo (with 20 variables), followed by eMG (with 20 input variables), eFCE (with 20 input variables), eNFN (with 5 input variables), and eFMI (with 26 input variables).
Performance on demand classification at 15-minute interval for New York
Figure13 illustrates the confusion matrices for Pick-Up and Drop-Off at 15-minute intervals. The ALLMo (with 20 input variables) performed better in the Very Low class with a hit rate of 95%, after in the High class, with 88%. In the Low and Medium classes, the ALLMo had hit rates equal to or above 78%, demonstrating its good assertiveness. It is noteworthy that the incorrect classifications have a low percentage in the matrix, and the errors are found only in adjacent classes.

Confusion Matrices at 15-minute intervals for ALMMo.
Figure 14 shows the heatmaps for the 63 zones of New York on 01/01/2020 for the range from 6:00 to 6:15 p.m.. Figure 14 (A) illustrates the map obtained based on the ALMMo results for Pick-Up, and Figure 14 (B) the desired map for Pick-Up. On the other hand, Figure 14 (C) depicts the map for Drop-Off built based on the results of ALMMo, and Figure 14 (D) the desired one for Drop-Off. The black highlights show some areas where there were differences in classification. As can be seen, the few errors in the classification remain in adjacent classes.

Heatmaps for 63 zones in New York - 01/01/2020 in the range of 18:00 to 18:15: (A) map obtained by ALMMo {N = 20} for Pick-Up; (B) desired map for Pick-Up; (C) map obtained by ALMMo {N = 20} for Drop-Off and; (D) desired map for Drop-Off.
The results of the 30-minute interval in New York are shown in Table 12. ALMMo (with 20 input variables) was the model with the best performance, followed by eFCE (with 10 input variables), eMG (with 10 input variables), eNFN (with 5 input variables), and eFMI (with 26 input variables).
Performance on demand classification at 30-minute interval for New York
Figure 15 illustrates the confusion matrices for Pick-Up and Drop-Off at 30-minute intervals. The greatest hits of the ALMMo (with 20 input variables) occurred in the Very Low and High classes, with 96% (Pick-Up) and 95% (Drop-Up) in the Very Low class, and 91% (Pick-Up and Drop-Off) in the High class. In the intermediate classes, the ALLMo obtained assertiveness equal to or above 78%. Here we can see the same pattern as in the other experiments, i.e., the low rate of errors and these errors happening only in the adjacent classes.

Confusion Matrices at 30-minute intervals for ALMMo.
The heatmaps generated for the 63 zones of New York, covering the range from 6:00 to 6:30 p.m. on 01/01/2020, are shown in Figure16. The map constructed using ALMMO results for Pick-Up is shown in Figure 16 (A) and the desired one in Figure 16 (B). On the other hand, the map for Drop-Off generated based on the ALMMo forecasts is illustrated in Figure 16 (C), and the desired map for Drop-Off is in Figure16 (D). As with previous heatmaps, the black markings exemplify zones with a divergence between classifications. Again, as in the other maps, the classification error is between adjacent classes.

Heatmaps for 63 zones in New York - 01/01/2020 in the range of 18:00 to 18:30: (A) map obtained by ALMMo {N = 20} for Pick-Up; (B) desired map for Pick-Up; (C) map obtained by ALMMo {N = 20} for Drop-Off and; (D) desired map for Drop-Off.
This work presented a new approach to taxi demand forecasting and classification. The proposed approach considers historical data from taxi rides and meteorological information and uses the Kruskal-Wallis statistical method to identify the most relevant variables. Then, the forecast is performed by evolving fuzzy systems, and the classification is carried out using the forecast outputs.
Computational experiments were carried out to evaluate the performance of the proposed approach considering ALMMo, eMG, eFCE, eFMI, and eNFN, to predict and classify demand for Pick-Up and Drop-Off with 15 and 30-minute intervals in Chengdu and New York. The computational results of the evolving algorithms were compared with each other and state-of-the-art. The algorithms of the proposed approach proved to be competitive with state-of-the-art, achieving superior or comparable performance. The comparison was performed only between the evolving models in the classification experiments. Among the evolving models, ALMMo presented the best results.
The computational results suggested that the proposed approach is promising for the forecasting and classification of passenger demand since: (i) it presents online data processing, different from most works with the same theme in the literature; (ii) it obtained better or similar results compared to state-of-the-art approaches; (iii) it showed consistency in forecasts over extended periods. In addition, the proposal uses heatmaps as a more intuitive way of visualizing the data, thus facilitating decision-making more assertively.
The limitations of the research include: the lack of similar works on passenger demand involving evolving systems for purposes of comparison with the proposed approach; the lack of real-time experiments to assess the evolving characteristics of the systems, and; the inaccessibility of ride databases with more recent data, due to the privacy of information by taxi companies.
As a perspective for future work, it is suggested to apply new techniques of region division to the proposed approach. Another possible future work would incorporate methodologies that analyze and decode images to capture spatial information from the regions studied. Finally, evaluating the proposed approach’s robustness and scalability is important.
Footnotes
Acknowledgment
The authors acknowledge CAPES, Brazilian Ministry of Education, code 001. In addition, the authors would like to thank Didi Chuxing for the dataset. Data source: DiDi Chuxing GAIA Open Dataset Initiative.
Places reserved to taxi requests.
The time window of the historical data has been set to 8 past intervals as in [9,
].
The historical number chosen was 8, as it also includes all the days of the previous week, allowing the models to learn the specific trends of each day.
Most works identify the moving average of previous days as the historical average. Therefore, this work will follow the same nomenclature.
The time division was based on results obtained from empirical observations of local traffic in the regions, provided by municipal governments.
