Abstract
In this paper, based on the fusion of Lidar and Radar measurement data, a real-time road-Object Detection and Tracking (LR_ODT) method for autonomous driving is proposed. The lidar and radar devices are installed on the ego car, and a customized Unscented Kalman Filter (UKF) is used for their data fusion. Lidars are accurate in determining objects’ positions but significantly less accurate on measuring their velocities. However, Radars are more accurate on measuring objects velocities but less accurate on determining their positions as they have a lower spatial resolution. Therefore, the merits of both sensors are combined using the proposed fusion approach to provide both pose and velocity data for objects moving in roads precisely. The Grid-Based Density-Based Spatial Clustering of Applications with Noise (GB-DBSCAN) clustering algorithm is used to detect objects and estimate their centroids from the lidar and radar raw data. Then, the estimation of the object’s velocity as well as determining its corresponding geometrical shape is performed by the RANdom SAmple Consensus (RANSAC) algorithm. The proposed technique is implemented using the high-performance language C
Introduction
Improving safety, lowering road accidents, boosting energy efficiency, enhancing comfort, and enriching driving-experience are the most important driving forces behind equipping present-day cars with Advanced Driving Assistance Systems (ADAS) [1, 2]. Many ADAS functions represent incremental steps toward a hypothetical future of safe fully autonomous cars [3, 4, 5, 6, 7, 8, 9, 10, 11, 12].
A critical component of the various ADAS features that are also highly required in autonomous cars is the recognition and accurate assessment of the surroundings [5, 6, 7, 8]. This component depends on data observed from sensors mounted on the ego car [9]. If there is an object close by, it is of interest to know where that object is, what the object’s velocity is, and if the object can be described by a plain geometric shape [12]. Lidar and radar are ones of the sought-after sensors for exploiting in ADAS and autonomous-car features [13]. A lidar always returns many concentrated detection points (point-cloud) that describe each detected object [14, 15]. Likewise, a radar often returns multiple detections per target but not as dense as a lidar [16]. This means that it is necessary to group detections originating from the same target, i.e. to cluster the detections, to obtain information about the surroundings [16, 17].
The ego car equipped with a lidar and radar receives a collection of raw data of sensors measurements that include information of detected road objects. Then, the proposed LR_ODT method employs a two-step approach to find and identify these road objects within the received data. The first step is to coarse cluster the lidar/radar raw data separately to detect objects within using the Grid-Based Density-Based Spatial Clustering of Applications with Noise (GB-DBSCAN) algorithm [18]. Accordingly, each object is then represented with its core point (centroid). The second step is the estimation of the object’s cluster velocity as well as determining its corresponding geometrical shape, which is performed using the RANdom SAmple Consensus (RANSAC) iterative algorithm [19].
As mentioned before, it is mandatory to have continuous, precise, and accurate velocity and position information about the objects surrounding the ego car. In this paper, this is accomplished by combining data from lidar and radar sensors.
Recent lidars have a large range (up to 200 m) and a wide field of view, and can thus track objects even at big distances (necessary at high speeds) as well as in curves (i.e. very accurate in position measurement) [13]. Their main drawback is that they completely lack dynamic information about the detected objects (velocity measurement). Radar sensors, on the other hand, have a relatively narrow field of view and reduced angular resolution (less accurate in position measurement), but use the Doppler effect to directly provide velocity information. The fusion of the data from both sensors can thus benefit from the combination of their merits [20].
Accordingly, sensor fusion of lidar and radar that combines the strengths of both sensor types is a logical step. This step has been investigated earlier in the literature, with promising prospects in the automotive industry [21].
As an early endeavor, Gohring et al. [20] apply lidar-radar fusion for the application of car following on highways based on the Kalman Filter (KF) [22]. To test the performance of their fusion technique, the authors have formed the ground truth by computing the mean square errors of relative distances and velocities in a highway-tracking scenario using a least-squares polynomial approximation of sensors data.
Moreover, to improve the perceived model of the environment, Chavez-Garcia et al. [23] include the objects’ classification from multiple sensors (lidar, radar, and camera) detections as a key component of the object’s representation and the perception process that is based on a framework derived from evidence theory. The fusion approach is tested using real data from different driving scenarios and focusing on four objects of interest: pedestrian, bike, car, and truck.
For tracking multiple objects, Rangesh et al. [24] propose a modular framework capable of accepting object proposals from different sensor modalities (cameras and lidars) and fuse them. The approach is tested on real-world highway data, showing its effectiveness to track objects through entire maneuvers around the ego-vehicle.
For obstacle detection, Hajri et al. [25] employ the global nearest neighbor standard filter (GNN) on the fused lidar/radar sensors data for associating new measurements with the underlying observed objects. The benefits of data fusion comparing with the use of a single sensor are illustrated through several tracking scenarios (on a highway and a bend).
The emphasis of this paper is on the data fusion between lidar and radar for road-objects’ tracking. The proposed technique is the acronym: a Lidar/Radar-based road-Object Detection and Tracking technique (LR_ODT). According to the presented state of the art, two different general fusion methods can be distinguished and both were applied for lidar and radar: Kalman filter and evidence theory. However, approaches that apply any of these methods agree on the main steps. Meanwhile, the technique that has been adopted in this work is based on using Kalman filters. Despite the previously mentioned works [26], the literature still clearly lacks the investigation of employing the UKF [27, 28] for lidar/radar fusion while applied for road-object tracking for autonomous driving. Therefore, this paper will mainly focus on the tailoring and implementation of UKF for tracking various road objects. The UKF design will be validated by tracking three road-objects: car, bicycle, and pedestrian. A quantitative comparison between the performance of the UKF versus that of the EKF [29, 30], as well as a quantitative comparison between the performance of the LR_ODT with and without the employment of lidar/radar fusion, are carried out.
The contribution of this paper can be enumerated as follows:
Tailoring the UKF as well as the EKF algorithms to fuse multiple radars and lidars data to achieve more accurate pose data for moving objects around the ego car, proving that the UKF-based method has better performance but the EKF-based method is less computationally demanding. Employing a high-order-generic-object-motion model (5 state variables that suits the most common road-objects: car, bicycle, and pedestrian) in the development of the UKF and EKF to generate more accurate estimates and improve the overall performance. Carrying out a quantitative comparative study between the sensor fusion performance using EKF and UKF using the same use cases. Evaluating the gain of fusion by testing the UKF on three different cases (lidar Evaluating the real-time performance of both the EKF and UKF on a moderate computational platform. Employing the GB-DBSCAN clustering algorithm to detect potential objects from the lidar/radar raw data, and finding their centroids. Employing the RANSAC algorithm and object proposals for bicycle, car, and pedestrian for estimation of the detected object’s cluster velocity as well as determining its corresponding geometrical shape.
The Kalman filter [22] is a system of equations working together to form a predictor-update cyclic optimal estimator that minimizes the estimated error covariance. The KF estimates the state
where
The KF estimation process works in two steps:
The predication step: estimates the next state as follows:
The measurement update step: works as follows
where
However, the above equations are only limited to linear processes, and accordingly, it is not suitable to the radar measurement process which is inherently nonlinear. Therefore, the extended Kalman filter [29] is introduced. The EKF estimation process is represented by Eq. (2) instead of Eq. (2) as follows:
where
The derivatives of
Accordingly, the EKF process is represented as well by the prediction and update Eqs (1) and (2) respectively after replacing
Due to the EKF is using only the first-order derivative in the linearization process and ignoring the higher-order terms, errors are accumulated in the state and covariance estimation. The unscented Kalman filter is introduced [27] to overcome this limitation. The UKF is a derivative-free alternative to EKF that uses a deterministic sampling approach. The UKF utilizes as well the predict-update two-step process, however, they are now augmented with further steps like generation and prediction of sigma points as shown in Fig. 1.
UKF roadmap.
In the UKF process, the state Gaussian distribution is represented using a minimal set of carefully chosen sample points, called sigma points.
where
In the sigma-point prediction step, each generated sigma point is inserted in the UKF nonlinear process model given in Eq. (8) to produce the matrix of the predicted (estimated) sigma points, which has an
In the next step, the predicted state-mean and covariance matrices are calculated from the predicted sigma points as given in Eq. (9):
where
In the measurement prediction step, each generated sigma point is inserted in the UKF nonlinear measurement model given in Eq. (11) to produce the matrix of the predicted measurement sigma points, which has an
In the next step, the predicted measurement-mean-and-covariance matrices are calculated from the predicted sigma points as well as the measurement noise covariance matrix
where
The final step is the UKF state update, where the UKF gain matrix (
The state of the moving object [31] is determined by the five variables grouped into the state vector
The nonlinear
where
An object motion model.
If the
Lidar and radar data fusion using EKF.
Figure 3 presents the lidar and radar data fusion technique employing the EKF. The received sensor raw data (either lidar or radar) is getting processed before being supplied to the EKF.
The processing is performed using clustering and association algorithms. DBSCAN [32] is an unsupervised clustering algorithm that groups together data points if the density of the points is high enough. It requires two parameters to determine the density. The first parameter is
While GB-DBSCAN is used for coarse clustering, the RANSAC [19] is used to fine-tune the clustering and associate geometrical shape proposals to potential coarse clusters. The output of the RANSAC is a fine-tuned object centroid (
The processed lidar measurement vector includes the moving object centroid position (
The EKF state nonlinear model is the moving object model that is described in detail in Section III and mathematically represented by Eqs (15)–(3). The Jacobian of this state model (
Equation (4) presents the lidar measurement model (
where
Lidar and radar data fusion using UKF.
Additionally, Eq. (25) shows the derived Jacobian matrix (
where
As per the above presentation, each sensor has its own prediction update scheme, however, both sensors share the same state prediction scheme. The belief about the object’s position and velocity is updated asynchronously each time the measurement is received regardless of the source sensor.
Both distinct update schemes are getting updated by any received measurement data (either from lidar or from radar). The state vector (
Figure 4 presents the lidar and radar data fusion technique employing the UKF. After computing the elapsed time between consecutive sensor reading (
Then the fusion technique thus branches into two directions based on the source of the last sensor data measurement. If the source is a radar, and employing the nonlinear radar measurement model (Eq. (20)), the predicted measurement sigma points (
If the measurement data source is a lidar, and employing the linear lidar measurement model (
Implementation of Kalman filters
Both Kalman filters (EKF and UKF) are implemented using the high-performance language GCC C
In the following sections, several important matters that have a crucial effect on the Kalman filters design process will be highlighted and discussed.
Noise parameters setting
The object motion model described by Eqs (14)–(3) includes several noise parameters that need to be carefully set. To set the two process noise parameters as an example: the longitudinal acceleration noise standard deviation
The Kalman filters noise parameters
The Kalman filters noise parameters
The KF design is considered consistent if the estimation error (
The radar’s measurement is a three-dimensional vector (three degrees of freedom), therefore, for a consistent EKF or UKF design the values of
In both sensors, the
The proper initialization of the Kalman filter is very crucial to its subsequent performance [38]. The main initialized variables are the estimate state vector (
The first two terms of the state vector
The state covariance matrix is initialized as a diagonal matrix that contains the covariance of each variable estimate (Eq. (29)). The initialization logic works as follows: little or almost no correlation among the state variables (independent variables) is assumed, therefore, the off-diagonal terms (covariances between variables) are initialized to zeros. Each diagonal term represents the variance (confidence) of each state element estimate as shown in Eq. (29). The variance of each element is initialized depends on the a priori information about this element. Since the first two elements of the state vector (
Initialization of Kalman filters states
To check the performance of the KF, in terms of how far the estimated results from the true results (ground truth). There are many evaluation metrics [39], but perhaps the most common one is the Root Mean Squared Error (RMSE) given in Eq. (30). The metric is calculated over a moving window of measurements of length
Performance evaluation of Kalman filters
Test track for a moving bicycle.
Extensive trials-and-errors attempts are used to tune the many hyper-parameters of the LR_ODT. However, to be more consistent and accurate, numerical Key Performance Indicators (KPIs) are constructed and coded as in Eqs (27) and (30) to evaluate the performance of the fusion technique under the given set of hyper-parameters.
Several test tracks have been used to evaluate the performance of the LR_ODT under different sets of hyper-parameters in an iterative tuning process. Examples of these test tracks are shown in Figs 5–7. These tracks are representing three different moving objects, bicycle, car, and pedestrian respectively, to emulate various motion profiles and velocities.
Consistency evaluation of Kalman filters
Consistency evaluation of Kalman filters
Test track for a moving car.
Test track for a walking pedestrian.
Table 3 presents the testing results of the LR_ODT fusion algorithm that uses the UKF and compares it with that uses the EKF. The performance evaluation is carried out on the three test tracks. The RMSE KPI (Eq. (30)) is used to compare both UKF and EKF performances on the five state variables:
Sensor fusion evaluation of the UKF (Bicycle Track)
Performance of the UKF position estimation on the bicycle track.
UKF-velocity-estimation performance on the bicycle track.
The UKF design-consistency indicator (NIS) results are reported in Table 3. The table reports values for the NIS by taking into consideration the estimated measurements by lidar alone, the estimated measurements by radar alone, and after combining both lidar and radar measurements. The values reported in the 5
Sensor fusion execution time of EKF and UKF
LR_ODT execution time for single object
EKF-velocity-estimation performance on the bicycle track.
To assess the significance of the fusion between lidar and radar in tracking. The UKF is tested in one time with measurements from lidar alone, and another time with measurements from radar alone. The results reported in Table 5 show how fusion makes the difference and substantially improves accuracy. The estimation of all state variables is spectacularly improved. For example, the RMSE of
Comparing UKF performance with that of the EKF, Table 3 details the results. It is obvious that UKF outperforms EKF at all velocity and motion profiles. The accuracy of all the estimated states is sustainably higher using the UKF. States that got affected more with non-linearities in the object model (e.g.
UKF-yaw-angle estimation performance on the bicycle track.
UKF-yaw-rate estimation performance on the bicycle track.
Lidar NIS values for UKF on the bicycle track.
Radar NIS values for UKF on the bicycle track.
The LR_ODT proved to be well enough fast in execution to be used in real-time. Using an Intel Core i5 with 1.6 GHz and 8 GB RAM which is a very moderate computational platform, the following measurements (Table 6) are collected for both EKF and UKF.
Table 6 shows how the EKF is twice faster than UKF. By considering that the lidar/radar measurements are collected at approximately 30 fps rate. Then the measurement cycle is 33.3 msec which is large enough to be utilized for tracking 25 objects using EKF or 13 objects using UKF according to the data in Table 7.
In this paper, a real-time road-object detection and tracking method (LR_ODT) for autonomous cars is proposed, implemented and described in detail. The method uses a tailored unscented Kalman filter to perform data fusion for the mounted lidar and radar devices on the ego car. The raw data of the lidar/radar are getting clustered using both GB-DBSCAN and RANSAC algorithms to produce the raw object’s pose and to determine its geometrical shape. The LR_ODT method is fully implemented using GCC C
The validation results show that the proposed method is reliably able to detect and track three types of street objects: bicycle, car, and pedestrians on three different tracks and speed profiles. The employed generic object motion model is comprehensive and is described using five state variables. The UKF has outperformed the EKF on all test cases and all the state variable levels (
Comparing the validation results of the UKF applied to a single sensor to the one employing the fusion of multiple sensors, show how outstanding is the improvement in tracking performance using the later (
The measured throughput (execution time) using an affordable CPU proved that the LR_ODT method is very suitable for real-time multi-object detection and tracking.
In the future, it is intended to add a front-camera to the presented fusion technique and further investigate the benefits it will add to the overall tracking performance. Furthermore, the LR_ODT will be augmented with other road objects like guardrails, trucks, animals, etc.
