Abstract
Technological development brings increasingly closer the era of widely available self-driving cars. However, presumably there will be a time when human drivers and self-driving cars would share the same roads. In the current paper, we propose a cognitive warning system that utilizes information collected from the behaviour of the human driver and sends warning signals to self-driving cars in case of human related emergency. We demonstrate that such risk detection can identify danger earlier than an external sensor would, based on the behaviour of the human-driven vehicle. We used data from a simulator experiment, where 21 participants slalomed between road bumps in a virtual reality environment. Occasionally, they had to react to dangerous roadside stimuli by large steering movements. We used one-class SVM to detect emergency behaviour in both steering and vehicle trajectory data. We found earlier detection of emergency based on steering wheel data, than based on vehicle trajectory data. We conclude that tracking cognitive variables of the human driver means that we can utilize the outstanding power of the brain to evaluate external stimuli. Information about the result of this evaluation (be it steering action or saccade) could be the basis of a warning signal that is readily understood by the computer of a self-driving car.
Introduction
Since 2009, when Google started testing Google Chauffeur driven cars, they accomplished driving over 1.5 million miles with only 22 documented minor accidents [1]. Interestingly, human error was found underlying all but one of these [2]. This warns to the fact that in spite of self-driving cars being a safer mode of transportation [3], a hybrid traffic of human-driven and self-driving cars is still prone to human faults. Human drivers are object to biological limitations (e.g. drowsiness) and tend to do multitasking in the car, thus providing suboptimal response in emergency situations [4]. Several in-car warning system designs have been implemented in order to reduce the risk of fatal outcomes [5]. In the present paper, we propose that these warning systems should not only raise the driver’s attention, but could be also used to inform other participants of the traffic, namely self-driving cars.
Widespread availably of passenger cars in the middle of the 20th century raised attention to traffic safety [9]. Since then, several different kinds of accident risk evaluation systems have been proposed. Amongst these we can distinguish three main types based on the source of data they use for estimation. These are (1) traffic data based, (2) car position based, and (3) driver behaviour based approaches. Traffic data-based approaches are typically based on traffic surveillance data and use that to evaluate the risk of accident depending on timeslot, traffic frequency and area (highway, intersection) [10, 11, 12, 13, 14]. Not entirely different from these systems [15, 16, 17] are those that work on the single car basis and use sensors of the master vehicle to predict risks of the peers. Current self-driving concept cars rely mostly on this technology [18]. The third type of risk evaluation systems is the set of systems that collect information from the driver. Driver behaviour-based models use gaze [19, 20], facial coding [19, 21], EEG [22, 23], and motion trajectories [24, 25, 26] recorded with various sensors. These solutions give very good real-time estimates that can be used to warn the driver for a potential risk of falling asleep [24, 27], driving through a red light [26], or for optimal lane-changing trajectory [28]. Here, we propose that these warnings could help the hybrid traffic of human-driven and self-driving cars in the future. This way they work more as a communication channel between two agents and not as a one-way sensor, hence the term cognitive in the title.
While a human driver may not be able to evaluate a warning message from a lead car in a couple milliseconds, this is not a problem for the processor of a self-driving car. Automated vehicles constantly monitor their surroundings with several sensors to provide the safest transportation possible [29]. Nonetheless, information collected inside the car’s cockpit may forego the externally detectable risk with tens or, sometimes, hundreds of milliseconds. This is true even if we take the steering wheel, where there is a few millisecond delay between the steering action and the chassis response [30]. Thus, these warnings may be extremely helpful for self-driving cars.
The proposed solution could be a good example of how biological and artificial cognitive agents could co-evolve [31, 32], emerging in a safer traffic infrastructure. The current proposal is not the first that promote consideration of cognitive factors in traffic safety [9, 33, 34], or increased communication between traffic participants [35, 36]. However, it is unique in its emphasis on human-to-machine information flow. Ongoing research [17, 29, 36, 37] is focusing on the design of optimal wireless communication between vehicles (vehicle to vehicle, V2V) and between vehicles and road-side units (vehicle to infrastructure, V2I). These communication links support efficiently the drivers’ situational awareness. Although situational awareness often refers to human situational awareness [38], it bears relevance also in human-machine (or possibly even in machine-machine) situations as a general concept of information availability and use in an interaction [39]. To demonstrate whether we can potentially facilitate situational awareness of a machine, we validate our idea by predicting abrupt steering wheel turn actions of a human driver in a virtual reality simulator paradigm. Here, from time to time the driver had to make emergency steering movements to roadside stimuli [40]. In the present analysis we used the car trajectory and the steering wheel angle data to investigate how early we can detect the initiation of an emergency steering behaviour only based on data from either external sensor.
In the current proof-of-concept implementation we used a one-class support vector machine (OC-SVM). SVM [41, 42, 43] is a set of machine learning models that uses support vectors (i.e. hyperplanes) in high dimensional space for classification and regression problems. Our choice of model was motivated by three main reasons. First, SVM solutions are fast and are often used in real-time applications [44]. Second, such a model can be extended, for example, a recent study presented a hybrid model of an OC-SVM and a deep belief network that outperformed a deep autoencoder in terms of speed on an anomaly detection task in high dimensional data [45]. Third, SVM can be trained even on computers with modest processing power. This latter argument is important since the current ideas may later give birth to an actual product. Presumably, people who cannot afford buying new self-driving cars would adhere to using human-driven cars, and thus would be the target audience of such an instrument. This facilitates the design of an efficient, yet inexpensive device.
We hypothesized that abrupt steering movements can be readily detected using both steering and car trajectory data. Moreover, we predicted that emergency events are detected earlier based on steering than on trajectory data. We aimed to propose a general anomaly detection system that could potentially use multidimensional data (e.g. EEG, eye-tracking etc.). These sensors could provide even earlier detection of an emergency [46]. Therefore we did not include any prior expectation of the dangerous events, only data of normal driving and hence the use of OC-SVM.
The experimental design. (a) Participants had to slalom through road bumps on a rural road. (b) From time to time, a deer raised up its head from the bushes. If the animal was facing to the road they had to steer to the other end of the road. If the deer looked the other direction they did not have to do anything. The red rectangle serves illustrative purposes.
Participants
Twenty-three participants took part in the virtual reality experiment. Two of them experienced simulator sickness, therefore their data was excluded. The training and test data were extracted from the steering and trajectory data of the remaining 21 participants (age
Experiment
The experiment took place in a cave automatic virtual environment (CAVE [47]) at the Centre de la Realité Virtuelle de la Mediterranean (CRVM), Aix-Marseille University. The CAVE consisted of three backprojected, 3 by 4 meter side screens and a fiberglass screen of 3 by 3 meter on the floor. Two Barco 5000 lumen projectors illuminated each screen. Participants sat in a custom built car simulator consisting of a car seat frame and a force feedback steering wheel (Logitech G27). Sounds were coming from two loudspeakers placed on both sides of the car frame.
We designed a driving simulator game in Unity 3D, where participants were told to drive on a rural road bounded by bushes on both sides. The road was flat and the scene did not contain other landmarks that may have distracted the driver’s attention. The experiment contained two kinds of tasks. Most of the time they had to slalom between road bumps. The task required continuous left/right steering movements. The road bumps appeared on both sides of the road to guarantee that only small steering movements were used, and the trial was only successful if the participant passed between the two road bumps (see Fig. 1). A green disk placed between the road bumps indicated the ideal position of passing. Running over a road bump was signalled by a small vibration on the steering wheel. This task was sometimes interrupted by an emergency event.
The emergency event was the appearance of a deer in the bushes, either on the left or on the right side of the road. The orientation of the deer’s jaw signalled whether a response was required or not (Go-NoGo task). If the deer was facing the road it signalled emergency (Go signal), if it turned away then no response was required (NoGo signal). In case of emergency, participants were instructed to steer to the other side (i.e. large steering movement) in order to avoid a collision. If the orientation of the deer did not implicate emergency, the participants were instructed to execute the primary task and not to react to the deer.
Procedure
The experiment started with a practice phase where participants were familiarized with the task. We looked for signs of simulator sickness to avoid unwanted discomfort caused by performing the task for a prolonged period. The data used in the current analysis was collected from four 5 minute-long blocks. The participants were free to take a rest, stand up, walk and drink between the blocks. The total duration of the experiment was approximately one hour, including breaks.
During the experiment, emergency events appeared with 20% chance. Time between road bumps varied between 300 and 1700 msec (distance: 5.9 m to 34 m at 70 km/h speed). Emergency events always followed a road bump with 650 to 700 msec and when they appeared they were the closest visual target stimuli. Emergency events were followed by road bump with 300 to 350 msec. This way the distance between the two road bumps bounding the emergency event was equal to the average distance of two road bumps. We used this configuration to avoid that participants could anticipate the emergency events.
Data preprocessing
Data preprocessing and modelling was done in Python [48] using Pandas [49], Scikit-learn [50], visualisation was done using Matplotlib [51] and Seaborn. Trajectory and steering angle data was logged in every 50 msec with high precision, according to the Unity environment internal physics. Normal driving data was extracted from the trajectories by selecting data points outside the emergency events. Emergency event onsets were defined as the moment when the deer become visible.
We defined the time window of the emergency events from
where
Consequently, we had a four dimensional vector available for every time point, which was used as the input of the risk prediction model. This way the model was able to handle short range dependencies of the time-series data.
In the following we will refer the normal driving data as no event and the emergency data as event. Thus data points were in theory either normal (
This means that we could have used the
where
that is subject to
here,
which yields positive values for
Detection time of Emergency from steering wheel and position data. We were able to predict emergency from steering data earlier than from lateral position because of the non-linear relation between steering angle and vehicle position. Whiskers show 95 % confidence intervals for the mean.
Relationship between steering angle and vehicle position. It can be on the two dimensional histogram, that the position of the vehicle changes in a rather curvilinear manner relative to the steering angle (nova from the centres). The two dense centres are results of the slaloming task, where the car was either going slightly left or slightly right, the smaller circular pattern around the centres also resulted from the slaloming task. The histogram uses jet colormapping, which goes from blue through green to red.
As a first step, we divided the whole no event data to training and validation sets by randomly assigning half of the time points to one and the other half to the other. Because our aim was to build a model that uses both general and personalized information, we did not split the data to two pools of participants. The model gave very small amount of false alarms on the validation set: 4.86% for the steering angle data and 4.06% for the trajectory data. After this, we used the support vectors of this model to detect the earliest anomaly point in the event data. We expected significantly high detection rate of the emergency events, and earlier detection of anomalies in the steering wheel data than in the trajectory data.
Emergencies were detected 645.15 (
t-SNE embedding of no event and earliest detected emergency event data. The embedding method clearly visualizes the decision boundaries between event and no event data. Only a fraction of 30.000 data points are displayed.
We visualized the anomaly detection thresholds based on the validation set and emergency event data points using the t-Distributed Stochastic Neighbour Embedding (t-SNE) method [55]. This method efficiently visualizes high-dimensional data by using joint probabilities of a low-dimensional embedding. The transformation was run using the Barnes-Hut approximation in order to perform calculation in quasi-linear time. The results of the t-SNE show that the no event and emergency event data points are easily differentiable (see Fig. 4).
Summarizing the results, we found that emergency events were readily detected both in wheel angle and in trajectory data using a OC-SVM. Steering data made possible earlier detection of emergency events than trajectory data.
In the current work we proposed an in-car risk detection and warning system that could inform automatic vehicles on the road about the cautious actions of the human driver (e.g. abrupt steering movement, falling asleep). We illustrated the benefits of the risk detection component by predicting dangerous steering movements earlier from wheel angle data than from vehicle trajectory data, because of the non-linear relationship between steering angle and vehicle lateral position [56, 57].
We used one class support vector machine for learning and prediction. These type of models are common in outlier detection scenarios for various problems [45, 58, 59]. Note, that by controlling the sparsity parameter of the SVM we can limit the number of support vectors used for prediction [54], there are even solutions to find the optimal number of support vectors for a given problem [60]. Moreover, while training an SVM (and potentially multitude of SVMs for each car on the road) would be infeasible inside a master vehicle, our proposal leads to computational efficiency since training and prediction could run on the individual peer vehicles. This fact opens the door to highly individualized models.
We found earlier detection of risk in wheel angle data than in trajectory data. Although this is in line with the expectations (i.e. because of steering backlash, vehicle inertia, tire stiffness), a limitation of the current study is that it was done in virtual reality. While reactions in virtual reality are comparable to those in the real-world [61], the physics of the virtual environment are simpler than reality. Not speaking of the large variance of normal driver behaviour in real world scenarios. While in our case there were only two tasks, outside of the simulator the driver faces all the challenges of traffic. This necessitates further exploration under more naturalistic circumstances. Nonetheless, our choice of virtual reality was motivated by the fact that only this way we were able to generate large amount of clean and labelled data for training and test without real risk of accident. Further studies should evaluate the effectiveness of such a system with more degrees of freedom. Here participants were only able to control the steering wheel angle but not the speed of the car, in reality steering wheel angle changes depends on the speed of the car too, also manufacturers apply speed steering solutions in today’s cars [56].
Worthy to note, that the change of the steering wheel angle is indicative of rather distant elements of the perception-action cycle. Hence, presumably more benefit we earn from such a model when more proximal cognitive variables are tracked. Eye and face tracking in the cockpit could help detecting drowsiness very early in time [21], but also – in situations like the current experiment – could also help identifying saccades to certain stimuli inside and outside the car [8]. Wearable sensors can monitor heart rate, and therefore can be used to inform traffic peers of medical emergency. Moreover, given the increasing availability of consumer EEG headsets, it is promising that research shows electrophysiological patterns can be extremely helpful as well [22, 23].
Another interesting field of exploration is the study of information transmission and potentially further propagation of data in a vehicle network [17, 62, 63]. This way the risk information is not only locally useful but can change the state of the global network. For example, the network could start organizing detours even when an inevitable accident has not happened yet. On the one hand, creating such a one-directional inter-cognitive link between an artificial and a biological cognitive system is an important step forward from the perspective of the applied field of cognitive infocommunications [31]. On the other hand though, it raises important concerns regarding privacy and security. These systems would monitor the driver’s reactions and while communication is only intended in case of risk, it is still a potential data breach. Moreover, malicious attack is also possible against the automated car by sending large amount of risk notifications. The communication link therefore must be secured. Indeed, current research on intelligent automated traffic, smart cities and situation awareness of self-driving cars is aware of these challenges [17, 35, 64, 65].
Researchers working on self-driving cars say that fully automated cars are still years or even decades ahead [29, 66]. Meanwhile, semi-automatic solutions are increasingly available (automatic parking, highway autopilot) [67, 68]. Thus, roads are becoming more and more a niche of biological and artificial drivers. In this situation we may want artificial cognitive agents to co-evolve with our biological cognitive systems. In the present work we detailed one aspect of this endeavour, namely inter-cognitive warning systems. The core of arguments was the importance of communication of the human drivers’ cognitive and behavioural states to self-driving cars to increase road safety in the future.
Footnotes
Acknowledgments
The research leading to these results has received funding from the European Community’s Research Infrastructure Action – grant agreement VISIONAIR 262044-under the 7
