Abstract
Intelligent Transportation Systems (ITS) aim at reducing the risks associated with the transportation system as road accidents are becoming one of the primary causes of death in developing countries. Monitoring of driver behavior is one of the key areas of ITS and assists in vehicle safety systems. It has gained importance in order to reduce traffic accidents and ensure the safety of all the road users, from the drivers to the pedestrians. In this work, we present a context-aware system that considers the vehicle, driver and the environment for driver behavior classification as a safe or fatigue or unsafe driver (representing any other unsafe driving behavior like a drunk driver, reckless driver etc.) using a Dynamic Bayesian Network (DBN). We have designed a questionnaire to obtain the influencing factors that decide safe, unsafe and fatigue driving behavior. The collected data has been analyzed using Statistical Package for Social Sciences (SPSS). It has been observed that several techniques in the past have been proposed for driver behavior classification or detection; which either use specialized sensors or hardware devices, inbuilt smartphone sensors (like a gyroscope, accelerometer, magnetometer and GPS etc.), complex sensor fusion algorithms and techniques to detect driver behavior. The novelty of our work lies in designing and developing a context-aware system based on Android smartphone; that considers the complete driving context (driver, vehicle and surrounding environment) and classifies the driver behavior using a DBN. In order to identify driver fatigue, results from the designed questionnaire and previous research studies have been used without the need for special hardware devices. A DBN that combines all the contextual information has been created using GeNIe Modeler. Learning of DBN has been carried out using the Expec-tation–Maximization (EM) algorithm. The real-time data for DBN learning and testing has been collected on Chandigarh-Patiala National Highway, India using an Android smartphone. The proposed system yields an overall classification accuracy of 80–83%.The focus of this paper is to develop a cost-effective context-aware driver behavior classification system, to promote ITS in developing countries.
Introduction
According to the Global Status Report on Road Safety by World Health Organization (WHO); road accidents contribute to 74% deaths in middle-income countries [1]. 90% of the road fatalities in the world take place in low-income and middle-income countries [2].
Various driving maneuvers like high speed, sudden braking, drunk driving and sharp/improper turns; along with other factors like sleep deprivation and fatigue are some of the prevailing factors for the road accidents [3]. Substantial research on driver behavior detection is being conducted in various developed nations like USA, UK etc. The transportation system in such countries is technologically advanced. However, in developing countries like India, Bangladesh, and Bhutan etc. such facilities are not available in the cars owned by the majority of the population. Therefore, it becomes a necessity to have such a system in place which helps in accident prevention to some extent.
Context-aware systems and Advanced Driver Assistance Systems (ADAS) are the emerging research fields that act as the next step towards road safety [4, 5]. Researchers have integrated the study of the behavior of the driver, vehicle and the environment which has led to the development of context-aware applications [6–9]. Three main stages of a context-aware system are sensing, processing and reacting [10]. The information sensed by the sensing phase is often prone to failure so context-aware applications should be assisted by adequate modeling and reasoning techniques [11].
Driving is considered as a complex decision-making process [12]. Real-time driver behavior detection is gaining research importance for driver’s state monitoring [13]. Different methods like ADAS packages, specialized driving simulators [14] special cameras in vehicles [15] and mobile smartphone sensors [16, 17] have been used in literature to study the behavior of the drivers in the real-time environment. Among all the available methods, use of a smartphone as a sensing device seems to be a cost-effective solution for developing countries where the count of smartphone users is increasing every day [18]. The smartphones available in the market have different operating systems; however, the smartphones with Android operating system acquire the world’s largest market share of 87.7% [19]. Hence, we are only focusing on Android smartphones; although the approach can be extended for use on smartphones running operating systems other than Android. The inbuilt smartphone sensors like accelerometer, gyroscope, magnetometer, and GPS can be used to collect the real-time data [20].
The temporal and static aspects of driver behavior can be acquired by the inbuilt smartphone sensors. The information contains uncertain high-level contextual information and effective reasoning techniques like Fuzzy logic, Hidden Markov Model (HMM), Bayesian Network and DBN etc. need to be applied to deal with this uncertainty [11, 22].
In this research work, the data regarding the contextual features that include the circadian/daily cycle (daytime/night time), number of driving hours, way of driving the vehicle (acceleration, deceleration and turns) and road conditions has been collected using a questionnaire to gauge the importance of various factors from driver’s perspective. The system collects the real-time data set using smartphone inbuilt sensors and then classify the driver as a safe driver or fatigue or unsafe driver using a DBN. In order to determine fatigue, no specialized devices have been installed. The influencing factors that determine driver’s fatigue have been obtained from the questionnaire and previous research studies.
The key contributions of this research work are: A questionnaire has been designed to study the relevant factors contributing to driving behavior. A smartphone based context-aware system has been designed to classify the behavior of the driver as safe or fatigue or unsafe. An experimental setup has been created for real-time data collection using inbuilt smartphone sensors (accelerometer and gyroscope). Details are given in Section 5. A DBN (using GeNIe) has been designed to implement the context-aware system, which combines sensing data from the smartphone with other contextual information.
Various smartphone based solutions have been proposed in literature [6, 22]. However, they do not consider the complete context (i.e. the vehicle, the driver and the environment); rather they have considered only partial context or have used specialized hardware or sensors. Therefore, to our knowledge, this is the first system that considers the complete context and do not use any special device (except smartphone as a sensing device) to classify driver behavior as safe or fatigue or unsafe using DBN.
The remainder of the paper is organized as follows. Section 2 describes the state-of-art work in the area of context-aware systems and driver behavior systems. Section 3 presents the questionnaire design and results obtained from the questionnaire. Section 4 describes the context-aware system architecture to classify driver behavior using a smartphone as a sensing device. Section 5 explains our experimental setup for data acquisition, to generate the learning and testing data set. Section 6 details the design and implementation of Static Bayesian Network (SBN) and DBN using GeNIe/ Structural Modeling, Inference, and Learning Engine (SMILE). System evaluation is given in Section 7 and Section 8 concludes the paper.
Related work
A questionnaire has been used as an effective research instrument in the literature by various researchers. In [23] a questionnaire has been used to survey the grabbing style of the steering wheel and the results validate the analysis for driving behavior obtained by the driving simulator system. An interview-based questionnaire has been used in [24] to analyze the mental and physical state of the driver before the accident. In [25], a questionnaire has been filled by 354 drivers to study the driver’s sensitivity towards the use of advanced driver assistance and safety systems. [26] recognized the driving style on the basis of driver behavior questionnaire. [27] presented a modification of Manchester Driver Behavior Questionnaire (MDBQ) for Australian drivers to determine professional driving behavior. A semi-structured questionnaire to understand driving-related aggression has been given in [28]. [29] used a questionnaire to find out significant distraction factors for the drivers due to which the crashes take place. In [30], a questionnaire has been used to obtain the socio-economic characteristics like age, education level, and income level etc. of the drivers. MDBQ [31] has been used by various researchers to measure the driver behavior [3]. In [32], a study has been conducted using a questionnaire filled by the youths of Alabama, U.S.A. It has been concluded that texting while driving and drunk driver are the prominent reasons for road crashes. In [33], a Driving Style Questionnaire has been developed to understand the driving behavior. In [34], a questionnaire survey has been used to validate the results obtained from a driving simulator for driver behavior.
Several context-aware systems have been proposed in literature considering the context as the vehicle or the environment or the driver [35–37]. The layered conceptual design architecture has been discussed in [9] and it has been concluded that the division of context data acquisition and processing allows the context information to serve a multitude of clients and applications. A review of the basic concepts related to types of context information, context modeling techniques and their relationship, and the uncertainty associated with the obtained context information have been given in [11].
Different driver behavior detection systems for ITS, have been proposed in the literature and reviewed in [38]. It has been concluded that the use of a mobile smartphone as a sensing device is a cost-effective technique for driver behavior detection.
Various modeling or machine learning techniques like Artificial Neural Networks (ANN), HMM, Gaussian Mixture Model (GMM), and Fuzzy Inference System etc. have been used in literature for robust identification or classification of driver behavior [39]. [40, 41] presented a survey of all the smartphone-based techniques for driver behavior detection and all the techniques have been compared in [40]. In [42], an efficient system to detect reckless driving behavior using the road design principles of civil engineering has been presented. In [43], a driving simulator has been used to characterize the driver using pattern recognition techniques. [44] monitored driver’s vigilance using a fuzzy classifier to infer the level of attentiveness of the driver. HMM for generation and recognition of driving patterns has been used in [45–48]. In [49], a special sensor fitted vehicle has been used and HMM along with a Hybrid State System (HSS) has been implemented to monitor driver behavior at the intersection. [39, 50] and [51] have used neural networks as a machine learning technique to detect or classify driver behavior. [52] used GMM for driver feature extraction and recognition. [53] detected drowsy driver by passing the collected data to machine learning algorithm like Adaboost. In [54], a mobile phone has been used to gather the data using the inbuilt sensors and Fuzzy Inference system has been used to assign a score to each driving maneuver for identification purpose. [55] presented a system that detects car accidents on the basis of data collected using smartphone sensors. They have used Dynamic Time Warping (DTW) to distinguish between accident states. HMM has been used to analyze the data. [56] presented a complete review of different technologies used to monitor driver inattention that is classified as distraction or fatigue. Different approaches for driver behavior detection have been surveyed in [57] that generate the classification accuracy varying from 77–97%, with the help of specialized hardware devices. Smartphone-based driver behavior detection systems which use machine learning/modeling techniques have been proposed in literature by various researchers to detect driver behavior [58–70].
The approaches described above, use mobile smartphones for driver behavior detection. However, they have considered either the partial context. The context-aware systems in the literature have used specialized devices to detect driver’s fatigue. Different from the above-described approaches, our proposed system consider the complete context (driver, vehicle and environment), classifies driver behavior using DBN, which uses inbuilt smartphone sensors for network parameterization and data collection. The proposed system detects driver’s fatigue on the basis of the circadian cycle, temperature and number of driving hours. Hence, our work eliminates the extra cost of all specialized hardware which is now available in mobile smartphones.
Questionnaire results
The proposed system gathers driver’s perspective information and psychological information using a questionnaire. The questionnaire comprises of 18 items given in Table 1. The questionnaire items reflect the context related to the environment, driver and the vehicle which has been filled by 268 drivers using online Google Forms. The items have been evaluated using a five-point Likert scale i.e. strongly agree, agree, indifferent/neutral, disagree and strongly disagree. The statistical analysis has been done using SPSS which is one of the most widely used tools for processing and analyzing survey data [71].
In order to exclude invalid data points and maintain consistency in data, certain preliminary checks have been performed to conserve the format of input fields and treat missing responses. The standard deviation of the scores has been calculated to remove responses wherein the scores equaled the mean, consequently ignoring invaluable entries. The survey encapsulated diverse data in terms of gender, age and experience in driving. The frequency table given in Table 2 depicts cross-tabulations of age and experience in driving, to the gender of the individual.
The driver behavior questionnaire
The driver behavior questionnaire
Gender age and experience crosstabulation
In the first phase, Principal Component Analysis (PCA), a dimension reduction technique has been applied using the Kaiser–Meyer–Olkin statistics (KMO) and Bartlett’s Test of Sphericity (BTS) to facilitate implementation. KMO test measures the adequacy of the sample and returns a value between 0 and 1. BTS is a test for the null hypothesis regarding the relationship between the variables. The KMO measure of sampling adequacy is given by Equation (1) where R = [r
ij
] represents the correlation matrix and U = [u
ij
] is the partial covariance matrix.
The results of the test are given in Table 3. According to the results, the value of KMO greater than 0.6, indicates the adequacy of sampling. BTS p-value of significance being less than 0.05 indicates the validity of factor analysis [72]. The results clearly indicate that the factor analysis technique can be used for analyzing data and adequacy of proposed approach. In the second phase, to absorb the negative construct of certain questions, the context and scores have been reversed to derive consistent and reflective observations for the analysis. This iterates a change in the adequacy tests but grants sufficient leverage over the threshold to continue with the stated method. Identifying components attributing correlation and hierar-chical representation, analysis based upon variance has been conducted and the components with Eigen values one and above are considered. Table 4 gives the total variance for all the components in which there are 7 factors with Eigen values greater than 1. A “scree plot” is charted depicting the contribution of variance by principal components, which enables retention of effective factors. Figure 1 shows the scree plot obtained for the factors under consideration. Table 5 shows the rotation component matrix which clearly shows that questionnaire items of interest 2, 6 and 10 (loaded on factor 4) have related and favorable measures. Also, for the scope of this paper; questionnaire items 8, 14 and 16 have been considered in further computations. The relevant components hence obtained encapsulate the importance of driver’s behavior in making sudden lane changes, improper turns and exceeding the speed limit. This research further considers these factors along with extrinsic variations influenced by the time of driving (Time of day) and hours of driving for fatigue detection. According to the responses obtained with respect to time of day, a driver may feel drowsy/fatigue majorly during night time. The output obtained by the questionnaire is used to set the initial probabilities for the proposed probabilistic Bayesian Network model.
KMO and Bartlett’s Test results
Principal component analysis: Total variance
Extraction Method: Principal Component Analysis. (* –: signifies values corresponding to Eigenvalue less than 1.)
Rotation component matrix
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 7 iterations.
Context-aware systems are able to adapt their operations to the existing context without user intervention. The context-aware system architecture for driver behavior classification is given in Fig. 2. The system consists of three subsystems: sensing, reasoning, and reacting subsystem. The sensing subsystem is responsible for collecting all the contextual information using mobile smartphone inbuilt sensors making it a cost-effective solution for data collection. The reasoning subsystem uses a DBN that fuses information from the sensing subsystem and other contextual information obtained from the questionnaire and research in literature to classify the driver. The reacting subsystem is responsible to generate a message on the screen that describes the driving behavior as safe or fatigue or unsafe. The reacting system requires converting the DBN (generated by GeNIe) for Android smartphone that can be done with the help of freely available SMILE Wrappers and is outside the scope of this article.
Data acquisition
Real-time data has been collected for sensing phase of the context-aware system and for learning and testing of DBN. Nowadays, middle budget and low budget smartphones include GPS sensor; and do not offer orientation sensor or magnetometer. Majority of the approaches proposed in literature either use magnetometer/orientation sensor or fused data from accelerometer, gyroscope, and magnetometer to evaluate various driving maneuvers. We propose to use only the accelerometer and gyroscope; thus making our approach compatible with any low budget smartphone, that has an inbuilt accelerometer and gyroscope which are the basic smartphone sensors. The filtering and smoothening of data have been accomplished using a high pass filter and exponential moving average respectively, as mentioned in our previous work [73].

Scree plot for the factors under consideration.
Data acquisition parameters
An experimental setup has been created to acquire data consisting of a smartphone which has been placed inside the vehicle (car) fixed to the windshield. An Android application has been developed and installed on the smartphone. The application has been developed for: (1) probability generation to set the initial probabilities of Bayesian Network (when “Generate Probabilities” button is pressed) and (2) real-time data set generation for learning and testing of DBN (when “STOP” button is pressed). The user interface of the Android application is given in Fig. 3. The device (with an inbuilt accelerometer, gyroscope, magnetometer, and GPS) used for data collection is Xiaomi Redmi Prime 2 (API level 19). Extensive road experiments have been conducted for learning of the DBN. The experiment was conducted on Chandigarh-Patiala Highway, India (NH-64) for a period of 20 days, approximately 45 kms per day. The map of the road for which data has been acquired is given in Fig. 4 (indicated by arrows). Table 6 specifies the details of data acquisition parameters.

The proposed context-aware system.
The driver behavior has been classified into the safe driver (non-aggressive) or fatigue or unsafe driver (aggressive) considering the sampling rate of 5 Hz. The driver behavior (B) can be represented by the transition in states during the course of driving and is given by Equation (2):
(S i is the state of the driver at each time step)

Snapshot of Android application used for probability generation and data set generation.

Chandigarh- Patiala National Highway, India (NH-64).
The information obtained from the questionnaire and previous research studies (driver’s context) is fused with the information from sensing system (vehicle’s context) to get more reliable results. Several information fusion techniques like Fuzzy Logic, Dempster-Shafer Theory and ANN etc. have been used by the researchers to identify driver behavior. However, we use DBN to capture the temporal and static aspects of driver behavior. DBN is a suitable approach to combine uncertain contextual information from different sources [21] and as compared to SVM; DBNs are easy to deduce and suitable for multi-label classification [74].
DBN is a probabilistic times series model, to design a dynamic system using a stochastic process and consists of interconnected SBNs. First-order HMM has been used to model the relationship between the two SBNs i.e. random variables at time (t) are influenced by the random variables at time (t) as well as the variables at time (t–1). A two-slice DBN can be given by the Equation (3) [21]:
Where N is the number of nodes
In order to construct a DBN, we first determine the nodes and then determine their probability. Two types of nodes need to be identified: hypothesis nodes and the information/evidence nodes which tell something about the hypothesis. The hypothesis node in our network is the Driver_Status node with three mutually exclusive states: safe or fatigue or unsafe. The information/evidence nodes: Time_of_day, Hours_driving, Temperature, Acceleration, Deceleration, and Turns; represent vehicle, environment and driver-related information. They have been selected on the basis of prevailing factors obtained from the responses to the questionnaire filled by the drivers. The evidence node Temperature has been made part of the Bayesian network as it has key effect on driver’s fatigue [10, 21]. Table 7 gives the description and state space for all the nodes.
Bayesian Network nodes and their states
Bayesian Network nodes and their states
GeNIe Modeler, a simple interface to SMILE; has been used to design and implement DBN. SMILE, implemented in C++; is a platform independent library of functions to implement graphical probabilistic and decision-theoretic models like Bayesian Networks, Influence diagrams, and DBN. SMILE is the reasoning engine for probabilistic models with an outer shell known as GeNIe [75, 76]. It has been used by researchers to build and define BN and DBN [77–81]. The following are the steps for creating a DBN using GeNIe [82]:
Step 1: Create a SBN and add static probabilities
SBN with the selected states has been given in Fig. 5. For parameterizing (choosing the value of conditional probability table) the Bayesian network, we use the probability metrics obtained for a safe/non-aggressive driver from the App considering the real-time collected data. The initial conditional probabilities for acceleration, deceleration, and turns have been obtained using Android App given in Section 5. The likelihood of the states: Time_of_day and Hours_driving have been obtained from the questionnaire responses with respect to safe driving behavior. To obtain the probability for the Temperature state, researches related to driver behavior system have been referred [10, 21]. Tables 8–13 gives the initial probabilities for the nodes of the Bayesian Network.
Step 2: Validation and Learning in SBN
The next step is to use GeNIe option of “Update Beliefs” using clustering algorithm which compiles the directed graph and update the probabilities. EM algorithm has been used in GeNIe to learn parameters from the given dataset. K-fold cross-validation has been used as the evaluation method (K=10). SBN produces the accuracy of 77–79% in classifying driver behavior. The updated probability values for the validated SBN are further used as the initial probabilities for DBN.
Step 3: Create DBN from SBN

Static bayesian network at t=0.
Initial probability for Time_of_day node
Initial probability for Hours_driving node
Initial probability for temperature

Unrolled DBN for two time steps.
Conditional probability for acceleration node given its parent node
Conditional probability for deceleration node given its parent node
Conditional probability for turns node given its parent node
In order to convert the SBN into a DBN add a temporal plate, temporal arcs and set the time step count. A temporal plate contains all the nodes of SBN. The time step count has been set to 25, as we are observing the state of the driver after every 5 seconds with the sampling frequency of 5 Hz. The temporal arc has been added for delay of one time step which means the state of the driver at t = n is dependent on the state of the driver at t = n–1 (each time slice t is of 200 ms). Thus, we get the status of the driver as safe or fatigue or unsafe after 25 time steps that cor-respond to 5 seconds.

(a) ROC (Safe Driver); (b) ROC (Unsafe Driver); (c) (ROC) Fatigue Driver.
Step 4: Adding probabilities and learning of DBN
The initial probabilities for the DBN have been set using the probabilities obtained from SBN after validation. The data produced by the accelerometer and gyroscope sensor (M), cannot directly be used as the learning or testing data set for GeNIe. In order to convert it into the format acceptable by GeNIe (M
G
) for learning and testing, the process given by Equation (4) is followed after every 5 seconds automatically by the App.
DBN is trained by EM algorithm and data has been collected using the App during different days. In order to validate a DBN, it needs to be unrolled as shown in Fig. 6 and on validation of the test data; 80–83% accuracy has been achieved. K-fold cross-validation with K=10 has been used for validation. The Receiver Operating Characteristic (ROC) curve for Driver_Status as Safe, Unsafe and Fatigue is given in Fig. 7a–c respectively. Area Under the Curve (AUC) is 0.8 approximately in all cases. The value of AUC must lie between 0.5 and 1 for a valid classifier, as we are getting the value close to 1; this proves the accuracy of the proposed classification approach.

(a) Temporal Belief for Safe driver; (b) Temporal Belief for Fatigue driver.
Comparison of driver behavior classification techniques
Comparison of driver behavior classification techniques
Performance comparison of the proposed system
(*NA- Not available)
Safety on roads is the key focus area of ITS. Providing information to the drivers about their driving behavior is the next step towards improving safety by preventing accidents and saving life on roads. The main challenge is posed by developing countries, where road accidents are one of the major causes of deaths; and thus there is a need to enhance safety on roads. Therefore, we developed a context-aware system for driver behavior classification as safe or fatigue or unsafe driver. A questionnaire has been designed to obtain the relevant factors that reflect safe or unsafe driving behavior. An experimental setup has been created to obtain the real-time data pertaining to the major influencing factors, using an Android Application developed and installed on a smartphone. The smartphone has been kept in a vehicle that is test run on Chandigarh-Patiala National Highway, India. The driver behavior has been classified using DBN (designed using GeNIe) by fusing real-time data and other contextual information. The system has been compared with other systems and yields the classification accuracy of 80–83%, keeping in mind the cost constraint and promotion of ITS in developing countries. The proposed system is suitable for urban areas, rural areas, and highways. It can be installed on driver’s smartphone and an alert can be generated in case the driver is not driving safely or feeling drowsy. The proposed system can be extended for detecting other driving maneuvers such as drunk driver and abrupt lane change etc. in an efficient manner.
