Abstract
In recent times, driver drowsiness is one of the major reasons for road accidents that leads to severe physical injuries, deaths and significant economic losses. Hence, the existing driver drowsiness detection systems require a countermeasure device for the prevention of sleepiness related accident. This research paper aims to perform drowsiness detection with the help of driver’s eye state, head pose, and mouth state information. Initially, the input data were collected from the public drowsy driver database. Then, the Camera Response Model (CRM) was applied to improve the quality of collected data. Also, viola-jones, and Kanade-Lucas-Tomasi (KLT) approaches were used to detect and track the driver’s face, eye, and mouth regions from the input video. In this research study, Online Region-Based Active Contour Model (ORACM) algorithm was used to segment the driver’s mouth region in order to obtain the threshold value. Successively, feature extraction; Histogram of Oriented Gradients (HOG) and Local Binary Pattern (LBP) was applied to extract the features from the detected eye region. The extracted features of the eye region were combined with the threshold value of mouth region and head pose angle. After extracting the feature vectors, infinite approach was utilized to choose the relevant feature vectors. Finally, the selected features were classified using Support Vector Machine (SVM) for classifying the stages of drowsiness detection. Simulation outcome illustrated that the proposed system increased the classification accuracy up to 5.52% as related to hybrid Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM).
Keywords
Introduction
In recent decades, driver drowsiness is the biggest safety issues in road transportation. In order to prevent the on road or run-off road accidents, an on-board driver drowsiness detection system in vehicles is necessary [1, 2]. The drowsiness detection assesses different measures like visual features, vehicle behaviour, physiological features, etc. In vehicle based measures, a number of metrics included for detecting the driver drowsiness such as, steering wheel movement, lane departure and pressure on the acceleration pedal [3]. The main issue with this technique is that the accuracy depends on the individual properties of the vehicle and driver. So, the techniques based on visual features like yawning, facial expressions, head movement and eye state showed an effective performance in driver drowsiness detection, because of its non-contact in nature [4, 5, 6]. The visual features based techniques have emerged as the promising field of research for driver drowsiness detection. The techniques on the basis of yawning cannot predict the drowsiness onsets, because this feature does not represent the drowsiness [7, 8, 9]. In contrast, eye state information (eye close/eye open), head pose, and mouth state are well suited for drowsiness detection system, because the unusual eye blinking pattern and the opening and closing of eyes directly indicate the onset of the drowsiness [10, 11, 12]. This paper attempts driver drowsiness detection using behavioural methods based on machine learning approaches to classify the sub-stages of driver’s drowsiness.
In this work, a new system is proposed to improve the accuracy of driver drowsiness detection. Initially, the data were collected from public drowsy driver dataset from ACCV 2016 competition. After data collection, CRM was applied to enhance the collected data quality. Then, viola-jones, and KLT methods were used for detecting and tracking the drivers face, mouth and eye regions of the input video. In addition, ORACM algorithm was utilized for segmenting the driver’s mouth region for obtaining the threshold value (height and width of the mouth region). Then, feature extraction methods; HOG and LBP were applied to extract the features from the detected eye region. Feature extraction extracts the local features and global features from the detected eye region. The extracted features of eye region were combined with the threshold value of mouth region (detection threshold value is 90 on the input frame) and head pose angle. After obtain the feature and threshold values, infinite algorithm were used for diminishing the extracted features dimension by eliminating the irrelevant features. The selected feature values were classified by SVM classification approach to classify the stages of driver’s drowsiness; drowsy or non-drowsy. Finally, the performance of infinite algorithm with SVM was compared to hybrid CNN-LSTM in light of specificity, f-score, sensitivity and classification accuracy.
A few research papers on driver drowsiness detection are surveyed in Section 2. In Section 3, proposed system is explained briefly with mathematical expression. The simulation result of the proposed system is stated in the Section 4. Conclusion is detailed in the Section 5.
Literature survey
Researchers developed numerous researches on different stages of driver drowsiness detection. Here, some key contributions to the existing literatures is presented.
de Naurois et al., [13] utilized Artificial Neural Networks (ANNs) for detecting the driver’s drowsiness level or for predicting the onset of an impaired driving state. Here, two ANN based approaches were utilized for predicting the level of drowsiness and to detect the time period, how long driver takes to reach moderately drowsy state. Here, the drowsiness detection performance of developed methodology was improved approximately 80% in the detection and 40% in prediction compared to the existing systems. The subject-specific adaptation of driver’s data delivers a better response to the issues of high inter and intra-individual variability. Besides, the developed research work did not concentrate on different road conditions and time.
Panicker and Nair, [14] developed a new drowsiness detection system that comprises of three main phases. The first phase was face detection, which was accomplished using a template matching approach and elliptical approximation technique. In the second phase, iris-sclera pattern analysis was used to detect the open eye. In the third phase, PERCLOS measure was used to determine the driver drowsiness state. The developed system was independent to any datasets for eye or face detection. In this study, the developed system uses morphological and Laplacian operations for open eye detection. Hence, the iris was extreme right or left within the eye, while the driver looks at the outside. In such conditions, the developed system failed in detecting the sclera symmetry.
McDonald et al., [15] evaluated temporal and contextual algorithms for detecting drowsiness-related lane. The developed method uses pedal input, steering angle, acceleration, and vehicle speed as input. In this research study, acceleration and speed were utilized for developing a real-time measure of driving context. These measures were combined with a dynamic Bayesian network, which considered the time dependencies in transition between awake state and drowsiness. This research study includes a few problems; scope of the ground truth drowsiness, use of a driving simulator, and size of the test dataset.
Guo and Markoni, [16] developed hybrid classifier: CNN and LSTM for driver drowsiness prediction. The hybrid model performed with low computational cost and better classification accuracy. The developed model was tested on public drowsy driver database [12]. Simulation result showed that the developed model performance was investigated in light of classification accuracy. If the dimension of the extracted features were high, the classification was quite difficult. Additionally, the CNN model performance depends on the amount of input data, if the data was fewer the CNN model performs poorly.
Zhao et al., [17] developed a new system for recognizing the driver drowsiness expression utilizing Deep Belief Network (DBN) and facial dynamic fusion information. Initially, the textures and landmarks of the facial regions were extracted from the videos, which were captured by using a high-definition camera. Then, DBN was utilized for classifying the driver’s facial drowsiness expressions. The experimental outcome exhibited the superiority of this system. This research study did not concentrate on the occlusion and large head rotation, which significantly diminishes the efficiency of the developed system.
A new system is proposed in this paper to address the above-mentioned issues and for improving the detection of driver drowsiness.
Proposed system
In this research, the proposed system contains six stages such as collection of data, pre-processing, object detection and tracking, extraction of features, selection of optimal features and driver’s drowsiness classification. Figure 1 shows the flow diagram of the proposed system and it is briefly explained below.
Flow diagram of infinite algorithm with SVM classifier.
At first, the data are acquired from public drowsy driver database from ACCV 2016 competition [12]. In this dataset, the video frame captured by using D-Link DCS-932L with the resolution of 640
Sample images of public drowsy driver dataset.
After data acquisition, pre-processing is performed using CRM in order to reduce the noise. Normally, the camera manufacturers use a few nonlinear features in the camera, for instance, demosaicing and white balance for enhancing the visual quality of the images. The CRM contains two major components; Brightness Transform Function (BTF) and Camera Response Function (CRF). The parameters of CRF are determined only by using the camera, where the BTF is determined by using exposure ratio and the camera. Initially, BTF is calculated based on the observation of two dissimilar exposure images. Then, derive the corresponding CRF by solving the comparametric equation. These two functions are mathematically described in the Eqs (1) and (2).
Brightness transform function
At first, brightness transform function selects two frames such as,
Where,
The CRF calculates the relationship between BTF parameters such as
If
Sample pre-processed images of public drowsy driver dataset.
After pre-processing the collected data, viola-jones, and KLT methodologies used for detecting and tracking the drivers face, mouth and eye regions of the input video. Then, the head pose angle estimated from the drivers face region for every video frame. In addition, ORACM algorithm is utilized to segment the driver’s mouth region for obtaining the threshold value on the basis of height and width of the mouth region. ORACM is a region active contour method that does not require additional parameters, so the segmentation accuracy is very significant related to the conventional ACMs approaches. In every iteration, ORACM performs a sort of block thresholding procedure. Successively, a thresholding procedure generates many minor particles and rigid boundaries. Here, an effective morphological operation is implemented for obtaining proper and smooth object contour and also to eliminate the minor particles.
As similar to other ACM algorithms, ORACM utilizes a user-defined active contour at the initialization step and then continuously updates it. Level set function
Where,
a) Pre-processed image, b) extracted face region, c) extracted eye region, and d) extracted mouth region.
a) Face image, b) eye blink, c) yawning, and d) head bending.
After eye region detection, feature extraction is performed on the detected eye regions. In this study, a high level texture features (HOG and LBP) utilized for extracting the features from detected eye regions. The brief description of HOG and LBP are detailed below.
Histogram of oriented gradients
In HOG descriptor, a gradient operation
Then, windows in the input images are categorized into several spatial regions, which are called as cells. In HOG feature descriptor, the gradient magnitude of the pixel is denoted with orientation of the edge. The gradient magnitude of the pixel
Where,
Where,
Output image of HOG.
On the basis of luminance value, LBP converts the images into labels. Hence, the gray scale invariance is a vital factor in LBP, which is based on texture and local patterns. The pixel position is stated as
Where,
In LBP, p-neighbourhood delivers
Where,
In this study, infinite algorithm is employed to choose the optimal features. Given a set of features
Where,
Where,
In practice,
Let
Where,
The standard matrix algebra is given in the Eq. (15).
Now,
Therefore, the first idea of the feature selection method is to eliminate the non-redundant feature sub-sets. Unfortunately, the computation of
The passage to infinity implies for calculating a new type of single feature score that is mathematically given in the Eq. (17).
Let,
Where,
Where,
Then, obtain final energy scores for each feature by marginalizing the quantity as given in the Eq. (21).
A rank for the feature is selected by decreasing the order of
After the selection of optimal features, classification is accomplished by utilizing SVM to classify the non-drowsy and drowsy driver. By developing a relaxed classification error bound, the SVM classifier reduces the size of resulting dual problem. In addition, SVM classifier speeds up the testing and training processes by preserving a competitive classification accuracy. The SVM is a discriminative classification approach, which is represented by a separate hyper-plane. In recent decades, the SVM classification methodology is extensively utilized in many applications such as signal processing, bio-informatics, computer vision fields, etc., because it has the ability to perform in high dimensional data. Though, SVM classifier does well in solving the two-class issue that is associated with vapnik-Chervonenkis theories and structure principles. The formula to calculate the linear discriminant function is represented as
Then, reduce
Finally, interchange the interior product
For experimental investigation, MATLAB (2018a environment) was applied with 3.2 GHz, windows 10 operating system and i5 Intel core processor. In this research work, the proposed infinite algorithm with SVM classifier performance was related with hybrid CNN-LSTM [16] to analyse the efficiency of the proposed system. In this study, the proposed infinite algorithm with SVM classifier was analysed in light of f-score, specificity, accuracy and sensitivity on a reputed database: public drowsy driver dataset from ACCV 2016 competition. Mathematical expressions of f-score, specificity, accuracy and sensitivity are indicated in the Eqs (25)–(28).
Where, false negative is denoted as
Performance investigation of the proposed system with dissimilar classifiers
In this scenario, public drowsy driver database is applied to analyse the performance of hybrid CNN-LSTM [16] and the proposed system. The public drowsy driver database comprises of 22 subjects in that four subjects (004, 022, 026, and 030) are utilized for testing evaluation and the residual subjects are used for training evaluation.
The mean classification accuracy of SVM is 90.37% and other classification methodologies such as random forest, K-Nearest Neighbour (KNN), and Neural Network (NN) delivers 66.55%, 82.31%, and 77.9% of mean classification accuracy. In addition, the mean sensitivity of SVM is 91.70% and other classifiers attain 76.27%, 85.29%, and 76.59% of mean sensitivity. Respectively, the mean specificity of SVM is 89.80% and other classifiers delivers 81.25%, 83.83%, and 79.04% of mean specificity. Additionally, the mean f-score of SVM is 90.61% and the existing classifiers such as random forest, KNN, and NN delivers 76.16%, 84.12%, and 82.67% of mean f-score. Tables 1 and 2 states that the infinite algorithm with SVM performs effectively in drivers drowsiness detection related to other classifiers on ACCV 2016 dataset. Graphical analysis of the proposed system performance is denoted in Fig. 7. The graphical analysis of the proposed system with dissimilar classifiers is stated in Fig. 8.
Mean value of the proposed system with different classifiers
Mean value of the proposed system with different classifiers
Graphical representation of the proposed system performance.
Graphical comparison of proposed system with dissimilar classifier.
Table 3 indicates the proposed system performance with infinite and without infinite algorithm. In with infinite algorithm, the SVM averagely enhanced the classification accuracy in driver drowsiness detection up to 2.86% related to with-out infinite algorithm. In this work, HOG and LBP effectively finds the linear and non-linear properties of drivers face, mouth and eye regions and also significantly preserves the relation between lower and higher level features. The performance measures (f-score, specificity, accuracy and sensitivity) confirm that the proposed infinite algorithm with SVM classifier performs well in driver’s drowsiness detection related to the existing system.
Accuracy evaluation of proposed infinite algorithm with SVM classifier
Cross validation of proposed infinite algorithm with SVM classifier
The cross validation of proposed infinite algorithm with SVM classifier is stated in Table 4. In this study, infinite algorithm with SVM classifier averagely attained 89.80% of specificity and 91.70% of sensitivity. Respectively, the average classification accuracy and f-score value of proposed system in driver drowsiness detection is 90.37%, and 90.61%. Simulation result of infinite algorithm with SVM classifier showed better contribution in active safety systems.
In Table 5, the comparative investigation of the proposed and existing system is presented. Guo and Markoni [16] developed a hybrid CNN-LSTM model for driver’s drowsiness detection. In this study, the developed system performance was evaluated on public drowsy driver database. In this literature, the hybrid CNN-LSTM attained 84.85% of accuracy in drowsiness detection. Related to hybrid CNN-LSTM model, the proposed infinite algorithm with SVM achieved 90.37% of classification accuracy, which was significantly higher than the hybrid CNN-LSTM model. In the proposed research, selection of optimal feature is a vital part of driver’s drowsiness detection. Every video sequence contains many feature vectors that leads to “curse of dimensionality” issue [18]. So, infinite algorithm is necessary for optimizing the extracted feature vectors, which is appropriate for better driver’s drowsiness detection. In addition, HOG and LBP effectively finds the linear and non-linear properties of video frames and also preserves the relation between lower and higher level feature vectors. Efficiency of infinite algorithm is indicated in Table 3.
Comparative investigation of the proposed and existing system
Comparative investigation of the proposed and existing system
As discussed in the Section 3, feature selection is an integral part of driver drowsiness detection. Though, numerous feature values are obtained from HOG and LBP, so infinite algorithm is applied to choose the active or relevant feature values of driver drowsiness detection. The effect of feature selection is given in the Table 3, where the accuracy of with infinite algorithm is 2.86% better than the without infinite algorithm. Proposed model attained better classification performance related to the existing model in light of f-score, specificity, accuracy and sensitivity. The proposed model includes the advantage of preventing major and minor run-off road accidents.
Conclusion
In this work, a new system is proposed for detecting the driver’s drowsiness. The aim of this study is to propose a superior feature selection methodology to classify the stages of drowsiness detection (drowsy or non-drowsy). Here, infinite algorithm is developed to select the relevant feature vectors. The selected optimal feature vectors are classified utilizing SVM classifier. Related to hybrid CNN-LSTM model, the proposed system achieved a better performance in driver’s drowsiness detection by means of f-score, specificity, accuracy, and sensitivity. From the simulation outcome, the proposed system attained 90.37% of accuracy and speed of 12 frames per second on public drowsy driver database but the hybrid CNN-LSTM model obtained an accuracy of 84.85%. In future, a new system can be proposed with an optimization algorithm for further improving the accuracy of driver drowsiness detection.
Footnotes
Acknowledgments
This work is done under the scholarship of Visvesvaraya Ph.D Scheme for Electronics and IT, Government of India.
