Abstract
Fatigue will affect the normal work and even cause accidents. In order to reduce fatigue’s impact on people, we proposes a method for providing a real-time fatigued detection.Specifically, this method comprise the following steps. Firstly of all, we use Active Shape Model (ASM) to detect face, extract the Histogram of Orientation Gradient (HOG) features of eyes and mouth. Secondly, we use Support Vector Machine (SVM) to classify the states and Pose from Orthography and Scaling with Iterations (POSIT) algorithm to estimate the poses of the head. Thirdly, based on the states of face, we obtain a fatigue decision index, wherein a weight of the fatigue decision index is calculated by the Entropy-weighting method. Finally, we apply Bayesian method to evaluate driver’s fatigued level based on calculated fatigue decision index. The final mean accuracy of this method is 83.3%.
Introduction
With the economic development in China, the material life of people is improving day by day. In China, transportation pattern has been transformed from the universal bicycle to a vehicle transportation, which greatly facilitates people’s life. However, with the number of vehicles increasing, the probability of a car accident has also increased greatly. It results in many property losses and personal injuries [1]. A survey reveals that fatigue driving is the leading cause for traffic accidents [2]. As the number of cars continues to increase, these numbers will continue to rise. A lot of statistics shows the dangers of fatigue driving, which warned people do not drive in fatigue states, because the effect is obvious [3]. It is imminent to detect the driver’s fatigued level and alert the driver in real time.
Fatigue is defined as the combined psychological and physiological disinclination to continue the task at hand or to take on a new one [4]. The so-called driving fatigue is to the driver responded slowly due to insufficient sleeping or prolonged driving, which are mainly manifested in the driver drowsiness, dozing, fatigue operation error and losing driving ability [5]. Long time driving requires the drivers focuses on the conditions of the road surface and the vehicle itself all the time, which consumes a great deal of energy. Prolonged driving activities can cause driver fatigue and constant yawning, winking and frictional eyes, low-attention. With the situation developing there will be closed eyes and nodded. At this time, drivers will cannot pay attention to the driving situation of the vehicle.
At present, the fatigued detection researches can be divided into three categories. Specifically, it includes:
The first category bases on the driving behavior of the vehicle methods [6], such as the operation of the vehicle’s steering wheel, driving directions and changed speed. However, the detection methods are more difficult to test, susceptible and lower accuracy of detection. The second category is fatigue testing by measuring the driver’s physiological parameters, such as the Electrocardiogram (ECG) [7], Electroencephalogram (EEG) [8] and Electrooculogram (EOG) [9]. However, most of the used devices are expensive special equipment and it require direct contact with the driver, which will cause some psychological pressure on the driver. It will affect the test result. The third category is based on machine vision [10]. There are some obvious appearance features when people are tired. Eyes features is most obvious among the features. This method is currently more common method and the detection effect is getting better and better, with the facial detection technology matures. The device has the advantage of not requiring contact with the driver [11]. It can overcome the shortcomings of the first two methods.
The past fatigue studies are mostly concentrated in road transport, shipping and other fields, and the study of fatigue of general fixed-position staff is less, such as dispatchers and visual display terminal operators [12]. Currently, many methods are focus on analyzing the state of human beings, by detecting the state of the human eyes. However, the fatigue index is not just detected on eyes. As we all known, the signs of fatigue including yawning constantly and reduced head movement, etc. With fatigue increasing, the person’s head will appear frequently nodded and his mouth will open. Such human actions could be obtained by using camera, with aforesaid indicators into the fatigued detection system. It improves the performance of method for evaluating the fatigued level. In addition, the former researches focused on whether the driver is fatigue, which don’t give the fatigued level. Fatigued level is considered in this paper.
In this paper, one aspect of the method is directed towards detect images with different angles, including detecting the motion of eyes and the motion of mouths, by using OpenCV and Visual Studio 2012. Specially, the system can detect and accurate locate human eyes and mouth by using ASM algorithm. Further, HOG features and SVM are used to recognize the state of eyes and mouth. Furthermore, the facial detection also helps to detect the posture of human head. Additionally, the system can detect the state of the human eyes accurately in big angle of head, and solve the problem caused by different brightness of the input image. In this regard, this paper propose a method as to classify fatigue into three levels, with six indicators which represent people’s different conditions, respectively regarding the weighting distribution of these six indicators, this paper introduce the entropy-weighting method to calculate aforesaid weighting distribution for each indicators. Accordingly,the degree of fatigue is determined by using the values of the six indicators and the Bayesian method. Above steps are described as following Fig. 1.
System processing flow.
Facial and facial features detection
The definition of ASM
In order to detect the state of the eyes and the mouth, the first thing we need to do is detect the face from the input image (with many different angles). We use the ASM algorithm to accomplish this goal. This algorithm bases on point distribution model, which trains the shape model by manual calibration training set. Then it matches the specific objects through the matching these key points. The advantage of this algorithm is using statistical theory to discover the most suitable position of selected features from the training set [13].
The main algorithm can be described as the relationship between the specific points and other points on the face. ASM has two sub-models: the descriptor model and the shape model. A descriptor model for each landmark, that describes the features of the image around the landmark. The model specifies what the image is expect to ‘looks like’ around the landmark [13]. Through the sampling of landmark around the training images, the descriptor model of the landmark is established. During the search, we search for the initial location of the landmark and move the landmark to this location, where the image surface best matches the description model. The process is then used to the entire landmarks to generate a preliminary new positions for the landmarks, called the suggested shape. The descriptor model attempts to locate each of the landmarks independently. If it is necessary, we use the shape model to correct the position.
The shape model was established to showed all models in the training set, according the Eq. (1) to establish a common model [14].
where
The mean shape,
The shape parameter b combined with a transformation of the model co-ordinate frame to the image co-ordinate frame, are described to an example of a model in an image. The defining position (
The positions of the model points in the image,
where the function
Now in order to find the best pose and shape parameters we match a model instance
AdaBoost algorithm is a kind of algorithm to improve the performance of classifiers. The core idea is to train different weak classifiers for the same training set, and then combine these weak classifiers to form a final strong classifier. But it can not find out the face under the big rotation and high angle detection, which has serious limitations and deficiencies for the actual using situation. The ASM algorithm can flexibly change the shape of the model to adapt to the uncertain characteristics of the target shape and control the shape’s variation within the allowable range of the model. So when the model changes, it will not be affected by various factors and produced unreasonable shape. The ASM algorithm can rely on the training set to accurately locate the facial position at different angles. Eyes and mouth are made up of a sequence of points of particular serial number.
Image preprocessing
The facial images we get will have shadows, uneven brightness, and lots of noise in the image. This situation is very bad in extraction of features. This will make the extracted features occurring error. In order to minimize the impact of image quality issues on the system, we need to preprocess the acquired images to facilitate subsequent processing. The step of image preprocessing is proposed in Fig. 2.
The steps of image preprocessing.
The MUCT face dataset is composed of 3755 face images which is different in races, lightings, ages, etc. Landmark has been marked. The dataset is used to train the ASM. Through sampling the landmark and the surroundings, we establish the descriptor model of the landmark. The following Fig. 3 is the test results.
Model testing result:(a) wear glasses, (b) do not wear glasses and eyes borders, (c) close eyes, (d) detect when eyes are tired.
HOG Feature to feature extraction
It is a feature’s descriptor that is used for various object detection application in the field of computer science [16]. It computes the features by calculating and counting histograms of gradient directions in the local area of the image. The HOG features are called Histogram of Orientation Gradient. A local target’s appearance and shape can be well described by the gradient and distribution of the direction density of the edge [17]. The image is divided into small connected areas (i.e., gradient or edge is collected in the cell unit. These directional histograms are combined to a feature descriptor. The HOG feature manipulates local cells on an image, so it maintains good unchanging with respect to image geometry and optical deformation.
Because the ASM algorithm marks each part of the face by the determined points, the eyes and the mouth are made up of a particular sequential points. In this paper, the features of the eyes and the mouth are described with the HOG. This avoids the description of the non ROI (region of interest). It reduces the noise interference in these regions and its compute largely.
HOG feature consist of magnitude and direction. The magnitude value of the pixel at coordinates (
The
The gradient direction of the pixel at coordinates (
When we extract the features of the eyes and the mouth, we need to classify the features to distinguish the state of the eyes and the mouth. Cortes and Vapnik first proposed Support Vector Machine (SVM) in 1995 [19]. It is based on VC-dimensional theory of statistical learning theory and structure risk minimization principle [20]. The best compromise which between the complexity of the model (i.e., the learning accuracy of a particular training sample) and the learning ability (i.e., the ability to recognize an arbitrary sample without errors) is sought based on limited sample information. That can achieve the best generalization. Compared with traditional machine learning methods, SVM effectively solves the problems of small samples, non-linearity and high latitude. It has strong promotion ability and has been widely used in machine learning and other fields.
When we classify the human eyes and the mouth, the main concerned thing is whether the eyes are open or closed, the mouth is open largely or closed. In the data classification processing, we define the data layer as: open eyes
Head posture estimation
In different states, the driver’s head posture is different. We can add head states when estimating the driver’s fatigued level. Head pose estimation need to obtain a three-dimensional postural parameter of user from the image. Taking into account the complexity of the head movement in real situations, we describe the head pose with an Euler rotation angle, which consists of a set of three-dimensional angular parameters (Yaw, Pitch and Roll). The horizontal direction is the X-axis. The head rotates around the X-axis is the Pitch. The Y-axis is vertical downward. The head rotates around the Y-axis is the Yaw. According to right-hand rule, we can easily get the Z-axis. The head around the Z-axis is the Roll.
There are many ways to estimate the states of the head. Here, I used the Pose from Orthography and Scaling with Iterations (POSIT) algorithm. POSIT is a fast and accurate iterative algorithm for finding the 6DOF pose (orientation and translation) of a 3D model or scene with respect to a camera given a set of 2D image and 3D object points correspondences [21]. The best advantage of the algorithm is that it requires only a few points (at least 4) to solve problems, meanwhile bypassing the process of solving nonlinear equations. That’s make solution is more simpler. The algorithm only know a few world coordinates of objects in reality and the coordinates of the objects in the image plane, through sustained iteration, approaching the transformation matrix between the world coordinate system and the camera coordinate system. It includes a rotation matrix and a translational matrix. Because the camera is the origin of the camera coordinate system, its coordinates can be calculated in the world coordinate system.
Thus, the head pose of the 3D world, by using the POSIT algorithm, is associated with the point of the 2D camera.
Fatigue decision-making
Fatigue decision-making indicators
Since each person’s perception of fatigue is different, fatigue becomes a vague concept. Because we want to detect fatigue, we need to quantify the measurement of fatigue. Federal Highway Administration (FWHA) and NHTSA consider Percentage Of Eyelid Closure Over The Pupil Overtime (PERCLOS) as the best method of real-time vehicle fatigued detection systems. PERCLOS is a certain percentage of eyes closed within a unit time and is divided into three standards of P70, P80 and EM [22]. P70 refers to the covered pupil more than 70% of the entire eyes, which is recorded as closed eyes. We count the proportion of the closed eyes in a certain period of time. We can also get P80. EM refers to the covered pupil more than the general area of the eyes. We count the proportion of the closed eyes in a certain period of time.
The PERCLOS is defined as Eq. (8) [22].
The relationship between the eyes opening and the time as shown below Fig. 4.
The relationship between time and Opening eyes [22].
According to Fig. 4, we can get the formula of P80 as following Eq. (10) [22].
Although PERCLOS is very effective in fatigue detecting, it has some shortcomings. For example, wearing sunglasses can cause difficulties in measurement. Therefore, the blink frequency is also introduced. According to the ophthalmologist saying, the driver may be fatigue if the driver’s blink frequency is too low and less than 7 beats per minute over a period of time. In addition, the parameters adds longest closing time, eyes open speed.
In addition to monitoring the eyes, the main visual fatigued feature is the mouth, which is often accompanied by frequent yawning. In a normal driving processing, the driver’s mouth has been basically closed; when the driver speaks to others, the mouth is normal open; when the driver is yawning, the mouth is big opening [23]. Under normal circumstances, yawning is a relatively long processing. The opening degree of mouth is more bigger. When the mouth is open for a period of time (usually 5 s), it can be judged as yawning.
In addition, when person enters suspected fatigue, head movement maybe slow or stopped. With time going, the driver will go into drowsiness, The head will appear frequently nodded.
To sum up, we are divided the fatigue decision-making indicators into six parts: PERCLOS, blinking frequency (BF), maximum close duration (MCD), blinking time (BT), yaw rate (YR), head posture (HP). The driver’s states are divided into three types: awake, suspected fatigue and drowsiness. All parameters should be statistically analyzed to a suitable threshold. The representation of fatigue in Table 1.
The representation of fatigue
In 1948, C.E. Shannon firstly put forward the concept of “information entropy” in his famous essay “Mathematical Principles of Communication”, solved the problem of measuring information and quantitatively the effect of information [24]. Shannon associates entropy with the amount of information and he points out that in order to eliminate the uncertainty within the system, it is necessary to add information.
The definition of ‘fatigue’ is a subjective uncomfortable feeling, but people lose normal activity or ability to work under the equal conditions. It can be seen that the definition of fatigue is very vague, there is no one physical quantity can determine the human’s fatigue. It brings difficulty to fatigue testing. However, we can find that introducing the concept of information entropy is a good way to describe the uncertainty of fatigue.
Judgment of fatigue can be divided into awake, suspected fatigue and drowsiness. Each states has a number of evaluation indicators. In the evaluation process we need to focus on two issues: one is the evaluation matrix (usually is fuzzy matrix); the other is the weight of the various evaluation indicators in the integrated processing. Weights can be divided into subjective weights, objective weights (entropy-weighting) and mixed weights. Subjective weights are determined by expert judgment or transcendental experience. Entropy-weighting are determined by information entropy. Mixed these together is a mixture of weights.
Entropy-weighting is not equal to significance of an indicator in the practical. It is the relative intensity of each index in the competition. Therefore, the size of the entropy is closely related with the evaluation of object. Particularlly, the value of aforesaid Entropy-weighting represents the effectiveness of the indicator. The maximum value of entropy is 1. The minimum value of entropy is 0, which means that the indicator fails to provide enough information to the decision-maker, and the indicator can be considered canceled [25]. The entropy-value is small and the entropy-weighting is large, which shows that the index provides much useful information and should be emphatically considered. After the entropy method, we can draw the rate of each index to the fatigue decision.
Entropy-weighting of each index is calculated by entropy-weighting method, which can be divided into the following steps:
2.4.2.1. Standardization of data matrix
The problem of judging fatigue can be modeled as M indicators (
In order to eliminate the dimension of different indexes, it is necessary to standardize the decision matrix and get the standardized matrix
For incremental index, the R can be calculated by Eq. (12) [26].
For diminishing index, the R can be calculated by Eq. (13) [26].
2.4.2.2. The definition of entropy
In the evaluation of the 6 indexes and n frames images, the entropy of the i-th index can be defined as Eq. (14) [26].
where
2.4.2.3 . Entropy-weighting
The entropy-weighting of i-th index is Eq. (15) [26].
where
Fatigue decision-making is an inference decision problem, utilizing the states of the driver from the known statistical information make the final decision with the most probable principle. This decision depends on the amount of available information and probability. Taking into account the unknown links possibly between the various indicators, so i chose Bayesian method to achieve fatigue decision-making.
Bayesian decision theory is the basic method of decision-making under the probability of framework. For the classification tasks, Bayesian decision-making considers how to choose the optimal type marker based on probabilities and misjudge losses in all relevant probabilities known. Bayesian classifier can represent the interdependence between attributes. It can integrate the prior knowledge and sample to train classifier performance and avoid the “black box” defects of other classifiers.
The processing of Bayesian decision-making is depend on Bayesian formula. When the sample size is close to the total number, the occurrence of the sample events calculated by the probability of the Bayesian formula will be close to the occurrence of the whole events.
When
After Entropy-weighting processing, we obtained weight coefficient
We can obtain the probability of each fatigued level. The max
Bayesian decision-making can make scientific judgment for the value of information whether need to gather new information.amd em[(2) ] It can evaluate the possibility of the results unlike the general decision-making method that the results of the survey completely believed or not. If the result of survey is not completely accurate, prior knowledge and subjective probability is not entirely reliable. Then Bayesian decision skillfully integrates the two information. It can be continuously used in the process of decision-making under accordant specific circumstances, so that decision-making is gradually perfected and more scientific.
The experimental environment conditions is in Table 2. The indoor light and temperature were not adjusted.
The experiment environment
The experiment environment
Our final goal is proving a reliable model. So the accuracy is the most important thing. We use the 100 sets of awake samples, 100 sets of suspected driving samples and 100 sets of drowsiness samples to form a test set. The participants is different in age, gender, physical fitness and weight on fatigue [12]. Ten students were selected as experimental subjects, whose basic information is shown in Table 3.
Basic information of the participants
The results of the verification are shown in Table 4.
Results of the verification
From the experiment result, we discover the detection of drowsiness is the most accurate and the detection of suspected fatigue is the most inaccurate. Because the features of suspected fatigue is vaguest. The algorithm that we used can’t distinguishes this kind of feature. The mean accuracy of this method is 83.3%.
The fatigue detection method has wide application value in future automotive manufacturing, the developed and control of instrument, fatigue driving warning module early, warning APP developed and so on. Now, the fatigue detection is mainly used in transportation. The future research can improve the data collection and fatigue judgment. With the growing prosperity of the big data and integrating multi-source data, developing an efficient, reliable and real-time fatigue detection system is the inevitable trend. Developed fatigue detection system and put it into the field of market application is necessary.
In order to solve the actual problem, i.e., how to reduce the risk of fatigue driving, existed in daily life, this paper comes up with a fatigue decision system, by using the OpenCV and the VisualStudio platform to develop program. Specially, the ASM algorithm is used to detect the face and get the key points of the face. So we can locate the human eyes and mouth. Further, extracting the HOG features of eyes and mouth. During this step, SVM is used to identify the state of eyes and the mouth. In addition, the head posture of the human is detected by the result of facial detection. Upon review, we divide the fatigue into three grades and use six indicators to indicate the state of person. For the weight assignment of the six indicators, the method introduces the entropy-weighting method to calculate the weight of aforesaid indicators, respectively, wherein the value of the six indicators and the Bayesian method is used to determine the fatigued level of the decision-makers.
Footnotes
Acknowledgments
This paper is supported by the local special program of the Shaanxi Provincial Department of Education (No. 16JF012), the National Natural Science Foundation of China (No. 61572392).
