Abstract
Autonomous vehicles technology is an emerging area and has attracted lots of recognition in recent times. Accidents-free driving has always been the focal point of autonomous vehicles. Autonomous vehicles have the potential to eliminate human errors while driving, which has been argued as the predominant cause of traffic accidents. In autonomous vehicles technologies, a variety of efforts have been made to eliminate human drivers. The full elimination of humans is not possible at this moment, but some of the tasks can be automated to facilitate the drivers. In this paper, we investigate the leading causes of accidents based on UK vehicle safety data of 2017-2018. We analyze the data and investigate the leading factors which cause traffic crashes. Based on the leading features in the dataset, we then run different prediction algorithms to predict the severity of accidents under a given input feature set. The accuracy of the model with Decision Tree classifier, Random Forest, and Logistic Regression are compared, and it has been found that Random Forest performs best among others with 95% accuracy. The trained random forest model is deployed on the Internet of Things server based on Arduino, and a lightweight application is developed to get the vital data from the driver. The data is applied to the trained model to predict the risk index of driving. This application is lightweight but yet provide a significant contribution in terms of safety in autonomous vehicles.
Introduction
Internet of Things (IoT) and smart spaces are an exciting research topic, and many studies have been devoted to allow context-aware application for sustainable development of the society [1–4]. Numerous sensors and actuators form a network in which the information is communicated and analyzed to form a whole new paradigm known as the IoT.
The advancement in IoT has further paved the way for many technologies to embrace it and orchestrate the processes by leveraging human as well as things capabilities. [5–8]. For instance, conventional real-time systems are now becoming IoT-empowered real-time systems (RT-IoT) with much more capabilities that come with IoT [9, 10]. One such evolution is the efforts towards a smart transportation system. Autonomous vehicles and electric vehicles are introduced to encourage risk-free and green transportation systems, respectively. In the former case, the motivation is the reduction of fatalities and accidents, which are mostly caused by drivers’ errors. It is envisioned that more than 90% of traffic crashes are due to the drivers’ errors, such as fatigue, and thus, their elimination can reduce the chances of accidents near to 10%.
Nevertheless, the reduction of human intervention is the sole aim of autonomous, which is very rewarding. However, at the same time, it can lead to catastrophic hazards if any failure occurs. AV testing and piloting have begun in various countries. By 2014, AV testing on roadways has been legalized in four states in the US. In Australia, AV testing has been first introduced in South Australia’s roads in 2016 [11]. The market penetration rate of AVs is estimated to be between 24% and 87% by 2045 [12, 13]. However, no vehicle has passed level 5, which corresponds to a fully autonomous vehicle. Although Tesla claims that they will introduce level 5 vehicles by 2020, it is still far from reality. So, in order to contribute to the majority of traffic crashes, automakers introduced some automated features such as self-parking, automatic braking, forward collision warning, lane departure warning, and blind-spot monitoring [14, 15].
The problem with these features is that they are very sophisticated at times and can facilitate driver but near to 100% accuracy, but for every feature, single embedded hardware is equipped with the vehicle, which is not only expensive but also put a load the battery of the vehicle as well. In this paper, we model the safety of vehicle using a completely different perspective and assume the collection of real-time tasks which must be executed within their deadline. Unlike dedicated and expensive hardware, we use a single generic IoT hardware that communicates with sensors to detect the driver’s state and based the state, the risk index is predicted. We prioritize the tasks based on the dataset of UK road safety, which is taken for 2017-2018. The dataset is analyzed, and the vital factors affecting the safety of vehicles are investigated. Based on vital features, the data is trained to predict the risk index for a certain set of features. Furthermore, the trained model is deployed, which takes real-time data and gives it to the deployed model. The model computes the risk index. If the risk index is high, actions such as alerts and notifications are issued to the driver. The assurance of real-time tasks execution can be formally proved by carrying out feasibility analysis. Once it has been guaranteed than no critical task will be missed, it will be deployed in a real environment.
The remainder of this paper is organized as follows. In the next Section, the motivation and parameter prioritization are discussed, followed by the methodology, interaction modeling, implementation stack, and execution results. Finally, the model is evaluated with respect to different parameters, and the paper is concluded with a summary of key findings and directions for future research.
Motivation
In this Section, the motivation of this paper is discussed in detail. The primary impulse which leads to this work is the fact that accidents and traffic crash not only result in human losses but also a considerable portion of economic resources are wasted in handling these crashes. For instance, in the UK, the amount spent on avoiding fatalities is about 1.25 million in 2002 despite the fact that it is among the leading safest countries around the globe [16]. It means the money spent in other countries are way more than the UK. Similarly, in the United States, about 1.6% of annual GDP is spent on handling traffic crashes, which contributed around 838 billion USD in 2010, and this keeps growing every year [17]. Fig. 1 shows the cost of all individual crashes due to driver’s errors and external environment for drivers, which contributes a significant portion of the economy in handling and avoidance crashes. Therefore, the goal of this paper is to make use of newer technologies such as IoT and RT-IoT to make a generic system that not only avoids the traffic crashes but also contribute a significant deal to the economy. Other similar systems are accurate but not cost-effective and specific with a single role so multiple chips are needed to provide a complete system which not only increase the cost but also the load on the vehicle. The proposed system is generic and cost-effective and only a single general-purpose hardware need to be installed. This is the leading advantage of this work over the state-of-the-art solutions.

Statistics of Economic Contribution towards handling traffic crashes.
As part of the work, we prioritize specific parameters that affect the most on road safety. Certain parameters are related to the environment, such as bad weather, surface friction, whereas certain parameters are related to drivers’ biological behaviors. In this work, we consider the driver’s biological behavior and environmental conditions which can make them uncomfortable and error-prune, which is considered the leading factor for traffic crashes. Artificial intelligence and Machine learning are used to find patterns in historical data. Different factors are contributing to road safety according to recent research studies. In order to systematically investigate the parameters which cater the most for road hazards and accidents, we analyze the standard dataset of UK road safety for the year 2017-2018 [18]. The data are collected by the UK police forces from the occurrence of the accident. The data field contains records for accident severity, weather, speed, age of the driver, drowsiness of the driver, number of fatalities, and vehicle type. Fig. 2 shows the architecture for data analysis and parameter prioritization.

System Model for Risk Index Prediction based on Trained Model and Drivers Biological Data.
The model accuracy is 95.67 based on Random forest, which on analysis further revealed that the dataset features which are preprocessed have a better match with the random forest as compared to other counterparts algorithms. The output is a risk index which computes the accident chance and their severity. Fig. 3 shows the top 5 factors having the strongest correlation with accident severity. It has been clear that two of the factors are related to the driver’s state, whereas the topmost culprit for accidents is the weather conditions.

Selected features with the leading correlation with accident risk.
In this paper, we consider three different aspects to model a semi-autonomous vehicle, which is safe for users, roads, and other vehicles on the road. The primary motivation of this work is the task modeling of all the scenarios, which can potentially lead to hazardous situations. Based on these scenarios, actions are performed to avoid such hazardous situations. The methodology of the system is exhibited in Fig. 4. First, we generate tasks based on two scenarios, i.e., environment monitoring and driver biological monitoring. For these scenarios, tasks are generated. Tasks for the first scenario are getting temperature, getting humidity, getting light conditions, and getting weather conditions, while for the following scenarios, tasks, namely getting ECG, getting a pulse, and getting oxygen saturation, are generated. These tasks are mapped on their respective sensors based on the correlation with the sensor. These tasks mapping are stored in a central repository from which it has been consumed by IoT devices. In this setup, Arduino is used for monitoring driver biological behavior such as drowsiness and fatigue, while Raspberry PI is employed to tackle the environmental conditions. The contextual data sensed from sensors are published to the central repository from where it is consumed by intelligent control. Intelligent control checks the values of the contextual data to predict the risk of accidents. If the risk is above some predefined threshold, it triggers control tasks. The control tasks are also mapped on the corresponding actuators and follow the same pattern. Eventually, the tasks are deployed on actuators connected with IoT devices.

Methodology of Proposed Safety Modeling based on Task Management.
IoT applications modeling on a task level provides flexibility and a process-aware approach. One benefit of this is the involvement of the stack holders in the process. In this work, we also consider modeling a safe semi-autonomous vehicle based on real-time system theory of tasks. A task is said to be a single function or a combination of function, which is performed on sensors and actuators. Thus, in the IoT context, a task can be classified into two broad categories; sensor tasks and actuator tasks. Sensors tasks are commonly periodic tasks, while actuator tasks are control tasks that are triggered on the occurrence of some events. In IoT systems, tasks are not different from conventional real-time systems; thus, the design of tasks is influenced by conventional tasks in a real-time system. A task can have attributes such as start time, execution time, type, urgency. These attributes are summarized in Table 1. A task τ is mapped on a virtual object V o to perform some action on a sensor or actuator commonly called an IoT node.
Task Modelling and Parameters Description
Task Modelling and Parameters Description
One advantage of vehicle modeling on task level is also to have a flexible approach to model the energy consumption and the power consumption and to optimize it accordingly. For instance, if a task τ i with a period P is deployed on a physical object v m , the energy consumed is given by
The power and energy must be within the permissible range to avoid any circuit breakage, which can cause hazards in many cases. In this paper, tasks considered vital for safety are summarized along with their mapped virtual objects in Table 2.
Tasks Generation and Mapping in Electric Vehicle Safety Modeling
In this Section, the interaction and flow of the system have been described in detail. The proposed system first analyzes the safety scenarios, which are described in a service requirement. Once the scenarios are analyzed, the tasks are generated for them. These tasks are then mapped on virtual objects and allocated to the corresponding IoT node. The data collected from sensor nodes are persisted in a central repository from where it is consumed by a module named intelligent control. The module checks the values of the sensors’ data and classifies them as normal or abnormal. In the case of abnormal values, some actions need to be taken, which is in the form of control task generation and deployment. For instance, if the ECG values suggest that the driver is tired or drowsy, a control task with the sound notification is generated and deployed on the corresponding actuator. The flowchart of this process is shown in Fig. 5 to portray the above-mentioned processes.

Flowchart of Proposed Safety Modeling based on Task Management.
The sequence of operations in the proposed work is exhibited in the form of a UML sequence diagram, as shown in Fig. 6. We have two main planes; IoT Embedded Plane and Task Management and Orchestration Application Plane. IoT Embedded Plane hosts an IoT server which at first initializes and registers physical nodes connected with it. These nodes can be either sensors or actuators. The registry information is passed on to the application. The device information is sent to the virtualizer module, where the virtual objects are formed. Virtual objects (VO) are the in-system representation of physical devices. Task generator creates the tasks, and the tasks and VOs are passed to Task orchestrator module. Once tasks are created, the data is loaded to the mapper. The physical device data is also loaded to the mapper, where the tasks are mapped on virtual objects. The mapping information is sent to the monitoring center. The monitoring center deploys sensing tasks periodically and assesses the risk index. At times, when the risk index exceeds the threshold, it sends an acknowledgment to the task generator. Task generator generates control tasks and deploys it to mitigate the risk by notifying with alerts. The information is visualized on a remote simulator. Sensor data are visualized and can be seen without being on-premises.

Sequence of Interaction among different components.
In this Section, the implementation stack is illustrated. The machine learning and model training part of the work is implemented in Pandas and TensorFlow libraries of Python. For model training using different algorithms, Scikit Learn (SKlearn) library. This library has built-in modules for different algorithms, i.e., Decision tree, Random Forest, and Logistic Regression, to name a few. Once the model is trained, it is deployed on a web application developed in Python mini-framework named Flask. For visual appearance, Bootstrap 3 and Javascript are used. We use Arduino, Raspberry PI, and sensors such as temperature sensors, humidity sensors, wind sensors, to name a few, and actuators such as buzzer and LEDs for alerts. The implementation stack is summarized in Table 3.
Implementation Stack
Implementation Stack
In this Section, the execution results of the proposed system are discussed. The main components described in the earlier sections are implemented using the technologies mentioned earlier. As discussed, there are two planes embedded plane and task orchestration plane. The deployment is shown in Fig. 7. Conceptually, there is a client application, which, in this case, is exposing safety services in a typical electric vehicle. The client application consumes services which are based on task management and application platform. The platform is the mean by which the IoT devices interact with the client. The IoT devices, as described earlier, are Rasberry PI and Arduino. The physical setup is shown in Fig. 7b. The central focus of the Fig. is the task manager and virtual object manager. Task manager generates tasks and persists them in tasks database. Similarly, the virtual object manager also generates virtual objects. Virtual objects and tasks are supplied to the mapper module, which maps the tasks on virtual objects. The tasks are then allocated in the embedded plane. It can be seen from Fig. 7b that Arduino has a pulse oximeter and ECG sensor attached to the Libillium sensor platform shield. In contrast, fan motors, BMP sensors are attached to Raspberry PI.

Detailed Schematic Representation of Sub-modules of proposed framework.
Task Management module is responsible for task adding, editing, deleting, and storing it to the database. Firstly, tasks are added by the task management module and stored in the database. Secondly, the virtual objects are also generated by virtual objects manager. The data from the device registry is available to this module, which uses the data to generate a virtual object for the corresponding physical device. The virtual objects are also persisted in the database. Next, the mapping is done in mapping plane in which on one end, tasks are populated whereas, on the other end, virtual objects are populated. The tasks are mapped using the drag-and-drop JSPlumb library, and the configuration is persisted. Finally, the task-vo pair is deployed on physical devices. Fig. 8 shows the execution flow of the process mentioned above.
Virtual object manager deals in virtual object generation based on the device registry information. Virtual objects have been assigned tags and matching names, which can help in the mapping process. Similarly, the task name can also be named in such a way that it can say the type. For instance, for sensors, tasks names are prepended with get. The virtual object manager module interfaces are shown in Fig. 8

Execution flow and pictorial representation of various interfaces.
Once the tasks are deployed on virtual objects, the deployment starts, and it checks the risk index is computed periodically based on the model output. If the risk index exceeds 2, the vehicle and driving are considered in the hazardous state. At this point, control tasks should be triggered to inform the driver about the associated risk of accidents. Fig. 9 illustrates it pictorially.

Vehicles Safety Monitoring and Control.
In this Section, we evaluate the performance of the model based on three algorithms. We use Random Forest, Decision Tree and Linear Regression. The model output is risk severity classes. There are three classes. 1 represents fatality, 2 represents casualties, and 3 represents no harm. The input is the data parameters. The data is normalized prior to giving it to the training phase. Table 4 shows the summary of model evaluation in terms of various attributes. The recall and precision of different classes in Random Forest are more than its counterpart’s algorithms. The overall accuracy of Random Forest is also the highest among other algorithms. The micro-average for Logistic regression is highest, which means this algorithm is less accurate, but its performance is balanced across all classes.
Analysis of Model Training Results of various algorithms
Analysis of Model Training Results of various algorithms
Fig. 10 shows the graph of true predictions and false predictions. The value of true predictions is comfortably higher than the value of false predictions. Similarly, for every respective algorithm, the true predictions of Random forest are slightly higher than the counterparts’ algorithms.

Model Evaluation in terms of Confusion Matrix.
The related work of this work is interlinked with the motivation, architecture, and design of this work. Numerous efforts have been made on the road safety of autonomous vehicles. Fagnant et al. [19] considered the near elimination of error linked with human interventions. These human errors form 90% of the fatalities. In another study, Rau et al. [20] proposed a mechanism and categorized AV functions into four steps in order to identify the cause of fatalities. These functions are the pre-crash scenario, location of the crash, driver’s biological condition, and speed of the vehicle. Many research tests have been performed on Real-world data in the recent studies [21–24] and found that the less the human intervention and the more automation, the less will be the crash probabilities.
It is predicted that human offloading from autonomous vehicles will make the transportation smart, and the chance of crash will be much lower; however, the fully automated car is yet to be deployed by the organization. Therefore, nowadays, warning systems and driver assistance systems are in high demand, which facilitates drivers with alerts and notifications by executing real-time tasks remotely from edge nodes [25–27]. These devices can detect numerous states of drivers and environments such as drowsiness, alcohol, and drugs and fatigue of the driver [28]. Commercial systems such as LiDar, anti-lock braking systems (ABS), and adaptive headlights are among few notable systems; however, these systems are dedicated to a single job and are very costly [27]. In this paper, a generic approach to finding the most pressing parameters linked with crashes are investigated, and a lightweight system is developed to give alerts and notifications remotely in a real-time manner and this, to the best of the authors’ knowledge, is the first-ever such attempt.
Conclusion
In this paper, we have proposed an Internet of things application to facilitate semi-autonomous vehicles and helps them reduce crashes and fatalities. This paper considers the human driving job as a collection of tasks that have to be performed within a span of time. If the real-time tasks can be guaranteed to be executed in the allocated time, a warning system could be developed to alert drivers for making an error, which could lead to a potential accident. We have used Safety data of the UK and trained models based on a variety of algorithms and deployed it on the embedded IoT application. The model also helped in prioritizing the parameters which affect the most on fatalities. We have used weather conditions and driver biological conditions and deploy different sensors which periodically sense these scenarios and compute the risk of driving. If the risk of driving is more than some threshold, it will give alerts and notifications. In the end, the model is evaluated, and the execution results are illustrated with interfaces. The results show that this paper can be the best starting point to redefine the mechanism of driver assistance systems and get rid of the commercial and costly devices which only do a dedicated job.
