Abstract
Although there have been improvements in traffic safety measures, the frequency of traffic accidents continues to persist. Developing countries experience a significant impact from traffic accidents with respect to fatalities and property damage. Traffic accidents happen for multiple reasons, involving traffic conditions, driving violations, driver misjudgments, and so forth. Severe casualties may lead to fatalities; therefore, accident severity prediction might help reduce the chances of fatalities. This research makes use of a U.S. road accident dataset that contains the most relevant 32 factors related to accidents. For obtaining accurate prediction of traffic accident severity, this research proposes a solution based on an ensemble of random forest and support vector classifiers that is trained using deep convoluted features. Features are extracted from the road accident dataset using a convolutional neural network (CNN). The performance of models using original features and CNN features is analyzed that shows the superiority of convoluted features. Experimental results involving the use of several well-known machine learning models indicate that the proposed model can obtain an accuracy of 99.99% for traffic accident severity prediction. The efficacy of the proposed model is validated against existing state-of-the-art approaches.
Keywords
Annually, a considerable number of individuals sustain injuries or lose their lives because of road accidents globally, leading to significant human and financial losses. To effectively decrease the fatalities and damages resulting from road traffic accidents (RTAs), it is essential to comprehend the causes of such accidents and the severity of the injuries. The increasing complexity of road systems, coupled with the rising number of vehicles on the road, demands a data-driven approach to understanding accident patterns and identifying potential risk factors. It is important to continue researching and understanding the causes of RTAs and implementing effective strategies to reduce their occurrence and severity ( 1 ). According to a recent report published by the World Health Organization (WHO) with respect to global road safety, traffic accidents account for over 1.3 million fatalities worldwide. The report also reveals that car accidents are the primary cause of death among young adults and teenagers on a global scale ( 2 ). In addition, in developing countries, 80% of all deaths from traffic accidents involve men. These statistics highlight the need for continued efforts to improve road safety and reduce the number of accidents and fatalities. The U.S.A. has faced significant traffic safety issues, particularly in 2020, with over 35,766 fatal motor vehicle crashes resulting in 38,824 deaths ( 3 ). The country also suffers significant economic losses associated with RTAs, with an estimated annual loss of US$242 billion ( 4 ). The healthcare system is also affected by these accidents, as RTAs can occupy a significant portion (30%–70%) of orthopedic beds in hospitals, according to one study ( 5 ). To effectively address and reduce the frequency and severity of road trauma, it is essential to comprehend the factors that contribute to RTAs.
The high rates of traffic accidents, combined with the significant number of passengers involved, make it important to investigate passenger safety thoroughly ( 6 ). In an intelligent transportation system, accurate and quick prediction of the severity of traffic accidents is crucial for providing appropriate medical care and transportation in the event of an accident. The ability to predict the severity of reported incidents or forecast the severity of future crashes can be beneficial for different organizations, such as emergency responders, transportation planners, and insurance companies ( 7 ). This is an important aspect of managing traffic accidents. To identify the risk factors, researchers use the data related to traffic accidents and implement strategies to improve road safety ( 8 , 9 ). Factors such as road conditions, driver behavior, vehicle attributes, weather conditions, and crash details can all affect the severity and frequency of crashes ( 10 , 11 ). Accident data is collected and stored in a database that includes various crash-related characteristics. Different analytical techniques have been used to analyze this data in previous studies.
Unexpected traffic accidents are a common occurrence, but providing drivers with relevant information can help decrease the chances of accidents. Forecasting potential accidents and identifying contributing factors can aid in preventing accidents and making roads safer. To achieve this, different models have been created to pinpoint important factors that contribute to accidents. These models use techniques such as statistical analysis, machine learning (ML), and data mining to study past accident data and recognize patterns and trends. The goal is to develop models that can accurately predict future accidents and identify the factors that contribute to them so that appropriate interventions can be implemented to improve road safety ( 12 ).
Data mining employs ML algorithms to grasp the intricate and non-linear relationships among different factors that influence traffic accidents ( 13 , 14 ). ML techniques have emerged as a promising solution in this domain, offering the capability to uncover intricate relationships within accident data and provide more accurate predictions than conventional statistical methods. These algorithms help create models that can predict future accidents with high precision and recognize contributing factors to decrease prediction errors and enhance results. As the amount of traffic data increases, it is crucial to use ML algorithms to aid transportation management departments in analyzing data and making accurate and precise decisions that enhance road safety. The following contributions are made in this research work for traffic accident prediction.
This study proposes the implementation of an ensemble model that incorporates convoluted features derived from a convolutional neural network (CNN) model to predict the severity of traffic accidents. The proposed ensemble model utilizes a random forest (RF) and support vector machine (SVM) with a voting system to determine the final prediction.
Experiments are performed using the original features from the data, as well as the features extracted from the CNN model, and the impact of the original features is analyzed against the performance of models using convoluted features.
A performance comparison is conducted using various ML models, such as the RF, k-nearest neighbor (k-NN), gradient boosting machine (GBM), SVM, logistic regression (LR), decision tree (DT), extra tree classifier (ETC), Gaussian naive Bayes (GNB), and stochastic gradient descent (SGD). Furthermore, the proposed model’s effectiveness is analyzed by comparing its performance to state-of-the-art approaches with respect to accuracy, precision, recall, and F1 score.
The remainder of the paper is structured as follows. The second section reviews related research in the area of traffic accident severity, giving a summary of the literature. The third section explains the dataset and the various techniques used in the study. The fourth section presents and examines the results. Lastly, in the fifth section, the study is concluded with suggestions for further research.
Related Work
In recent years, ML has gained popularity in predicting accident severity because of its capability to identify intricate relationships and provide more accurate results than conventional statistical methods. Traditional statistical methods for accident severity prediction can have limitations such as low precision and unrealistic assumptions. ML, on the other hand, does not need prior knowledge of the variables or the process that creates them, making it a more dependable method for predictions. This section provides an overview of some of the ML-based approaches used for traffic accident severity prediction.
Al Mamlook et al. ( 15 , 16 ) conducted two studies utilizing ML techniques to analyze crash injury severity and injury prediction in road accidents. In their first study ( 15 ), the LR, DT, naive Bayes (NB), RF, and light GBM are used to examine factors affecting injury severity among elderly drivers, and the synthetic minority oversampling technique (SMOTE) method is employed to handle class imbalance. The light GBM model yielded an accuracy of 87%. In the second study ( 16 ), the authors evaluated the effectiveness of different ML models, including the NB, RF, adaptive boosting (ADA), and LR to predict injury severity for road accidents. The RF model demonstrated the highest accuracy rate of 75.5%.
Sameen and Pradhan ( 17 ) proposed a system for predicting accident severity based on deep learning models such as the multi-layer perceptron (MLP), Bayesian linear regression (BLR), and recurrent neural network (RNN). The study revealed that the RNN model produced an accuracy of 71.77%. Meanwhile, Aldhari et al. ( 18 ), suggested a ML-based system for traffic accident severity prediction in Saudi Arabia. The system employed three ML models, the RF, LR, and XGBoost, and addressed bias issues using SHAP software. Experiments were conducted in two scenarios: binary class classification and multi-class classification. For multi-class classification, XGBoost demonstrated the highest accuracy of 71%, while in the second scenario, XGBoost once again had the highest accuracy score of 94%.
In their study, Jamal et al. ( 19 ) presented a network that utilizes several ML models, including the RF, LR, DE, and XGBoost, to improve the prediction accuracy of road accident severity. The authors found that the XGBoost model exhibited superior performance compared to the other models with respect to individual class accuracy and overall predictive performance. In addition, through feature importance analysis, the authors identified specific factors that have a significant impact on the severity of traffic accidents. The proposed XGBoost model achieved an impressive accuracy score of 93%. Similarly, Manzoor et al. ( 20 ) suggested an ensemble learning model, the RFCNN, that combines ML and deep learning to identify the influential factors for road accident severity. The study revealed that on the 20 most significant features, the proposed RFCNN model obtained an accuracy score of 99.1%.
One study ( 21 ) proposed a simple Classification and Regression Tree (CART) model to predict the severity of motorcycle accidents. In addition, the Probability and Regression Tree (PART) and Multilayer Perceptron (MLP) models were utilized in the study. The influential factors linked to motorcycle crash injury severity were also identified. According to the findings, the CART model achieved an accuracy score of 73.81%, followed by the PART model with an accuracy score of 73.45%. Gan et al. ( 22 ) proposed a deep forest algorithm for traffic accident severity prediction. To evaluate its efficiency, the light GBM, k-NN, DT, RF, deep neural network (DNN), and XGBoost ML models were also employed. The proposed deep forest model exhibited good stability and achieved an accuracy score of 90.69%.
Lin et al. ( 23 ) proposed a deep learning-based system for traffic accident prediction for the Internet of Vehicles. For accident risk prediction, the authors used learning models such as the DNN, DT C4.5, NB, deep belief network (DBN), MLP, and Bayesian network. The results of the study showed that the DNN outperformed the other models and performed well for stage one and stage two clustering. Bahiru et al. ( 24 ) compared the performance of several ML algorithms, including the ID3, NB, J48, and CART. The study reported a 96% accuracy using the J48 ML model. Table 1 shows an analytical summary of the discussed research works.
Comparison of Existing Works
Note: RF = random forest; NB = naive Bayes; LR = logistic regression; DT = decision tree; GBM = gradient boosting machine; ADA = adaptive boosting; MLP = multi-layer perceptron; BLR = Bayesian linear regression; RNN = recurrent neural network; ETC = extra tree classifier; CNN = convolutional neural network; SGD = stochastic gradient descent; DNN = deep neural network; k-NN = k-nearest neighbor; DBN = deep belief network; CART = Classification Regression Tree; PART = Probablity and Regression Tree; VC = Voting Classifier; SHAP = SHapley Additive exPlanations; AB = AdaBoost.
The presented existing works on accident severity prediction show the models used, datasets employed, and the achieved accuracy scores. Most of the studies are directly relevant to accident severity prediction, as they utilize datasets specifically focused on traffic accidents and accident records. These results hold significant value for traffic engineers and field engineers, as they can inform targeted safety interventions, prioritize resources, and optimize traffic management strategies to mitigate accident severity and improve overall road safety. However, the analysis also reveals certain research gaps, such as the need for studies explicitly focused on accident severity prediction in specific regions or using larger datasets with diverse accident scenarios. In addition, some studies lack detailed explanations of how the model results can be translated into practical applications for practitioners. Nevertheless, the insights from these studies present a valuable contribution to the development of data-driven and effective accident severity prediction models with real-world applications for practitioners in the field. Addressing these research gaps will further advance the understanding and application of predictive models, ultimately leading to enhanced road safety measures and reduced accident severity.
Methods and Techniques
The proposed methodology and procedures are discussed in this section. It also covers the dataset description, preparation procedures, classifiers, and performance assessment matrices.
Dataset Description
The “US accidents (2016–2021)” dataset ( 25 ) was obtained from Kaggle and comprises more than 2.8 million records of traffic accidents. This countywide car accident dataset covers 46 states throughout the U.S.A. and includes 47 attributes and factors that contributed to car crashes in the country. The dataset covers accidents that occurred from February 2016 up until December 2021. The dataset contains 47 attributes, which are Start_Lng, Number, State, Weather_Timestamp, Visibility (mi), Amenity, No_Exit, Traffic_Calming, Nautical_Twilight, Severity (target variable), End_Lat, Street, Zipcode, Temperature (F), Wind_Direction, Bump, Railway, Astronomical_Twilight, Start_Time, End_Lng, Side, Country, Wind_Chill(F), Wind_Speed (mph), Crossing, Roundabout, Turning_Loop, End_Time, Distance (mi), City, Timezone, Humidity (%), Precipitation (in.), Give_Way, Station, Sunrise_Sunset, Start_Lat, Description, County, Airport_Code, Pressure (in.), Weather_Condition, Junction, Stop, Civil_Twilight, ID and Source. The complete description of the dataset is shown in Figure 1; a more detailed description of the dataset can be found in Gan et al. ( 22 ).

Description of the dataset.
Convolutional Neural Network
In this study ( 26 , 27 ), a CNN was employed as a feature engineering method to predict traffic accident severity. The CNN model comprises four layers, namely an embedding layer, a one-dimensional (1D) convolutional layer, a max-pooling layer, and a flattened layer. The embedding layer employs all the features of the traffic accident dataset, with an embedding size of 20,000 and an output dimension of 300. The 1D convolutional layer consists of 5000 filters, a kernel size of 2 × 2, and a rectified linear unit (ReLU) activation function. Subsequently, a max-pooling layer of size 2 × 2 is utilized to extract crucial feature maps from the second layer. Finally, the flattened layer is used to transform the feature maps back to a 1D array.
To convert the traffic accident dataset into the desired input format, an embedding layer is utilized by creating
The input for the convolutional layer is generated by the embedding layer
The model is designed to accept inputs ranging from 0 to 20,000, with an embedding size set at 20,000. The output dimensions
where
To extract features from the output of the embedding layer, a 1D convolutional layer is used in this study. The CNN consists of 500 filters, each with a kernel size of 2 × 2. The ReLU activation function is implemented, which sets all non-positive values in the
A max-pooling layer is employed to identify significant features in the output of the CNN. A pool of 2 × 2 is used for feature mapping, resulting in a feature set map
A flattened layer is used to convert the three-dimensional (3D) data into one dimension. This is done to improve the efficiency of ML algorithms, as they perform better with 1D data. By implementing these steps, 25,000 features are obtained for training ML models.
Machine Learning Algorithms for Traffic Accident Severity Prediction
In this study, multiple traffic accident severity prediction models were developed using various parameters and ML algorithms, such as the DT, LR, SVM, SGD, GNB, RF, k-NN, and GBM. The models were utilized to predict traffic accidents based on road surface conditions, using a dataset of multiple traffic accidents. The implementation of the ML models was carried out using the scikit-learn library.
Decision Tree
The DT ( 28 ) ML model is well-known and often used for classification and regression tasks. By recursively splitting the data into subsets depending on the input feature values, this algorithm generates a tree-like model of decisions and potential results. The interior nodes of the tree represent features, whereas the leaf nodes represent class labels. Beginning at the root node, the algorithm makes decisions based on the input feature’s value, eventually directing the input to a specific leaf node. The primary objective is to construct a tree that can accurately predict the class label of new input data. The DT model is easy to comprehend and interpret and can handle both categorical and numerical data.
Support Vector Machine
The SVM ( 29 ) is an efficient model for handling classification and regression problems. The SVM aims at locating the optimal boundary, called a hyperplane, that can effectively separate the data into different classes. The selected boundary maximizes the margin, which is the distance between the boundary and the closest data points from each class, also known as support vectors. When the data is difficult to linearly separate, the SVM is a good choice. The SVM uses the kernel trick for the conversion of data in higher dimensional space.
Random Forest
The RF ( 28 ) is an ensemble learning technique that can be used for classification and regression. This method involves multiple DTs, each constructed using a random subset of both the data and the features. The final prediction is determined by averaging or taking the mode of the predictions generated by each tree in the ensemble. The main idea is to average out the errors made by individual DTs, which are often high-variance models, and create a more robust model. The RF also has a built-in feature selection mechanism, where it selects the most important features for the DTs by considering the decrease in impurity (e.g., Gini, information gain) when a feature is used to split a tree.
Logistic Regression
LR ( 30 ) is a statistical method used for both classification and probability prediction in ML. It is a type of generalized linear model (GLM) that is used when the response variable is binary or categorical. LR uses the logistic function, also called the sigmoid function, to model the probability of a binary response variable. The logistic function maps any real-valued number to a value between 0 and 1, which corresponds to the probability of a certain event.
Extra Tree Classifier
The ETC is an ensemble learning method that uses randomized trees ( 31 ) to generate a final classification output by combining uncorrelated trees in a forest of DTs. The underlying concept of the ETC is similar to the RF but the method of constructing DTs in the forest is different. In the ETC, for the decision making some random samples of the K best features are used, and the optimal solution is found using the Gini index.
Gaussian Naive Bayes
The GNB model (
32
) is a classification algorithm that is based on the Bayes theorem and the assumption of independence between features. It is a probabilistic classifier that is specifically designed for continuous data, and it makes the assumption that the data for each feature is drawn from a Gaussian distribution. The Bayes theorem states that the probability of an event
K-Nearest Neighbors
The k-NN model ( 33 ) is a ML algorithm that can be used for both classification and regression problems. It is a non-parametric, instance-based, or lazy learning method. The k-NN algorithm locates the k-number of training examples that are nearest to the new data point and then classifies the new data point based on the majority class among those k-NNs. It makes predictions based on the similarity between the input data and the training data.
Gradient Boosting Machine
The GBM ( 34 ) is a ML algorithm that is suitable for solving both classification and regression problems. It is a member of the boosting family of ensemble learning methods, which entails combining the predictions of several weak models (e.g., DTs) to create a single robust model. The basic idea behind gradient boosting is to iteratively train weak models, such as DTs, and add them to the ensemble one at a time. Each new tree is trained to correct the mistakes of the previous trees by focusing on the training instances that were misclassified. The predictions of all trees are then combined to make the final prediction. This process is repeated until a pre-determined number of trees is reached or the performance of the ensemble on a validation set stops improving.
Stochastic Gradient Descent
SGD ( 35 ) is an optimization technique used in ML to find the best set of parameters for a model. The algorithm works by making small adjustments to the parameters in the direction that minimizes the loss function. The use of a random subset of the data, called a mini-batch, to estimate the gradient at each iteration is what makes it “stochastic.” The algorithm updates the parameters by using the gradient of the loss function with respect to the parameters from the mini-batch and a step size called the learning rate. The process is repeated until the loss function converges or reaches a pre-determined number of iterations. SGD is computationally efficient and can handle large datasets.
Proposed Framework
The use of ensemble models has become more prevalent in recent years, resulting in improved accuracy and efficiency in classification outcomes. By combining multiple classifiers, performance can be enhanced beyond what individual models can achieve. In this research, an ensemble learning technique is employed to enhance the prediction of road accident severity. The approach proposed involves a voting classifier that combines the RF and SVM models using soft voting criteria. The outcome is determined by the class that receives the most votes. Figure 2 illustrates the complete workflow of the proposed model. The proposed ensemble model is outlined in Algorithm 1:

Workflow methodology diagram.
The prediction probabilities for each test sample are obtained by running it through both the RF and SVM models, denoted by
where VC(RF + SVM) determines the final class by selecting the class with the highest average probability and combining the projected probabilities from both classifiers.

Architecture of the proposed VC(RF + SVM) model.
The proposed framework utilizes an ensemble model called VC(RF + SVM) for traffic severity prediction that combines two ML models. The experiments involve the U.S. accidents (2016–2021) dataset from Kaggle. A label encoder is used that converts the categorical data into numerical form. The proposed model has been tested on the U.S. accidents dataset for the years 2016–2021 in two steps. The first step is to predict the severity of road accidents using all 47 features of the dataset. During the second step of the experiment, the dataset is pre-processed for the ML models by utilizing convolutional features. The data is split into two subsets, with 70% used for training and 30% reserved for testing. This approach, known as the training–testing split, is commonly employed in ML to evaluate the accuracy of the model on previously unseen data. The implementation code for the employed models can be found at https://github.com/MUmerSabir/SAGETRR, while the dataset used in this study can be found at https://zenodo.org/badge/latestdoi/668118587.
Experimental Setup
For this experiment, we used an Intel Core i7 CPU and an NVIDIA graphics processing unit (GPU) to train the models. The models are implemented using a Python 3.7 programming environment. Table 2 provides the details of the software and hardware specifications used in the experiment.
Experimental Setup for the Proposed System
Note: RAM = Random Access Memory; OS = Operating System; CPU = Central Processing Unit; GPU = Graphics Processing Unit.
Evaluation Parameters
To evaluate the effectiveness of the proposed system for traffic accident severity prediction, various performance metrics are used in this study. These evaluation parameters include accuracy, precision, recall, and F1 score. These metrics are commonly used in ML to measure the model’s performance. The following formulas are used for these metrics:
True positive (TP): TP refers to the number of positive instances in the dataset that are correctly predicted as positive by the ML model. In other words, TP represents the number of instances where the model correctly identified a positive outcome.
True negative (TN): TN refers to the number of negative instances in the dataset that are correctly predicted as negative by the ML model. In simpler terms, TN represents the number of instances where the model correctly identified a negative outcome.
False positive (FP): FP refers to the number of negative instances in the dataset that are incorrectly predicted as positive by the ML model. In this case, the model makes a mistake by predicting a positive outcome when it should have been negative.
False negative (FN): FN refers to the number of positive instances in the dataset that are incorrectly predicted as negative by the ML model. This means that the model fails to recognize a positive outcome and, instead, predicts it as negative.
These terms are used in conjunction with performance metrics such as accuracy, precision, recall, and F1 score to evaluate the performance of a model.
Results and Discussion
Extensive experimentation has been carried out to predict the severity of traffic accidents, with ongoing efforts to create a more effective method for analyzing RTAs. ML models have been utilized on both the original and convoluted features, and the results have been discussed. In addition, an ensemble of the four highest-performing individual ML models has been employed in experiments on both feature sets.
Results of Individual Machine Learning Models on Original Features and Convoluted Features
The present study uses nine ML models with different parameters. To attain high performance, the hyperparameters are set empirically. The RF, for example, performs best when it works with the original features. The RF attains an accuracy score of 89%, followed by the ETC which achieves an accuracy score of 87%. The k-NN is the worst performer, achieving an accuracy score of 79%. The accuracy scores of all the classifiers when used with original features are displayed in Table 3.
Results of Models Using Original Features
Note: RF = random forest; ETC = extra tree classifier; LR = logistic regression; SVM = support vector machine; GNB = Gaussian naive Bayes; k-NN = k-nearest neighbor; GBM = gradient boosting machine; DT = decision tree; SGD = stochastic gradient descent.
Table 4 shows the classification accuracy of different classifiers when used with convoluted features. Experimental results depict that the RF and ETC outperform other learning models and achieve an accuracy score of 91%. Similarly, the SVM gives a higher accuracy score than the other classifiers. It is found that the performance of ML models tends to improve when CNN-based features are used for training. For example, the performance of the RF is improved from an accuracy of 89.41% to 91.92% using original features and CNN features, respectively. Similarly, all other models show better performance when used with CNN features.
Results of Models Using Convoluted Features
Note: RF = random forest; ETC = extra tree classifier; LR = logistic regression; SVM = support vector machine; GNB = Gaussian naive Bayes; k-NN = k-nearest neighbor; GBM = gradient boosting machine; DT = decision tree; SGD = stochastic gradient descent.
Results of Ensemble Models on Original Features
Initially, individual models are applied to the original features and convoluted features and the results of the models are shown in Tables 3 and 4. Out of the nine ML models, four models, the RF, ETC, LR, and SVM, achieve the best results on both feature sets. In this part of the experiment, the ensemble of these ML models is tested on the original features. Results of the ensemble learning models show that the proposed ensemble model RF + SVM outperforms other ensemble models with respect to accuracy, with 93%, which is 2% higher compared with the other ensemble learning models. It is followed by the SVM + ETC, which achieves an accuracy score of 92%. However, the RF + SVM achieves 93% precision, 96% recall, and 94% F1 score value for the traffic accident severity. Results of the ensemble learning models on the original feature set are shown in Table 5.
Performance of Ensemble Models Using Original Features
Note: RF = random forest; SVM = support vector machine; ETC = extra tree classifier; LR = logistic regression.
Performance of Ensemble Models Using CNN Features
Ensemble models are also trained and tested using CNN-extracted features to analyze the change in their performance when switched from the original features. Table 6 shows the results with respect to precision, accuracy, and so forth. It can be observed that the proposed RF + SVM surpassed the performance of the other models with 99% accuracy, and 100% for precision, recall, and F1 score. When trained on the CNN features, the proposed model obtains 6% higher accuracy than using the original features. With a large feature set obtained using the CNN model, the ensemble model is trained better, gets a good fit of the data, and shows better performance. The ETC + LR has shown the lowest results with 93% accuracy. Ensemble learning model results show better performance on the convoluted features than the original features.
Results of Ensemble Models Using Convoluted Features
Note: RF = random forest; SVM = support vector machine; ETC = extra tree classifier; LR = logistic regression.
Results of k-Fold Cross-Validation
A k-fold cross-validation is also performed to analyze and verify the performance of the proposed model. Cross-validation aims at validating the results from the proposed model and verifying its robustness. Cross-validation is performed to analyze if the model performs well on all the subsets of the data. This study makes use of five-fold cross-validation and the results are given in Table 7. Cross-validation results reveal that the proposed ensemble model provides an average accuracy score of 99.99%, while the average scores for precision, recall, and F1 are 99.98%, 99.96%, and 99.98%, respectively.
Results for k-Fold Cross-Validation of the Proposed Ensemble Model
Significance of the Proposed Approach
To prove the significance of the proposed approach, this study performed experiments on another dataset called the Road Accident (United Kingdom [UK]) Dataset ( 36 ) obtained from Kaggle that comprises more than 1.8 million records of traffic accidents. Table 8 presents the results of an ensemble modeling approach using convoluted features on the UK dataset for predicting accidental severity. The model aims to predict the severity of accidents, and the performance metrics evaluated are accuracy, precision, recall, and F-score for each combination of models.
Results of Ensemble Models Using Convoluted Features on UK Dataset
Note: RF = random forest; SVM = support vector machine; ETC = extra tree classifier; LR = logistic regression.
Results revealed that the RF in combination with the SVM, the ETC, and LR has shown good results, while LR in combination with the ETC and SVM has shown the lowest results on the UK dataset. The RF is an ensemble learning method that combines multiple DTs to make predictions, while the SVM is a powerful supervised learning algorithm used for classification tasks. The CNNs’ representation and feature extraction capabilities resulted in improved prediction performance of the RF + SVM for traffic accident severity. Based on the results, the RF + SVM combination achieved the highest performance with 98.50% accuracy, 99.17% precision, 99.50% recall, and 99.30% F-score, indicating it is the most effective ensemble model for predicting accidental severity on this particular U.S. dataset.
Discussion
This section provides an analysis of the findings of this study and the implications of improved prediction accuracies for accident severity. The results have demonstrated the effectiveness of ensemble models, particularly when leveraging CNN-extracted features, in enhancing the precision of traffic accident severity prediction. CNNs are excellent at capturing hierarchical patterns in data. Traffic accident data may have complex spatial patterns and relationships between different features. By applying a CNN, the model can identify and extract relevant spatial patterns that are difficult to capture with traditional methods. Traffic accident data can have various sources of noise and variations because of factors such as road conditions, lighting, camera angles, and so forth. CNNs have been shown to be robust in handling such variations, which can lead to more robust and accurate predictions. CNNs are capable of learning complex non-linear relationships between the input data and the target variable. Overall, the ensemble model benefits from the CNN’s representation and feature extraction capabilities, resulting in improved prediction performance for traffic accident severity.
Accurate accident severity predictions enable traffic engineers to proactively identify high-risk locations and prioritize safety interventions effectively. By understanding the influential factors contributing to accident severity, traffic engineers can design targeted and data-driven traffic management strategies, implement appropriate road safety measures, and optimize traffic infrastructure to enhance overall road safety. For field engineers, the study’s potential applications are equally significant. Accurate predictions of accident severity can aid field engineers in identifying and addressing potential road hazards, thereby minimizing the risk of severe accidents. This predictive capability allows field engineers to conduct targeted road inspections, implement timely maintenance measures, and improve road infrastructure in critical areas.
Limitations of the Proposed Framework
This research presents a valuable contribution to the field of traffic accident severity prediction, but it is crucial to acknowledge several inherent limitations in its approach. Firstly, deep learning models, such as CNNs, are renowned for their black-box nature, making it challenging to explain or visualize how predictions are made. The lack of interpretability can hinder the model’s transparency and usability, especially in applications where understanding the rationale behind predictions is vital. Secondly, the paper may not extensively validate the model’s performance on external datasets or in real-world conditions. Robust validation against diverse scenarios and comparison with existing models is essential to assess its true effectiveness and generalizability.
Conclusion
This study has demonstrated the critical importance of predicting accident severity to enhance public health and safety, considering traffic accidents as a major cause of injuries, fatalities, and property damage. The authors have proposed an ensemble model comprising the RF and SVM, trained on features extracted from a customized CNN model, for accurate accident severity prediction. Multiple experiments were conducted to compare the model’s performance using original features versus CNN features, with the results clearly favoring CNN features, leading to an exceptional accuracy of 99.99%, along with consistent precision, recall, and F1 score values. A k-fold cross-validation further confirmed the model’s robustness.
The implications of this research for traffic engineers and field engineers are significant. The proposed ensemble model’s high accuracy can enable more effective and proactive accident management strategies, aiding in resource allocation, optimizing traffic flow, and improving emergency response times. By predicting accident severity, this model can potentially save lives and prevent injuries, making it a valuable tool for traffic safety applications.
The proposed model has certain limitations for real-world implementation. It heavily relies on accurate and comprehensive accident data, and limited or biased datasets may affect its generalization and accuracy across different regions and road conditions. In addition, the computational resources needed for the CNN-based model could be challenging in resource-constrained or real-time scenarios. To address these challenges, further research is needed to obtain diverse accident data, handle evolving traffic patterns, and develop efficient algorithms for real-time predictions. Overcoming these limitations can make the ensemble model a powerful tool for traffic engineers and field engineers, contributing significantly to accident prevention and road safety.
Footnotes
Authors Contributions
The authors confirm contribution to the paper as follows: writing—original draft: N. Abuzinadah, M. Umer, S. Tahir; methodology: N. Abuzinadah, S. Tahir; resources: N. Abuzinadah, X. Chen; funding acquisition: T. Aljrees; project administration: T. Aljrees, A.A. Eshmawi; writing—review & edit: T. Aljrees, I. Ashraf; formal analysis: X. Chen, M. Umer; data curation: X. Chen, O.I. Aboulola; conceptualization: M. Umer, O.I. Aboulola; investigation: O.I. Aboulola, K. Alnowaiser; visualization: S. Tahir, A.A. Eshmawi; software: A.A. Eshmawi, K. Alnowaiser; validation: K. Alnowaiser, I. Ashraf; supervision: I. Ashraf. study conception and design: N. Abuzinadah, M. Umer, and S. Tahir; data collection: S. Tahir, X. chen, T. Aljrees, A. A. Eshmawi, O. I. Aboulola, and K. Alnowaiser; analysis and interpretation of results: M. Umer, X. Chen, AA. Eshmawi, O. I. Aboulola, K. Alnowaiser and Imran Ashraf; draft manuscript preparation: N. Abuzinadah, T. Aljrees and Imran Ashraf. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
