Fatal Crash Occurrence Prediction and Pattern Evaluation by Applying Machine Learning Techniques

Saumik Sakib Bin Masud1, Abid Hossain2, *, Nazifa Akter1, Hemin Mohammed3
1 Department of Civil, Environmental, and Architectural Engineering, University of Kansas, Lawrence, Kansas 66045, United States
2 Department of Civil and Environmental Engineering, Florida International University, Miami, Florida 33174, United States
3 Turners Department of Civil & Environmental Engineering, Trine University, Angola, Indiana 46703, United States

Article Metrics

CrossRef Citations:
Total Statistics:

Full-Text HTML Views: 1421
Abstract HTML Views: 188
PDF Downloads: 119
ePub Downloads: 76
Total Views/Downloads: 1804
Unique Statistics:

Full-Text HTML Views: 668
Abstract HTML Views: 107
PDF Downloads: 99
ePub Downloads: 59
Total Views/Downloads: 933

Creative Commons License
© 2024 The Author(s). Published by Bentham Open.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Department of Civil and Environmental Engineering, Florida International University, Miami, Florida 33174, United States; E-mail:



Highway safety remains a significant issue, with road crashes being a leading cause of fatalities and injuries. While several studies have been conducted on crash severity, few have analyzed and predicted specific types of crashes, such as fatal crashes. Identifying the key factors associated with fatal crashes and predicting their occurrence can help develop effective preventative measures.


This study intended to develop cluster analysis and ML-based models using crash data to extract the prominent factors behind fatal crash occurrences and analyze the inherent pattern of variables contributing to fatal crashes.


Several branches and categories of supervised ML models have been implemented for fatality prediction and their results have been compared. SHAP analysis was conducted using the ML model to explore the contributing factors of fatal crashes. Additionally, the underlying hidden patterns of fatal crashes have been evaluated using K-means clustering, and specific fatal crash scenarios have been extracted.


The deep neural networks model achieved 85% accuracy in predicting fatal crashes in Kansas. Factors, such as speed limits, nighttime, darker road conditions, two-lane highways, highway interchange areas, motorcycle and tractor-trailer involvement, and head-on collisions were found to be influential. Moreover, the clusters were able to discern certain scenarios of fatal crashes.


The study can provide a clear image of the important factors related to fatal crashes, which can be utilized to create new safety protocols and countermeasures to reduce fatal crashes. The results from cluster analysis can facilitate transportation professionals with representative scenarios, which will benefit in identifying potential fatal crash conditions.

Keywords: Transportation, Fatal crash, Machine learning, Clustering, Prediction, Vision Zero.