Semi and Nonparametric Conditional Probability Density, a Case Study of Pedestrian Crashes

Mahdi Rezapour1, *, Khaled Ksaibati1
1 Wyoming Technology Transfer Center 1000 E. University Avenue Department 3295 Laramie, WY 82071

Article Metrics

CrossRef Citations:
Total Statistics:

Full-Text HTML Views: 430
Abstract HTML Views: 241
PDF Downloads: 228
ePub Downloads: 178
Total Views/Downloads: 1077
Unique Statistics:

Full-Text HTML Views: 224
Abstract HTML Views: 140
PDF Downloads: 152
ePub Downloads: 115
Total Views/Downloads: 631

Creative Commons License
© 2021 Rezapour and Ksaibati

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Wyoming Technology Transfer Center 1000 E University Avenue Department 3295 Laramie, WY 82071; E-mail:



Kernel-based methods have gained popularity as employed model residual’s distribution might not be defined by any classical parametric distribution. Kernel-based method has been extended to estimate conditional densities instead of conditional distributions when data incorporate both discrete and continuous attributes. The method often has been based on smoothing parameters to use optimal values for various attributes. Thus, in case of an explanatory variable being independent of the dependent variable, that attribute would be dropped in the nonparametric method by assigning a large smoothing parameter, giving them uniform distributions so their variances to the model’s variance would be minimal.


The objective of this study was to identify factors to the severity of pedestrian crashes based on an unbiased method. Especially, this study was conducted to evaluate the applicability of kernel-based techniques of semi- and nonparametric methods on the crash dataset by means of confusion techniques.


In this study, two non- and semi-parametric kernel-based methods were implemented to model the severity of pedestrian crashes. The estimation of the semi-parametric densities is based on the adoptive local smoothing and maximization of the quasi-likelihood function, which is similar somehow to the likelihood of the binary logit model. On the other hand, the nonparametric method is based on the selection of optimal smoothing parameters in estimation of the conditional probability density function to minimize mean integrated squared error (MISE). The performances of those models are evaluated by their prediction power. To have a benchmark for comparison, the standard logistic regression was also employed. Although those methods have been employed in other fields, this is one of the earliest studies that employed those techniques in the context of traffic safety.


The results highlighted that the nonparametric kernel-based method outperforms the semi-parametric (single-index model) and the standard logit model based on the confusion matrices. To have a vision about the bandwidth selection method for removal of the irrelevant attributes in nonparametric approach, we added some noisy predictors to the models and a comparison was made. Extensive discussion has been made in the content of this study regarding the methodological approach of the models.


To summarize, alcohol and drug involvement, driving on non-level grade, and bad lighting conditions are some of the factors that increase the likelihood of pedestrian crash severity. This is one of the earliest studies that implemented the methods in the context of transportation problems. The nonparametric method is especially recommended to be used in the field of traffic safety when there are uncertainties regarding the importance of predictors as the technique would automatically drop unimportant predictors.

Keywords: Pedestrian crashes, Nonparametric density estimation, Smoothing parameter, Traffic safety, Single-index model, Models.