Deep 3D Dynamic Object Detection towards Successful and Safe Navigation for Full Autonomous Driving

Patikiri Arachchige Don Shehan Nilmantha Wijesekara1, *
1 Department of Electrical and Information Engineering, Faculty of Engineering, University of Ruhuna, Galle, 80000, Southern Province, Sri Lanka

Article Metrics

CrossRef Citations:
Total Statistics:

Full-Text HTML Views: 1250
Abstract HTML Views: 918
PDF Downloads: 786
ePub Downloads: 728
Total Views/Downloads: 3682
Unique Statistics:

Full-Text HTML Views: 815
Abstract HTML Views: 620
PDF Downloads: 644
ePub Downloads: 596
Total Views/Downloads: 2675

Creative Commons License
© 2022 Patikiri Arachchige Don Shehan Nilmantha Wijesekara

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Department of Electrical and Information Engineering, Faculty of Engineering, University of Ruhuna, Galle, 80000, Southern Province, Sri Lanka; E-mail:



Infractions other than collisions are also a crucial factor in autonomous driving since other infractions can result in an accident. Most existing works have been conducted on navigation and collisions; however, fewer studies have been conducted on other infractions such as off-road driving and not obeying road signs. Furthermore, state-of-the-art driving models have not performed dynamic 3D object detection in the imitation learning stage; hence, the performance of such a model is unknown. No research has been conducted to investigate the driving models' computational complexities.


The objective of this research is to study the effect of 3D dynamic object detection for autonomous driving and derive an optimized driving model with superior performance for navigation and safety benchmarks.


We propose two driving models. One of them is an imitation learning-based model called Conditional Imitation Learning Dynamic Objects (CILDO), which performs dynamic object detection using image segmentation, depth prediction, and speed prediction. The other proposed model is an optimized model of the base model using an additional traffic light detection branch and deep deterministic policy gradient-based reinforcement learning called Conditional Imitation Learning Dynamic Objects Low Infractions-Reinforcement Learning (CILDOLI-RL).


An ablation study proves that using image segmentation and depth prediction together to enable three-dimensional object vision improves navigation performance rather than taking decisions entirely from the image. The CILDOLI-RL model presented in this paper achieves the highest score for the newly introduced No-Other-Infraction benchmark and No-Crash benchmark. It scores a moderate score for the Car Learning to Act (CARLA) benchmark in both the training town and the testing town, ensuring safe autonomous driving. The base CILDO model achieves the best performance in navigation and moderate scores for safety benchmarks under urban or rural dense traffic environments in both towns. Both proposed models are relatively computationally complex.


For safety-critical driving, since both navigation performance and safety are crucial factors, it can be concluded that the proposed CILDOLI-RL is the best model out of the two proposed models. For applications where driving safety is not of much concern, the proposed CILDO is the best model.

Keywords: Autonomous, Driving, CILDO, Imitation, Learning, Reinforcement learning, Infractions.