Instantaneous Incident Detection System Based on Analysis of Acoustic Signal from Crash and Skid in Tunnel

An acoustic signal-based tunnel accident detection system was developed in this study. In a tunnel environment, the sound diffusion effect is minimized and thanks to that, discrimination of accident sounds (crash and skid) from other noises can apparently be accomplished. The system is composed of three parts: algorithm, field device, and center system. To distinguish accident-related acoustic signals such as a crash or skid among various other sounds in a tunnel, a delicate algorithm that can discriminate those signals from other normal signals generated from moving vehicles was created. The developed algorithm processes acoustic signals to filter out noises and to identify accident-related signals. The field device, installed in a tunnel, collects analog sounds, transforms them into digital signals, and transmits the digital signals to the server in the tunnel traffic management center. Lastly, in the tunnel traffic management center, the acoustic signal processing algorithm described above, installed in a server system, can instantaneously detect accidents. Once confirmed by the system operators, the information on the detected accidents is intended to be provided to drivers following behind as well as relevant agencies to prevent secondary accidents and to respond promptly. The developed system was evaluated in a real tunnel environment using traffic accident sounds acquired from real crash tests. The detection rates were 95, 91, and 80% at distances of 10, 30, and 50 m, respectively with a detection duration less than 1.4 s. Compared to conventional detection systems using loop detectors or video images that have a long detection time of around 1 min, the developed system can be regarded as superior in that it has an extremely short detection time, which, of course, is one of the most important factors for automatic incident detection systems.


Introduction
In tunnels, vehicles generally experience higher vibration while changing lanes due to high air resistance compared to other sections of the highway; this, coupled with the limited shoulder area, can cause a high rate of traffic accidents. The Mont Blanc accident in 1999, involving a truck that caught fire while colliding with other vehicles, took 39 lives. Since the tragedy, instantaneous detection of accidents in tunnel has attracted substantial interest worldwide. In Korea, more than 600 accidents annually occur in tunnels and the rate of accidents is on the rise due to the increasing number of tunnels on the roads. Actually, the rate has increased by 31.2% in the past five years, causing great concern to road agencies (Kim and Lee, 2004). In Korea, every tunnel longer than 1 km is equipped with a tunnel traffic management system as illustrated in Fig. 1. In the system, accidents are first detected by vehicle detection systems (VDSs), followed by confirmation using closed circuit television (CCTV). Once confirmed, strategies for managing the traffic flow in a safe and efficient manner are subsequently executed using variable message signs (VMSs) and lane control systems (LCSs). Negative effects caused by accidents can be minimized when they are detected as early as possible. Conventionally, traffic accidents in tunnels have been automatically detected using incident detection algorithms of which the input is traffic data from VDSs (Balke, 1993;Castro-Neto et al., 2012;Browne et al., 2005). Among the widely recognized techniques are California, All Purpose Incident Detection (APID), and the McMaster algorithm. The fundamental logic behind the algorithms is identifying abnormal traffic flow characteristics that occur in the aftermath of traffic accidents. Generally, many parameters are essentially predefined to identify the abnormality. Unfortunately, however, these parameters are not easily calibrated and do not have spatial transferability either (Abdulhai and Ritchie, 1999). Owing to the lack of universality, only 12.5% of traffic management centers claimed to have been using a fully functional incident detection algorithm (William et al., 2007).
From the mid-2000s, automated incident detection using video image sensors has been attracting interest in some advanced countries (Prevedouros et al., 2006). An automated video incident detection system can identify a wide range of incidents including crashes, stopped vehicles, pedestrians, fire, vehicles driving the wrong way, and so on (Fahrtash, 2012). It also has the ability to enable operators to conduct instantaneous verification of the type and severity of the incident. However, due to the utilization of video image sensors, the deterioration of performance under sun glare, changing illumination, and dust conditions still makes reliable and instantaneous detection of tunnel accidents a challenging task. Moreover, the limited installation height (approx. 4 m) in tunnels prevents it from detecting incidents in cases where an incident is obscured by tall vehicles.
To resolve the above-mentioned problems in the existing techniques, an acoustic signal-based tunnel accident detection system (AADS) was developed in this study. In a tunnel environment, the sound diffusion effect is minimized and thanks to that, the discrimination of accident sounds (crash and skid) from others can be more easily performed compared to other sections of roadway. The system is composed of three parts: the algorithm, field device, and center system. To determine accident-related acoustic signals such as a crash or skid among other sounds in the tunnel, an innovative algorithm using nonnegative tensor factorization (NTF) and a hidden Markov model (HMM) was proposed. To collect sounds in the tunnel and to process the collected acoustic signals to transmit them to a server system in the tunnel traffic management center, an aesthetically designed field device was developed. To operate the proposed algorithm, a center system with sophisticated protocols was established.

Acoustic Signal Processing Algorithm
The proposed acoustic signal processing algorithm, as shown in Fig. 2, exploits NTF and HMM techniques to identify incident-related acoustic signals. The algorithm initially suggested by Jeon et al. (2016) first detects multiple acoustic events by utilizing channel gains obtained from the NTF technique. Subsequently, an HMMbased likelihood ratio test is performed to verify the detected events. Since it was first introduced by Shashua and Hazan in 2005, the NTF technique has been used by researchers to discriminate sounds from different sources (FitzGerald et al., 2005). HMM, recognized as a ubiquitous tool for modeling time series events, is a technique to represent probability distributions over sequences of observations. It has been extensively used in a wide range of pattern recognition systems -speech, biology, computer vision, and so on (Chahramani, 2001). The whole process of the proposed algorithm is initiated by receiving sounds from a field device that collects various sounds in the tunnel. Then, the sounds are processed by applying short-term Fourier transform (STFT) and a Mel filter bank composed of a series of overlapping triangular filters defined by their center frequencies to acquire the Mel-spectral magnitude. After that, the NTF technique is applied to improve the signal-to-noise ratio (SNR) by discriminating incident-associated sounds from other sounds and subsequently to initially detect incidents using the mean-to-max threshold on the channel gain of acoustic signals. Fig. 3 shows the noise-filtering process by the NTF algorithm on tunnel incidents (crash and skid), and the improvements in SNRs for sounds (30 samples) gathered at distances of 10 and 50 m from the sound generator were 14 and 17 dB on average, indicating that the proposed algorithm can be effectively applied to a tunnel environment where the principal noises are composed of echoes from the wall. The incidents initially detected by the NTF are finally verified by the HMM-based likelihood test. The proposed algorithm can be regarded as superior to the existing methods (Gemmeke et al., 2013;Valenzise et al., 2007) in that it performs the secondary verification using the HMM to minimize the false alarm rate while still maintaining a reliable detection rate; this secondary verification strategy has not been considered in the former studies.

Field Device
The main function of the field device is to digitalize the analog input acoustic signals and to transmit the digitalized signals to a traffic management center equipped with an acoustic signal processing system. The aesthetically designed field device, as shown in Fig. 4, is composed of two microphones, a video camera, and other interfaces. The microphones, with the capabilities of a signal-to-noise ratio of 65 dB, sensitivity of -40 dB, and dynamic range of 30 to 120 dB, collect analog sounds in the tunnel. The video camera with a pan/tilt/zoom function aims to verify the accident detected by the acoustic signals. Several interfaces including Ethernet, USB, a speaker, and a temperature sensor are also equipped for various functions such as debugging, transmitting the digitalized acoustic signals, alarming verified accidents on the road ahead, and monitoring environmental conditions.
The firmware for the field device was coded with C language on ARM Cortex-M7. It consists of two threadsaudio and network: the audio thread has the role of receiving sounds via microphones and storing them in SDRAM, and the network thread, acting like a TCP/IP server, transmits the acoustic data stored in the SDRAM to a server in the traffic management center whenever the server requests them. Hence, the TCP/IP server remains on standby status with a socket open until a client requests a connection.

Center System
The center system consists of two units of the server system -acoustic signal analysis and traffic management servers. The acoustic signal analysis server on which the acoustic signal processing algorithm described above operates receives digitalized acoustic signals framed in the unit of 84 ms from the field devices and detects incidents. The sampling rate of 84 ms was chosen as being where the algorithm showed the best performance, which also corresponds with a former study (Harlow and Wang, 2002). Once an incident is detected, the information regarding the detected incident is intended to be transmitted to the traffic management server that controls all of the tunnel traffic management devices including VMSs, LCSs, CCTVs, and VDSs.

Experimental Setup for Evaluation
The system described above is evaluated under real tunnel environment conditions as illustrated in Fig. 6. According to the Korean guidelines that stipulate the installation, operations, and maintenance of tunnel traffic management systems, accident detection devices are recommended to be installed at 100 m spacing. Since the field devices developed in this study can collect incidents bi-directionally, the maximum spacing for the evaluation was set at 50 m. The incident sounds for the evaluation, composed of 200 crash and 37 skid sounds, were obtained from widely recognized organizations such as the Euro New Car Assessment Program and Insurance Institute for Highway Safety. The sounds were generated using a speaker at similar sound pressure levels to those of the real sounds. According to a study (Neale et al., 2008), the sound pressure levels for vehicle crashes and skids lie in the range of 110-130 and 90-100 dB-SPL, respectively. As expressed in Equations 1 to 3, three broadly employed evaluation indexes for incident detection systems, the detection rate (DR), false-alarm rate (FAR), and mean time to detection (MTTD), were used for numerical evaluations.   Table 2 shows promising evaluation results at each distance from the sound generator (or speaker). The DRs revealed 95.36, 91.56, and 80.43%; FAR exhibited 2.63, 3.00, and 3.56%; MTTD resulted in 1.24, 1.30, and 1.39 s for the distances of 0, 30, and 50 m, respectively. Compared to FAR and MTTD, DR represented significantly different performances by distance, showing a decreasing performance pattern as the field device becomes further from the sound generator. Otherwise, no notable differences were observed for FAR and MTTD. The consequence of the minor difference in FAR may be highly significant because, according to a survey, the reluctance of road agencies to deploy incident detection systems is normally attributed to high FARs (Hancocks et al., 2011). Fig. 6 shows three-dimensional locations for the performances at each distance compared to the perfect score (DR of 100%, FAR of 0%, and MTTD of 0 s).

Discussion
As stated above, the results of the initial field test were considered to be satisfactory even when compared to the existing mature technologies. The most encouraging performance factor, as shown in Table 3, over the two conventional systems of the developed system, is MTTD, which is crucially emphasized for incident detection systems to prevent secondary accidents (Ozbay et al., 1999). Furthermore, the developed system has merits from the perspective of operations; it can be easily calibrated in comparison with the conventional systems and there are also no restraints for tunnel environments (installation height restraint or influence of occlusion).  Transportation Research Record, No. 1925. b Source: Video Incident Detection Tests in Freeway Tunnels, Transportation Research Record, No. 1959, 2006 However, no magic bullet exists for detecting incidents in tunnels. Although the developed acoustic-based system can resolve some deficiencies in traditional systems, it cannot cover every type of accident that may occur in a tunnel. Among 11 identified incident types in a tunnel, as listed in Table 3, it can only cover three types of incident crash, skid, and flat tire. Hence, the developed system is recommended to be deployed in combination with the conventional systems as a complete solution for instantaneously detecting incidents in tunnels.

Conclusions and Future Studies
Thanks to advances in tunnel construction technologies such as tunnel boring machine methods, an increasing tendency for road agencies to opt for tunnel construction rather than damaging the environment by constructing roads directly on mountains is observed in mountainous areas. Recently, underground roads are being actively constructed in Seoul metropolitan region, South Korea. Thereby, an emphasis on tunnel traffic management in a safe and efficient manner is becoming increasingly important. One of the core elements for tunnel traffic management is the prompt detection of incidents to prevent secondary accidents and to rescue injured people as soon as possible. However, traditional technologies including VDS-based incident detection algorithms and video image-based incident detection systems have some deficiencies in terms of their instantaneous detection capability and ease of parameter calibration. To resolve these shortcomings, an incident detection system based on analysis of acoustic signals from tunnels was developed in this study.
The developed system is categorized into three parts: the acoustic signal processing algorithm, field device, and center system. A sophisticated algorithm based on NTF and HMM techniques to identify incident sounds (crash and skid) was proposed and the capability to increase the SNR was verified. The improvement in SNR is essential to enhance the performance of incident detection. The aesthetically designed field device collects, digitalizes, and transmits various sounds in the tunnel. It also verifies the actual occurrence of incidents using a built-in video camera with a pan/tilt/zoom function. The center system composed of servers, algorithms, and communication protocols detects incidents in the tunnel and manages the traffic with an appropriate mitigation strategy. A field test of the developed system using 237 recorded incident sounds revealed an encouraging outcome with DR of 95-80%, FAR of 2.6-3.6%, and MTTD less than 1.4 s; these performances are ahead of the existing mature technologies as rigorously compared in the previous section.
The research project for developing this system consists of just three stages and is currently at the second stage. The subsequent tasks for the third step include verifying the performance using real incident cases in tunnels; this, of course, will consume substantial time and effort until sufficient samples are acquired to conclude a statistically significant result. Also, the proposed acoustic signal processing algorithm could be further enhanced by employing a more advanced pattern-matching algorithm like deep learning, but only when a substantial amount of real-world incident sounds becomes available. Other incident-related sounds from flat tires, bangs, and sirens could be considered for broadening the capability to detect incidents. Only two acoustic sensors (microphones) are used for this study as is, of course, the normal case for an acoustic signal gathering system (Jeon and Kim, 2017). However, applying more sensors might reinforce the performance of the proposed system, which is intended to be tried in the research stage that follows.