Expected Effects of In-Service Road Safety Reviews

: Despite the popularity of in-service road safety review as an effective tool to identify the safety problems on roads, there have been very few studies performed to gauge its benefits. This study analysed collision data on selected in-service road safety review locations in Alberta to examine whether the reviews are associated with any reduction in collisions on roads to provide policy makers with some evidence on which to base future investment decisions. Our results showed that the expected reductions in collision are highly sensitive to the evaluation methodology used.


INTRODUCTION
Road crashes are a major cause of deaths and serious injuries in many countries.Around the world, about 1.2 million people are killed each year on the roads [1].In the United States, for example, there are more than 42,000 traffic fatalities a year and the annual social cost is estimated at over $230 billion [1].Similarly, about 3000 road users are killed each year on Canadian roads, resulting in an estimated social cost of about $25 billion [2].Among the Canadian provinces, Alberta has been experiencing an increase in the number of traffic collisions over the last few years which is in contrast to the decreasing trend for the whole country.For example, there were 453 road deaths and 25,964 injuries in 2006 in the Province alone [3], resulting in an estimated annual social cost of over $4.7 billion [4].
In a bid to improve road safety, the Province of Alberta has set a goal to reduce the number of deaths and serious injuries by 30% (Alberta Transportation, 2006b).In addition to the education and enforcement efforts targeting speeding, impaired driving and the use of seat belts, the Alberta Traffic Safety Plan also identified improvements of the physical and operational characteristics of existing roads as an effective method of improving road safety [4].One of the engineering strategy in the plan calls for an increase use of a more proactive approach to identify highway related crash contributing factors and to develop engineering measures to mitigate the collision risks.
To find the magnitude of such problems and to take necessary improvement measures, road safety audits and inservice road safety reviews are increasingly being used.These measures are gaining popularity worldwide in last decade as a means to access the expected safety performance of newly constructed roads as well as existing roads.Amid the increasing popularity of such tools, the Transportation *Address correspondence to this author at the Faculty of Law and Management, La Trobe University, Melbourne, Victoria, 3086, Australia; Tel: 61-3-9479-1267; Fax: 61-3-9479-3283; E-mail: r.tay@latrobe.edu.auAssociation of Canada (TAC) has recently published two guides to conducting road safety audits [5] and in-service road safety reviews [6].The Alberta Traffic Safety Plan has also recommended maintaining the government's commitment to ongoing road safety audits and in-service road safety reviews to improve road safety.
Despite the increasing use of such reviews and the enormous amount of resources invested to improve safety on Albertan roads, very few studies have been conducted to examine the overall effectiveness of conducting these reviews.Though there have been some studies around the globe on the effectiveness of such reviews, these evaluations [7][8] tend to focus on the success or failure of the implemented recommendations and not on the impact of the review itself.The decision to invest in the review however, has to be based solely on the expected benefits of conducting the review itself and not on the expected benefits of implementing the individual recommendations from the review.Hence, there is a strong need to look at the effectiveness of such reviews in the aggregate level in order to make proper use of economic resources in future.The purpose of this paper is to examine the effectiveness of inservice road safety reviews in Alberta to get an insight on whether such reviews are, on average, expected to reduce the number of collisions on roads.In addition, this study will also examine the robustness of the result with respect to different model specifications and evaluation methods used.

METHODOLOGY
The main approach used in this research was the three year before-after study with comparison group analysis [9].Collision data from 1999 to 2005 were obtained from Alberta Transportation for 22 locations (treatment sites) where in-service road safety reviews were conducted between 2000 and 2002.Most of these locations were urban and rural intersections in Alberta.In addition, crash data were also collected for 37 potential comparison sites which had similar design and traffic characteristics with the treatment sites.
The validity of before-after study with comparison group analysis could be significantly affected by the selection of suitable comparison sites.Note that part of the process for selecting the comparison sites had to be qualitative and subjective because some of the important attributes were inherently qualitative in nature.Also, quantitative data on some attributes might not be available.Care had been exercised in selecting the comparison sites which included a detailed review of collision history, geometry, location and traffic control.The treatment and possible comparison sites that were considered for this research were listed in Table 1.
To complement the qualitative process, quantitative validity tests were also conducted.

Validity Tests for Comparison Sites
The validity of a comparison site to the corresponding treatment site was determined by the odds ratio test using two approaches.The first approach followed the methodology used by Hauer [9] while the second approach is the modified Allsop method [10,11].Only those locations that passed the validity tests using both approaches were considered eligible for the comparison.The underlying theory and assumptions behind each approach is described in details as below.

The Hauer Approach
Consider the following notations for the treatment and comparison groups as shown in Table 2. Assuming that the collisions follow a Poisson distribution, K is an estimator of E(k) and Var(k) , L is an estimator of E(l) and Var(l) .
Let be the expected counts on the treatment group that would have occurred had the treatment not been applied and i be the corresponding random variable, then Ratio of the expected counts for the comparison group: Corresponding ratio for the treatment group: As the treatment and comparison group are assumed to have similar characteristics, or or (5) Although the treatment and comparison sites are assumed to have similar characteristics, plentiful data have showed that this argument is invalid, that is, the claim that T C r r = is not true.It is therefore necessary to consider to be a random variable which on different occasions takes different values.This ratio is called odds ratio ( ).
For a comparison group to be considered legitimate, it must fulfil the requirement that the mean of odds ratios equals to 1.

E{ } = 1 (7)
To ascertain whether this hypothesis holds true between the treatment group and the corresponding comparison group, historical time series data before the treatment are used as the basis for the significance test.As the expression for the odds ratio is not linear, unbiased estimators for the mean and variance have to be determined [9]: For the data obtained from each treatment group and its corresponding comparison groups, the following notation as shown in Table 3 is used.Considering p as before period and p+1 as after period, the estimator for the average odds ratio for each period is computed as The corresponding expected value and the variance of the sample mean of the odds ratio (including the temporal effect over the periods and the randomness of collision counts in accordance with the Poisson distribution)) can be determined by: Assuming that the mean is normally distributed around 1 and considering a 95% confidence level, the lower and upper limits for the mean odds ratio can be determined as where W = 1, n = 1 and Z /2 = 1.96 for 95% confidence level.Hence, the comparison group is not considered to be a good candidate if W < 1 1.96s W or W > 1 + 1.96s W .

The Modified Allsop Approach
Similar to the Hauer approach, the odds ratio between period p and period p+1 is defined as To avoid negative probability as a result of large variance, the logarithm transformation of the odds ratio can be taken and the observed transformed variable for the transition from period p to p+1 is defined as follows [12]: The corresponding mean and variance can be estimated by Assuming that the sample mean is normally distributed around ln(1) = 0, the limits for the 95% confidence level are ±1.96s

Index of effectiveness:
= / .It is common to estimate by ˆ = ˆ / ˆ .However, even if ˆ and ˆ are unbiased estimates of and , the ratio ˆ / ˆ is biased estimate of .Though the bias is often small, to remove it is a worthwhile precaution [9].An approximately unbiased estimator for mean of the index of effectiveness is given by: The results obtained from these calculations can be interpreted as "after accounting for the sundry influences which have changed from the 'before' to the 'after' period, the treatment brought about a reduction of ˆ collisions in the study period" [9].The standard deviation of this estimate is ŝ 2 .This amounts to a reduction of collisions by 100 1 ˆ ( ) % with a standard deviation of ±100 ŝ % .

Pooled Index of Effectiveness
Drawing conclusions by interpreting the results of individual treatment sites may be misleading as the interpretation of results might be blurred by the noise associated with small samples in individual sites [13].This problem can be addressed by pooling the data.The pooled index of effectiveness can be determined as follows [12]: Mean of the pooled index of effectiveness: (20) Variance of the pooled index of effectiveness:

Empirical Bayes Analysis
In order to correct the possible regression to mean bias, Empirical Bayes method was used.Although the development of unique collision prediction model was desired in the EB method, such a model could not be developed for this study due to the limited availability of data.However, a literature search revealed some Alberta based collision prediction models had been developed to predict three year collision data in the intersections of provincial highways of Alberta and the City of Calgary [14].These models were used to predict three year collision data for the treatment sites.Eight locations in Calgary and three in provincial highways were eligible for use in these models.
The models, also called safety performance function (SPF), used for this study had the general form given by: μ where a 0 , a 1 , a 2 are regression parameters V 1 is major road AADT V 2 is minor road AADT The regression parameters were estimated using the Generalized Linear Interactive Modelling (GLIM) and shown in Table 4. Source: Hamilton-Finn [14] Since AADT for the treatment sites were not available, they were estimated from the peak hour volumes.The relative weight was determined using the relation: where Y = number of years (3 years) which is already included when calculating μ where E(X) is the expected collision on similar locations X is the collisions at the treatment site The collisions expected on similar locations is the three year predicted collisions using the SPF: The standard deviation of the estimate of the expected collision frequency was calculated as: The index of effectiveness where = observed 3 year collisions in the after period Relative difference in collision occurrence = 100(1-) %. (28) In addition, to make the collision data comparable, all severity levels were converted to equivalent property damage only (EPDO) collisions.The conversion factor used was suggested by PIARC [15]:

RESULTS
The odds ratio test results for the successful sites were summarized in Table 5. Suitable comparison site could not be found for the Highway 1 and Highway 9 intersection.The only identified possible comparison site did not pass the Modified Allsop test and hence was excluded from the before-after collision analysis.The estimates of the indices on the effectiveness of the in-service road safety reviews in reducing collision were shown in Table 6.Since two year before-after studies were relatively common, estimates using two years of data were also provided as a comparison.
The before-after analysis results showed that there was a reduction in the number of collisions that was associated with the in-service road safety reviews.The result, however, depended on whether one chose to use the equivalent property damage only collisions (EPDO) or just the total number of collisions regardless of the severity.As one of the purposes of this study was to compare the results obtained from different methods, results obtained from both methods would be discussed.As summarized in Table 6, for the EPDO collisions, there was a reduction in collision in most of the sites.Assuming a normal distribution, the effectiveness of the treatment was tested at 95% confidence level.The treatment was therefore deemed to be effective if the collision reduction factor ˆ was more than 1.645 ŝ .Despite the reduction in collision numbers, the treatment in most of the locations could not be considered effective at the 95% confidence level.However, a few sites showed significant reduction in collision.
To avoid the possible misinterpretation of the results that might be blurred by the noise associated with the small samples, pooled estimations were carried out.The results showed that there was an overall reduction of 16.8% in the three year collisions with the standard deviation of ± 12%.Somewhat similar results were obtained for the individual treatment sites by using the collision figures without any conversion to the EPDO values.There was an overall reduction of 4.9% with a standard deviation of ± 10% in the total collisions in three year period which could be attributed to the in-service road safety reviews.However, some of the sites which were determined to have effective treatments were no longer effective using this criterion.One possible reason might be the higher number of injury collisions at those sites, resulting in a large difference between the before and after collision numbers in the case of EPDO values.
Two year before-after analysis was also performed including some additional sites for which had only two year before and after collision data.The results showed that there was an overall collision reduction of 19% with the standard deviation of 29% in the 2 year after period.Majority of the treated sites experienced a reduction in the collision numbers whereas only 5 out of 16 sites were considered to have effective treatment at 95% confidence level.Compared to the results of three year before-after study, the reduction in collisions based on a two year after period was slightly higher.The possible reasons could be either the third year in the before period which was left out for 2 years study had small number of collisions or the corresponding third year in the after period had higher number of collisions as a result of decreasing effects of the improvements made.
The results of Empirical Bayes method seemed to a bit more different from the ones from before-after with comparison.This result was logical because, as opposed to before-after with comparison method, Empirical Bayes compared the collisions with the average collisions from a large sample of similar sites which made the average collisions decrease in most cases, resulting in lower predicted collisions on the treated sites.The precision of the method, however, relied significantly on the selection of proper safety performance function which in this case was adopted from a study that was assumed to be relevant and reasonably representative.
A different prediction model for the sites on Alberta highways showed a reduction in the collisions compared to the similar ones in the province.The reduction in collision was higher in three year before after analysis compared to the Empirical Bayes method.This indicated that the treated sites had higher reduction in collisions than non-treated ones.

Additional Results
The log-ratio and chi-square tests were performed to determine whether the reduction in collision numbers were significant.Log ratio test results showed that the reduction in collisions was significant in the case of EPDO collisions as the ratio = -6.703which was outside the range ± 1.96 at 95% confidence level.However, the same ratio for the case of collisions without conversion was -1.626 which was within the range ± 1.96 indicating that the reduction in the collision numbers in the treatment sites as compared to the comparison sites was not significant at 95% confidence level.The difference in results in the two cases indicated that there were more injury collisions in the treatment sites in the before period which increased the value of EPDO and hence the difference between before and after collisions was much higher as compared to the collision counts regardless of the severity of collisions.
Similarly, the chi-square test results showed that in the case of EPDO collisions, the 2 value is 8.42 which corresponded to a p-value of 0.006.This implied that there was 0.6% likelihood that the change in collisions was due to random fluctuation.In other words, there was 99.4% level of confidence that the change in collisions occurred because of the treatment.The 2 value in the case of collisions without conversion was 0.57 which corresponded to the significance level of 73.4% indicating that the reduction in collisions was not significant at 95% confidence level.

CONCLUSION
After completing the before-after study with comparison, it was concluded that the sites where in-service road safety reviews were performed had, in aggregate, experienced a reduction in collisions.Results also showed that although the outcomes in urban locations were mixed, all three rural locations had benefited from the in-service road safety reviews.This result might suggest that recommendations for safety improvements were more likely to be adopted for provincial highways.However, as the number of rural sites used in this study was quite small, care should be exercised in generalizing this result.
Both log ratio and chi-square tests showed consistent results, implying that the results were reliable.As these tests were non-parametric tests, they tended to have a lower power than other parametric tests which required added assumptions about the distributional properties of the variables.It was very encouraging therefore to find a statistically significant reduction in some of the locations and at the aggregate level.These results showed that conducting in-service safety review had a positive safety effect, on average, regardless of whether the recommendations from the reviews were adopted.The reductions might be due simply to Hawthorne Effect (temporary change in behaviour due to increased attention) or more permanent changes due to the recommendations being implemented immediately in some of the locations.
With respect to the robustness of the results due to methodological changes, our analysis found that the reduction was significant when considering equivalent property damage only collisions and not significant when non weighted total collision numbers were used, which implied that there had been a reduction in the severity of collisions in the after period.However, it was possible that the results might change if different weights were used to convert injury and fatal collisions to property damage only collision.More research should therefore be conducted in the future to determine the sensitivity of the results to different weighting schemes.
The results of before-after study were compared with the results from Empirical Bayes method.After correcting for possible regression-to-mean bias, the results of Empirical Bayes analysis showed that, unlike in the before-after method, there was an increase in the collisions in most of the Calgary intersections relative to the expected collisions predicted by the safety performance function developed for the Calgary.However, the safety performance function used to predict the collisions were quite rudimentary and thus might have, to some extent, contributed to the difference in the results.
More importantly, since there was a statistically significant reduction in EPDO crashes, the results might simply indicate that the modifications implemented were mainly designed to reduce the severity of crashes.However, the efforts to verify which, if any, of the recommendations were actually implemented were not successful even after contacting several of the transportation agencies involved.Nevertheless, since the objective of this study was to evaluate the effectiveness of the in-service road safety reviews per se and not on the effectiveness of the modifications, this concern was scope beyond of this study to address.
In conclusion, the overall effect of the treatments done after the in-service road safety reviews seem to be quite positive towards improving road safety.Therefore, it is recommended that in-service safety review be conducted at any high crash locations, especially at locations with high severity crashes.In addition, as a more pro-active effort, inservice safety reviews should also be conducted at samples of selected highways regularly to improve the safety performance of the overall road system.

Table 2 . Notations for Treatment and Comparison Groups Treatment Comparison
are the corresponding random variables, and ( , , μ, ) are the corresponding expected values.