Abstract
The objective of this study is to understand the impact of a variety of factors on the frequency and severity of pedestrian-vehicle collisions that involve pedestrian spatial violations at mid-blocks. To that end, the historical collision records of the City of Hamilton between 2010 and 2017 were obtained, and collisions that had occurred at mid-blocks were filtered out. A Bayesian structural equation modeling (SEM) framework was developed to investigate the impact of a wide range of factors on such collisions. First, a classical SEM was developed to group the different factors into sets of latent variables. Four latent variables were defined, including location amenities and attractions, pedestrian/road network characteristics, exposure parameters, and location/collision-specific factors. The Bayesian SEM was then implemented to investigate the relationship between the latent variables and collisions. The results showed that the amenities and attractions of a location (e.g., parks, schools, bike-share stations, and bus stops) were the most influential factor on the frequency of collisions that involve spatial violation, followed by pedestrian network characteristics. Pedestrian network characteristics and location/collision-specific factors were found to be the most influential factors on the severity of collisions. The location of bike-share stations, pedestrian network connectivity, exposure to walkers, and the number of lanes were the four observed variables that explained the highest percentage of the variance in each latent group, respectively. The results of this study should assist engineers and planners to develop better design concepts to mitigate collisions that are caused by pedestrian spatial violations in urban areas.
Keywords
Promoting non-motorized modes of transportation, such as walking and cycling, has become a central objective for many transportation agencies around the world. Numerous policies and design concepts have been introduced to encourage active modes of travel, aiming at promoting sustainable communities and reducing single-occupancy vehicle trips. Nevertheless, safety concerns have been one of the major roadblocks for the full use of active travel modes as key modes of travel in many North American cities. Many transportation safety professionals consider pedestrians and cyclists to be among the most vulnerable road user groups who have a higher risk of being killed or severely injured as a result of road collisions. Historical collision data clearly show that pedestrians and cyclists are overrepresented in collision fatalities and serious injuries. For example, pedestrians accounted for 17.3% of collision fatalities in 2018 in Canada, despite representing only 3.4% of persons involved in collisions ( 1 ).
While pedestrian safety has been investigated extensively in the literature, the behavior of pedestrian and its impact on their safety have been studied relatively little. Previous studies showed that pedestrian-unfriendly design of the urban road networks, lack of effective pedestrian facilities, and inadequate prior education of pedestrians promote many risky pedestrian behaviors that affect the overall road safety level, such as spatial violation ( 2 ).
Crossing the street at undesignated spaces (spatial violations) has become a common way to cross streets in big cities. No surprise, such behavior was shown to be a major contributor to increasing the frequency and severity of pedestrian-vehicle collisions. Historical collision records of the City of Hamilton, Ontario, showed that about 35% of pedestrian-vehicle collisions that occurred at mid-block locations between 2010 and 2017 were mainly attributed to pedestrian violations. Data also show that 91.4% of those collisions were serious collisions that involved either pedestrian fatalities or serious injuries.
In this study, spatial violations in mid-blocks are defined in two cases: (i) pedestrian crossings outside a designated mid-block crosswalk that exists within 30 m of the crossing location; and (ii) crossings in mid-blocks with no close-by marked crosswalks where pedestrians did not yield the right-of-way to vehicles. These two cases were identified after a careful review of pedestrian crossing laws in the province of Ontario and some major cities in the province (specifically, the City of Toronto). The Highway Traffic Act of the province of Ontario, Chapter H.8, Section 144(22) states that “where portions of a roadway are marked for pedestrian use, no pedestrian shall cross the roadway except within a portion so marked” ( 3 ). The law does not stipulate how far from the nearest intersection one must be to legally cross mid-block. Some cities, such as the City of Toronto, follow police advice to generally use 30 m from the nearest intersection as a “rule of thumb” ( 4 ). This means that if a pedestrian crosses the street at unmarked crosswalks while they are within 30 m of an intersection, it is considered a legal offense. The study followed this concept and considered the crossings that occur outside the crosswalk, but within 30 m of the crosswalk to be a spatial violation. Also, the Municipal Code of the City of Toronto (Section 950-300B) states that “no person shall, except where traffic control signals are in operations, or where traffic is being controlled by a police officer, or at a pedestrian crossover, proceed so as not to yield the right-of-way to vehicles and streetcars on the roadway” ( 4 ). This means that pedestrian crossings in the mid-blocks with no close-by marked crosswalk can still be illegal if pedestrians do not yield the right-of-way to vehicles. Based on that concept, the study considered pedestrian mid-block crossings where no close-by marked crosswalks exist to be spatial violations if the pedestrians did not yield the right-of-way to vehicles.
Pedestrian spatial violations are typically influenced by a variety of contributing factors, such as social norms and habits, road network characteristics, traffic conditions, and built-environment characteristics, among other factors. Previous studies have investigated the impact of a wide range of factors on pedestrian spatial violations and attempted, to some extent, to assess the impact of such behavior on pedestrian-vehicle collisions. However, the impact of many factors, such as pedestrian network characteristics (network connectivity and accessibility) and location amenities, on the frequency and severity of collisions that involve pedestrian violations still requires further investigation. Moreover, the majority of previous studies assessed the impact of spatial violations on safety through advanced regression models, which consider the spatial violation decision as an independent variable that impacts collision occurrence. These models did not consider the personality traits of pedestrians while analyzing the spatial violations. Some pedestrians inherently tend to take risks while crossing a road, regardless of the road characteristics and the presence of preventive countermeasures. Thus, ignoring the impact of such traits could bias the impact of violations on collision frequency and severity. From a statistical point of view, this endogeneity-biased outcome occurs as a result of the presence of possible interrelationship between the independent variable in a model (i.e., spatial violation) and unobserved variables in the error term (i.e., the personality traits of pedestrians). Given the impact of unobserved features, spatial violations could be endogenous to the consequence of the collisions.
Bayesian Structural Equation Modeling (SEM) is one of the most common techniques that are capable of addressing the aforementioned endogeneity bias, by considering unobserved (latent) variables while developing a model based on the observed explanatory variables. In other words, the main role of SEM models is to define a median variable (i.e., latent variable) to identify the hidden impacts of the observed variables on the dependent one ( 5 ). Given the potential endogeneity bias of pedestrian spatial violations, the Bayesian SEM is considered an appropriate statistical technique to investigate the safety consequences of pedestrian spatial violations, in the frequency and severity of the resulted collisions. Although several studies recommended the implementation of the classical SEM method to study road safety ( 6 , 7 ), this technique has not been used to investigate the safety consequences of pedestrian spatial violations.
The main objective of the study is to analyze pedestrian collisions at mid-block locations that are attributed to spatial violations. The goal is to understand the impact of a variety of factors on the frequency and severity of pedestrian-vehicle collisions that involve pedestrian spatial violations. To that end, historical collision records for the City of Hamilton, Ontario between 2010 to 2017 were obtained, and pedestrian collisions that occurred at mid-block locations were filtered out. A wide range of factors that may impact pedestrian collisions that involve spatial violation was obtained from various sources, including exposure parameters, location-specific characteristics (such as the number of lanes, presence of central refuge islands, and road surface condition), the amenities and attractions that exist in the vicinity of the collision location (such as parking lots, bus stops, schools, convenience stores, and parks), pedestrian network features (such as connectivity, block size), and land use at the collision location. A Bayesian SEM framework was developed to investigate the impact of the considered variables on both the frequency and the severity of collisions that involve pedestrian spatial violations at mid-blocks. The results of the model were analyzed to understand the impact of the different factors on the violation-related collisions, along with identifying the relative importance of each factor in influencing the frequency and the severity of collisions.
This study provides two main contributions: (i) the application of the Bayesian SEM to assess the safety consequences of pedestrian spatial violation and identifying the contributing factors that affect the frequency and severity of collisions that involve pedestrian spatial violations; and (ii) investigation of the impact of numerous variables on violation-related collisions that were not thoroughly considered in previous studies, such as pedestrian network connectivity and accessibility, and a variety of location amenities and attractions. The results of this study provide valuable insights for a better understanding of the factors that encourage pedestrians to spatial violation and increase the risk of collisions, along with the impact of road and pedestrian network characteristics on pedestrian behavior and safety. Such understanding assists transportation engineers and planners to develop better design concepts to mitigate the frequency and severity of collisions that are caused by pedestrian spatial violations in urban areas. The rest of the paper is organized as follows: the following section provides a summary of the literature review. The research methodology is then documented and is followed by a summary of the data collection and processing. Next, the results of the study are presented and a brief discussion of the results is provided. Finally, the last section of the paper presents the conclusions and the recommendations of the study.
Literature Review
The literature review focused on reviewing previous studies in two key areas: (i) understanding the factors contributing to pedestrian spatial violations at mid-blocks and the safety consequences of such behavior; and (ii) investigating the applications of SEM models in pedestrian safety research. The following subsections provide a summary of the findings of the literature review in these two areas.
Pedestrian Spatial Violations
Many pedestrians engage in spatial violations while crossing to save time and reduce the walking distance ( 8 ). Nevertheless, pedestrians’ decisions on whether to violate or not vary significantly depending on many factors. Waiting time to cross and the available gap between vehicles have been identified as important factors that influence pedestrians’ decisions to violate ( 9 , 10 ). Vehicle speed was also identified as another traffic-related factor that contributes to pedestrian violation in many studies ( 11 , 12 ).
Many studies also showed that road characteristics play an important role in pedestrian decisions to violate. For example, pedestrians were shown to be more eager to cross the street without the right-of-way at mid-block locations equipped with central refugee islands ( 13 , 14 ). The large block size was also found to be an important factor that increases the probability of spatial violation ( 15 ). The presence of bus stops at a location increases the frequency of spatial violations, especially at times where buses are waiting at the bus stop ( 16 ). Considering pedestrian traits, while some studies showed that men are more likely to engage in high-risk situations and end up in collisions that happened as a result of spatial violation ( 17 , 18 ), other studies found that gender has no significant influence on such behavior ( 19 , 20 ). Younger pedestrians were shown to be more likely to violate compared with older pedestrians in many studies ( 20 ). Previous studies also highlighted the role of habits, social norms, and past experiences on pedestrian violation behavior ( 21 , 22 ).
Moreover, spatial violations were shown to be an important contributor to the frequency and severity of pedestrian collisions. For example, Hussein et al. ( 23 ) analyzed the association between pedestrian violation and the frequency of pedestrian-vehicle conflicts at a signalized intersection in New York City. The study identified pedestrian violations as the main contributor to pedestrian-vehicle conflicts and showed that 18% of the pedestrians tend to cross the street in a non-designated space. Kim et al. ( 24 ) used the hierarchical order technique to analyze more than 137,400 pedestrian collisions in South Korea between 2011 and 2013. The results showed that spatial violations at mid-blocks and temporal violation of drivers (red light running) were the main contributing factors to severe injury collisions. Pour-Rouholamin and Zhou ( 14 ) reported that pedestrians who cross the roadway at the dedicated crosswalks are 12% less likely to engage in a fatal collision compared with violation. In another study, Ghomi and Hussein ( 25 ) applied an integrated clustering and copula-based model to investigate the impact of pedestrian violations at intersections on both the frequency and the severity of collisions in the City of Hamilton, Ontario. The study showed a strong association between pedestrian violations at intersections and the frequency and severity of pedestrian-vehicle collisions, especially at intersections that have multiple bus stops and schools in the vicinity of the intersection.
SEM Applications
The application of SEM in the transportation field is most popular in transportation planning and travel behaviors ( 26 ). Several studies showed the merits of SEM in road safety applications, especially for identifying the factors contributing to the frequency ( 7 ) and severity ( 8 ) of motor-vehicle collisions. Other studies used SEM to develop a safety risk index in urban areas ( 27 ) and evaluate unsafe behavior and driver aggression ( 28 ). SEM has also gained recent popularity in investigating pedestrian collisions. Al-Mahameed et al. ( 29 ) defined road network characteristics, exposure, and social status as the main influential latent factors on the frequency of collisions that involve pedestrians and cyclists. Sheykhfard et al. ( 30 ) demonstrated that road characteristics were the most important latent factors that affect the frequency of pedestrian collisions. Other studies developed SEM to analyze survey data to evaluate the safety perception and subjective norms of a pedestrian while crossing the streets ( 2 ). As can be seen in the literature, the application of SEM in pedestrian safety studies is still limited. There are almost no studies that applied SEM to assess the contributing factors of collisions that involve pedestrian spatial violations.
Methodology
To achieve the study objectives, historical collision records of the City of Hamilton between 2010 and 2017 were obtained. The analysis focused on pedestrian-vehicle collisions that involved pedestrian spatial violations at mid-block locations. The Bayesian SEM approach was adopted to evaluate the underlying impact of the explanatory variables on the frequency and severity of collisions that involve spatial violations. The analysis was carried out in a two-stage procedure. In the first stage, the relationship between the manifest variables and the latent ones was calibrated by developing a classical SEM. In the second stage, a Bayesian SEM was applied to investigate the impact of the latent variables on both the frequency and severity of collisions that involve pedestrian spatial violations. The study considered a wide range of explanatory variables along with two independent variables (the frequency of spatial violations and the severity of collisions that happened because of violations). A brief description of the proposed technique is addressed below.
Bayesian SEM
SEM is a multivariate statistical technique that assesses the interrelated dependency among observed variables and unobserved (latent) variables, through the incorporation of regression models, factor analysis, path analysis, and analysis of variance, simultaneously. SEM approaches consist of two main components. The first component, known as the measurement model, describes the association between the observed variables (independent variables) and the latent factors. The second component, known as the latent model, explains the relationship between endogenous and exogenous latent variables by presenting the direction and effectiveness between the variable ( 31 ). The latent model is developed based on Equation 1:
where X and Y are vectors of the endogenous and exogenous latent variables, respectively, a and y are the coefficients of the latent variables, and
Also, the measurement models for the exogenous and endogenous variables follow the formulas of Equations 2 and 3, respectively:
where
Figure 1 presents a simplified general structure of an SEM model.

SEM structure.
Generally, SEM has several benefits compared with the typical statistical models. The SEM method is capable of estimating multiple relationships among variables at the same time. This approach can evaluate the performance of the unobserved/latent variable while predicting the dependent factors based on a series of manifest variables. Moreover, SEM can estimate the error term for each of the observed variables in the measurement part of the model. Finally, the SEM is capable of overcoming the multicollinearity issue among the variables.
The maximum likelihood (ML) method is the most common estimation approach for classical SEM techniques. However, the ML estimator is not able to present the best performance while dealing with residual correlation, cross-loadings, and the absence of multivariate normality, which leads to biased factor loadings. To overcome these issues, the third generation of the SEM was integrated with Bayesian models. In Bayesian SEM, the uncertainties are considered in the predictive model and the requirement for the normal distributions is released ( 32 ). Bayesian SEM employs Gibbs sampler from a Markov Chain Monte Carlo (MCMC) simulation method to predict the posterior distribution of the latent variables. However, the prior distribution of the variables needs to be determined first. Since there is no sufficient information on the prior distribution of the variables, the normal distribution with zero mean and a very large variance (e.g., 1,000) is considered as an acceptable prior distribution of the parameters.
To ensure the model convergence, three independent Markov chains were run for 10,000 iterations for each parameter. The first 5,000 iterations in each chain were treated as burn-in samples that are not considered in the calculations. The convergence of each parameter was checked using the proportional scale reduction (PSR), which examines the between- and within-chain variation. A PSR value close to 1 typically indicates that the model has converged ( 33 ).
Data Collection and Processing
Pedestrian-vehicle collisions that occurred at mid-block locations in the City of Hamilton, Ontario, from 2010 to 2017 represent the main source of data in this study. In total, 617 pedestrian-vehicle collisions were reported at mid-block locations (19,728 sections) in the city during the eight years considered in the analysis. The collision data set provided by the City of Hamilton specifies the exact location of the collision, which was used to determine whether the location occurred outside a close-by marked crosswalk or not. If the collision occurred outside a marked crossway that exists within 30 m of the collision location, it is defined as a collision that involves pedestrian spatial violation. The collision data set also provides information on the pedestrian action before the collision, including whether or not the pedestrian yielded the right-of-way to vehicles. Collisions that involve pedestrians who did not yield the right-of-way to vehicles were also defined as collisions that involve pedestrian spatial violation. A total of 214 collisions (34.7% of total collisions) were attributed to pedestrian spatial violations at mid-blocks, resulting in 11 fatalities and 192 injuries. The spatial distribution of the collisions is presented in Figure 2.

Spatial distribution of mid-block collisions in the study area.
Moreover, to determine the factors potentially contributing to collisions that involved spatial violations, a thorough review of the literature was first conducted to identify the factors that promote pedestrian spatial violations at mid-blocks. According to the literature, various factors were identified as contributors to the spatial violation behavior, including road user exposure, road network characteristics (mainly block size and road class), location-specific factors (such as the number of lanes and the presence of central refuge islands), built-environment factors (mainly, bus stops and schools), and land use. The literature also provided little discussion on the impact of pedestrian network characteristics (mainly directness and connectivity) on the violation behavior. Consequently, it was decided to investigate the potential impact of those pedestrian network indicators on violation-related collisions. Afterwards, a list of additional potential contributing factors was identified based on a preliminary analysis of the spatial distribution of the violation-related collisions and the correlation between the location of collisions and those factors. Those additional factors included several location amenities and attractions (namely, bike-share stations, playgrounds, parking lots, convenience stores, recreational trails, and restaurants), other location-specific factors (such as illumination, traffic composition), and the distance between collision location and the nearest intersection. Finally, the correlation between the selected factors was investigated to avoid using highly correlated factors in the developed model. Below, a brief description of the selected factors, their calculation details, and the correlation analysis that was conducted to select the final list of factors is provided.
First, the collision data set provided useful information on each collision, including weather conditions at the time of the collision, illumination, road surface condition, type of vehicles involved in the collision, number of lanes, road class, average annual daily traffic (AADT), and whether the road segment is divided or undivided. This enables the direct extraction of those factors for each collision in the data set.
Additionally, the study used seven other data sources to extract the rest of the potential contributing factors, including the Esri ArcGIS online website, the Open Street Map website, Canadian 2016 census data, the Hamilton Street Railway (HSR) transit route data set, the Hamilton School Board data set, the Hamilton Open Data website of the City of Hamilton, and the Geospatial Datacenter of McMaster University ( 34 – 38 ). ArcMap 10.7.1. was used to merge the information of different sources, which enables the development of the required models
As for the pedestrian network accessibility indicators, two indicators were used: pedestrian network connectivity at the collision location, and pedestrian route directness at the collision location. Road network characteristics at the collision area included block size, road class, and distance between the collision location to the nearest intersection. The class of the road at which the collision occurs was provided in the collision data set. To estimate the other four parameters, the transportation network of the City of Hamilton was converted to a set of nodes and links, where the links represent the road segments, and the nodes represent the intersections. The geo-coded road network of the City of Hamilton was extracted from the Open Street Map website ( 34 ). The block size was measured as the direct distance between two adjacent intersections. The length of the road between the location of the collision and the nearest intersection was considered as the distance to the nearest intersection. The ratio of intersections to the summation of intersections and dead-end streets within a radius of 400 m from the collision location was considered as an indicator for pedestrian network connectivity at the collision location. Finally, the sidewalk layer was mapped on the road network to estimate the pedestrian route directness within a radius of 400 m from the collision location. This factor indicates the degree of the sidewalk’s orientation and calculates as Equation 4.
With regard to the land use, parcel-based land use data from the City of Hamilton was obtained from the Geospatial Datacenter of McMaster University ( 35 ) and merged with collision layer in ArcMap software. The “Intersect” function was then used to divide a mid-block segment between adjacent parcels if it crossed the boundary of the parcel. To extract the dominant land use, a 400-m buffer was generated from each collision location. The study considered three common categories of land use: residential, commercial, and institutional/office.
As for the exposure parameters, the AADT was used as a direct exposure measure for traffic. The AADT at each collision location was available in the collision data set. Unfortunately, a direct measure for pedestrian exposure at each collision location (i.e., pedestrian volume) was not available. To overcome this issue, the study used the number of walking trips as a surrogate measure of pedestrian exposure. The City of Hamilton was divided into 191 tracts, based on the 2016 Canadian census data ( 36 ) and the dominant mode of transportation in each tract was calculated in each tract. The census data layer was joined to the mid-block layer in ArcMap software to distribute the mid-blocks within the tracts. The “Intersect” function was then used to divide a mid-block segment between adjacent tracts if it crossed the boundary of the tract. Finally, the total number of walking trips that was overlaid on a 400-m buffer generated around the center of the collision location (i) was counted and considered as the total number of walking trips for collision (i). It should be noted that the 400-m buffer that was used to determine the abovementioned factors was selected based on a preliminary sensitivity analysis, in which different buffer sizes (50 m–1,000 m) were tested and the buffer that was associated with the best performance of the developed SEM was selected.
With regard to the location/collision-specific factors, six variables were considered: the number of lanes, the presence of central refuge islands, illumination at the collision location, the type of vehicle involved in a collision, the road surface conditions, and the weather conditions at the time of the collision. These six factors were provided in the collision data set and were used directly in the analysis.
Finally, the study considered the impact of a variety of amenities and attractions in the collision area, including the number of schools and bus stops within the collision location area and the presence of trails, playgrounds, parks, restaurants, parking lots, bike-share stations, and convenience stores near the collision location. The number of bus stops was extracted from the HSR data set and geo-coded in ArcMap software. A buffer with a pre-defined radius was then generated around the center of the mid-block at which the collision occurs to obtain the number of bus stops that exist within the collision area. Previous studies considered various buffer sizes when studying the impact of bus stops on pedestrian safety, ranging from 5 m to 300 m ( 25 , 39 ). Based on the results of a preliminary sensitivity analysis, a 200-m buffer showed the highest accuracy of the developed SEM model. A 200-m buffer was therefore used to estimate the number of bus stops within each collision location area.
The locations of schools in the City of Hamilton were extracted from the Open Hamilton dashboard ( 37 ). A 400-m buffer was generated from the center of the mid-block at which the collision occurred to determine the number of schools existing near the collision location. The buffer size was also selected based on a preliminary sensitivity analysis, in which different buffer sizes were tested and the buffer that was associated with the best performance of the developed SEM was selected. The number of the rest of the amenities and attractions (trails, playgrounds, parks, restaurants, parking lots, bike-share stations, and convenience stores) at each collision location was extracted from the Open Data website of the City of Hamilton and the Esri ArcGIS online website ( 37 , 38 ), using a buffer of 400 m from the center of the mid-block at which the collision occurred.
After extracting all the factors, the Spearman correlation matrix was developed to study the potential correlation between them. The correlation between the two variables was considered significant if the correlation coefficient was higher than 0.7. In this regard, a significant correlation was found between (i) weather condition and road surface and (ii) the presence of trails and the proportion of parks. Consequently, weather conditions and the number of parks were eliminated from the data set, leaving 23 factors as potential contributors to the collisions that involved pedestrian spatial violation at mid-block locations. A descriptive summary of the 23 factors is presented in Table 1.
Descriptive Summary of the Variables
Note: SD = standard deviation; AADT = average annual daily traffic.
Results and Discussion
The model development process started by forming a measurement model with six latent variables: exposure, road network characteristics, location/collision-specific factors, land use factors, pedestrian network accessibility indicators, and location amenities and attractions. The initial model was developed to evaluate the presence of causal effects among latent variables and assess the multicollinearity issue. The model demonstrated a high
To overcome these issues, the insignificant observed variables (illumination, type of vehicle, and road class) were removed from the model. Also, the two latent variables (land uses and pedestrian network accessibility indicators) were dropped out and their significant parameters were merged with other potential latent variables. The modified structure of the measurement model included four latent variables: location amenities and attraction; exposure; road/pedestrian network characteristics; and location/collision-specific factors. The “location amenities and attraction” latent variable included eight observed factors (the presence of playground, restaurant, bike-share stations, parking lots, trails, convenience stores, and the number of bus stops and schools). Five variables, including AADT, walkers, and commercial/residential/institutional land uses were classified as “exposure” latent variables. The “road/pedestrian network characteristics” latent variable included four observed variables (block size, pedestrian network connectivity indicators, pedestrian route directness, and the distance between the collision location and the nearest intersection). The “location/collision-specific factors” included road surface condition, number of lanes, and the presence of refuge islands at the collision location. The model connected the four exogenous latent variables with the two endogenous variables, collisions and fatal collisions that involved pedestrian spatial violations.
The proposed model with four latent variables demonstrated significant results for all input variables. The value of
Once the model structure was set, it was imposed in the Bayesian SEM model to estimate the relationship between latent variables and endogenous variables. The RStudio software was used to develop the Bayesian SEM method using the “blavaan” statistical package. Figure 3 shows the graphical structure of the Bayesian SEM model and the group of manifest variables used for each of the latent variables.

Graphical results of the Bayesian structural equation modeling (SEM) model.
The values on the arrows show the coefficients of the variable while the values in parentheses indicate the squared correlation coefficient (R 2 ), which express the percentage of the variance that was explained by that observed variable (at 95% confidence level).
The results of the Bayesian SEM are presented in Table 2. The table presents the parameter estimates that show the impact of the observed variables on the four extracted latent variables, along with the influence of the four latent variables on the two endogenous variables (the frequency of collisions and fatal collisions that involve pedestrian spatial violations).
Result of Bayesian Structural Equation Modeling (SEM) Method
Note: PSR = proportional scale reduction; SD = standard deviation; AADT = average annual daily traffic; NA= Not Applicable.
Based on the results, location amenities and attractions demonstrated the highest impact on pedestrian collisions that involve spatial violations. This latent variable reflects the aggregation of the amenities and attractions at a mid-block location. As the value of this variable increases (i.e., more attractions that attract pedestrians are available), more pedestrians are attracted to use these facilities, and the probability of spatial violation increases significantly, which results in increasing the frequency of collisions that involve pedestrian spatial violations. The coefficients of the manifest factors related to the location amenities and attraction were all positive and significant at a 95% confidence level.
Furthermore, it should be noted that in the SEM technique, the proportion of variance that is explained by each observed variable is equal to the square of the correlation coefficient (R2). Based on the last column of Table 2, the presence of bike-share stations was found to be the observed variable that explains the highest percentage of the variance for the “location amenities and attractions” latent variable. The City of Hamilton has an efficient bike-sharing system (So Bi bike-share) that has stations distributed all over the city. Bike-share stations are important attractions for pedestrians as they walk to get to the bike and use it as the main mode to get to the destination or to switch mode (mainly, from or to transit buses) and use the bike for part of their trips. It should be noted that 81% of the bike-share stations are located within 100 m of a bus stop, which makes bike-share a convenient transportation option that can be integrated with transit. As such, it is expected that many violations would occur at these locations, especially in situations such as a cyclist trying to park the bike and cross the street to catch an approaching bus. Also, parking lots, restaurants, and trail entrances were found to be significant variables within the “location amenities and attractions” latent variable. Based on these findings, special attention is required when selecting the location of the bike-share stations, especially those that are close to bus stops. In addition, preventive measures are needed to reduce the frequency of pedestrian spatial violation near bus stops, bike-share stations, parking lots, restaurants, and trail entrances.
Nevertheless, a negative association was found between the frequency of fatalities and the “location amenities and attraction” latent variable. One of the reasons for this finding is that locations with amenities such as schools, playgrounds, and access to trails are usually located in reduced speed zones, and the vehicle operating speeds are typically low. Consequently, the severity of collisions is expected to be lower. Unfortunately, the distribution of the vehicle operating speeds at collision locations is not available to test this hypothesis. Another possible reason for this finding is that drivers usually pay more attention to violating pedestrians at locations with such amenities, which reduces the severity of potential collisions.
The “road/pedestrian network characteristics” latent variable was found to be the second most influential factor on the frequency of collisions that involve violation. Since the four variables that constitute this latent variable demonstrate a better level of accessibility and pedestrian convenience in the road network, the rate of pedestrians’ conformity will increase as they experience a more pedestrian-friendly environment (i.e., with the increase of the value of this latent variable). Subsequently, both the frequency and severity of collisions that involve pedestrian spatial violations will decrease, as can be observed from the negative sign of the latent variable coefficients. Pedestrian network connectivity was found to be the main factor associated with this latent variable, based on the percentage of the variance explained. Based on the results, locations with poor pedestrian network connectivity and large block size require countermeasures that mitigate pedestrian violation. When planning new developments, block size, the connectivity of the pedestrian network, and ensuring that pedestrians can access their desired destination in the shortest possible distance are essential measures to mitigate violation and related collisions.
Moreover, the “exposure” latent variable was found to have a significant but negative impact on both the frequency of total collisions and fatal collisions that involve pedestrian spatial violations at mid-block locations (although the impact of exposure on the severity of collisions was not statistically significant). This finding was expected since higher exposure to traffic (higher AADT) discourages pedestrians from spatial violation, as was reported in many previous studies ( 9 , 10 ). In addition, higher pedestrian exposure and land uses that attract more pedestrians increase drivers’ awareness of pedestrians, which reduces the risk of collisions.
Finally, the “location/collision-specific factors” latent variable showed an indirect significant impact on both the frequency and the severity of collisions attributed to violation. Based on the definitions of the observed variables of this latent variable, the results indicate that the presence of refuge islands, dry surface conditions, and the lower number of lanes at a location will increase the frequency of both total and fatal collisions that involve pedestrian spatial violations. Previous research showed that the presence of refuge islands increases the probability of spatial violation ( 13 , 14 ). Consequently, the presence of refuge islands will increase the frequency of collisions happening as a result of violations. Similarly, previous research showed that pedestrians are discouraged from violation in adverse weather conditions and as the number of lanes increases ( 25 ), which explains the higher frequency of violation-related collisions in dry weather conditions and at roads with fewer lanes.
Conclusion
In this study, a Bayesian SEM model was developed to analyze pedestrian collisions that are attributed to spatial violations at mid-blocks. Pedestrian-vehicle collisions that occurred in the City of Hamilton, Ontario from 2010 to 2017 were the main source of data for this study. The SEM model aimed to investigate the interrelationship between a variety of factors, categorized in four latent variable groups, and two endogenous dependent variables (the frequency and the severity of collisions that involve spatial violations). The four latent variable groups included: location amenities and attractions (e.g., parking lots, schools, bus stops, trails, restaurants, among others), exposure parameters, location/collision-specific factors (e.g., number of lanes and the presence of refuge islands), and pedestrian/road network characteristics, such as pedestrian network connectivity and block size.
The results showed a significant impact of the different amenities and attractions at a location on the frequency of violation-related collisions, particularly bike-share stations, trail access points, restaurants, and parking lots. More collisions were observed at locations with bike-share stations that are located near bus stops, which highlights the significance of the proper selection of bike-share stations and applying appropriate countermeasures at such locations to mitigate pedestrian spatial violation. Lack of pedestrian network connectivity and large block size were found to be highly correlated with the frequency and severity of pedestrian collisions that involved spatial violations at mid-blocks. Accordingly, locations with poor pedestrian network connectivity and large block sizes require countermeasures that reduce the frequency of spatial violation. Additionally, block size, the connectivity of the pedestrian network, and ensuring that pedestrians can access their desired destination in the shortest possible distance are essential measures to consider when planning new areas. Finally, violation-related collisions were found to be more likely to happen at locations that have refuge islands and a low number of lanes.
Nevertheless, the study is subject to several limitations that should be addressed in future studies.
The study used an estimate of the number of walkers at collision locations as a surrogate measure of pedestrian exposure. While this is a commonly used surrogate measure for pedestrian exposure in the safety literature, more precise measures for pedestrian exposure can be explored, including collecting extra survey data or applying activity-based models to estimate the pedestrian volume at a location accurately.
The estimated number of walkers at collision locations in this study is based on Canadian census data that are only available at the tract level. Although this method has some benefits, it suffers from a major drawback. Specifically, many mid-blocks that are located in the same tract will be assigned similar numbers of walking trips regardless of other road characteristics. Specifically, the coarse-grained pedestrian volume estimates used in this study may introduce error into the parameter estimates for roadway-level variables. There is, therefore, a need to consider pedestrian volume at a fine-grained, street-by-street level in future studies.
It is essential for future studies to include the vehicle operating speed distribution in the analysis as this can provide an explanation for the impact of many factors on collision severity and enhance the accuracy of the results.
The current study used one data set from one city. Future studies should analyze more data sets from different cities to investigate the impact of culture and behavioral differences on the results. It should be noted that the “spatial violations” referred to in this study may occur within a different legal context for pedestrian crossings than in other studies in the literature. However, investigating the nuanced differences in the legality of mid-block crossing in the various jurisdictions examined in previous studies is beyond the scope of this study.
Despite the well-established safety benefits of refuge islands ( 15 , 40 , 41 ), the results of the model suggest that the presence of refuge islands may be associated with an increase in the frequency of collisions that involve violations. This result may be an artifact of the coarse measure used to represent pedestrian exposure, but it could potentially be a result of the increase in the frequency of spatial violations at locations with refuge islands. Accordingly, it may be valuable to apply some mitigation measures at locations with refuge islands that aim to reduce the frequency of spatial violations to avoid having a high frequency of these violation-related collisions.
Finally, road networks characteristics seem to have a significant impact on spatial violation behavior and consequent safety issues. Future studies should conduct a more detailed analysis of the pedestrian network indicators and evaluate the impact of micro-scale characteristics related to the built-environment factors on the results.
Footnotes
Author Contributions
The authors confirm contribution to the paper as follows: study conception and design: Ghomi and Hussein; data collection: Ghomi; analysis and interpretation of results: Ghomi; draft manuscript preparation: Ghomi and Hussein. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
