Abstract
Previous crime prediction research focusing on regional characteristics is lacking in terms of the examination of physical characteristics of individual crime scenes. This study, therefore, presents a street crime prediction model by analysing streetscape features within an actual field of vision for a low-rise housing area in South Korea, which serves as a gauge for potential offenders to carry out crime. First, we performed logistic regression to analyse the correlation between street crime opportunities and the elements of streets to derive an equation for predicting street crime using selected variables. Next, we created a crime prediction map based on a geographic information system that contains attribute data on these physical characteristics and presented a street crime prediction model based on the derived prediction equation. Finally, to test the prediction model, we compared actual crime data from the selected area with the results obtained from the prediction model. The test results showed that the prediction model classified 11 out of 29 actual crime spots as crime occurrence; among the 312 non-crime spots, 257 were classified as non-crime occurrence. Based on these test results, we confirm that the occurrence of street crime is affected by the physical characteristics within the actual field of vision and discuss the improvement of the prediction model.
Keywords
Introduction
According to environmental criminology, crimes are influenced by the socio-economic, demographic and physical environments of the crime-prone area, and concentrated or repeated crimes are not random; rather, they occur in a pattern based on prevailing environmental characteristics (Newburn and Sparks, 2004). Therefore, crime opportunities in certain areas can be assessed and predicted by analysing the environmental characteristics that influence crimes. However, while early studies on crime prediction conducted statistical analyses (e.g. regression models) of socio-economic, demographic and physical environments at the macro scale (e.g. city level, census tract level) because collecting data and building data sets was easy (Pratt and Cullen, 2005), they failed to reflect on the physical characteristics of individual crime scenes. Generally, socio-economic and demographic characteristics differ markedly between areas, but little within individual spaces in an area (Hillier, 2004). Because of this, these two variables are typically controlled for at the micro scale, and only the physical characteristics of crime scenes are analysed (Perkins et al., 1992, 1993).
To address this shortcoming, recent statistical analyses have focused on crime data and the physical environment surrounding crime scenes at the micro level (e.g. block level, individual level). In particular, for an opportunist crime such as theft, the potential offender's committing of a crime can be determined by the physical characteristics of the location. Such physical characteristics can have different effects based on the type of theft (intrusion crime, non-intrusion crime), meaning that one crime type is usually selected for analysis. In studies that have specialized in burglary, for instance the physical characteristics that influence such a crime in each building have been analysed, and the unit of crime prediction is narrowed to a building (Malleson et al., 2013).
On the contrary, it is difficult to define an individual unit of analysis for street crime. Hence, research that examines street crime usually analyses how the physical characteristics affect crime at the block level and assess crime probability by street block (or block face) because the street block has a more distinct and easily defined boundary (Perkins et al., 1992, 1993). Nevertheless, street blocks differ in block length; a block may include an invisible area far away from an actual crime scene. In other words, studies of street crime continue to find it difficult to identify the physical characteristics of actual crime scenes and lack accuracy when assessing and predicting crime probability.
One clear limitation of previous studies is that there is not enough data about crime and place, and our knowledge is very limited (Weisburd, 2015). The law of crime concentration highly relies on the same data and sampling techniques that allow us to make strong generalizations to a larger population, and it limits the scope of generalizability of the law of crime concentration (Weisburd, 2015).
In addition, as the physical characteristics of the spatial environment included in the field of vision of potential criminals are the basis for judging whether to commit a crime (Takizawa, 2011, 2013), the limitation in terms of not reflecting micro-spatial characteristics needs be supplemented (Brantingham and Brantingham, 1999; Eck and Eck, 2012; Eck and Weisburd, 2015).
To overcome the limitations of previous studies of street crime, it is necessary to analyse smaller spaces than those attempted by previous researchers. Therefore, this study presents a street crime prediction model by examining streetscape features within an actual field of vision, which becomes a benchmark to judge whether potential offenders are inclined to commit a crime. First, logistic regression analysis is performed to identify the correlation between street crime occurrence and the elements of a street in crime scenes such as building entrances facing the street, presence of parking lots and the height of fenced walls. Following this, a street crime prediction equation is derived by using the variables that influence street crime. Next, a crime prediction map is created based on a geographic information system (GIS) that provides attribute data on physical characteristics, and a GIS-based street crime prediction model is developed from the derived street crime prediction equation. Finally, the model is tested in a study area, its accuracy is verified and its feasibility is examined.
Literature review
Literature review on the occurrence of street crime
According to the crime prevention through environmental design (CPTED) theory and the defensible space theory that describe the specific location of crime and its surroundings, the probability of crime is low in physical environments that increase the detection rate of criminal activity (natural surveillance factor) by maximizing the visibility of local residents, allowing local residents to freely use or occupy the area to enable their direct or indirect control of crime (territoriality factor), and not giving an impression of ease of crime through continuous management (image maintenance factor) (Cozens et al., 2005; Crowe, 2000). The crime factors of physical characteristics mentioned in the above study are classified into three categories: natural surveillance, territoriality and image maintenance factor (Cozens et al., 2005; Crowe, 2000).
Natural surveillance can be facilitated by the design of roadside buildings that minimize the obstacles obstructing visibility and enhance surveillance across the street (Zelinka and Brennan, 2001). For example, it is difficult for criminals to commit a crime in a street with high visibility from a first-floor window because of the possibility of increased natural surveillance (Jacobs, 1961). Verandas, balconies, etc. also increase the opportunity for residents to naturally monitor the street (Brown et al., 1998). Fences or side walls protect the internal space from intrusion, but simultaneously limit natural surveillance. Therefore, high fences or walls can sometimes make the street a blind spot (Brown et al., 1998). Other physical factors limiting natural surveillance include visual obstacles such as telephone poles, roadside trees, street parking and front yards (Greenberg et al., 1982). Windows with security bars were also shown to be a factor limiting natural surveillance (Foster et al., 2011; Perkins et al., 1993). In a study conducted in Korea, ground-floor parking has been pointed as a factor affecting natural surveillance (Lee and Kim, 2014). Ground-floor parking is one of the representative low-rise housing types in Korea. Ground-floor parking facing the street forms a deeply hollow space next to the street with pillars or walls of parking lot blocking the view, forming spaces for potential criminals to hide, and thus limiting natural surveillance.
Territoriality implies a virtual area in which local residents can freely use or occupy their rights, and a space that is likely to be passive in terms of criminal activity, because when potential criminals cross territorial boundaries and enter territoriality spaces, they become aware that their conduct may be monitored or prevented (Perkins et al., 1992, 1993). Therefore, making territorial distinctions using personal property such as guide signs, warning signs, mailboxes and landscaping has the potential to suppress crime (Newman, 1972; Perkins et al., 1992, 1993). In a study conducted in Korea, territorial distinction through installation of pavement tiles was used as the primary variable indicating the residential environment's territoriality (Baek et al., 2010; Kim et al., 2011).
Finally, image maintenance of the street environment also has a close influence on crime occurrence. Physical incivilities such as damage to buildings and roads, graffiti, and piles of garbage indicate the absence of social control over crime, and are associated with high crime rates (Brown et al., 2004a, 2004b). The broken windows theory argues that if such physical incivilities are not immediately repaired, social control becomes even weaker and results in further inciting of potential criminals (Skogan, 1992).
Establishing the analysis environment
Selection of the study area
The study area is a residential urban zone that has uniform street structures, blocks and lot sizes; it is densely populated with low-rise housing, which is a representative housing type in South Korea. Thus, the study area has a representative residential street environment (Figure 1(a)). The lot size and block size of the study area are approximately 130 m × 32 m and 13 m × 16 m, respectively; one block has 20 lots on average, with two rows of 10 houses facing each other across the street. Street width is typically less than 5 m, which is rather narrow, and almost identical in all streets. Most houses in the study area are multi-family housings with fewer than four stories (Figure 1(b)). Therefore, the study area has almost identical conditions except for the streetscape components such as streets, building elements and street furniture (e.g. balconies, high solid front walls, ground floor parking) (Figure 1(c)).
General condition and street elements in the study area. (a) The study area in Seoul, Korea. (b) General condition. (c) Elements of a street.
The reasons for selecting this study area are threefold. First, it is representative of residential streets since it is a highly populated low-rise housing area, typical in South Korea. Second, the study area has similar space structures in terms of street network, topography, and block and lot size; it has no apartment complexes and all conditions except for the elements of a street are identical. Therefore, we can control for the other factors that may influence crime occurrence. Third, it has a relatively low transfer rate, and most residents belong to the middle class. In addition, this area does not house any colleges or manufacturing factories, so it has low ethnic heterogeneity. Therefore, the socio-economic and demographic characteristics that may influence crime occurrence can be controlled.
Establishing the unit of analysis
The unit of analysis in this study is a streetscape within an actual field of vision. To establish the unit of analysis, observation points were first established following which a field of vision was created.
Establishing the observation points
The physical characteristics of a streetscape vary depending on its components (e.g. streets, buildings facing each other across the street, street furniture). To reflect these changeable physical characteristics, the observation points were established at regular intervals between the opposite sides of buildings (Figure 2). Excluding streets that deviate from regular lots and blocks such as empty lots, schools and nearby parks, 1834 observation points were identified.
Observation points in the study.
Establishing the field of vision
To establish the field of vision from these observation points, we reviewed previous studies that have acknowledged the limitation of visibility and established new units of analysis based on the field of vision. For example, Yoshida et al. (1997) analysed the correlation between pickpocketing and crime spots by assessing the level of natural surveillance around a building based on the size of the windows facing the streets, concluding that the distance that violates the privacy of a resident is approximately 30 m. Based on this finding, Takizawa (2011, 2013) established 40 m as their unit of analysis, deciding to expand this area because their study was conducted in subway station areas with wide streets (over 15 m) with a large number of pedestrians.
We examined earlier studies that recognized the limits of vision and set new analysis units based on the field of view (Takizawa, 2011, 2013; Yoshida et al., 1997). The average field of view was then set based on the number of field work of the site.
In our selected sites, measuring general visibility from an observation point on a narrow street (<5 m wide) showed that the field of vision that allowed us to identify the physical elements was approximately six buildings in all directions and a street radius of 20 m from the observation point. Indeed, straight line visibility surpassed 20 m; however, identifying the front and side areas of buildings over 20 m away on such a narrow street was difficult, making it challenging to determine whether an act of crime was occurring. Hence, the present study established its unit of analysis as the space consisting of the street and six buildings within a 20 m radius of the observation point (Figure 3).
Spatial scope of the unit of analysis.
Methodology
Logistic regression is multiple regression but with an outcome variable that is categorical and predictor variables that are continuous or categorical (Field, 2009). In this study, logistic regression analysis was used to establish the model that predicts the occurrence of crime in the spatial unit. Therefore, the categorical variable explaining whether crime occurs or does not occur was established as a dependent variable, and street elements observed in the scope of the spatial unit were established as the independent variables for the analysis.
Establishing the variables
Crime data
The present study obtained crime data for one year from the ‘crime occurrence compilation’ with the cooperation of A District Police Station in Seoul. The ‘crime occurrence compilation’ documents crime location and time by crime type. We limited the crime type used in the analysis to theft (an opportunistic street crime) that occurred from 12 p.m. to 6 p.m., 1 during which visibility is ensured, the elements of a street are recognizable and the theft rate is at its highest.
A total of 295 thefts occurred in the study area from 12 p.m. to 6 p.m. for one year, after excluding 87 cases whose addresses could not be identified. The final crime data set selected for this study was 217 cases after excluding the cases that did not occur on the typical block: 18 cases that occurred on the street by an empty lot, 19 cases that occurred on the street by a school, 13 cases that occurred on the street by nearby parks (nature) and 28 other cases that occurred on streets that did not fit the pattern of regular lots and blocks. Among these 217 cases, 11 cases occurred at the same location multiple times, making a total number of 211 observation points, where 217 reported actual crimes occurred and 1623 observation points where no crime occurred (after subtracting these 211 observation points from the total 1834 observation points). The status of crime occurrence was a dependent variable divided into two categories; within SPSS, these categories were coded as Crime = 0 (no crime occurrence) and Crime = 1 (crime occurrence).
Physical characteristics that affect street crime
In order to select the variables for physical characteristics, crime influence factors of physical characteristics that affect crime occurrence were derived based on previous domestic and international studies examined in the literature review, and the elements of street environment influencing each factor were selected as the physical characteristics of this study.
Variables for physical characteristics.
Na1, Na2, Na3, Na5, Na6, Na7, Na9, Na11 and Te1 are continuous variables that measured the number of applicable variables observed within a unit of analysis, ranging from 0 to 6. Na4, Na8, Na10, Na12, Na13, Na14, Te2, Im1 and Im2 were initially measured as continuous variables, following which these variables were classified into two categories since a wide scope was not observed. Taking Na4 as an example, it was coded into two categories: Na4 = 0 (No house with good visibility from ground floor windows) and Na4 = 1 (house with good visibility from ground floor windows).
Next, elements of the street environment that affect each factor were also extracted based on previous studies, and some were modified to be made suitable for the actual situation of the site.
Regarding factors affecting natural surveillance, entrance facing a street, balcony, windows (including ground-floor windows with good visibility and windows with security bars), fences and walls, pilotis parking lots, front yards, visual obstacles and street parking were selected based on previous research. Among these, windows and fences/walls were subdivided into front windows, side windows, front walls and side walls, respectively. In addition to the physical characteristics examined in previous studies, stairwell windows and hiding places were selected as factors affecting natural surveillance. Because multi-family houses have a common stairwell to climb to the upper floor, stairwell windows were added as a variable in consideration of the possibility of natural surveillance in the common stairway. Since security doors were not installed in the space between houses, a space surrounded by a wall or an outer wall of 1.5 m or more served as a space for hiding from monitoring eyes, thereby restricting natural surveillance. Therefore, these spaces were considered to be a hiding space and added as a variable. Thus, a total of 14 natural surveillance factors were selected by adding the physical characteristics of natural surveillance from previous studies and some physical characteristics shown in the residential streets of the site.
Elements of street environment affecting territoriality included guide signs, warning signs, mailboxes and landscaping in previous studies. However, as most of the houses on the site were multi-family houses rather than individual homes, many of these physical factors used in previous studies were not present. Therefore, instead of distinguishing each factor as a separate variable, we combined these factors as one variable called ‘personalisation signs’. In addition, the pavement tiles that were presented in previous studies were also added as a variable of territoriality. Although the pavement tiles refer to peel-off type floor packaging material, it was determined that the front area of a well-defined territoriality space could affect the overall appearance of the street environment. Thus, two variables associated with territoriality, installation of pavement tiles and personalization signs that include signs, mailboxes, etc. were selected.
Variables of image maintenance factor were also extracted from variables that indicate physical incivilities such as damage to buildings and roads, graffiti and pile of garbage as presented in previous studies. Graffiti presented in previous studies was excluded because none was found on the site, and damaged appearance, including damage to buildings and roads and trash piles by the road, were selected as variables.
The final selected variables are shown in Table 1.
Analysis of factors influencing the occurrence of street crime
Descriptive statistics.
N = 1834. SD: standard deviation.
To derive an accurate regression model, analysis was conducted in the following three steps:
Univariate logistic regression analysis is performed between the independent variables (variables for the physical characteristics) and dependent variables (street crime occurrence), and insignificant variables are eliminated at a significance level of 0.1. Correlation analysis is performed only on the significant independent variables, and the independent variables that may be hampered by multicollinearity are eliminated. Multivariate logistic regression analysis is performed on the selected independent and dependent variables obtained from the first and second steps. The weight of each variable is derived, and finally, a street crime prediction equation with this applied weight is formulated.
Univariate logistic regression analysis
Univariate logistic regression analysis.
Correlation analysis
By examining the correlation of the independent variables, we examine the issue of multicollinearity, which occurs because of the high correlation between selected independent variables. When multicollinearity exists, one of the two variables must be eliminated to avoid an over-estimated variance as a result (Field, 2009). Generally, when the Pearson coefficient is over 0.7, high correlation exists (Field, 2009). Among the Pearson coefficients between the independent variables, the coefficient of Na1 and Na2 is 0.802 at the significance level of 0.01, which shows very high correlation. Consequently, we eliminated Na2, which has a lower correlation with the dependent variable than Na1.
Multivariate logistic regression analysis
The 12 independent variables selected after these two elimination steps were compared with the dependent variables by using multivariate logistic regression analysis.
Goodness-of-fit of the model.
Hit ratios.
Multivariate logistic regression analysis.
When the coefficient β value is below 0.00 (−), the probability of being classified as non-crime spots (Group 0) increases; when the coefficient is higher than 0.00 (+), the probability of being classified as crime spots (Group 1) increases (Field, 2009). This means that as Na1, Na3, Na4 and Na6 increase, the probability of crime occurrence decreases, while as Na5, Na8, Na9, Na10, Na11, Na12, Im1 and Im2 increase, the probability of crime occurrence increases. The odds ratio indicates the proportional change in probability: when each independent variable increases by one, this value shows the exponential probability of being classified as no crime occurrence compared with the probability of being classified as crime occurrence (Field, 2009). For example, when an independent variable is a continuous variable, similar to Na1, each time a house with an entrance facing a street increases by one, the probability of crime occurrence decreases by 0.668. Another example is when an independent variable is a non-continuous variable, such as in Na4; the probability of crime occurrence decreases by 0.486 on a street where the house has good visibility from the ground floor windows in comparison with a street without such a condition.
In the final analysis, most of the variables of the natural surveillance factor and image maintenance factor were identified to have a significant influence on crime occurrence with certain exceptions, whereas none of the variables of the territoriality factor was found to be statistically significant. This outcome contrasts markedly with the results of previous studies. This difference is believed to be a result of the limitation of selecting variables used in previous studies because they could not be easily be found in the studied multi-family housing area where several families live in the same building and where territoriality markers cannot be objectified or quantified. In addition, the streets in multi-family housing areas do not have scope for territoriality.
By integrating the weights of the variables for the physical characteristics significantly related to street crime, a street crime prediction equation can be created as follows
Street crime prediction model
Building a GIS-based street crime prediction model
In this section, based on the derived prediction model, a crime prediction map including attribute data on the physical characteristics is built using Quantum GIS 2.6.1, an open source-based GIS program. Spatial data were collected from the National Spatial Information Clearinghouse (www.nsic.go.kr), while attribute data related to the independent variables were collected from an observation study. The final database was built after linking the attribute data from the field survey to the spatial data. Figure 4 shows the process of building the GIS-based street crime prediction model from the analysis of the physical characteristics of a streetscape.
Establish observation points in all streets at the same interval. Create a buffer with a 20 m radius. At this point, one buffer generally includes six buildings in all directions from the observation point. Spatial-join the sum of the attribute data from all buildings and the street's spatial data that overlap in the buffer. Obtain the probability of crime occurrence by using the derived prediction equation. Select ‘graduated type’ for the attribute style of the buffer and divide the probability of crime occurrence into four classes from 0% to 100%. The redder (whiter) the class becomes, the higher (lower) the probability of crime occurrence.
Building process of the street crime prediction model.

Prediction model simulation and results
As the next step, we selected a test site, calculated the probability of street crime occurrence by using the street crime prediction model and tested the prediction model by comparing the simulation results with actual street crime occurrence data. We selected a test site to avoid over-fitting the prediction model, which could be problematic if the test is performed on the same site as the target site. The examined study area is a neighbourhood whose environmental conditions are similar to that of the study area selected for analysing the factors that influence crime occurrence (Figure 5).
Selection of the test site.
Streets by schools, apartment complexes and small parks were excluded, establishing 341 observation points in 341 streets. Thirty cases of street crime occurred for one year in the study area (the total number of crime scenes is 29 since two cases occurred in the same location). Figure 6 illustrates the result of simulating the study area by using the prediction model created in this study.
Result of the simulation predicting street crime (red dots indicate actual crime spots).
Figure 6 illustrates the probability of a crime occurring at all locations in the study area on one map. A cut-off value of 50% was used to classify the fourfold categorisation into binary outcomes: if the probability of crime occurrence is over 50%, it is classified as crime occurrence; if it is below 50%, it is classified as no crime occurrence.
Result of testing the prediction model and its accuracy.
The results of the street crime prediction model can be examined in four groups. First, on Street Nos. 3, 7, 10, 11, 12, 15, 17, 21, 26, 37, 39, 41, 42, 43, 44, 46 and 47, no crime was actually committed. The prediction model also classified these spots into the non-crime occurrence group since the crime occurrence probability from all observation points was below 50%. This finding is in alignment with the data on actual street crime indicating no crime occurrence; hence, the prediction model accurately predicts that streets with no crime occurrences have a low probability of crime occurrence.
On Street Nos. 1, 2, 4, 5, 8, 18, 19, 22, 23, 24, 31, 40, 45 and 48, at least one street crime occurred. In actual crime occurrence locations, the crime occurrence probability was assessed to be over 50% and thus classified into the street crime occurrence group. In particular, on Street Nos. 23, 24, 31 and 40, multiple crimes (over two cases) occurred at the same observation point or at a neighbouring observation point; except Street No. 31, all actual crime occurrence locations were predicted to have over a 75% probability of crime occurrence. With regard to the same or neighbouring observation point, where there is an overlapping area, the street environments are similar. The fact that crimes occur multiple times in such a space indicates that its physical characteristics make the street vulnerable to street crime occurrence. Therefore, this finding implies that the prediction model accurately predicts the location in which actual street crime occurred.
Contrastingly, in Street Nos. 6, 13, 14, 25, 27, 28, 32 and 38, no actual street crime was committed. However, some of the locations on these streets were assessed to have over a 50% probability of crime occurrence, and thus, they were classified into the street crime occurrence group. Although this result differs from that of the prediction model, this finding implies that the probability of future street crime occurrence is high and that the street environment of such locations needs to be improved.
Finally, on Street Nos. 9, 29, 30, 31, 33, 34, 35, 36 and 39, at least one street crime occurred, but these were classified into the non-crime occurrence group since the crime occurrence probability did not exceed 50% at the actual crime occurrence location. This result differs from the result of the prediction model, implying that other factors besides those examined in this study influence street crime occurrence. Examples of other factors include distance from the crime location such as distance from main streets, schools and parks. In this study, only physical environmental conditions within the field of visibility were analysed. However, distance between a crime location and the locations of main streets, schools and parks can influence accessibility as getaways for potential offenders who are aware of their location.
Conclusion
In this study, a street crime prediction model based on the analysis of physical characteristics of street space was presented. The crime prediction model presented in this paper has two points of significance.
First, based on environmental criminology and its subsequent theories (CPTED theory, etc.), it confirmed that the environmental characteristics within actual field of view affect the occurrence of street crime, or in other words, the crime decisions of criminals. As a result of verifying the proposed prediction model, about 64% of actual crime occurrence points are classified as street crime occurrence, showing that the prediction model predicts the points where the street crime actually occurred.
Although this prediction model did not predict all crime occurrence spots with high accuracy, it predicts spaces with two or more cases of crime in the same or adjacent unit space with a high probability ( > 75%), showing that street space where street crime is concentrated has certain physical characteristics that make it vulnerable to crime.
Second, it showed a possibility of evaluating the street crime probability as a specific spot. It pointed out the limitations of previous studies that predicted crime occurrence by street segment by performing analysis on a street block (or block-face) unit, and presented a new analytical unit based on the field of view (20 m radius). As a result, the probability of crime occurrence could be assessed on a spot unit level.
This prediction model is extremely useful for predicting street crime in planned urban districts where the street networks and sizes of blocks, lots and streets are similar to those in the study area examined herein and where factors besides the elements of a street can be controlled. However, the application of the model may be limited in residential areas with various space structures that are naturally formed or that have developed at different points in time. Additionally, the established unit of analysis in this study may be limited to other crime influential elements as independent variables (such as the location of the park) beside physical elements in the scope of spatial unit of vision. Furthermore, spatial autocorrelation between the recorded crimes was overlooked. Therefore, crime influential dependent variables in addition to the physical elements discusses in this paper should be added in the follow-up study, and spatial regression analysis should be conducted in order to control spatial autocorrelation in the analysis process.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (grant number NRF-2015R1D1A1A01059718).
