Abstract
Studies of fear of crime often focus on demographic and social factors, but these can be difficult to change. Studies of visual aspects have suggested that features reflecting incivilities, such as litter, graffiti, and vandalism increase fear of crime, but methods often rely on participants actively mentioning such aspects, and more subtle, less conscious aspects may be overlooked. To address these concerns, this study examined people’s eye movements while they judged scenes for safety. In total, 40 current and former university students were asked to rate images of day-time and night-time scenes of Lincoln, UK (where they studied) and Egham, UK (unfamiliar location) for safety, maintenance, and familiarity while their eye movements were recorded. Another 25 observers not from Lincoln or Egham rated the same images in an Internet survey. Ratings showed a strong association between safety and maintenance and lower safety ratings for night-time scenes for both groups, in agreement with earlier findings. Eye movements of the Lincoln participants showed increased dwell times on buildings, houses, and vehicles during safety judgements and increased dwell times on streets, pavements, and markers of incivilities for maintenance. Results confirm that maintenance plays an important role in perceptions of safety, but eye movements suggest that observers also look for indicators of current or recent presence of people.
Introduction
Fear of crime refers to the emotional response to potential violent crime and physical harm (Covington & Taylor, 1991). It has important health consequences for the individual: people experiencing higher levels of fear of crime have lower levels of mental health (Whitley & Prince, 2005), physical health (Jackson & Stafford, 2009; Ross, 1993), and a reduced quality of life (Stafford, Chandola, & Marmot, 2007). They avoid certain routes (Ravenscroft, Uzzell, & Leach, 2002), walk less (Foster, Giles-Corti, & Knuiman, 2014; Ross, 1993), and have fewer social interactions (Liska, Sanchirico, & Reed, 1988; Ross & Jang, 2000).
Studies of fear of crime often focus on social and demographic factors, such as gender (Pain, 1997; Stanko, 1993; Tomsich, Gover, & Jennings, 2011), age (Braungart, Braungart, & Hoyer, 1980; Clememte & Kleiman, 1976; LaGrange & Ferraro, 1989), geographical location (Smith, 1987; Valentine, 1989), race (Chiricos, Hogan, & Gertz, 1997; Ortega & Myles, 1987), exposure to crime (Skogan, 1987; Stafford & Galle, 1984), and media exposure (Heath & Gilbert, 1996; Romer, Jamieson, & Aday, 2003; Weitzer & Kubrin, 2004). Factors such as these aid in identifying target populations for intervention but do not present factors that can be easily influenced. Studies have therefore moved to visual factors that can be more easily addressed, which have suggested that locations where a perpetrator could hide (Nasar & Fisher, 1993), darkness (Hanyu, 1997; Painter, 1996; Warr, 1990), littering and vandalism (Jackson & Gouseti, 2012; Lorenc et al., 2012), street lighting (Boomsma & Steg, 2012; Farrington & Welsh, 2007; Vrij & Winkel, 1991), and green space (Foster, Giles-Corti, & Knuiman, 2010; Maas et al., 2009) influence a person’s fear of crime.
Theories that explain these visual factors include the prospect-refuge theory (Appleton, 1975, 1984; Fisher & Nasar, 1992), which suggests that people prefer places with open views and where one can hide, and incivilities theory, which suggests that sub-criminal, but antisocial activities, such as vandalism and evidence of drug use or dealing, as well as neglect and decay in the environment, play a role in creating fear of crime (LaGrange, Ferraro, & Supancic, 1992; Lorenc et al., 2012; Rohe & Burby, 1988; Wyant, 2008). Contrasting prospect-refuge theory, however, people feel safer in open spaces without refuge because places of concealment also allow possible perpetrators to hide (Fisher & Nasar, 1992).
Studies on fear of crime tend to rely on quantitative survey data (Garofalo, 1979; LaGrange et al., 1992; McGarrell, Giacomazzi, & Thurman, 1997; Pitner, Yu, & Brown, 2012). Although such methods allow for large sample sizes, they may suffer from discrepancies between perceived and actual incivilities (Perkins, Florin, Rich, Wandersman, & Chavis, 1990) and the need to keep the survey short (Farrall, Bannister, Ditton, & Gilchrist, 1997). Furthermore, they may be less well suited to examine the visual aspects of fear of crime. Studies have therefore been extended to virtual or actual walks (Andrews & Gatersleben, 2010; Toet & van Schaik, 2012) and judgements of photograph (Austin & Sanders, 2007; Hanyu, 1997; Herzog & Kutzli, 2002) and computer-generated images (Herzog & Flynn-Smith, 2001). Although computer images and virtual walks have the advantage that visual features can be systematically varied, actual images and walks better convey the richness of the natural environment. The use of such methods avoid issues of recall, but they still rely on participants explicitly reporting their impressions and on the experimenter selecting those aspects of the scenes that they think are important for the research question (e.g., comparing photographs with and without green areas). In this study, we therefore use eye tracking to examine whether people’s eye movements may reveal aspects of scenes that contribute to fear of crime that participants would not directly consider reporting or would not normally be detected by traditional methods.
Only a few studies have made use of eye tracking to assess visual features contributing to fear of crime. By examining heatmaps of a small sample of participants while watching various scenes, Guedes, Fernandes, and Cardoso (2014) found that observers focused on buildings under construction, people, window bars, tunnels, and ends of streets when assessing the scenes for security. Using a mobile eye tracker, Davoudian and Raynham (2012) found that participants reported feeling unsafe at night in unfamiliar neighbourhoods and safe during the day. The eye tracking data showed that observers tended to focus on the pavement, in particular, during the day.
The analysis of eye movement patterns in isolation without comparison to another visual task, however, may be problematic. Eye movements are known to be driven by two types of factors. First, there are bottom-up factors, related to the materials presented. These are often summarised by models predicting the visual salience of objects in a scene, by analysing the stimulus intensity, stimulus orientation, and colour of regions of a visual image (Itti & Koch, 2000; Itti, Koch, & Niebur, 1998), although it has been argued that such models may not predict where people look very well (Birmingham, Bischof, & Kingstone, 2009; Foulsham & Underwood, 2008) and that the tendency to look at salient objects can be overridden by the task (Henderson, Malcolm, & Schandl, 2009). Second, there are top-down factors, which include the particular task participants are conducting and expectations and goals of the observer (DeAngelus & Pelz, 2009; Yarbus, 1967). The relative importance of bottom-up and top-down factors appear to depend on the stimuli and tasks involved. For example, when judging faces, the task participants are performing often has little effect (Kwart, Foulsham, & Kingstone, 2012; Nguyen, Isaacowitz, & Rubin, 2009; Pelphrey et al., 2002). Other studies, using a painting (DeAngelus & Pelz, 2009; Yarbus, 1967) or a portrait (Tatler, Wade, Kwan, Findlay, & Velichkovsky, 2010) did find reliable effects of the observer’s task. Subsequently, studies have tried to infer the task performed from the eye movement patterns. Although one study suggested that task could not be reliably deduced from eye movement patterns (Greene, Liu, & Wolfe, 2012), others have demonstrated that eye movements predict task above chance (Borji & Itti, 2014; Haji-Abolhassani & Clark, 2014; Kanan, Bseiso, Ray, Hsiao, & Cottrell, 2015; Kanan, Ray, Bseiso, Hsiao, & Cottrell, 2014). For natural scenes, task appears to influence eye movements more strongly. For example, different regions were inspected when viewing images during visual search and memorisation, although basic eye movement parameters, such as fixation durations and saccade amplitudes, were largely unaffected (Castelhano, Mack, & Henderson, 2009). Likewise, overt following of directional cues, such as gazing or pointing individuals or arrow signs in the scenes, was reduced when participants memorised the scenes compared with when freely viewing the same scenes (Hermens & Walker, 2015).
Because of the interplay between bottom-up and top-down factors, simply asking participants to judge the safety of a scene while recording their eye movements may lead to a confound between top-down and bottom-up factors that jointly determine the objects and parts of the scene that observers look at. For example, people may look at bins in a scene because they are relevant for the task (judging how safe the scene is) or because they are salient objects in the scene. By comparing two tasks for the same scene, the visual aspects of the scene are kept constant, thereby isolating the effects of task. One aspect to consider is that repeated presentation of the scene may influence the pattern of eye movements towards that scene. Although past studies have suggested that repeated presentation of the same scene has no systematic influence on people’s eye movements (Hermens &Walker, 2015; Võ & Wolfe, 2012), we ensured that repeated presentation did not influence the average data by counterbalancing the order of the tasks across participants. The two tasks that we use in this study are judging scenes for safety (indicating how safe this scene looks with respect to crime) and for maintenance (indicating how well this scene is maintained, for example, whether the grass is kept and repairs have been made). We chose these two tasks because both require a global exploration of the scene without explicitly directing participants to specific objects (as would, for example, occur when asking participants to determine the number of cars in an image). However, because past studies have suggested that how well a scene is kept influences perceptions of safety, our comparison is expected to focus on small differences between safety and maintenance judgements.
For our main participant group, we chose students for two reasons. First, we expected them to be familiar with the university and the main student residential areas, which allowed us to vary the expected familiarity of the scenes presented. Second, because participants needed to attend to the lab, recruitment of this participant group was more straightforward (they were already in the area). Furthermore, students are an interesting target group for studies on fear of crime as reports have suggested that fear of crime in students is particularly prevalent at university campuses, because of the absence of people at night, and their design (McCreedy & Dennis, 1996; Nasar & Fisher, 1992; Woolnough, 2009). To examine whether ratings of the images were unique to participants being students and from one of the two locations shown (Lincoln, UK), we collected additional safety and maintenance ratings for all of the images from a second group of participants who were from neither location, using an online survey.
We presented our participants with a large number of stimuli (a total of 80 photographs) to avoid drawing conclusions on accidental properties of the images used. Images were of four possible types, varying time of day (day or night) and expected familiarity for our main participant group (images from the campus and surrounding area in Lincoln, UK—the familiar condition, and images from a different university and surroundings, Royal Holloway University and Egham, UK—the unfamiliar condition). To assess whether these participants were actually familiar with the scenes, we also asked them to indicate (on a 3-point scale) how familiar the scenes were and—at the end of the study—to name the unfamiliar second location for them. To further examine the role of familiarity on ratings, we also asked a group of observers not from Lincoln or Egham to rate each of the images. Besides collecting eye tracking data and ratings, we also administered a general questionnaire (adopted from Office of National Statistics, 2015) to our main participant group to examine how image ratings compare to more standard fear of crime questionnaire results.
Method
Participants
The main participant group comprised of 40 current and former students (26 females and 14 males aged between 18 and 35 years—an average of 22.5 years) of the University of Lincoln (UK). The majority of these participants identified themselves as White British, English, Scottish, Welsh, or Northern Irish (N = 32), with various other backgrounds for the remainder of the participants. Likewise, the majority of participants identified themselves as heterosexual (N = 32) and as home students (N = 35). Participants took part in the study in return for course credits (the majority of the participants) or without reimbursement. A further 25 observers (15 females, 9 males, and 1 did not wish to say; aged between 20 and 70 years—an average of 40.8 years), not from Lincoln or Egham, also rated each of the images using an online questionnaire. The study was approved by the local ethics committee, in agreement with the guidelines of the British Psychological Society (BPS) and the Declaration of Helsinki.
Apparatus
Participants from the main participant group viewed the images on a ViewSonic VX2268WM flat screen (set at a resolution of 1280 × 1024 pixels and a refresh rate of 60 Hz), controlled by a LanBox Lite PC, running on the Windows 7 Operating system and software compiled using SR Research’s Experimental Builder software. Eye movements were recorded by an EyeLink 1000 (SR Research, ON, Canada) desk-mounted eye tracker, controlled by a second LanBox Lite PC. Participants sat with their head in a head-and-chin rest at a distance of about 80 cm from the computer screen. A standard USB keyboard and optical mouse were used to record responses. Participants rating the pictures online used their own Internet devices to complete the survey that was administered via the Qualtrics.com website.
Design
Participants from the main participant group performed three blocks of 80 trials each, rating 80 individual images for safety, maintenance, and familiarity. Participants either judged images for safety first (N = 20) or for maintenance first (N = 20) but always performed the familiarity judgements in the third block (i.e., only the safety and maintenance judgements were counterbalanced across participants). At the beginning of the block, the task was presented to the participant. From the 80 trials per block, 20 had day-time images of their town of study (Lincoln), 20 had night-time images of their town of study (Lincoln), 20 had day-time images of a different university town (Egham), and 20 had night-time images of a different university town (Egham). After completing the rating tasks (during which eye movements were tracked), participants completed a questionnaire on the same computer (but without eye tracking). The order of the images in each block was randomised for each participant. Participants rating the pictures online only performed the safety and maintenance rating tasks using the same 80 pictures in a random order.
Stimuli
In total, 80 photographs (scaled down to a resolution of 960 × 540 pixels; 36 cm wide and 15 cm tall on the monitor used for eye tracking) served as the stimuli (images available from Hermens, 2017). Photographs were taken with a standard point-and-shoot camera and depicted street and campus scenes of the town of study of the main participant group (Lincoln) and another university town (Egham, where Royal Holloway University is located) taken during day and night times. Because people in images tend to strongly attract observers’ gaze (Birmingham et al., 2009; Röhrbein, Goddard, Schneider, James, & Guo, 2015), photographs of scenes containing people were avoided.
Procedure
Before taking part, participants were informed about the aim of the study and provided online or written consent. For eye tracking, a standard 9-point calibration procedure of the eye tracker was performed, involving the fixation of 10 fixation targets presented on the computer screen positioned on a 3 × 3 grid (first and last target were presented in the centre). Calibration was considered acceptable when the recorded fixation locations matched the 3 × 3 grid on which the calibration targets were presented, yielding a reported accuracy of 0.25° to 0.5° (SR Research, ON, Canada). Before eye tracking, the task was signalled to the participant on the screen and participants pressed a key to start the block. Every trial in a block started with a fixation point, presented at one of four possible fixation locations outside the image region (left, right, up, down). The trial was started by the experimenter as soon as the participant fixated on this fixation target. The target image would then appear for 1,500 ms, followed by a response screen until participants clicked with the mouse on the button to indicate their response. Participants rated the images for safety (one block) and maintenance (another block) on a 7-points Likert-type scale (Figure 1). For the familiarity ratings, participants were asked to indicate whether they had never seen the scene before (left button), had seen the scene but did not come there often (middle button), or knew the scene well (right button). After each block of 80 images, participants were given a short break until they indicated to be ready for the next block. After completing all three ratings for each of the 80 images, participants completed a short questionnaire about their perceptions of crime and demographics (adopted from Office of National Statistics, 2015) on the same computer as for the image rating tasks. Upon completion, participants were verbally debriefed, received a written debrief form to take home with contact details, and were thanked and dismissed. Participants rating the pictures online received instructions in a welcome statement, after which they rated each of the images for safety and maintenance in a random order, followed by debrief information and contact details.

Illustration of the stimulus sequence used for eye tracking. At the start of each trial, participants were asked to fixate a fixation point, randomly placed at one of four positions outside, where the image was going to appear (above, below, left, or right) to ensure that participants were not directed to particular aspects of the image (e.g., those in the centre of the image) by the fixation point. The experimenter monitored the participant’s eye movements and started the trial as soon as participants fixated on the fixation point. The image would then be shown for 1,500 ms, followed immediately by the rating scale. This scale stayed on the screen until participants clicked with the mouse to indicate their rating. The image was shown before the rating scale to ensure equal viewing times of the image across participants and conditions.
Data analysis
Raw eye movement data (horizontal and vertical coordinates on the screen) were automatically parsed into saccades and fixations using the EyeLink 1000 system’s parser applying the default velocity (30°/s) and acceleration (8,000°/s2) criteria. Only the fixations were analysed as meaningful data as it can be assumed that information extraction only takes place during these intervals. Regions of interest (ROIs; images available from Hermens, 2017) were created for each image by using GIMP to colour the image for the corresponding areas, after which custom-built MATLAB scripts were used to superimpose fixations on these ROIS and classify the data. The main focus of this analysis will be the dwell times on the different ROIs. Although the presentation duration of each image was fixed, we chose to present dwell times as a percentage of the overall presentation duration rather than in milliseconds because this allows for easier comparison with future studies that may use different presentation durations. For one participant, the eye movement data were of poor quality due to reflections on the glasses that the participant was wearing. Eye movement data of this participant are therefore not included in the results. Further statistical analysis of the data was conducted in R.
To examine the influence of familiarity and time of day on participants’ ratings, linear mixed effects analyses (with participants and images as random effects) were used to incorporate both the variability across both participants and images. The statistical significance of interactions, main effects, and simple effects were determined by comparing the model with the effect of interest with the nested model without the effect of interest, using a likelihood ratio test (yielding χ2statistics).
Results
At the end of the experiment, we asked participants whether they recognised the second location where the pictures were taken. None of the participants could indicate that they were from Egham. We also asked whether they had been in Egham or at Royal Holloway University, and no one indicated they had been. In the online version, we also asked participants to guess where the images were taken. A few participants (7 out of 25) correctly guessed Lincoln (again, no one identified Egham), but this could be because they knew the researchers were from Lincoln (e.g., from the information given before the survey) rather than actually recognising the images.
Ratings
Figure 2 provides an overview of the images that received the highest and lowest safety and maintenance ratings from the participants from Lincoln (doing the eye tracking task). Two of the images with the highest safety ratings were from the university campus (where the testing took place), suggesting that familiarity plays a role in the safety ratings. Some of the images with high safety contain green space (Foster et al., 2010; Maas et al., 2009), which could be another factor, although this also turns up in images with high maintenance ratings. Images with low safety ratings show dark areas (Hanyu, 1997; Painter, 1996; Warr, 1990) or a narrow street (Nasar & Fisher, 1993), in line with results from previous studies. Some of the images are rated highest or lowest for both safety and maintenance, suggesting that the two types of ratings are related (Jackson & Gouseti, 2012; Lorenc et al., 2012).

Images with the highest and lowest safety and maintenance ratings.
To more systematically investigate the effects of time of day and familiarity, Figure 3a plots the average ratings across images of the different categories (day or night, familiar or unfamiliar town). Mixed effects analyses showed a significant three-way interaction between the task (safety or maintenance), the time of day (day or night), and familiarity (Lincoln versus Egham images), χ2(1) = 27.9, p < .001.

(a) Average safety and maintenance ratings in the various conditions (day-time familiar, day-time unfamiliar, night-time familiar, night-time unfamiliar; with familiar = Lincoln images and unfamiliar = Egham images). (b) The association between safety and maintenance ratings. Each symbol in the data plots shows the average rating for one image. Different symbols show the different conditions. (c) Percentage of “unfamiliar,” “somewhat familiar,” and “well-known” ratings for each of the image types. (d) Average safety and maintenance ratings for previously seen (somewhat familiar and well-known) and unseen (unfamiliar) Lincoln scenes. Error bars show the standard error of the mean across participants.
To examine this three-way interaction further, the pattern of results for the two tasks was examined separately. For the safety ratings, a significant interaction was found between time of day and familiarity, χ2(1) = 31.2, p < .001. For day-time pictures, unfamiliar images were rated significantly safer than familiar images, χ2(1) = 26.9, p < .001. For night-time pictures, this effect was reversed, and familiar images were rated significantly safer than unfamiliar images, χ2(1) = 11.2, p < .001. For the familiar (Lincoln) images, day-time scenes were rated significantly safer than night-time images, χ2(1) = 209.4, p < .001. This advantage for day-time images was also found for unfamiliar (Egham) images, χ2(1) = 503.8, p < .001.
For maintenance ratings, the interaction between time of day and familiarity was also significant, χ2(1) = 26.0, p < .001. For familiar (Lincoln) images, no significant effect of time of day was found on maintenance ratings, χ2(1) = 0.35, p = .55. For unfamiliar (Egham) images, day-time images were rated significantly better maintained than night-time images, χ2(1) = 67.4, p < .001. For day-time images, the unfamiliar (Egham) images were rated significantly better maintained than the familiar (Lincoln) images, χ2(1) = 69.2, p < .001. For night-time images, there was no significant difference between the familiar and unfamiliar town, χ2(1) = 1.68, p = .19.
The pattern of results for safety and maintenance shows considerable overlap, although a few differences can be observed (e.g., no maintenance difference between day-time and night-time Lincoln images, but a significant difference in safety ratings for these two groups of images). To examine the association between safety and maintenance ratings in more detail, Figure 3b plots the safety rating for each image against the maintenance rating. Across all images, there was a significant (Pearson) correlation between safety and maintenance ratings, r = .70, p < .001. Significant correlations were also found for the different image categories: night-time unfamiliar, r = .55, p = .011; night-time familiar, r = .84, p < .001; day-time unfamiliar, r = .91, p < .001; and day-time familiar, r = .83, p < .001. The relatively low correlation for the night-time unfamiliar images may relate to the smaller range of safety and maintenance ratings found for this type of image, as seen in the data plot (Figure 3b).
For each of the images, we asked participants to indicate whether they (1) had never seen the place before, (2) had seen the place but do not come there often, or (3) came there regularly. Figure 3c shows the average frequency of responses for the four types of stimuli (Lincoln—night, Lincoln—day, Egham—night, Egham—day). The figure shows that participants despite not knowing the pictures in the second set were from Egham, or having been in Egham before, still sometimes indicated that they thought the scene in Egham images was familiar. Interestingly, they also indicated for a large portion of the Lincoln images that these areas looked unfamiliar to them, even though the images were mostly from the university campus region and the so-called “West-End” area of the town, where a large portion of our participants could be expected to live. To test whether Lincoln scenes were more often familiar than Egham images and whether there were differences in familiarity between images taken during the day or night, a mixed effects logistic regression was performed, pooling data across the two “familiar” response categories (familiar and well known). A significant interaction between location (Lincoln versus Egham) and time of day was found, χ2(1) = 7.16, p = .0075. For Lincoln images, participants recognised day-time scenes more often than night-time scenes, χ2(1) = 21.3, p < .001. For Egham images, no such an effect of time of day was found, χ2(1) = 0.026, p = .87. Lincoln images were significantly more often familiar to participants both during day time, χ2(1) = 408.2, p < .001, and night time, χ2(1) = 236.7, p < .001.
The fairly large proportion of the images from Lincoln unfamiliar to participants allowed for one further analysis, comparing participants who were familiar and those unfamiliar with each (Lincoln) scene, thereby controlling for maintenance effects (Figure 3d). Mixed effects analyses showed that having previously seen a Lincoln scene influenced safety ratings, χ2(1) = 6.49, p = .011, but not maintenance ratings, χ2(1) = 1.10, p = .29, suggesting that familiarity influences safety ratings, independently of perceived levels of maintenance.
To examine the role of familiarity further, another 25 participants who were not from Lincoln or Egham rated each of the images via an online questionnaire. Figure 4 shows their ratings compared with the participants from Lincoln; a very similar pattern of results across the two participant groups. A mixed effects analysis testing the effects of participant group, time of day (day or night), and location (Egham or Lincoln) on safety ratings, χ2(1) = 0.76, p = .38, and maintenance, χ2(1) = 0.0081, p = .93, showed no interaction between these factors. There was a significant two-way interaction between time-of-day and group for safety judgements, χ2 = 31.1, p < .001, caused by a larger effect of time-of-day for participants from Lincoln (but both participants groups showed significantly lower safety ratings at night, p < .001). Participants not from Lincoln show the same strong correlation between safety and maintenance ratings, r = .73, p < .001 (Figure 4c). Further strong correlations were found between safety, r = .77, p < .001 (Figure 4d), and maintenance, r = .85, p < .001 (Figure 4e), ratings per image across the two groups.

Rating results from observers not from Lincoln or Egham (where the photographs were taken). (a) Maintenance ratings. (b) Safety ratings. (c) Safety versus maintenance ratings. (d) Safety ratings of observers from Lincoln and those not from Lincoln compared. (e) Maintenance ratings of observers from Lincoln and those not from Lincoln compared.
Visual factors
Past studies have suggested that visual factors, such as locations where a perpetrator can hide (Nasar & Fisher, 1993), darkness (Hanyu, 1997; Painter, 1996; Warr, 1990), signs of littering and vandalism (Jackson & Gouseti, 2012; Lorenc et al., 2012), street lighting (Boomsma & Steg, 2012; Farrington & Welsh, 2007; Vrij & Winkel, 1991), and green space (Foster et al., 2010; Maas et al., 2009) can make a scene look more or less safe. The analysis of our images into ROIs gives the opportunity to test whether the mere presence of certain ROIs (e.g., green space, street lights) in an image or the size of such ROIs determines how safe a scene is rated. Independent samples, t-tests, comparing images with a feature against images without this feature partly confirm previous observations and show that the presence of branches, t(42.9) = 3.27, p = .002, and green areas, t(57.1) = 3.48, p = .00095, lead to higher safety ratings. These effects are independent of increases in maintenance impressions due to the presence of such features, as none of the features’ absence or presence significantly influenced maintenance ratings. The data also show that the presence rather than the size of the area determines safety impressions, as none of the correlations between area size (when present) and safety (or maintenance) ratings was significant.
Eye movements
The eye movement analysis is illustrated in Figure 5, which shows three examples of scenes, the corresponding ROI images, and the dwell times to each of the ROIs for each task. Dwell times were computed as the sum of the fixation durations to that ROI, as a percentage of the duration of the trial (to limit their range between 0% and 100% and for easier comparisons with past or future studies that may employ different trial presentation durations). To examine whether systematic patterns across images can be found in the regions fixated in the two task of interest (safety and maintenance), paired sample t-tests compared dwell times for each of the regions for each image. Figure 6 shows the ROIs for which a p-value smaller than .05 was obtained (used as a threshold to find regions; for statistical testing, a Bonferroni correction would be needed). An interesting pattern emerges. When judging images for maintenance, participants tend to focus more on surfaces, such as footpaths, pavements, streets, and walls, but also at some regions that are classically thought to be important for safety, such as bins, damage to the wall (poor upkeep), puddles (disrepair), and graffiti (incivilities). When judging for safety, participants tend to focus more on buildings, shop windows, houses, windows with lights, and vehicles, which suggests that they look for signs of the presence of people.

Examples of images, the ROIs, and the dwell times on these ROIs for the three different tasks. All heatmap images, dwell time plots, and original images can be downloaded from (Hermens, 2017).

Numbers of images in which the indicated ROIs had significantly (p-value smaller than .05) longer dwell times for (a) the maintenance rating task or (b) the safety rating task.
To analyse the dwell times across the images, the specific ROIs were grouped into broader categories, such as buildings (including houses), fences, green space, pavements (including footpaths), streets, and vehicles (cars, motorbikes, and bicycles). Figure 7a shows the dwell times for these broader categories for the three tasks. Mixed effects models show an interaction between the three tasks and region of interest, χ2(28) = 2,681, p < 0.001, which remains significant when only the safety and maintenance tasks are compared, χ2(14) = 617.9, p < 0.001. Mixed effects pairwise comparisons showed significant differences in dwell times for safety and maintenance for bins, maintenance longer, χ2(1) = 27.4, p < .001; for buildings, safety longer, χ2(1) = 140.9, p = .001; for lighting, safety longer, χ2(1) = 13.6, p = .0002; for pavements, maintenance longer, χ2(1) = 178.2, p < .001; for puddles, maintenance longer, χ2(1) = 20.6, p < .001; for signs, safety longer, χ2(1) = 14.523, p = .0001; for streets, maintenance longer, χ2(1) = 25.4, p < .001; and for vehicles, safety longer, χ2(1) = 21.8, p < .001, in line with the categories obtained when analysing individual images (Figure 6). The data also suggest that to examine whether a scene is familiar, participants focus in particular on buildings, comparison with safety: χ2(1) = 202.6, p < .001, with maintenance: χ2(1) = 670.0, p < .001, and signs, comparison with safety: χ2(1) = 54.3, p < .001, with maintenance: 121.0, p < .001.

(a) Dwell times of broader categories of ROIs, pooled across images. (b) Average fixation durations for the different tasks and broader categories of interest. (c to f) Fixation durations for each of the four categories of images, shown for five ROIs that occurred in most images. The error bars show the standard error of the mean, computed after computing the average across images first.
Earlier studies have suggested a link between depth of processing and fixation durations (Henderson, Nuthmann, & Luke, 2013; Nuthmann, Smith, Engbert, & Henderson, 2010). To examine such possible depths of processing differences, Figure 7b shows the fixation duration for each of the categories and tasks. A mixed effects analysis shows an interaction between ROI category and task, χ2(28) = 52.1, p = .0037, but this interaction is no longer observed when only the safety and maintenance tasks are compared, χ2(14) = 19.2, p = .16. Because the familiarity task was always presented last, this could mean that the interaction reflects changing patterns in fixation durations across the experiment rather than a task effect. Main effects of task, χ2(1) = 55.2, p < .001, and ROI category, χ2(14) = 106.8, p < .001, are found for the safety and maintenance comparison, with longer fixation durations for safety than for maintenance. To examine whether fixation durations depend on the type of images used, Figure 7c to f plots fixation durations for the four image categories (restricted to often occurring ROIs). Mixed effects analyses testing the effect of task (safety versus maintenance) for the different types of images showed significantly longer fixations on vehicles for safety judgements of familiar night images, χ2(1) = 14.0, p = .001, and on buildings for safety judgements of unfamiliar day images, χ2(1) =11.1, p = .001, in line with the interpretation that these regions are important for safety judgements. The other comparisons did not survive Bonferroni correction.
Questionnaire
In the questionnaire, participants fairly often reported feeling unsafe at night (42.5% a bit unsafe, 5% very unsafe), but generally safe during the day (80% very safe, 20% fairly safe) or alone at home at night (47.5% very safe, 37.5% fairly safe). A fairly large group of participants considered vandalism, graffiti, or damage (5% a very big problem, 35% a fairly big problem) or rubbish or litter (15% a very big problem, 45% a fairly big problem) a problem. Fisher exact tests showed that females felt less safe than men when walking alone at dark (p = .008), worried more about being raped than men (p = .008), and worried more about being pestered than men (p = .003) in agreement with earlier findings (Pain, 1997; Stanko, 1993; Tomsich et al., 2011). Detailed results for the questionnaire can be found in Hermens (2017).
Image ratings versus questionnaire
To examine whether ratings of images are linked to responses to general questions about fear of crime, Figure 8a compares night safety ratings for familiar (Lincoln) and unfamiliar (Egham) scenes for people who gave different responses to the question whether they felt safe walking alone at night. If ratings of images probe into the same underlying construct as general questions about fear of crime, people who feel (very) unsafe at night are expected to give lower ratings on images, particularly, for night-time images of Lincoln (where they walk at night). Contrary this prediction, there was no significant main effect of feeling unsafe walking alone at night on image ratings, χ2(3) = 0.12, p = .99, or from the location judged, Lincoln versus Egham, χ2(3) = 6.19, p = .10.
Figure 8b makes a similar comparison for reported safety during the day and safety ratings of day images. Although no interaction is found between the response and the ratings, χ2(3) = 0.23, p = .63, there is a main effect of response, χ2(1) = 7.94, p = .0048: people who feel very safe during the day rate day images (irrespective of familiarity) higher for safety.
Figure 8c and d examines whether gender and year of study had an effect on ratings. Neither factor had an effect, three-way interaction between gender, time of day, and location, χ2(2) = 1.29, p = .53; main effect of year, χ2(1) = 0.008, p = .93, where the three way interaction model could not be fitted, indicating that although females report feeling less safe, this does not result in lower safety ratings for the images. Likewise, residing longer around the university area did not change ratings.

(a) Safety ratings of night images for people responding differently to the question whether they feel safe walking alone at night. (b) Safety ratings of day images for people responding differently to the question whether they feel safe walking alone during the day. (c) Safety ratings of female and male participants. (d) Safety ratings per year of study.
Eye movements and questionnaire results
Dwell times to the various ROIs did not depend on participants’ level of safety at night, χ2(42) = 47.0, p < .28; the level of safety during the day, χ2(39) = 32.8, p = .75; participants’ gender, χ2(15) = 9.35, p = .86; or year of study, χ2(57) = 67.6, p = .16. This suggests that dwell times were influenced only by visual features and the task, and less so by participant features, but a lack of statistical power for this particular type of analysis cannot be ruled out.
Discussion
Studies on fear of crime have predominantly examined social and demographic factors, such as gender (Pain, 1997; Stanko, 1993; Tomsich et al., 2011), age (Braungart et al., 1980; Clememte & Kleiman, 1976; LaGrange & Ferraro, 1989), exposure to crime (Skogan, 1987; Stafford & Galle, 1984), and media exposure (Heath & Gilbert, 1996; Romer et al., 2003; Weitzer & Kubrin, 2004), but these factors are often difficult to influence to reduce fear of crime. Studies that looked into visual factors that lead to increased fear have suggested that locations where a perpetrator could hide (Nasar & Fisher, 1993), darkness (Hanyu, 1997; Painter, 1996; Warr, 1990), littering and vandalism (Jackson & Gouseti, 2012; Lorenc et al., 2012), street lighting (Boomsma & Steg, 2012; Farrington & Welsh, 2007; Vrij & Winkel, 1991), and green space (Foster et al., 2010; Maas et al., 2009) are of importance. Studies have typically relied on surveys or ratings of photographs or computer images, which may fail to reveal factors of influence that participants may not think of reporting. To examine whether further visual factors play a role in fear of crime, this study therefore examined participants’ eye movements while they judged photographs (N = 80) for safety. To ensure that identified areas were associated with the task, and not simply attract visual attention for other reasons (e.g., high salience due to colour), eye movements were compared with a second task that could also be expected to require a general scanning of the visual scene (maintenance ratings). Participants also completed a general questionnaire about their fear of crime.
The results from the rating task and the questionnaire were very much in line with past findings. Ratings showed that photographs of scenes at night were rated as less safe (c.f., Hanyu, 1997; Painter, 1996; Warr, 1990) and that safety ratings were strongly associated with maintenance ratings (c.f., Jackson & Gouseti, 2012; Lorenc et al., 2012). The questionnaire showed that people felt less safe at night and that women felt less safe than men and were more worried about being pestered or being raped (c.f., Pain, 1997; Stanko, 1993; Tomsich et al., 2011). The effects of being familiar with a scene were less clear. Photographs of the unfamiliar town (Egham) were rated higher for safety, but only during the day (when maintenance features may be better visible). The results were confounded by higher maintenance ratings for the unfamiliar (Egham) than for the familiar (Lincoln) town. Participants from neither town showed very similar ratings, suggesting that the higher maintenance ratings of Egham play a role in the perceived safety. There is some evidence, however, that familiarity played a role. When participants from Lincoln who did not recognise a particular scene were compared with those who did, higher safety (but not maintenance) ratings were found for participants who recognised the scene. This finding agrees with the observation that the two images with the highest safety ratings were from Lincoln’s university campus, where all participants came regularly, suggesting that familiarity with a scene may reduce fear of crime (c.f., DuBow, McCabe, & Kaplan, 1979). Interestingly, ratings of the photographs and expressions of fear in the questionnaire were not always related. Only in day-time photographs, lower levels of fear corresponded to higher safety ratings of photographs, and gender did not have systematic influences on ratings of photographs. These findings suggest that surveys may tap more into social and demographic factors, whereas ratings of photographs may tap more into visual factors.
In our study, we only compared two locations (Lincoln and Egham), and the results may therefore be specific to these two locations. Our findings, however, do agree with past studies in many respects, including the relationship between safety and maintenance ratings (Jackson & Gouseti, 2012; Lorenc et al., 2012) and lower safety ratings for night images (Hanyu, 1997; Painter, 1996; Warr, 1990). One may argue that our main participant group knew the overall reputation of one of the locations (Lincoln) and that this may have influenced the ratings (as in the halo effect, Nisbett & Wilson, 1977). However, these participants could not identify the other location (Egham) and could therefore not rely on a reputation of this location. Moreover, our second group of participants, who were not from Lincoln, rated the images in a very similar way, suggesting that overall reputation of the location was not a driving factor. We also randomised the presentation of the images for each participant, so that perceptions of a specific image were less likely to be influenced by the set of pictures in which they were presented. To avoid the results depending too strongly on the particular images used, we asked participants to rate a large number of images (80 in total), but this does not exclude that somewhat different results may be obtained with a different set of images (but this would be similar to, e.g., testing a different set of words in a psycholinguistics experiment). In future studies, it would therefore be important to replicate the present results with images from different locations to ensure that the findings do not depend on the specific image set used.
By coding the photographs for ROIs, the presence or absence of certain visual features in the scene could be linked to ratings of safety and maintenance. This analysis suggested that the presence of trees (with or without leaves) and green space improved safety (Foster et al., 2010; Maas et al., 2009), along with maintenance ratings, but no other features were identified this way. An approach in which computer-generated images (e.g., Herzog & Flynn-Smith, 2001) are used that are systematically altered for certain features may therefore provide a better method for tapping into the specific image features that influence people’s ratings than relying on ratings of a set of natural images. In our images, the area occupied by different features did not vary strongly across images (i.e., there were many images with small areas and few images with large areas), which may be an important limitation of the use of natural scenes in this context.
Eye movements revealed an interesting pattern of results that was not obvious from the questionnaire and the rating data. When judging images for safety, participants tended to fixate for longer on areas that could reveal the presence of other people, including buildings (houses, shops, commercial buildings), vehicles (mostly cars), and bright windows. These areas were also found in an analysis across all images using broader categories, and two of these areas (buildings and vehicles) had longer fixations, suggestive of more in-depth processing (Henderson et al., 2013; Nuthmann et al., 2010). We chose to use images without people, as previous studies have found that people in scenes strongly draw observers’ attention (Birmingham et al., 2009; Röhrbein et al., 2015), and an interesting area for future research would be to examine the role of the presence of people on safety judgements of scenes and the corresponding eye movement patterns. When judging scenes for maintenance, participants were most strongly drawn towards streets and pavements, where potholes may indicate how well the scene was maintained. Interestingly, other regions that are normally associated with safety (“bins,” which also included litter, greenery, and puddles) were also more strongly attended for maintenance than for safety, suggesting that although these may play a role in safety judgements, their influence does not reach beyond that of judging the overall maintenance of the scene.
Compared with previous studies employing eye tracking to study the visual factors that determine the perceived safety of a scene (Davoudian & Raynham, 2012; Guedes et al., 2014), this study is the first to control for bottom-up factors, such as visual salience of the scenes (Itti & Koch, 2000; Itti et al., 1998). To control for bottom-up factors, we presented the same scenes under different tasks, so that differences between the two tasks could not be due to visual aspects of the scene alone. We selected two tasks (safety and maintenance judgements) that did not explicitly direct observers’ attention to specific objects in the scene (which, for example, counting the number of cars, would have done) and also avoided asking participants to look at the scenes without an instruction (where participants may start searching for clues about why they are asked to look at these specific images). Although the use of a comparison task for this purpose is a clear strength of our approach, it may simultaneously present a weakness, as the two tasks compared (safety and maintenance judgements) are inherently related (Jackson & Gouseti, 2012; Lorenc et al., 2012). Although the tasks tapped into related aspects of visual scenes, analysis of the eye movements demonstrated a clear pattern of differences. Future studies, however, could explore other comparison tasks. Ideally, such other tasks require the global scanning of each image, and does not selectively draw attention to particular areas of the image (e.g., as in counting or visual search tasks). One possible task would be a memory task, in which participants are asked to decide whether a section of an image shown after the target image was part of that image (Hermens & Walker, 2015). The advantage of this task is that the presentation duration can be chosen by the experimenter (in contrast to, for example, visual search, where search ends after the participant locates the target), so that this aspect of presentation can be matched to the judgement task. Alternatively, eye movements of groups of participants (e.g., males and females) within the same task can be compared. Such comparisons did not reveal any group differences in this study, but this could relate to the sample size, which sufficed to measure within participant differences, but may be inadequate to reveal group differences.
Our results provide recommendations beyond those already known, such as maintaining areas, improving lighting, and making people familiar with their surroundings. The main recommendation from our eye movement results is that cues indicating the presence of people (whether they are actually there or not) may reduce fear of crime. It may therefore be beneficial to leave some lights in buildings switched on or have bicycles and cars parked visibly in front of buildings.
Conclusion
Past studies have suggested that reducing signs of incivilities (e.g., litter, vandalism, broken windows), adding green space, and improving lighting can help reduce fear of crime. Our results, using eye movement data while participants rated images for safety and maintenance, suggest that adding signs indicating the presence of people may further reduce fear of crime. Our findings also suggest that eye movements can contribute to the understanding of the fear of crime, beyond what can be learned from surveys and ratings of images.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
