Abstract
The study expands empirical knowledge on nonresponse bias when estimating victimization rates by using latent class analysis (LCA). Based on information about proxy-nonrespondents (hard-to-reach respondents and soft refusals), the study identifies subgroup(s) of persons who are systematically underrepresented by refusal and unreachability and determines whether an over- or underestimation of different offense-specific crime rates (prevalence and incidence rates) is to be expected. Therefore, a broad review of the current state of research is carried out, followed by a nonresponse analysis of a large-scale victimization survey conducted in Germany (n = 35,503). The paper illustrates that a variety of factors must be considered when analyzing nonresponse in victimization surveys and that the current state of research does not allow definitive conclusions about the amount and direction of nonresponse bias. The following analysis shows that LCA constitutes an excellent approach to determine nonresponse bias in surveys. In each sample, one class of person was identified that is systematically underrepresented, both by refusal and unreachability. Here, victimization rates of violent crime tend to be significantly higher, indicating an underestimation of crime rates.
Keywords
Introduction
Looking at the current state of research, it is common methodological knowledge that nonresponse can cause nonresponse bias if the probability of nonresponse relates to one of the variables of interest. This can be the case not only for point estimates (Groves and Peytcheva, 2008) but also for the estimation of bi- and multivariate relations (Billiet et al., 2007; Peytchev et al., 2009), although the latter is known to be much smaller (Amaya and Presser, 2017; Heggestad et al., 2015). Nevertheless, current empirical knowledge about the amount and direction of nonresponse bias for specific variables of interest is very limited, especially in criminological research. Although a variety of theoretical assumptions and evidence make nonresponse bias in victimization surveys likely, as yet, there is a lack of reliable empirical evidence on the overall level and direction of nonresponse bias when estimating victimization risks.
Consequently, neither researchers nor political decision makers have validation as to whether victimization rates from surveys are over- or underestimated in victimization surveys and how large these biases may be. The political importance of victimization surveys as a crucial source of crime data, as well as the steadily rising nonresponse rates to surveys (in Europe, surveys with response rates below 50% are usual), suggest that this lack of research is unsustainable for survey methodology in general and for criminological research in particular. 1
In the context of victimization surveys, two principal effects of nonresponse when estimating victimization rates—first described by van Dijk in 1989—are commonly discussed in the research literature: a) the interest in the survey topic and the so-called ‘eager-to-tell hypothesis’, whereby victims are willing to speak about their victimization experiences (which should lead to a higher participation rate of victims and thus an underestimation of victimization risks); and b) the so-called ‘lifestyle hypothesis’, which assumes that persons who go out more frequently have greater victimization risks while being harder to reach in surveys (which should lead to a lower participation rate of victims and thus an underestimation of victimization risks). However, no empirical validation of these relationships is available. Neither van Dijk et al. (1990) nor Griggs et al. (2018) found a correlation between response and victimization rates. Schnell (2002) demonstrated a positive correlation between the number of contact attempts and the number of different victimization experiences; however, the relation was curvilinear for most offenses. Sparks et al. (1977) found that victims and non-victims do not differ in terms of their refusal frequency but rather in terms of their accessibility and lifestyles. Nevertheless, the conclusion of this research, indicating the absence of a serious nonresponse bias, is insufficient for several reasons: For an adequate analysis of nonresponse in victimization surveys, it appears necessary to differentiate between two major reasons for nonresponse: unavailability and refusal. It is commonly understood that these two factors are the result of different processes and thus related to different predictors (see also Couper and de Leeuw, 2003; Groves and Couper, 1998). Van Dijk et al. (1990) have also referenced the necessity of differentiating between reasons for nonresponse when thinking about nonresponse bias in victimization surveys. Nevertheless, recent research has not systematically differentiated between processes relating to refusal and unavailability in victimization surveys. Currently, a variety of variables (which exceed the above-mentioned approaches) are known to influence the probability of nonresponse. Worth mentioning are sociodemographic variables, fear of crime, and socio-environmental variables. Given that these variables may also be related to the probability of victimization experiences (partly in opposite directions), a systematic review of these effects should be considered when analyzing nonresponse bias. Although the ‘lifestyle hypothesis’ clearly refers to an important relation between availability in surveys and victimization risk, it appears doubtful that this relation has the expected effect on nonresponse bias in cross-sectional surveys. While victimization experiences are asked about for the period of the previous one to five years, questions about going-out behavior are restricted to the current period. As victimization experiences reduce going-out behavior, it seems possible that in cross-sectional surveys, persons who seldom go out (and thus are less likely to be reached in surveys) show a higher level of reported victimization experiences (confirmation of this empirical relation can be found in Birkel et al., 2014). The ‘eager-to-tell hypothesis’ only applies to survey settings where the survey topic is explicitly described, or the questionnaire is presented in advance (e.g. postal surveys). Considering the fact that this is not always the case for victimization surveys, in part due to efforts to avoid selection bias, it appears doubtful that victims who are frequently traumatized—especially in the case of more severe offenses—feel encouraged to participate more frequently. This effect should be taken into account for less severe forms of victimization and survey designs that communicate the survey topic and questions in advance. Nonresponse analysis (for understandable reasons) frequently relies on so-called proxy information of nonrespondents, such as hard-to-reach respondents and soft refusals (‘proxy-nonrespondents’).
2
In doing so, most approaches only refer to simple group comparisons, in part by controlling for sociodemographic variations. However, proxy-nonrespondents only represent nonrespondents imperfectly (Lynn et al., 2002; Stoop, 2004; van Leest and Burhenne, 1997). It is likely that proxy-nonrespondents, to a large extent, resemble easy-to-reach or interview-willing respondents. Thus, an approach to nonresponse analysis that identifies proxy-nonrespondents who represent final nonrespondents to a great(er) extent seems appropriate. For similar reasons, studies investigating the correlation between response rates and overall victimization rates are also misleading. A variety of studies show that the level of bias does not automatically increase while nonresponse rates are increasing (Groves and Peytcheva, 2008). Further, it is well documented that, in particular, the variation of response probabilities between subgroups produces nonresponse bias (Peytcheva and Groves, 2009). Thus, looking only at the entire group of (proxy-)nonrespondents and their victimization rates may blur relevant differences. It seems both plausible and likely that the amount and direction of nonresponse bias in victimization surveys vary between types of crime. Thus, analyses based on overall victimization rates, which are common (e.g. van Dijk et al., 1990), are not expedient.
Research questions
This study aims to increase empirical knowledge about nonresponse bias when calculating victimization rates by using information about proxy-nonrespondents. The analysis underlies the widespread idea that nonresponse is a stochastic phenomenon, and that people may or may not participate (Groves and Couper, 1998). This joins the theoretical class model for analyzing nonrespondents, which assumes that there are different groups of nonrespondents and that these groups can be found among proxy-nonrespondents (Smith, 1984; Stinchcombe et al., 1981; O’Neil, 1979).
Based on this assumption, and considering the above-mentioned weaknesses of previous research approaches, the study seeks to answer the following questions: Which variables are correlated with nonresponse in victimization survey, either because of direct or indirect effects on victimization risks? Do proxy-measures of nonrespondents include one or more latent classes of persons that are underrepresented among ‘normal respondents’ (= persons that are easy-to-reach and interview-willing)? If yes, do these subgroups indicate an over- or underestimation of victimization rates (differentiating between offense-specific crime rates as well as prevalence and incidence rates)?
Although previous research findings do not indicate significant bias when estimating victimization rates, various conceptual considerations and methodological evidence raise doubts as to whether this is indeed the case. The underlying approach of this study is, therefore, to be understood as a type of sensitivity analysis that puts the current state of research to a test, allowing verification of the absence of relevant nonresponse bias in the estimation of victimization rates. This seems necessary in view of the criminological and criminological-political importance of victimization research, especially against the backdrop of low and decreasing response rates observed in many countries of the world.
The empirical challenge of the following analysis is to determine whether one or more homogenous latent classes of soft refusals and hard-to-reach respondents exist that are systematically underrepresented among normal respondents, that is, those who are easy-to-reach and interview-willing. Finding such latent class(es) and determining their proportion as well as the distribution of victimization rates should allow for assessing the characteristics of systematically underrepresented persons separately for refusals and unreachable persons, and for establishing what level of victimization rates (prevalence and incidence rate) those persons have to determine whether an overall over- or underestimation of victimization rates is to be expected.
To enable such an identification of possible subgroups/classes, pertinent variables linking the probability of nonresponse and victimization must be elaborated (research question 1). As refusal and unreachability are the results of different processes and, thus, related to different predictors (see Couper and de Leeuw, 2003; Groves and Couper, 1998), the state of research on each reason of nonresponse will be reviewed separately.
State of research: Refusal bias in (victimization) surveys
Survey methodology provides several theoretical approaches to explain participation in surveys, including rational choice models, social psychological theories such as social exchange theory, and leverage-salience theory, which can be considered an extension of rational choice models (see Groves and Couper, 1998; Groves et al., 2000; Stoop, 2012). Although these theories have diverse emphases, survey participation is commonly explained multi-causally by various, partially interacting factors such as survey design, interviewer characteristics, social environment and several personal or household-related variables. Upon examining in more detail those variables that have proved to affect willingness to participate in surveys as well as the probability of being a victim, the following four groups of variables are identified as meriting special attention: sociodemographic variables, the interest in and personal experience of the subject of the research question, fear of crime, and community characteristics.
Sociodemographic variables
Several sociodemographic variables have been shown to be related to survey refusal, such as age, household composition, sex, education, migration background, and working status. While younger and older persons, as well as households with children, usually exhibit a higher willingness to participate in surveys, migrants, men, unemployed persons, and persons with low formal education tend to refuse more frequently (for an overview, see e.g. Groves and Couper, 1998; Stoop, 2004). Victimization surveys regularly show a relationship between these variables and the probability of victimization, especially for single-person households, migrants, men, and unemployed persons who show higher victimization risks for property and violent crime (see e.g. Birkel et al., 2014; Kennedy and Forde, 1990; Osborn and Tseloni, 1998). Therefore, nonresponse bias on the estimation of victimization risk appears likely, although this may not be true for each crime type. Of course, it can be critically analyzed whether sociodemographic variables have direct causal effects on victimization experience or rather represent manifestations of underlying social processes. Either way, the statistical relationship between these variables represents the basis of careful conclusions, namely that survey refusal can lead to an underrepresentation of more vulnerable persons, thus causing an underestimation of victimization rates. Based on multivariate analyses, Guzy (2015) showed that lacking survey responses from single persons, migrants, men, unemployed persons and persons with low education leads to an, albeit small, underestimation of victimization rates for the crime types of theft, fraud, and violence (including robbery and assaults).
Interest in and personal experience of the subject of the research question
As mentioned above, the interest in and personal experience of the research subject is one of the most cited nonresponse effects in victimization literature (van Dijk et al., 1990). Stangeland (1996) calls this approach the eager-to-tell hypothesis; it assumes that (under the assumption that respondents are precisely informed about the research question) victims exhibit a higher probability of participating than non-victims, leading to an overestimation of victimization rates (see also Killias, 1990). 3 This assumption is confirmed by a) theoretical approaches that explain participation behavior in surveys (e.g. leverage-salience theory by Groves et al., 2000), b) survey results that found participation rates to be higher among respondents with textual relevance (for a meta-analysis, see Aust and Schröder, 2009), and c) victimization surveys that found victims to be more willing to participate in surveys (Killias, 1990; McNutt and Lee, 2000). However, this hypothesis appears, at least in part, to contradict some victimological research that finds serious crime to be less frequently admitted to in less anonymous survey settings (Cantor and Lynch, 2000; Johnson et al., 2001; van Dijk et al., 2010) and to psychological considerations according to which victims, especially of serious violent crime, are known to be largely traumatized and attempt to suppress their experiences. Because of this, it is likely that the eager-to-tell hypothesis is influenced by the survey administration mode. Despite that, a variety of victimization surveys do not directly refer to the crime topic of the survey, in an attempt to avoid this effect. Thus, a positive relation between response and the victimization probability appears unlikely, especially for a serious crime. Accordingly, neither van Dijk et al. (1990) nor Griggs et al. (2018) found a relation between refusal and victimization rates.
Fear of crime
Studies have shown that persons with a higher level of insecurity tend to be more likely to refuse to participate in surveys (Bethlehem and Kersten, 1985; Durrant and Steele, 2008). It is unclear whether this relation leads to an over- or underestimation of victimization rates. On the one hand, it seems plausible that persons who feel more insecure and tend to refuse exhibit higher victimization rates, taking into account that a high level of fear of crime is a reaction to victimization experiences (Dull and Wint, 1997; Ferraro, 1995; Skogan, 1987). On the other hand it is also possible that fearful persons show a lower level of victimization risk as fear of crime leads to more pronounced avoidance behavior (Skogan, 1987). Thus, the current state of research concerning the relation between nonresponse, fear of crime and victimization risks is inconclusive. While many studies found victims to be more fearful, a variety of surveys also found no effect (for an overview, see Tseloni and Zarafonitou, 2008), most likely because persons with a high level of fear of crime exhibit more pronounced avoidance behavior (Averdijk, 2011; Rengifo and Bolton, 2012), leading not only to a reduced willingness to participate in surveys but also to a reduced victimization risk. Thus, a systematic underrepresentation of more fearful persons in victimization surveys has the potential to cause both an over- and underestimation of victimization rates, although the latter effect is considered to be especially relevant in the case of longer reference periods (for similar conclusions see House and Wolf, 1978).
Community characteristics
A variety of contextual factors have proved to influence survey cooperation, such as living in an urban area, population density, crime rates, and a lack of social cohesion. One of the most consistently documented of these factors is living in an urban area; residents of small towns, rural or suburban areas respond to surveys at a lesser rate than persons from larger cities and inner cities (for an overview, see Couper and Groves, 1996). Couper and Groves (1996) suggest that variables that are negatively associated with community ties, such as high population density, crime rates, social disorganization and crowding, constitute the causal reasons for this correlation. 4 In line with this, research results have verified that persons from socially disorganized neighborhoods show a lower readiness to participate in surveys (Groves and Couper, 1998), while they typically exhibit a higher level of crime and victimization rates (for an overview, see Sampson et al., 2002; for theoretical embedding see social disorganization theory by Shaw and McKay, 1942), leading to an underestimation of victimization rates (see Elliot and Ellingworth, 1997; Groves and Couper, 1998).
State of research: unavailability bias in (victimization) surveys
Unavailability in surveys constitutes the second most common reason for nonresponse. In this context, availability is usually defined as a function of at-home patterns and leisure-time behavior and, thus, is typically operationalized by the number of contact attempts needed to achieve an interview (Groves and Couper, 1998). Correlates and predictors for unavailability can be found among three groups of variables: a) sociodemographic variables such as age, household composition, working status, migration background and socioeconomic status; b) lifestyle/leisure-time behavior; and c) community characteristics.
Sociodemographic variables
A variety of variables are known to be linked to availability in surveys. Nonresponse studies have found that older persons (and households with older persons), multi-person households, unemployed persons and persons without a migration background, as well as persons with low socioeconomic status (operationalized by education, income and working status), are more reachable in surveys (Freeth, 2004; Groves and Couper, 1998; Groves et al., 2009; Lynn et al., 2002). As criminological research results show that each of these variables is related to victimization risks, nonresponse bias in victimization surveys appears likely. The direction of effects is partly opposite; being a multi-person household or unemployed increases both the availability and victimization risk, and being older or a non-migrant increases availability in surveys but decreases victimization risks, especially for property crime (see e.g. Birkel et al., 2014; Kennedy and Forde, 1990; Osborn and Tseloni, 1998). Thus, predictions about the direction of bias appear impossible.
Lifestyle behavior
In the research literature, the link between nonresponse, lifestyle behavior and victimization rates is one of the most cited when discussing non-availability bias in victimization surveys. The assumption is that persons who are hard-to-reach in surveys usually represent mobile persons who go out more often in their leisure time and are more frequently affected by victimization experiences (see e.g. Stangeland, 1996; van Dijk, 1990; Young, 1988). The assumptions are supported by several victimization theories that are based on lifestyle and routine activity structures (Cohen and Felson, 1979; Hindelang et al., 1978) as well as several groups of studies that found significant correlations between mobility and victimization risks. One group of empirical studies supports routine activity theory and the correlation between going-out behavior and victimization risk (Cohen and Felson, 1979; Hindelang et al., 1978; Jensen and Brownfield, 1986; Lynch, 1987; Miethe et al., 1990), while another group of studies confirms the relationship between mobility (in the sense of relocation) and the probability of being victimized (Bidermann and Cantor, 1984; Dugan, 1999; Reiss, 1978). 5
In addition, a group of analyses investigates the relationship between general accessibility in surveys (operationalized by the number of contact attempts required to achieve an interview) and victimization risks, assuming that pronounced going-out behavior is related to lower availability in surveys and a higher victimization risk (Kury and Obergfell-Fuchs, 1998; Warr, 1994). Confirmation can be partly found in the DEFECT study by Schnell (2002) in which a positive correlation between the number of contact attempts and the number of victimization experiences was found. This relationship was linear for some offenses (burglary, auto theft) and curvilinear for others (bicycle theft, robbery, assault). Further studies that show persons who go out more frequently are harder to reach in surveys underpin the aforementioned hypothesis (Durrant and Steele, 2008; Groves and Couper, 1998), as does a study by Sparks et al. (1977), which found that victims and non-victims differed not in terms of their refusal frequency, but rather in terms of their accessibility and lifestyles.
Nevertheless, when considering the widespread survey designs usually applied to survey victimization experiences and lifestyle behavior, an additional relationship must be taken into account. As most victimization surveys are cross-sectional surveys, victimization experiences are usually asked about for the reference period of the previous one to five years, while lifestyle behavior and going-out behavior are asked about only at the time of the survey. Thus, not only are persons who go out more frequently less likely to be reached in surveys while the victimization risks and rates are increased, but also past victimization experiences can influence current lifestyle and going-out behavior in an attempt to avoid repeated victimization (Averdijk, 2011; Rengifo and Bolton, 2012). Thus, in cross-sectional surveys, some persons who go out less frequently, and are thus are more likely to be reached in surveys, do so because of past victimization experiences, leading to an overestimation of victimization risk (confirmation can be found in Birkel et al., 2014).
Community characteristics
The criminological research literature confirms that community characteristics are closely related to crime risk and thus the level of victimization rates. This can be explained by criminological theories, such as social disorganization theory (Shaw and McKay, 1942) and broken window theory (Wilson and Kelling, 1982). Against this background, a variety of studies have shown that communities with high population density, mobility and rates of unemployment, along with visual incivilities, low neighborhood ties and formal control, commonly exhibit high crime rates, especially for violent and property crime (Sampson and Groves, 1989; Sampson et al., 1997; for an overview see Sampson et al., 2002). At the same time, methodological research also shows that persons from larger communities and socially weaker neighborhoods are characterized by being harder to reach in surveys (Groves and Couper, 1998; Lynn, 1998). Thus, the relationship between community characteristics, crime risk, and unavailability suggests an underrepresentation of persons from disadvantaged and thus crime-heavy communities, resulting in an underestimation of victimization risks (probably for both property crime and violent crime).
Methodology
Data
The data for the following analysis derived from a computer-assisted telephone-based survey about victimization experiences, reporting behavior, fear of crime and crime-related attitudes, titled the ‘German Victimization Survey’. The survey was conducted in 2012 by the Federal Criminal Police Office in close cooperation with the Max Planck Institute for Foreign and International Law. 6 The study was part of a project funded by the German Ministry of Education and Research, ‘Security, perceptions, reports, conditions and expectations – monitoring security in Germany’ (known as the Barometer of Security in Germany or BaSiD). The target population was all German-speaking residents living in Germany aged over 16 years as well as immigrants belonging to the residential population who spoke Turkish or Russian (to the extent that they lived in private households and were available by telephone). The sample contained 35,503 persons and was surveyed in the German, Russian and Turkish languages; the respondents were contacted by landline (n = 28,118) and by mobile telephone (n = 7,385). The telephone numbers were generated by random digit dialing (RDD), a method that uses active number blocks and lets the last digits vary randomly. This approach usually produces good sample coverage as all persons with a landline or mobile telephone have the probability of being selected for the sample.
The sample was drawn by a multilevel and stratified random sample procedure. To achieve an appropriate representation of persons with a migration background, a subsample was generated using a selection process that refers to the study of names (onomastic). For this name-based approach, additionally to the RDD sample frame, names from a telephone register were provided with probabilities of having a foreign origin and selected for an additional sample. 7
With the aim of achieving the highest possible response rate, target persons who initially refused to participate were contacted again after some time and invited a second time to participate. These nonresponse conversion interviews were carried out by specially trained interviewers and included only respondents who refused with the explanation of ‘no time’ or ‘not interested’. Of the 52,392 phone numbers that went into this post-processing, a total of 2,267 interviews (4.3%) were conducted. Target persons were also contacted without a limit on the number of contact attempts so that very hard-to-reach respondents were also surveyed.
Based on the nonresponse classification and calculation of response rates of the American Association for Public Opinion Research (AAPOR, rate 4), the overall response rate was 22%, while approximately 61% refused to answer and 17% were not-reachable (for more information, see the methodology report, Infas, 2013). This low response rate illustrates the necessity of methodological analysis for this study and the importance of nonresponse analysis in this criminological context.
Analytical strategy
Although a variety of studies have shown that both hard-to-reach respondents and soft refusals represent nonrespondents only imperfectly, there are a relevant number of studies that have found noticeable differences between the easy-to-reach and interview-willing respondents (Lynn et al., 2002; Stoop, 2004; van Leest and Burhenne, 1997). Moreover, some results indicate that sample measures get closer to the true value if the number of contact attempts increases (Billiet et al., 2007; van Leest and Burhenne, 1997). Thus, the strategy of the following analysis follows the assumption that proxy-nonrespondents can be used ‘for estimating how participants and nonparticipants differ from each other and then using this information to obtain at least rough estimates of the size and direction of nonparticipation bias in survey estimates of means and proportions’ (Lin and Schaeffer, 1995: 3). In doing so, it is assumed that soft refusals can provide important insights into the group of final refusals and hard-to-reach respondents into the group of unreachable persons. Consistent with the idea of the class model (Smith, 1984; Stinchcombe et al., 1981; O’Neil, 1979), however, it is assumed that fewer than the entire group of proxy-nonrespondents are informative in order to determine final nonrespondents and nonresponse bias, but rather that subgroups/latent classes exist that resemble final nonrespondents, or at least systematically underrepresented persons to a larger extent.
For the following analysis, the sample was divided into soft refusals (persons who refused to participate at a first stage of the survey but could be converted by a specially trained interviewer a week after the refusal), interview-willing respondents (persons who immediately consented to participate in the survey), hard-to-reach respondents (persons who needed more than seven contact attempts to participate) and easy-to-reach respondents (those who needed a maximum of six contact attempts to be reached for the survey). In order to identify the latent class structure of soft refusals and hard-to-reach respondents, latent class analysis (LCA) (Hagenaars and McCutcheon, 2002) is used. LCA is a multivariate technique and a subset of structural equation modeling that enables the identification of unobservable subgroups within the population under study. This method has already been used in nonresponse analysis (see Feskens et al., 2012); however, this is the first time it will be performed based on information about proxy-nonrespondents.
Aiming to find one or more latent class(es) of (proxy-)nonrespondents that is/are systematically underrepresented among normal respondents, the latent class structure of hard-to-reach respondents is compared with that of easy-to-reach respondents, on the one hand, and the latent class structure of soft refusals with that of interview-willing respondents on the other. If proxy-nonrespondents include information about non-ignorable nonrespondents, the sample of soft refusals and hard-to-reach respondents must include at least one latent class of persons who are underrepresented or even not identified within the sample of easy-to-reach and interview-willing respondents. If such a class is identified, an analysis of its victimization rates should allow conclusions about the direction and possibly the level of nonresponse bias.
The LCA was run with those classification variables for which the current state of research and the data used exhibit a relationship between availability in surveys (= number of contact attempts needed for a realized interview) and the tendency to refuse to participate in surveys, with victimization risk. Analysis differs between violent crime (robbery or assault) and property crime (theft of bicycle/car/motorbike, other personal theft, consumer fraud, abuse of payment card, internet offenses) as well as between the prevalence and incidence rate.
For analysis, the statistical package Mplus was used (Muthén and Muthén, 1998–2017). The models were run with 1,000 starting values in the first step and 500 starting values in the second step by using 500 iterations. The bootstrap method, which is recommended for model selection (see McLachlan and Peel, 2000), was run with a sample of 500, increasing the starting value sets to 100 in the first and 50 in the second step for each case. Based on the pertinent research literature (Hagenaars and McCutcheon, 2002; Nylund et al., 2007), the following criteria were used to select the appropriate number of latent classes in each analyzed sample: a) the BIC was minimized; b) the parametric bootstrap method described by McLachlan and Peel (2000) was significant; c) the number of latent classes allowed substantive interpretation; d) the entropy value and average latent class probabilities for most likely latent class membership were at least close to 0.8; and e) the number of boundary estimates was small. Furthermore, the additional condition was set that f) at least half of the starting values converged. 8
As a result, for both the sample of soft refusals and hard-to-reach respondents, a four-class model fits the data best, while the samples of interview-willing and easy-to-reach respondents are best represented by a three-class model (for model information, see Appendix A).
Results
Soft refusals versus interview-willing respondents
Tables 1 and 3 present the characteristics of each identified latent class based on the classification variables and the aforementioned methodology. In order to answer the underlying research question, each latent class is assessed with respect to its victimization rates.
Starting with the sample of soft refusals, LCA identified four groups of respondents: a) employees; b) pensioners; c) students and unemployed persons; and d) typical nonrespondents (see Table 2). A comparison with the latent class structure of the interview-willing respondents reveals that the first two classes are largely comparable, at least in terms of the classification variable, the class proportions and the victimization rates. While both these classes with proportions of approximately 47–50% primarily represent multi-person households, employees, middle-aged persons and persons with low fear of crime, the third and fourth classes, with a proportion of 29–31%, consist of mostly older (pensioners) and unemployed persons with a lower level of education.
Latent class structure of soft refusals and interview-willing respondents (in %).
Average marginal effects (AME) on victimization rates for class four of soft refusals (typical nonrespondents).
***p ≤ 0.001; **p ≤ 0.001; *p ≤ 0.05, (*) p ≤ 0.01
Latent class structure of hard-to-reach and easy-to-reach respondents (in %).
Considering in greater detail the victimization rates of class one and two (see Figure 1), one can see that victimization rates (prevalence and incidence rates) are rather similar between soft refusals and interview-willing respondents for each type of crime and crime rate under study. Interestingly, victimization rates of reluctant employees and pensioners tend to be slightly lower than for their interview-willing counterparts. This holds especially true for the 1-year property crime incidence rate of employees (6.9 versus 10.1 cases per 100 inhabitants) and for the 5-year property crime prevalence rate of pensioners (20.2% versus 24.8% cases per 100 inhabitants). Multivariate analysis results show that these differences, though small, cannot be explained by a different sociodemographic distribution between soft refusals and interview-willing respondents (see Appendix B).

Victimization rates of soft refusals and interview-willing respondents.
More noticeable differences between soft refusals and interview-willing respondents arise in class three of the LCA. Although the third class with a proportion between 10% (soft refusals) and 21% (interview-willing respondents) appears structurally similar to the two samples at first glance, interview-willing students and unemployed persons consist more frequently of multi-person households, unemployed, migrants and persons up to 34 years of age. At the same time, class three of soft refusals is represented by a larger proportion of highly educated persons and an overrepresentation of men. Presumably, the overrepresentation of men and persons with a higher education level also explains why the proportion of persons with a high level of fear of crime is lower in the sample of soft refusals. In both the sample of soft refusals and that of interview-willing respondents, victimization rates in class three are the highest and much higher than in the other classes. This applies in particular to the sample of soft refusals and to violent crime (5-year prevalence rate: 27.8% versus 20.5%; 1-year incidence rate: 31.8 versus 15.1 cases per 100 habitants), but is also important for the 5-year property prevalence rate (55% versus 48.5%).
Testing whether differences in victimization rates between soft refusals and interview-willing students and unemployed persons are caused by a different sociodemographic distribution of both subsamples (composition effect) shows, however, that the observed higher victimization rates of class three can be fully explained by a deviation in the sociodemographic distribution of both classes (see Appendix B). Thus, the higher victimization probabilities can be explained by the sociodemographic composition of soft refusals. This is interesting and empirically relevant, as these differences can be correctable by weighting for these variables and thus no nonresponse bias is likely.
Particularly interesting results emerge when looking at class four of the soft refusals, which does not have a comparable counterpart in the sample of interview-willing respondents. This class of persons represents especially multi-person households, women and persons with a migration background, low level of community ties and a high level of fear of crime. 9 Taking into consideration the empirical knowledge regarding the characteristics of refusals, this class—with a significant size of 11.6%—appears to exhibit known refusal characteristics to a large extent (thus this class was called ‘typical nonrespondents’). Looking at its victimization rates, one can see that the rates with levels between 12.4% and 47.7% are considerably above average for each type of crime and for both types of crime rates (prevalence and incidence rates). Although victimization rates are not as high as in class three, they reach levels that are up to twice as high as those of the whole sample, particularly for violent crime.
Table 2 represents the results of a hierarchical multivariate modeling strategy. While the first model (column 1) shows the effect of class four (typical nonrespondents) on the victimization rate in the whole sample of soft refusals, and thus represents the differences in victimization rates between class four and the rest of the sample, the second model (column 2) controls for further variables such as household composition, working status, migration background, age, community size, sex and education. In doing so, the results show that the higher victimization rates of violent crime do not disappear after multivariate controlling. Although the effect of being a member of class four of soft refusals is small, the differences in victimization rates remain and cannot be explained by sociodemographic characteristic of soft refusals. Thus, class four of soft refusals indicates a systematic underestimation of violent crime rates. In contrast, the differences for property crime are either not significant or lose their significance after controlling for sociodemographic variables; here, nonresponse bias is unlikely.
Hard-to-reach versus easy-to-reach respondents
Table 3, as an analogue to the previous analysis, shows that LCA identified four classes among hard-to-reach respondents, while for the sample of easy-to-reach respondents only three classes were found. Again, the first three classes appear rather similar (in both of the samples, one class each of employees, pensioners and students was identified) while class four of the hard-to-reach respondents does not have a comparable counterpart.
Starting with the first class of employees (which can be found in both samples with a proportion of approximately 50%–52%), both hard-to-reach and easy-to-reach respondents show an overrepresentation of multi-person households, employed persons, persons between 35 and 64 years of age and those with an income over EUR3,000 per month. The overrepresentation is higher in the sample of hard-to-reach respondents, however. The second class of pensioners exhibits a similar distribution between both samples, with the exception that the proportion of older persons up to 64 years of age is remarkably higher among the hard-to-reach respondents (87.2% versus 74.8%). Nevertheless, class two is much smaller among hard-to-reach respondents than among easy-to-reach respondents (19.8% versus 33.3%).
Figure 2 visualizes the results of the victimization rates for hard-to reach and reluctant respondents by class. The victimization rates of the first class (employees) are rather similar between hard-to-reach and easy-to-reach respondents. Significant differences (only at the 10% level) are only observed for the 1-year incidence of violent crime rate. However, this difference is very small and can be explained by compositional effects derived from sociodemographic variables (see Appendix B).

Victimization rates of hard-to-reach and easy-to-reach respondents.
The rates of pensioners attract slightly more attention, as both prevalence and incidence violent crime rates are statistically significantly lower among hard-to-reach pensioners than among easy-to-reach pensioners. This result seems especially interesting as these hard-to-reach, but less victimized, pensioners are characterized by a tendency to go out more frequently (see above, literature review on ‘Lifestyle behavior’). In this class, being hard-to-reach is linked to more pronounced outgoing behavior but to lower crime risk. Controlling again for compositional differences between hard-to-reach and easy-to-reach respondents, multivariate analysis shows that only the higher 5-year prevalence of violent crime rates remain significant. Thus, as in the case for refusals, victimization rates of violent crime tend to be slightly overestimated by unreachability among pensioners.
Further discrepancies can also be found when looking at the latent class of students. Although in both samples, unemployed persons, migrants, persons up to 34 years of age and those with a higher tendency to go out in the evening are overrepresented, the proportion and level of overrepresentation is particularly high among hard-to-reach respondents. Despite that, the distribution of income appears especially remarkable, as the proportion of persons with an income above EUR2,000 per month is above average among hard-to-reach respondents, while the proportion is below average among easy-to-reach respondents. Thus, although these groups appear structurally similar, they differ with respect to their sociodemographic variables (especially income distribution). This may explain the significantly higher 1-year incidence rates of property crime of hard-to-reach respondents (38.8%) compared to easy-to-reach respondents (19.8%), indicating an underestimation of crime rates by unreachability in this class. Multivariate analysis controlling for this sociodemographic composition confirms this assumption; differences in crime rates disappear after controlling for sociodemographic variables.
Class four of the hard-to-reach respondents, again, does not have a comparable counterpart among the easy-to-reach respondents. This group is characterized by an overrepresentation of younger persons, migrants, persons with a low income and those from neighborhoods with a low level of community ties. Again, these are characteristics known to be related to unavailability in surveys. In this class, which represents 17% of all hard-to-reach respondents and thus constitutes a considerable proportion of people, victimization rates are clearly above average for each type of crime and rate, although they do not reach the level of class three of students and unemployed. The considerably higher crime rates hold especially true for violent crime; victimization rates (5-year prevalence 41.5%, 1-year incidence 12.9 cases per 100 persons) are considerably higher than the first two classes (where prevalence rates vary between 2.4% and 7.4% and the incidence rates between 0.5 and 4.8 cases per 100 persons).
Controlling again for compositional effects (here the differences between class three and the rest of the sample of hard-to-reach were controlled by sociodemographic variables), the results in Table 4 show that the higher victimization rates for violent crime as well as the 5-year property crime prevalence rate are stable for statistical controlling of sociodemographic variables. The effect of being part of the class of typical nonrespondents is very small but stable, thus an underestimation of victimization rates is to be expected.
Average marginal effects (AME) on victimization rates for hard-to-reach respondents.
***p ≤ 0.001; **p ≤ 0.001; *p ≤ 0.05, (*) p ≤ 0.01
Summary and discussion
Based on a review of available literature, the paper provides a unique overview of the current state of research relating to nonresponse when estimating victimization rates. In doing so, a variety of relevant variables and processes were identified that need to be considered when studying nonresponse bias in victimization surveys. This overview demonstrated that, based on available evidence, reliable predictions about the overall direction or even approximate amount of nonresponse bias are neither feasible nor suitable when estimating victimization rates. For several reasons, recent research approaches analyzing nonresponse bias in victimization surveys were assessed as insufficient. Both content-related results and methodological knowledge about nonresponse analysis (in victimization surveys), especially when using information about proxy-nonrespondents, make an approach feasible referring to the idea that nonresponse is a stochastic phenomenon and that different groups of nonrespondents might allow the identification of systematically underrepresented persons and thus conclusions about the level and direction of nonresponse bias. Latent class analysis was used for the underlying nonresponse analyses. The aim was to identify the latent class structure of proxy-nonrespondents, namely soft refusals and hard-to-reach respondents, and to compare its structure, as well as victimization risks, with those of interview-willing and easy-to-reach respondents. Interestingly, for both the samples of proxy-nonrespondents, latent class analysis identified four latent classes, while for the sample of interview-willing and easy-to-reach respondents, only three classes were found. Thus, for the sample of soft refusals as well as hard-to-reach respondents, in each case, one class of respondents was found that is systematically underrepresented among easy-to-reach or interview-willing respondents.
While the first three classes among hard-to-reach respondents and soft refusals are structurally similar compared to normal respondents, at least in the sense of a general characterization based on their classification variables, only a more detailed determination shows that each class differs in terms of at least one classification variable. In most cases, the overrepresentation is higher among soft refusals and hard-to-reach respondents. The victimization rates appear rather similar, especially for the class of employees and pensioners, which constitute a relevant proportion in each sample of between 72% and 79%. Interestingly, victimization rates tend to lie slightly below the rates of interview-willing respondents. Significant differences, especially after controlling for sociodemographic compositional effects, could only be observed for pensioners and violent crime. Here victimization rates (prevalence and incidence) are slightly lower among refusals and hard-to-reach respondents, indicating that nonresponse—independent of the reason—leads to an overestimation of violent crime rates among pensioners. For class three (students and unemployed), differences in victimization rates between proxy-nonrespondents and normal respondents are observed to be much more pronounced. In this case, victimization rates tend to be systematically higher among soft refusals and hard-to-reach respondents. These differences completely disappear when controlling for sociodemographic variables, indicating a so-called missing-at-random (MAR) dropout that can be fully accounted for by variables available in the sample and thus be corrected by classical weighting (for analogue results see Carkin and Tracy, 2015).
In contrast, both samples of proxy-nonrespondents (soft refusals and hard-to-reach respondents) refer to one latent class of persons that does not have a comparable counterpart among normal respondents. Here, victimization rates are clearly above average, and are especially high for violent crime. Further multivariate analysis, comparing class four with the rest of the sample, shows that these differences in violent crime victimization rates cannot be explained by the sociodemographic composition of both samples. This indicates that the significantly higher victimization rates of violent crime represent real differences and that victimization rates are systematically underestimated by refusal as well as unreachability. This seems to apply in particular to incidence rates, where differences are especially high and robust. For these cases, classical weighting, referring to sociodemographic (marginal) distribution, cannot solve the problem. This seems unsurprising, as LCA could not identify this class of persons among interview-willing and easy-to-reach respondents. Obviously, this class represents characteristics that are not well represented among normal respondents. As weighting can only give a greater significance to persons who are represented in the sample, approaches must be elaborated that consider this systematically underrepresented class of person among proxy-nonresponse in a special manner. Therefore, however, the crucial question about the proportion of this systematically underrepresented person among the final nonrespondents must be answered. In both samples of soft refusals, as well as hard-to-reach respondents, the class of typical nonrespondents represents between 11% and 17%. It seems likely that the proportion is significantly higher among final refusals; the underlying results, however, do not allow assumptions about the proportion at this point. Academic considerations and hypotheses on this matter, as well as questions about the use of this information for correction procedures, should be the basis for further research.
The overall analysis indicates that a relevant part of nonrespondents resembles interview-willing and easy-to-reach respondents to a large extent and that for those persons, only very small nonresponse bias in victimization rates can be expected. For a relevant part of respondents, differences in victimization rates are the result of a missing-at-random process that can be explained, and thus controlled for, using sociodemographic variables. Interestingly, especially the group of reluctant and hard-to-reach pensioners has shown significant and stable differences in victimization rates that could not be explained by compositional effects, but suggests a systematic overestimation of violent crime rates by nonresponse. Despite this, the data of proxy-nonrespondents refer to one class of systematically underrepresented nonrespondents that show significantly higher violent crime rates, indicating, in turn, an underestimation of victimization rates. Although the underlying results suggest that this effect is not that great, it is likely that the differences are higher among final nonrespondents. It is important to keep in mind that those proxy-nonrespondents (even the smaller class of typical nonrespondents) might still differ compared to final nonrespondents. Nevertheless, another direction bias seems unlikely; if at all, a stringer effect appears plausible.
It is noteworthy that the expected nonresponse bias turns out to be very similar for both kinds of nonresponse. Although the dropout process is rather different for refusal and unreachability, the structure of bias for victimization rates proved to be comparable. However, it is important to differentiate by the type of crime, as nonresponse seems to particularly bias violent crime rates. Not only can a systematic underestimation of victimization rates be observed, but also, depending on the group under study, an overestimation. Moreover, the distinction between prevalence and incidence rates is shown to be informative, as nonresponse tends to influence incidence rates to a greater extent.
The underlying analysis also leads to important methodological inferences about nonresponse analysis based on proxy-nonrespondents, an approach which is widespread in survey research. First, it is shown that although hard-to-reach respondents and soft refusals reach largely similar persons at first glance, this also includes important information about systematic underrepresented persons that might allow conclusions about nonresponse bias. Thus, making further efforts to reach and survey soft refusals and hard-to-reach respondents is highly advisable to improve sample representation and the quality of the survey data. The identification of different classes/groups of (non-)respondents does not only allow a determination of relevant key variables but also an assessment of persons who need special attention to reduce future nonresponse, for example, by tailored nonresponse reduction strategies.
Finally, some methodological limitations must be pointed out. The study was carried out based on a telephone-based sample. This not only has consequences for the sample distribution but also for the nonresponse process. Whether or not the results also apply for other data collection methods must be tested in future research. LCA was performed with variables available in the dataset used. Therefore, some important neighborhood variables that are known to be related to both victimization and nonresponse risk could not be taken into account as they were only available for smaller subsamples. Further research should replicate the underlying approach with different classification variables and samples.
Footnotes
Acknowledgement
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The ‘German Victimization Survey 2012’ was part of the project ‘Security, perceptions, reports, conditions and expectations - monitoring security in Germany’, funded by the German Ministry of Education and Research.
Notes
Appendix
Appendix B. Average marginal effects (AME) of latent classes on victimization rates. 10
| AME of |
Violence crime 5-yr prevalence | Property crime 5-yr prevalence | Violence crime 1-yr incidence | Property crime 1-yr incidence | ||||
|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 1 | 2 | 1 | 2 | 1 | 2 | |
| Class 1 Employees | ns | ns | ns | ns | -0.020(*) | ns | ns | ns |
| Group 2 Pensioners | -0.021* | -0.015(*) | ns | ns | -0.020(*) | ns | ns | ns |
| Group 3 Student and unemployed Soft refusal =1 | ns | ns | ns | ns | ns | -0.058(*) | 0.063(*) | ns |
***p ≤ 0.001; **p ≤ 0.001; *p ≤ 0.05, (*) p ≤ 0.01
