Abstract
Studies of the effect of immigration on homicide in U.S. cities have reported mostly null or negative results. These studies suffer from a failure to weight by population size and the lack of a credible identification strategy. Using data from the Census and the Uniform Crime Reports, 146 U.S. cities in the year 2000 are analyzed using weighted instrumental variables (IV) regressions to overcome these limitations. Estimates are insignificant, and none suggest a substantial negative effect of immigration on homicide, a finding that is replicated with 1990 data. Model comparisons indicate that conventional specifications exaggerate the beneficial effect of immigration somewhat.
The view that immigration is a cause of crime is prominent in U.S. public opinion (see Martinez & Lee, 2000; Simon & Sikich, 2007) and discourse (see Martinez, 2006; Wadsworth, 2010) and has been used as an argument in favor of limiting immigration (Brimelow, 1995). Proponents of this view can find support in social scientific theory (see Martinez & Lee, 2000). Two classic theories were specifically developed to explain a positive link between immigration and crime. Social disorganization theory (Shaw & McKay, 1931, 1974) holds that immigration increases crime rates by limiting communities’ ability to exert social control. Culture conflict theory (Sellin, 1938; Wirth, 1931) posits that the cultural heterogeneity resulting from immigration leads to a lack of clarity of moral standards in a community, thereby increasing crime rates. Other approaches, not specifically developed to address the immigration–crime nexus, also lend themselves to the prediction of a positive effect. Rational choice (Becker, 1968) and anomie perspectives (Merton, 1968) can be used to argue immigration leads to crime because immigrants tend to be economically disadvantaged (Martinez & Lee, 2000; Merton, 1968; Wadsworth, 2010) and may displace natives in the labor market, who then might turn to crime (Sutherland, 1934). Putnam’s work on social capital also suggests more crime as a consequence of immigration. His theory states that decreases in social capital will typically bring about increases in crime (Putnam, 2000), while his empirical work suggests immigration decreases social capital (Putnam, 2007).
Recent years have seen an increase in macro-level studies relating immigration to crime measures. There is widespread agreement that this literature shows no positive effect of immigration on crime (Akins, Rumbaut, & Stansfield, 2009; Desmond & Kubrin, 2009; Martinez, 2006; Wadsworth, 2010), and some go further to argue it indicates a crime-reducing effect (Stowell, Messner, McGeever, & Raffalovich, 2009; Wang, 2012). Research on homicide specifically is consistent with this conclusion. Studies of the nexus between immigration and total homicide rates have found significant negative associations (MacDonald, Hipp, & Gill, 2013; Martinez, Stowell, & Lee, 2010), null effects (Akins, Rumbaut, & Stansfield, 2009; Feldmeyer & Steffensmeier, 2009; Graif & Sampson, 2009; Martinez, 2000; Olson, Laurikkala, Huff-Corzine, & Corzine, 2009; Stowell et al., 2009; Vélez, 2009), and mixed results depending on which component of immigration is taken as the independent variable or which subsample is used (Martinez, Stowell, & Cancino, 2008; Reid, Weiss, Adelman, & Jaret, 2005; Stowell & Martinez, 2009; Wadsworth, 2010).
As these findings emerged, researchers reexamined the prediction of a positive link, often tweaking specific assumptions of existing approaches. For example, the immigration revitalization perspective (Lee & Martinez, 2002; Lee, Martinez, & Rosenfeld, 2001; Martinez et al., 2010) retains the view that disorganization leads to crime, but argues that current immigration tends to organize rather than disorganize communities. Rational choice reasoning is consistent with the prediction that immigration reduces crime, as the threat of deportation is an additional incentive for immigrants to not engage in crime (Feldmeyer & Steffensmeier, 2009).
However, the conclusion that immigration does not increase crime may be premature, as the research it is based on exhibits methodological limitations. Most importantly, it is doubtful whether extant studies are able to identify the causal effect of immigration on the outcome. The present article contributes to the literature by presenting analyses that aim to identify the effect of immigration on homicide rates using an instrumental variables strategy with city-level data. Previous research has used levels of aggregation ranging from standard metropolitan statistical areas to neighborhoods (see Stowell et al., 2009). Authors who analyze city data argue that larger aggregates may be too heterogeneous to meaningfully relate structural characteristics to homicide (Feldmeyer & Steffensmeier, 2009; Wadsworth, 2010). Perhaps due to the influence of Shaw and McKay, most studies have been cast at an even lower level, neighborhoods. These units of analysis, however, may be too small. Any compositional hypothesis that posits immigrants’ differential involvement in homicide suggests a level of analysis larger than the neighborhood, as immigrants may offend or be victimized away from their neighborhood. Some contextual explanations also work better for larger units. Immigrants may displace natives in the job market (which is not confined to one’s neighborhood), leading to native involvement in crime (Sutherland, 1934). Concerning the social influences highlighted by culture conflict and social disorganization theories, it is certainly true that people are influenced by their neighbors. However, the influence of these neighbors relative to the impact of people who live elsewhere seems likely to have diminished since the days of Sellin and Shaw and McKay, as improvements in transport and telecommunications have made it easier to congregate with residents of other neighborhoods or communicate with people not present. 1 Cities also have two practical advantages over neighborhoods. First, they provide adequate counts of homicides (Feldmeyer & Steffensmeier, 2009); second, city crime rates are easily available for places throughout the United States, which aids generalizability. 2 In sum, cities as levels of analysis may provide the best ratio of advantages to disadvantages available for the question at hand.
The remainder of this article is structured as follows: Section 2 reviews and critiques the literature on the influence of immigration on city homicide rates; Section 3 discusses the methods and data used; Section 4 presents the results; and Section 5 contains a summary and discussion of the findings.
Extant Studies on Immigration and Homicide in U.S. Cities
Review of Extant Studies
Three previous studies have estimated the effect of immigration on cities’ total homicide rates. Martinez (2000) relates a proxy for recent Hispanic immigration to 1980 homicide rates controlling for location in a Southwestern state and demographic, economic, and educational variables. He finds no significant effect of Hispanic immigration on homicide.
Feldmeyer and Steffensmeier (2009) study the link between the rate of recent immigrants and 1999-2001 homicide rates in California cities. Controlling for measures of ethnic/racial heterogeneity, residential mobility, other demographic, economic, and educational variables, as well as police presence, they find a small and insignificant association between the two measures.
Wadsworth (2010) relates changes in logged homicide rates between the 1989-1991 and 1999-2001 periods to 1990-2000 change measures of immigration and covariates. His models simultaneously include the rate of immigrants, recent immigrants, and Hispanics, as well as residential stability and other demographic, geographic, economic, and educational controls. Wadsworth finds a significant negative coefficient for the recent immigrants measure, while the rate of all immigrants shows an insignificant positive association with the outcome. In contrast, cross-sectional models for 1990 and 2000 produce insignificant negative coefficients for the measure of recent immigrants, while the rate of all immigrants exhibits a positive association with the logged homicide rate, which is significant in the 1990 data. In the author’s view, the lack of agreement between models shows that “the cross-sectional relationship between immigration and violent crime is spurious” (p.548).
Taken together, these studies appear to confirm the conclusion that there is no clear positive, and perhaps a negative, effect of immigration on homicide. However, such a conclusion should not be drawn with any confidence. This is due to the methodological limitations of these studies, which I discuss next.
Limitations of Extant Studies
The studies discussed above share two limitations. First, all use data from units that differ in population size, but none presents results weighted by this variable. 3 As a result, the impact of a homicide event on the effect estimate varies inversely with the population size of the city in which the homicide took place. In contrast, weighting by city size leads to an estimate in which each homicide counts equally. As a consequence, such estimates refer to the average effect in the population as a whole (Kneip & Bauer, 2009). Although unweighted analyses are defensible if the objective is to investigate what distinguishes high- from low-crime cities, studies that aim for statements about the urban United States as a whole must weight cities by population size.
Second, it seems doubtful whether previous studies have identified the causal effect of immigration on homicide. To do so, the regressions in that literature would have to meet the conditional independence assumption. This assumption is met if, conditional on any variables that might be controlled for, there are no reasons other than a causal influence of the independent on the dependent variable for the two to be correlated (Angrist & Pischke, 2009). It has long been recognized in the literature on ecological influences on offending that this assumption will be violated if individuals who differ in their criminal propensities also differ in their propensities to select into different areas (Taft, 1933). In the present context, the assumption would be violated, for example, if immigrants tend to select into cities that are low in homicide. This might be because homicide rates influence immigration or because immigration and homicide are both linked to third factors (e.g., local economic conditions). Multivariate regression is a tool for trying to ensure conditional independence, as it allows the researcher to control for variables that may influence the outcome and be correlated with the predictor of interest (Angrist & Pischke, 2009).
However, the inclusion of control variables has a potential downside. If the values for the controls are themselves influenced by the predictor of interest, their inclusion biases the effect estimates, unless they are not correlated with the outcome, conditional on other covariates (Angrist & Pischke, 2009). This problem is known as bad control (Angrist & Pischke, 2009) or overcontrol. It is likely to be a severe problem for all studies discussed above. This is most evident in Wadsworth’s model, which tries to simultaneously estimate the effects of the presence of all immigrants and the presence of recent immigrants. The rate of recent immigrants necessarily influences the rate of all immigrants, unless exactly one foreign-born longtime resident moves out for every recent immigrant that comes to a city. Conversely, the recent immigrants variable controls for a portion of the effect that the rate of all immigrants may have. The percent Hispanic (controlled for by Martinez, 2000, and Wadsworth, 2010) and racial/ethnic heterogeneity (Feldmeyer & Steffensmeier, 2009) also seem very likely to be influenced by immigration. Sutherland (1934) hypothesizes that immigration fosters crime by increasing urbanization, which calls into question the inclusion of population size in all three articles’ models. Social disorganization theory (Shaw & McKay, 1931, 1974) sees residential mobility as a main channel through which immigration increases crime; hence, controlling for mobility (Feldmeyer & Steffensmeier, 2009; Wadsworth, 2010) seems inappropriate. Sampson (2008) argues that immigrants cause economic uplift of communities and that adjusting for economic variables will hence cause overcontrol, but such measures are included in all three studies.
In many cases, the authors discuss some of these variables as mediators between immigration and crime, but this does not translate into the estimation of models that exclude these measures. For example, Wadsworth (2010) points out that economic models suggest “a relationship between immigration and crime” because “individuals in weaker labor market positions have [. . .] less to lose by participating in crime,” (p. 535) but controls in all specifications for economic disadvantage and a variable that “represents employment opportunities for low-skilled persons” (p. 540).
In sum, previous estimates of the effect of immigration on homicide exhibit two limitations. First, they give the same weight to cities that differ in population size; second, they almost certainly violate the independence assumption by controlling for variables that are themselves influenced by immigration. Estimating weighted regressions easily solves the first problem. The second issue is much thornier. A researcher concerned with correct identification in a standard regression model will often find herself in a situation in which she wants to simultaneously control and not control for a certain variable (Angrist & Pischke, 2009). For example, it has been noted that economic disadvantage may be influenced by immigration, but it is clear that it is also subject to other influences, and the correlation between economic disadvantage and homicide rates is well established. Whereas the inclusion of this variable may cause overcontrol, its exclusion seems likely to cause undercontrol. The same argument can be made with respect to other measures. 4 The instrumental variables approach promises a solution to this dilemma. It will be described in the next section.
Analytical Strategy and Data
Analytical Strategy
The present study departs from previous research in two important ways. First, all analyses are run with weights for population size. Second, it attempts to solve the dilemma of controlling for too many or too few measures by using instrumental variables.
Using weights proportional to the population size gives each homicide the same influence on the estimates of the relationship between the homicide rate and the presence of immigrants. It also obviates the need to average homicide rates over various years, which introduces a temporal mismatch between the dependent and explanatory variables. The motivation for the use of averaged rates is that a specific year’s homicide rate may be influenced by city–year idiosyncrasies not explicable by covariates. This problem is particularly pronounced with small units of analysis. Giving smaller weights to smaller units minimizes this problem.
The aim of the instrumental variables (IV) technique is to identify the influence of a variable, D, on an outcome, Y, by ascertaining that the correlation between D and Y is not contaminated by other influences on Y that are also correlated with D. To this end, a variable is sought that influences D but has no influence on Y except via D. This is the instrumental variable (or instrument) Z (Bollen, 2012). The IV analysis consists of two regressions. In the first stage, D is regressed on Z. In the second stage, the regression of interest, Y is regressed not on D, but on fitted values of D, as predicted by Z. Hence, the second stage does not estimate how Y is influenced by D per se, but rather how Y is influenced by the portion of the variance in D that is explained by Z. D is called the endogenous regressor.
Two crucial requirements must be fulfilled so that the second stage’s coefficient for D has a causal interpretation (Bollen, 2012). First, relevance refers to the requirement that Z must predict D. The better Z predicts D, the stronger the first stage is said to be. The weaker the first stage is, the more the coefficient of interest estimated in the second stage regression will be biased toward the estimate obtained by a standard regression (Angrist & Pischke, 2009). Second, instrument exogeneity holds that there is no connection between Z and Y other than through D (Wooldridge, 2002).
As mentioned, by omitting variables from a regression that may cause overcontrol, one will in many cases create an undercontrol problem. The concept of instrument exogeneity helps understand why the IV approach promises a solution to this dilemma. If it is true that the only connection between Z and Y is through D (i.e., that Z is exogenous), then the predicted values of D will not be correlated with omitted controls, and the estimate obtained will hence be unbiased (Wooldridge, 2002). Hence, rationales for the inclusion of controls change when moving from standard to IV regressions. The typical rationale given for the inclusion of control variables in standard regressions is that variables need to be included if they may also influence (or be correlated with unmeasured influences on) the outcome. As was pointed out, this is not sufficient even in a standard regression framework—rather, one would have to argue that the inclusion of the control variable removes more bias (due to undercontrol) than it introduces (due to overcontrol). The use of the IV technique further tilts the balance against the use of controls. Three rationales for their inclusion exist. With one exception (the third reason below), their role is to ensure that the IV strategy’s assumptions are met.
First, concerning relevance, the issue in question is the strength of the instrument conditional on control variables (the number of which may be zero; Bollen, 2012). One may hence include controls if they strengthen the relationship between the instrument and the endogenous regressor. One covariate is chosen on the basis of this rationale in the present study.
Second, it may be that the instrument is exogenous only after the inclusion of control variables (Angrist & Pischke, 2009). If so, these must be included. A number of controls are chosen in the present article for this reason.
A third reason has nothing to do with the use of the IV technique specifically. If a variable is uncorrelated with the endogenous regressor, it may be included in the model. It cannot bias the effect estimate, but it may reduce its standard errors (Angrist & Pischke, 2009). This rationale plays no further role in this article.
This exhaustive list implies that it is invalid to critique an IV specification on the basis that it omits predictors of the outcome. Rather, one would have to argue that the omission of variables leads to a violation of the assumptions underlying the IV strategy. Consequently, a number of known predictors of crime are deliberately excluded from the preferred models in this article. For example, the poverty rate is often found to be a good predictor of the homicide rate (e.g., Martinez, 2000). Nonetheless, no such measure is included in this article’s main models because (a) its inclusion is not necessary to meet the IV strategy’s assumptions and (b) it may cause overcontrol. If the first of these two statements is true, then it is invalid to argue that the model is misspecified due to the omission of this predictor.
Two instruments are used separately in the analyses below. The choice of the first instrument is based on the fact that rates of immigrants are higher in cities located in states that share a border with Mexico. I use a dummy variable that indicates a city’s location in such a state. Initial tests showed that this variable is not a strong predictor of the endogenous regressor. On reflection, this is unsurprising, given that immigrants tend to settle in large cities. This suggests improving instrument strength by holding city size constant. However, controlling for concurrent city size seems likely to induce overcontrol. A city’s population size in 1920 is hence included in all regressions to increase instrument relevance.
All other controls included in the main models are used in an attempt to ensure instrument exogeneity. Location in a border state is correlated with the presence of Asians and Hispanics. Hispanics are often overrepresented in crime statistics (Jones-Webb & Wall, 2008; Steffensmeier, Feldmeyer, Harris, & Ulmer, 2011), and the proportion of the population that is Hispanic has been linked to higher homicide rates (Nelsen, Corzine, & Huff-Corzine, 1994); the converse is true of Asians (Jones-Webb & Wall, 2008; McNulty & Bellair, 2003). Hence, it seems likely that the percentages of the population that are native Asians and Hispanics would influence homicide rates, and measures of these quantities are controlled for to block the causal paths from the instrument to the outcome. I also include dummy variables for eight of the nine census divisions to control for other possible influences on homicide that are correlated with location in a border state. In another specification, these are replaced with dummies for states. These variables cannot cause overcontrol.
The choice of the second instrument follows previous practice (Card, 1990; MacDonald et al., 2013). I use lagged rates of immigrants to predict concurrent values. Students of migration have demonstrated a process of “cumulative causation” (Massey, 1990) in which the presence of immigrants attracts further immigration (Zavodny, 1999), especially when past and prospective immigrants share their world region of birth (Jaeger, 2008), ethnicity (Bartel, 1989), or membership in a network (Winters, de Janvry, & Sadoulet, 2001). Again, population size in 1920 is used as a control to improve instrument strength, and geographic measures are included to enhance the likelihood that the instrument is exogenous.
The case for inclusion is less obvious with respect to the measures of Asian and Hispanic natives. Ellis and Goodwin-White (2006) find that the higher the concentration of immigrants in an area, the lower the likelihood that native Asians and Hispanics move away. Because past immigrant concentration (measured by the instrument) may influence homicide rates through this channel, it seems prudent to control for these variables to block this path from the instrument to the outcome. However, these authors’ finding also suggests that concurrent immigration concentration influences the concurrent rates of native Asians and Hispanics. Controlling for concurrent proportions of Asians and Hispanics would then induce overcontrol, as the aim is to estimate the total effect of the presence of immigrants, including the portion that may be mediated via this channel. This leaves us with the same dilemma of controlling for too many or too few variables that the IV approach was supposed to solve. Hence, with this instrument only, results are shown both with and without controls for rates of Asian and Hispanic natives.
Because MacDonald et al. (2013) present the only IV analysis of immigration and crime to date, it is useful to outline the differences between their empirical strategy and the one used here. These authors estimate the effect of levels of a proxy for the presence of Hispanic immigrants, measured in 2000, on 2000-2006 changes in total crime, violent crime, and homicide in a sample of neighborhoods. The proxy for Hispanic immigrant rates measured in 1990 is used as an instrument for the same variable measured in 2000. The authors report significant negative estimates that do not change considerably when units are weighted by population size. The article leaves unclear why one should expect levels of rates of immigrants to influence subsequent changes in crime. In addition, the authors control for 2000 levels of poverty, residential stability, population density, and measures of Black residents and young males. If one accepts a possible influence of levels of immigrants on changes in crime, it seems likely that much of this influence is mediated via these controls. Their statistical model is hence subject to the same overcontrol critique that was presented above. In contrast, the present article estimates the effects of levels of rates of immigrants on levels of city homicide rates. Statistical models are constructed to minimize the threat of overcontrol while including covariates necessary to meet the requirements of an IV strategy.
Data
I estimate the effect of the presence of immigrants on homicide rates in a sample of U.S. cities in the year 2000, the last year for which data on the presence of immigrants are available for cities. Cases are included in the sample if (a) their population is at least 100,000, (b) the population estimates in the census data and the Uniform Crime Reports (UCR) are in close agreement, 5 and (c) homicide data are available. The sample size is 146. Unless stated otherwise, all regressions are run using weights that are calculated by dividing a city’s population size by the sample mean of this variable. The procedure is two-stage least squares (2SLS).
The dependent variable, homicides per 100,000 population, is calculated on the basis of homicide counts and population data from the UCR. The natural logs rather than the original rates are used, as regression specification error tests (RESETs) indicate functional form misspecification in the latter but not the former case. 6 The endogenous regressor is the percentage of the population that is foreign-born. The same variable for 1990 is included as an instrument (lag instrument, henceforth), as is a dummy for location in a border state (border instrument, henceforth). Dummies for U.S. census divisions and states are included as controls, as are census estimates of a city’s population in 1920, taken from Gibson and Jung (2005) and taken from U.S. Department of Commerce, Bureau of the Census (1931). In two cases, this value is missing, indicating that the city was not an incorporated place in 1930 (U.S. Department of Commerce, Bureau of the Census, 1931) and was set to zero. Further data will be used for additional analyses and described in those sections.
Results
Main Results
IV regression results are displayed in Table 1. Point estimates and lower and upper bounds of 95% confidence intervals (CI) are shown for the constant, % immigrants, and % native Hispanics and Asians. Although it is customary to display the point estimates along with standard errors, the combination of the IV strategy with a relatively small sample will lead to imprecision in the estimates, and the method of display used here facilitates the appreciation of the degree of imprecision. Significant point estimates are italicized. The inclusion of other controls is indicated at the bottom of the table. The model using the border instrument controls for the presence of native Hispanics and Asians, while models using the lag instrument are calculated both without these controls (Models 1a and 2a) and with them (Models 1b and 2b). Model 2 cannot be calculated for the border instrument due to perfect collinearity with the state dummies. For each regression, a first-stage F statistic is displayed that tests for instrument strength. Stock and Yogo (2005) present a table of critical values that helps decide when an instrument should be considered weak. I selected a maximal bias of the 2SLS estimator relative to ordinary least squares (OLS) of 0.05. The corresponding critical value for the first stage F is 24.09. Estimates are calculated only if this value is exceeded. No results from overidentification tests (such as the Sargan test) are reported. Although I use two different instruments, each specific regression includes only one of them. The models are hence just-identified, in which case, overidentification tests cannot be computed (Bollen, 2012).
Weighted Two Stages Least Squares Regression of the Natural Log of 2000 Homicide Rates on the Presence of Immigrants and Control Variables (N = 146).
Note. All first and second stage models are weighted by a variable that is calculated by dividing the city population size by the average population size in the unweighted sample. Coefficients displayed are point estimates (“point”), lower bounds of 95% confidence intervals (“lower”), and upper bounds (“upper”). Significant coefficients are italicized. Empty cells denote coefficient was not estimated.
Table 1 presents the results. The first stage of the only model using the border instrument is just strong enough. The point estimate for the target variable is positive and of quite noteworthy size, but it falls just short of statistical significance. The estimates using the lag instrument hover around zero and are all insignificant. Model 2b is the preferred specification for this instrument, as it combines a strong first stage with extensive controls. It suggests a small negative effect of the target variable on homicide rates. Taken together, the results give no clear indication that there is a substantial effect of the presence of immigrants on homicide rates. No support is found for the view that the presence of immigrants reduces homicide rates to a noteworthy degree.
Results Concerning Method Effects
This section compares the preferred estimates from the previous section to estimates derived using more conventional methods. Panel A of Table 2 displays previously presented results of Models 1b (using the border instrument) and 2b (using the lag instrument) to coefficients obtained when 2SLS is replaced with OLS, data are not weighted, or both. For readability, only point estimates for the target variable are displayed, but coefficients are italicized if significantly different from zero and bold typed if significantly different from the coefficient from the weighted IV model using the border instrument. (No coefficient is significantly different from the estimate from the preferred lag model.)
Method Effects in the Estimation of the Effect of the Presence of Immigrants on Logged 2000 Homicide Rates (N = 146).
Note. 2SLS = two-stages least squares; OLS = ordinary least squares.
Weighted models are weighted by a variable that is calculated by dividing the city population size by the average population size in the unweighted sample. Significant coefficients are italicized. Coefficients for % immigrants are in
Results are displayed in Table 2. An unweighted 2SLS version of Model 1b is not estimated due to a weak first stage. In contrast to the IV model, the result of the OLS estimation is negative, with a stronger association in the unweighted estimate, although all three models’ CIs overlap. Turning to Model 2b, a clear pattern emerges. Replacing IV with OLS makes almost no difference, but failure to weight moves the coefficients toward more negative values. Again, these differences are not significant.
Although these results are informative, a more meaningful comparison can be made when results from the preferred specifications are compared with the kind of analysis that is found in the previous literature. To create a “typical” analysis, I ran an unweighted OLS regression. A control for a construct is included if at least two of the three studies reviewed above use a measure of the construct. This regression hence includes controls for divorce (percentage of the population that is divorced), education (percentage of the population over 24 with a high school or equivalent degree), Hispanics as a percentage of the population, population size, the percentage of the population below the poverty line, residential instability (the percentage of the population over 4 who lived in a different household 5 years before), unemployment (the percentage of males over 15 who are unemployed or not in the labor force), and young males (males aged 15-24 as a percentage of the population), but none of the controls used in Models 1 and 2. 7 All variables are taken from the 2000 census. I also followed the data transformation decisions taken in all three articles. First, if a variable exhibited extreme skew, its natural log is used. The population size and Hispanic measures were hence log-transformed. Second, the conceptually related measures of education, unemployment, and poverty were examined using principal components analysis. They all loaded strongly on a common factor and were hence combined into a single index of structural disadvantage by averaging their z-scores. For comparison purposes, Panel B of Table 2 again displays the coefficient for the immigrants measure from Models 1b (border instrument) and 2b (lag instrument). Model 3 mimics the typical model from the extant literature. The coefficient for the target variable is negative and significantly different both from zero and the estimate from Model 1b. Models 4 through 6 omit one of the three measures that show the strongest bivariate correlations with the target variable. When % Hispanic or ln population size is omitted, the target variable’s coefficient becomes insignificant, which is not the case when % divorced is dropped. Model 7 omits all of these variables, which causes only a small change in the estimate compared with Models 4 and 6. Additionally dropping the three remaining controls causes no noteworthy change in this coefficient.
In sum, reestimating the preferred models without weights and/or instruments leads to appreciably more negative coefficients for the variable of interest, with a failure to weight showing a stronger effect than a failure to instrument. The differences are not statistically significant. A conventional unweighted OLS model using the same data set and a set of standard controls yields a significant negative effect estimate, in contrast to IV models. The controls for percent Hispanic and population size are particularly influential in this respect.
Additional Results
This section investigates what differences may be observed when changes are made to the data used. In all cases reported below, Model 1b could not be estimated due to weak first stages. All results are hence based on Specification 2b that instruments current with lagged rates of immigrants. New weights were calculated and used wherever appropriate.
Latin American immigrants
The results displayed above using the border instrument were just insignificant, and it seems likely significant results will be obtained when the endogenous regressor is changed to create a tighter fit between it and the instrument. Hence, the proportion of the population that is immigrants from Latin America (not Hispanic immigrants) is used as the new endogenous regressor. Theory also suggests a focus on this group, as Latin American immigrants tend to be low skilled and may hence be more inclined to commit crimes or induce offending in those with whom they compete in the job market.
To estimate a model isolating the effect of the presence of immigrants from Latin America, a number of changes are made. The percentage of all, rather than native, Asians is used as a control. The instrument % immigrants in 1990 is replaced with a proxy for the 1990 rate of Latin American immigrants. As data on immigrants by place of birth are not available from the 1990 census, all values for this variable had to be imputed in a two-step procedure. First, using 2000 data, % Latin American immigrants was regressed on % Hispanics, % immigrants, and their interaction term. Second, the resulting regression equation (which explains 95% of the variance) was used to estimate the rate of Latin American immigrants in 1990 with 1990 data for % Hispanics and % immigrants. The IV regression indicates that the influence of the presence of Latin American immigrants on homicide rates is very small (B = 0.001, 95% CI [−0.066, 0.068]). A model isolating the effect of Mexican immigrants even more specifically (with variables constructed analogously) yields a similar result (B = −0.017, 95% CI [−0.109, 0.075]).
Small and large cities; high- and low-immigration cities
It may be that some cities are better able to absorb immigrants than others due to their size or previous rates of immigrant presence. To explore the issue of size, the sample was divided into two halves on the basis of population size. The coefficient for larger cities is almost exactly zero (B = −0.002, 95% CI [−0.034, 0.030]), whereas the regression for the smaller cities yields an insignificant negative estimate (B = −0.053, 95% CI [−0.265, 0.159]). Cities that were high versus low in the presence of immigrants in 1990 are also compared. For high-immigration cities, the effect is small (B = 0.010, 95% CI [−0.050, 0.070]), whereas for low-immigration cities, it is noteworthy, but insignificant (B = −0.123, 95% CI [−0.512, 0.266]).
1990 data
Another question is whether the main findings can be replicated using 1990 data. The attempt to construct analogues for 2000 models was hindered by data limitations. Specifically, values for Asian and Hispanic natives had to be estimated in a two-step procedure. First, a variable was imputed to estimate the percentage of the population that is Asian immigrants. This was done in a manner analogous to the estimation of Hispanic immigrants above. The regression for immigrant Asians explains 98% of the variance, and the resulting formula was used to estimate the percentage of Asian immigrants in 1990. Second, to obtain an estimate of the percentage of native Asians and Hispanics, the estimates for immigrant Asians and Hispanics were subtracted from the percentage of (all) Asians and Hispanics given in the 1990 census. The instrument is the percentage of the population that is foreign-born in 1980, which comes from the census and was taken from U.S. Department of Commerce, Bureau of the Census (1983). The estimate obtained is B = 0.027 (95% CI [−0.033, 0.087]). Although this may seem to contrast with the negative estimate obtained with the 2000 data, both coefficients are small and insignificant. Taken together, the estimates from this model for 1990 and 2000 suggest that the influence of the presence of immigrants on homicide rates was close to zero in both years.
Summary and Discussion
Classical theories of crime, as well as current public opinion and discourse, suggest that the presence of immigrants increases homicide rates. Previous research did not confirm this and tended to point toward a negative effect. I have argued that previous results should not be taken at face value, as they are based on unweighted samples, and there are concerns regarding the identification strategies used. IV regressions with a weighted sample of cities in the year 2000 yield no evidence of a noteworthy negative effect. The use of the border instrument leads to a substantial positive, but insignificant, estimate, while the use of the lag instrument suggests the effect is close to zero. The latter finding is replicated with 1990 data. Together, these results are consistent with the conclusion that there is no positive effect of immigration on homicide. No support is found for the view that the presence of immigrants reduces crime to a noteworthy extent. Confident statements about a strong positive or negative effect of immigration on city crime rates would thus seem unwarranted.
Although the CIs of these estimates all include zero, one might still ask why the point estimates are so different. A simple answer is that IV regressions measure the effect of the target variable as influenced by the instrument. Hence, regressions using the border instrument estimate the effect of the presence of immigrants to the extent that it is influenced by location in a border state. The lag instrument leads to estimates of immigrants’ effect on homicide insofar as their presence is influenced by the previous presence of immigrants. As these two quantities are not the same, we may expect different estimates. One might suspect that the border instrument picks up an effect of Latin American or Mexican immigrants specifically, whose presence may cause more crime than that of other immigrants. However, models aiming to isolate the effect of these groups do not support this view.
Another possible explanation has to do with the fact that the credibility of any IV analysis hinges on its identification assumptions. These would be violated if there were an influence of an instrument on the dependent variable that is not accounted for by the controls. Specific channels through which such an influence might operate are hard to think of in the case of the border instrument. In the models using the lag instrument, a violation would occur if high rates of immigrants hampered the conventional socialization of children, increasing the likelihood that these people commit homicide 10 years later. More generally, it is conceivable that a high rate of immigrants disrupts mechanisms of social control and sets cities on a pathway of high crime. However, while some research tentatively suggests such an effect (Putnam, 2007), others stress findings showing how immigration revitalizes neighborhoods (Lee & Martinez, 2002; Lee et al., 2001; Sampson, 2008), and definitive statements on this question are unwarranted given the limited scope of the literature. The reason for the disagreement between the results ultimately remains unclear, but it should be kept in mind that the estimates are consistent with each other in the sense that their CIs overlap and that none of them suggests a noteworthy negative effect of the presence of immigrants.
If we are willing to put some faith in the weighted IV estimates, then it is useful to compare them with coefficients resulting from different procedures. A first set of comparisons suggests that a failure to weight by city size biases the estimates downward to a noteworthy degree, whereas the results comparing IV with OLS specifications are less consistent. The largest difference between preferred and alternative specifications was obtained when replacing the former with a stylized “typical” specification. It thus appears that standard estimates in the extant literature exaggerate the extent to which the presence of immigrants reduces homicide rates somewhat. The variables % Hispanic and population size seem particularly problematic in this respect. Although regressions not controlling for these variables could be criticized for undercontrol, this article shows that this dilemma can be circumvented by controlling for the presence of native Hispanics and historical population sizes, irrespective of whether an IV strategy is used.
Such solutions are useful, as the IV approach will hardly replace standard regression in the study of homicide rates. One reason is that often, satisfactory instruments are not available. Another is that there is a clear advantage of standard techniques. Although IV estimates refer to the effect of the endogenous regressor as influenced by the instrument, standard techniques can use the target variable “as is.” Resulting estimates are credible to the extent that reasons for a correlation between the target and independent variables other than a causal effect have been eliminated. Awareness of this problem is more important than the use of any specific technique. If findings across techniques are similar, this increases confidence in the results.
The weighting procedure used in this article was motivated by the desire to make statements about the urban United States, which is unwarranted when unweighted estimates are used. However, these results need not generalize to the nation as a whole. Social scientists are wary of the ecological fallacy, pointing out that inferring individual-level relationships from aggregate data may lead astray, but inferences from smaller to larger units are also risky (Sampson & Messner, 1991). Specifically, it is conceivable that immigration has no net effect on homicide at the city level but a substantial influence on national rates nonetheless. This pattern could emerge if immigration had homicide-inducing effects, but these were offset at the city level by immigration causing particularly crime-prone residents to leave the city. Then we would see no effect on city rates despite a positive effect of immigration on national homicide rates (similarly, Sailer, 2006). Hence, it seems unwarranted to use results from urban samples to argue that immigration caused national rates to decrease. Conclusions about the effects of immigration at the national level require studies cast at this level. Because the policy relevance would be high, it would be particularly important that researchers attempting such analyses do so mindful of the challenges encountered in the causal analysis of observational data.
Footnotes
Acknowledgements
I am also grateful to Kenneth Bollen, Charis Kubrin, Ramiro Martinez, Edward Shihadeh, Jacob Stowell, and Tim Wadsworth for providing copies of research papers and to Editor Wendy Regoeczi, two anonymous referees, Steven Messner, Robert Putnam, Robert Sampson, Christopher Winship, and especially Jörn-Steffen Pischke for helpful comments on earlier drafts. All remaining errors are mine.
Author’s Note
The research presented herein is loosely based on my dissertation, supervised by Thomas Ohlemacher at the University of Hildesheim. Data used in this study were collected by the U.S. Census Bureau and the FBI, and downloaded from the web pages of the U.S. Census Bureau (http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml), the National Historical Geographic Information System (https://www.nhgis.org), and the Inter-university Consortium for Political and Social Research (
).
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
