Abstract
Using crime data for the 48 continental and conterminous US states and the distribution dynamics approach, this paper detects two distinct phases in the evolution of the property crime distribution: a period of strong convergence (1971–1980) is followed by a tendency towards divergence and bimodality (1981–2010). Moreover, the analysis reveals that differences in income per capita and police can explain the emergence of a bimodal shape in the distribution of property crime: in fact, after conditioning on these variables, the bimodality completely disappears. This empirical evidence is consistent with the predictions of a two-region model, that stresses the importance of income inequality in determining the dynamics of the property crime distribution.
Introduction
A huge literature has tried to understand the causes and the dynamics of crime and important contributions can be found in different social science disciplines: from economics (Becker, 1968) to sociology (Cohen and Felson, 1979), from politics (Smith, 1997) to law (Marvell and Moody, 2001). A huge set of determinants has been considered to explain crime, including unemployment, income inequality, police officers, the legalisation of abortion and many others: see, inter alia, Baltagi (2006), Cornwell and Trumbull (1994), Donohue and Levitt (2001), Gould et al. (2002), Greenberg (2001), Kapuscinski et al. (1998), Levitt (2002), Machin and Meghir (2004), Marvell and Moody (1996), Paternoster and Bushway (2001), Raphael and Winter-Ebmer (2001), Rosenfeld and Fornago (2007).
Moreover, the empirical crime literature has focused its attention on the dynamics of the aggregate level of crime in the USA, such as Cook and Cook (2011). In particular, many papers have tried to explain the raise of crime rates in the 1970s and 1980s and their decline in the 1990s. Reviewing the evidence from these numerous papers, Levitt (2004) identifies four factors that explain the decrease of crime started in the 1990s: the increasing number of police, the skyrocketing number of prisoners, the ebbing of the crack epidemic and the legalisation of abortion in the 1970s. Another paper trying to explain the dynamics of crime is represented by Shoesmith (2010), who uses an error correction model and four factors to explain both the rise and fall of crime rates. The employed factors are arrest rates, income per capita, the proportion of justice resources devoted to drug crime and alcohol consumption.
The aim of this paper is different because it does not focus on the aggregate value of crime but on the analysis of the entire distribution of crime rates for the US states and on its evolution over time. Other papers have tried to address this issue using standard approaches, such as the beta-convergence and sigma-convergence concepts: for instance, Cook and Winfield (2013), employing growth regressions and measures of cross-sectional variation, provide evidence of convergence for the US states, whereas, using a different probabilistic approach, Cook and Watson (2013) find different results depending on the period analysed and on the crime typology considered.
However, it has been recognised that a negative relationship between initial values and growth rates, which is the essence of the beta-convergence analysis (Barro, 1991; Barro and Sala-i-Martin, 1991, 1992; Baumol, 1986), is a necessary but not a sufficient condition for convergence (Quah, 1993a): in fact, the regression approach, by focusing on the behaviour of a representative unit towards its own steady state, is completely silent on what happens to the entire cross-sectional distribution. Moreover, the sigma-convergence analysis (Sala-i-Martin, 1996) is also affected by significant drawbacks because, as argued by Quah (1996a), a constant variability is compatible with very different dynamics, from criss-crossing and leap-frogging to persistent inequality. Distinguishing between these completely different patterns is crucial and is possible only by analysing the entire cross-sectional distribution.
In order to overcome these criticisms affecting the existing literature about convergence analysis of crime, the present paper adopts an alternative methodology, i.e. the distribution dynamics approach (Quah, 1993a, 1993b, 1996a, 1996b, 1997), which allows the study of the entire distribution of crime rates, both in terms of shape and of intra-distributional dynamics.
In particular, this paper considers the dynamics of property crime rates for 48 conterminous US states during the period 1971–2010. Results indicate different phases for this typology of crime: in fact, a period of strong convergence (1971–1980) is followed by a tendency towards divergence and bimodality (1981–2010).
Furthermore, the analysis reveals that differences in the levels of income per capita, whose divergence starting from the 1980s has been documented by Gerolimetto and Magrini (2017) in a distribution dynamics setting, and in the numbers of state police employees, which also exhibit a divergent pattern starting from the 1980s, can explain the emergence of a bimodal shape in the distribution of property crime rates: in fact, after conditioning on these two variables, the bimodality completely disappears. The paper is organised as follows: the next section presents a two-region model which investigates the relationship between crime and other socio-economic variables from a theoretical point of view; the section entitled ‘Methodology’ introduces the adopted empirical methodology, whereas the ‘Data’ section describes the data employed; finally, we present the empirical results obtained with the distribution dynamics technique and explore the relationship between crime and some of its determinants.
A two-region model of crime dynamics
This section presents a theoretical framework in which it is possible to explore the dynamics of crime over time and across different spatial units.
In particular, in this model there are two regions or states, indicated by
where
On the other hand, the future value of income depends on the current level of crime, because the presence of high crime rates reduces the resources available in each state by discouraging investments:
in which
The two states decide, through taxation, the share of resources
Consequently, the expenditure in crime repression improves the future level of income by reducing crime and its negative effect on the level of resources, but it is a costly investment because it requires an increase of the tax rate.
It is assumed that politicians choose in each period the level of taxation by maximising the probability of being re-elected
where
The first order conditions of the problem can be rewritten in the following way:
in which the left-hand side represents the marginal cost of taxation, whereas the right-hand side is the marginal benefit. It is worth noting that taxation has a positive benefit only if the crime sensitivity to police, measured by
The optimal level of taxation equalises marginal benefits and marginal costs:
This expression shows that the optimal tax is higher when: the weight associated to taxation, measured by
This model allows us to analyse the effects of an exogenous income shock and its propagation to the other variables of the system. In particular, the case of an increase in income inequality is considered because this scenario is related to the increasing inequality in terms of income per capita observed across the US states, starting from the 1980s. Consequently, it is assumed in the main calibration that state
Figure 1 reports the dynamics implied by the model for the main variables in the different scenarios considered. In particular, the columns correspond to the small, medium and big shock scenario, respectively. The first row shows the timing and magnitude of the shock; the second and third rows present the dynamics over time of income and crime in the two regions, compared with the average of the two states in each period, respectively (the time series of tax rate and police have the same pattern of income and, for this reason, are omitted); finally, the last row depicts the trajectory of the aggregate value of crime, which is the sum of the crime levels in the two regions. In every scenario the mechanism behind the model is the same: a higher (lower) level of income leads to a decline (rise) of crime through both the opportunity cost and the police channel.

Time series of the main variables of the model in the presence of different income shocks.
In the case of a small shock affecting the income of the richest region, this state has not enough resources to reduce its crime level, which continues to grow and subtract wealth to the state, and this leads to a divergent crime pattern. Moreover, the aggregate level of crime exhibits an explosive trend because of the strong growth of crime in both regions.
On the other hand, the second column shows that a bigger income shock provides to state
Finally, with an even bigger shock, the dynamics of the system changes completely: in this scenario the richest region is able to reduce its crime level compared with the poorest region. Benefiting from this crime reduction, income in region
If the values assigned to the weight of taxation
The three scenarios considered reveal that, from a theoretical point of view, a greater income inequality, such as the one experienced in the USA starting from the 1980s, can have completely different consequences on the dynamics of crime and on its distribution, depending on the size of the shocks that drive this phenomenon. The following empirical analysis will reveal which theoretical scenario best approximates the US experience.
Methodology
As mentioned in the introductory section, there are two approaches to the analysis of convergence: the regression method, complemented by the study of the cross-sectional variability, and the distribution dynamics approach. In this paper the latter approach is chosen because it allows the study of the entire cross-sectional distribution of a given variable, both in terms of external shape and intra-distributional dynamics, using stochastic kernels to describe its evolution over time.
Let
in which
The output of this type of analysis is represented by a set of figures: (1) a three-dimensional plot of the estimated stochastic kernel (the Technical appendix sketches the estimation procedure employed); (2) the corresponding contour plot, with contours at the 90%, 50% and 10% level; (3) the Highest Density Regions (HDR) plot, proposed by Hyndman (1996), in which the vertical strips represent conditional densities given specific values for the initial year and, for each strip, darker to lighter areas display the 10%, 50% and 90% highest density regions; (4) a plot comparing the initial year distribution with the final year one and the ergodic.
Then, convergence is analysed by looking at the three-dimensional shape of the stochastic kernel and at the corresponding contour and HDR plots or by comparing the initial distribution with the final one and the ergodic. A probability mass located along the main diagonal in the contour and HDR plots indicates a persistence feature of the studied phenomenon, because the elements in the cross-sectional distribution remain where they started. On the other hand, a convergence process is highlighted by a probability mass concentrated around the mean value at time
Data
The crime data used in this paper are from the Federal Bureau of Investigation (FBI)’s Uniform Crime Reports (UCR), 1 which include crime reports submitted voluntarily either directly by local, state, federal or tribal law enforcement agencies or through centralised state agencies across the country. Data are freely available online for each year, starting from 1960, and for the 50 US states, plus the District of Columbia. Crimes are classified according to the following categories: burglary, larceny-theft, motor-vehicle theft, murder, rape, robbery, assault. The first three typologies are grouped in the property crime category while the others are classified as violent crimes. The attention of this paper is devoted to the aggregate category of property crime, which is measured by a standard index, the property crime rate, defined as the number of reported property crimes committed per 100,000 inhabitants.
Figure 2 plots the time series of the property crime rate in the USA. This picture shows a very well-known pattern in the crime literature: the aggregate level of property crime increases up to the beginning of the 1990s and then falls. It is worth stressing that this trend seems consistent with the predictions of the model in the third theoretical scenario of Figure 1.

Time series of the property crime rate in the USA from 1971 to 2010.
The UCR data have many advantages: they cover a long period of time with a stable methodology, allowing a meaningful trend analysis; they are the only source of geographically disaggregated crime data available for the USA; they offer a good coverage in terms of crime typologies and in terms of geographic locations considered. On the other hand, the UCR programme has some limitations: it covers only crime reported to the police and many crimes are reported in low percentages; furthermore, since reporting is voluntary, some enforcement agencies may not report information or information may be incomplete. In the presence of incomplete data, the FBI uses specific protocols to impute the missing values: the imputation is based on crime rates of agencies considered similar according to population size, type of agencies (city, rural and state, suburban counties) and geographic location. 2
In order to bring the model to the data, a measure of income and police is needed. The relationship between property crime and income will be explored in the following analysis using data on real per capita personal income: the personal per capita income net of current transfer receipts comes from the Bureau of Economic Analysis, 3 while the Consumer Price Index (CPI) used to deflate income is from the Bureau of Labor Statistics. 4 On the other hand, the level of crime repression activities in each state is measured by the number of state police employees per 100,000 inhabitants, as recorded in the UCR data. 5 Furthermore, our analysis is restricted to the 40-year period from 1971 to 2010 6 and to the 48 continental and conterminous US states. 7
Distribution dynamics of property crime
Using the methodology described in section ‘Methodology’, the dynamics over time of the distribution of property crime is analysed.
Figure 3 shows the results for the overall period, from 1971 to 2010. The first three graphs are related to the shape of the estimated stochastic kernel. In particular, if we look at the contour levels and at the 45 degree line, it is possible to analyse the intra-distributional dynamics between the two considered periods: observations with a high property crime rate in 1971 are likely to have a lower crime rate in 2010; states with a low rate in the initial year tend to present a higher crime rate in the final year; moreover, there is a group of states located around the mean in the initial year experiencing the highest crime rates in the final period. This situation results in a final and ergodic distribution affected by bimodality, as displayed in the fourth graph. 8 The states that are responsible for the first mode of the final density are those located in the northwest part of the USA while the second mode can be attributed to the states in the southeast.

Distribution dynamics analysis of property crime rates for the period 1971–2010 using the 48 continental US states.
However, behind these dynamics, two distinct phases are identifiable. In fact, Figure 4 depicts a period of strong convergence from 1971 to 1980, made evident by a sharply concentrated unimodal ergodic distribution and by a noticeable clockwise rotation of the estimated probability mass: this rotation represents evidence of convergence because it implies that states with a low crime rate at the beginning of the period considered present higher rates at the end; and the opposite holds for states with high crime rates. Conversely, Figure 5 shows a phase of divergence, from 1981 to 2010, in which the distribution of property crime rates shows a clear tendency towards bimodality, made transparent by the comparison of the initial distribution with the final one and the ergodic.

Distribution dynamics analysis of property crime rates for the period 1971–1980 using the 48 continental US states.

Distribution dynamics analysis of property crime rates for the period 1981–2010 using the 48 continental US states.
Robustness checks
These findings seem to suggest a dynamic more similar to the third scenario predicted by the model. In order to test the robustness of the obtained results and to address the concern that a particular city or county is driving the whole of the distribution (e.g. the city of New York in the state of New York), the analysis is replicated considering the nine regions in which the US territory is divided according to the definition of the United States Census Bureau.
Figure 6 presents the distribution dynamics analysis for these regions in the period 1971–1980: convergence is highlighted by a clockwise rotation of the probability mass and by the transition from an initial distribution characterised by high variability to a more concentrated, although bimodal, final distribution. A further evidence of convergence is represented by the ergodic distribution which exhibits lower variability than the final distribution and in which any evidence of bimodality disappears. Figure 7 replicates this exercise for the period 1981–2010: the last picture in this figure reveals the emergence of a final and ergodic distribution far less concentrated than the initial one. With only nine regions it is not possible to capture the bimodal shape observed in Figure 5; however, the tendency towards divergence is clear.

Distribution dynamics analysis of property crime rates for the period 1971–1980 using the nine US regions.

Distribution dynamics analysis of property crime rates for the period 1981–2010 using the nine US regions.
The analysis performed on more aggregated regional divisions proves the robustness of the results regarding the existence of two phases in the evolution of the distribution of property crime, i.e. a period of convergence followed by a divergent dynamics. This evidence is again consistent with the third scenario predicted by the model, characterised by the presence of strong income shocks.
Conditional distribution dynamics
According to the theoretical framework of section ‘A two-region model of crime dynamics’, the divergence of property crime rates is driven by income shocks that affect crime through the channels of opportunity cost and police. Moreover, in the same period in which a tendency towards bimodality has been detected for property crime rates, i.e. the years from 1981 to 2010, Gerolimetto and Magrini (2017) find an analogous divergent pattern using real per capita personal income for the same 48 conterminous US states. By applying the distribution dynamics technique, the same divergent tendency can be observed in the number of state police employees, starting from the 1980s. 9 These considerations motivate the idea to explore more deeply the relationship between income, police and crime from an empirical point of view.
In order to understand if income and police can explain the bimodality of property crime, the conditional distribution of the variable of interest must be considered. Since both income and police are likely to be endogenous variables, 10 the method proposed by Quah (1996b) is considered in the following analysis.
The first step of the procedure consists in estimating a growth regression, in which the growth rate of the dependent variable (property crime) is regressed on a set of potentially endogenous variables (income and police), and then taking the fitted values for subsequent analysis: in order to correct for the possible endogenous nature of these variables and, in particular, the presence of feedback effects, both the current values, the lags and the leads are included in the regression. The inclusion of leads might not solve completely the endogeneity problem owing to the many possible channels of reversed causality between dependent and independent variables: this means that the following results shall be interpreted with caution and on a descriptive basis.
In the second step of the procedure, the fitted values are used to estimate the residual component of crime that is not explained by the accumulation of the conditioning variables (details are in the Technical appendix). Figure 8 is obtained by applying the distribution dynamics analysis to this unexplained component of property crime, for the period 1971–2010. 11

Distribution dynamics analysis of property crime rates for the period 1971–2010 conditioning on income and police.
The conditional convergence is made evident by the noticeable clockwise rotation of the estimated probability mass. Moreover, any sign of bimodality is completely disappeared from the ergodic distribution, which is also more concentrated than the initial and final one. Thus, the divergence of the number of state police forces and the increasing income inequality, started in the 1980s, explain well the tendency towards a bimodal shape of the property crime distribution.
Concluding remarks
The importance of analysing the convergence of crime rates, as stated by Cook and Winfield (2013), is related to the existence of a possible national crime trend and whether there are movements towards it. Moreover, the analysis of convergence helps in choosing between competing sociological theories, such as the modernisation and the conflict theories (LaFree, 2005): in fact, according to the modernisation view, crime rates should converge given the spread of developments and advances across regions, whereas the conflict theory predicts their divergence, arguing that these developments have uneven speed in the different regions.
The proposed theoretical model predicts the emergence of two distinct phases in the dynamics of the property crime distribution when spatial units are affected by strong income shocks generating greater inequality: a period of convergence followed by the divergence of crime rates. This scenario fits the situation of the US states, because economic disparities were exacerbated in those regions, starting from the 1980s.
This theoretical prediction is confirmed by the descriptive analysis presented in this paper. In fact, using the distribution dynamics methodology, two different patterns are identified during the period 1971–2010 for the property crime distribution: a phase of strong convergence (1971–1980), followed by a period of divergence and a tendency towards bimodality (1981–2010). These two distinct phases, to the best of our knowledge, have not been highlighted by the existing literature.
Moreover, the contemporaneous divergence of personal per capita income and of state police rates can account for the observed dynamics of property crime rates: in fact, conditioning on income per capita and state police, the distribution of property crime does not exhibit a bimodal shape, thus indicating the presence of conditional convergence.
An important policy implication that can be derived from the analysis is as follows. The theoretical framework suggests that significant income disparities are translated into different concentrations of crime through the opportunity cost and the police channel. Moreover, the model predicts that poor states have lower resources to fight crime and, consequently, they exhibit higher crime rates. Since the presence of crime discourages investments and lowers income, these states are trapped in a vicious circle. Therefore, mitigating the effects of inequality with cross-state compensations, in terms of financial and police resources, may help avoiding both the concentration of crime activities in specific regions and the emergence of self-reinforcing gaps between poor and rich states.
Footnotes
Technical appendix
Acknowledgements
I am grateful to Stefano Magrini, Paolo Pellizzari and three anonymous referees for their extremely useful comments and suggestions. I also thank Margherita Gerolimetto and Stefano Magrini for the code used to implement the kernel density estimator.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
