Abstract
The intentional killing of one human being by its own kind is considered the worst of the crimes. Therefore, homicide prevention is a major concern for policy makers in both developing and developed countries. We propose regression modeling for the homicide rates in Brazil along with appropriately chosen distributions for these responses that are in agreement with the restriction of values to the unit interval. We adopt the beta and simplex regression models with systematic components for the mean and dispersion parameters to explain the homicide rates in 27 state capitals of Brazil from the following explanatory variables: time, Gini coefficient, municipal human development index (MHDI), illiteracy and poverty rates. We employ standard likelihood techniques, perform influence and residual analysis and calculate goodness-of-fit statistics to select the best regression to explain homicides rates in these capitals. We perform the computations in the R package. The main results suggest the following: the mean homicide rate is increasing over time; there is a negative correlation between MHDI and murder rate; the poverty has a quite small negative impact on the mean homicide rates in the beta regression. The Gini coefficient and the illiteracy and poverty rates explain the dispersion of the homicide rates.
Introduction
In this paper, we employ both beta and simplex regressions to explain the variability of the homicide rates in all Brazilian state capitals using data sets from three separated years (1991, 2000 and 2010). The homicide, defined here as an unlawful death deliberately inflicted on one person by another person, is considered the worst of the crimes. According to the United Nation Office on Drugs and Crime, half of all homicides occur in countries that gather 11% of the global population. There is significant variation across and within nations. For instance, in 2012, the global homicide average was estimated at 6.2 per 100,000 habitants, thus representing up to 437,000 violent deaths. Some regions, such as Central America and Southern Africa, have rates close to 25 cases per 100,000 population, while Southern and Western Europe have averages up to six times lower than the global mean. The Americas (16.3) and Africa (12.5) are significant more violent than Europe (3) and Asia (2.9). In addition to spatial dependence, the homicides dispersion also follows different longitudinal trends. For example, Eisner (2003) provided long-term homicide rates across Western Europe and showed that murder rates dramatically decreased over time. Conversely, official data from England and Wales (1967–2017) show a more complex pattern with a positive linear trend from 1960 to 1990 and an exponential increase from 1999 to 2003. After a ten years’ negative trend, homicides started to increase again in 2015.
On theoretical grounds, scholars from different disciplines try to identify which variables explain homicides variation. For the best of our knowledge, Kronh (1976) and McDonald (1976) provided pioneer cross-national comparisons on property crime from police statistics. In particular, early academic works have systematically found a positive association between socioeconomic variables and homicide rates (Loftin & Hill, 1974; Parker & Smith, 1979; Parker & Loftin, 1983). Blau and Blau (1982) and Messner (1982) reported null results regarding the poverty-homicide link. Braithwaite and Braithwaite (1980) used data from 31 countries over a 20-year period (1955–1974) to estimate the relationship between crime and income inequality. They reported that most of the variables have a strong and statistically significant correlation with homicides. The authors performed several tests, but they used to evaluating these countries the following variables: political freedom, ethnic fictionalization, gross domestic product (GDP) per capita and intersectorial income inequality. Williams (1984) employed a nonlinear model to examine the relationship between poverty levels and homicide rates for 125 statistical metropolitan areas in the United States. Gartner (1990) estimated a time-series-cross-section model using sex and age specific victimization data from 18 countries (1950–1980). According to Nivette (2011), there was a series of theoretical approaches in cross-sectional studies of crimes’ predictors. Applying meta-analysis, the author identified 10 different perspectives and most of the work are based on demographic explanatory variables.
Messner (1982) identified two measures of deprivation: absolute and relative. The first is defined as the proportion of a population below a fixed measure of well-being. The latter represents the relative dispersion of income within a specific population. According to Roberts and Willits (2015), for the relative deprivation perspective “individuals evaluate their socio-economic standing in relation to others, and develop frustration, hostility, and resentment when they realize that they have fewer resources than others”. More recently, Kelly (2000) used data from 1991 FBI Uniform Crime Reports to find that inequality has a strong impact on violent crimes, but no effect on property offenses. He also reported that poverty and police activity explain the variation of violent crime, but have little correlation with property incidents. Roberts and Willits (2015) argued that different measures of inequality may lead to mixed findings on the income inequality-homicide link. Based on data from 208 U.S. cities, they reported that all measures of inequality are positively correlated with homicide rates. Thus, their proposed mechanism between income inequality and crime relies on the concept of deprivation.
Among developing countries, Brazil has two interesting features: extreme income inequality and high crime rates, particularly homicides. According to official estimates, Brazil was responsible for 10% of world homicides in 2015 which means almost 60,000 violent deaths, see:
In applied fields, outcomes of rates or proportions often arise. Few models are suitable for fitting such data. In this paper, we apply the beta and simplex regressions with varying dispersion to predict homicide rates in Brazil. The beta regression is commonly used to model continuous variables restricted to the interval
Outcomes of continuous proportions or rates arise in many applied areas including social and political sciences. Regression models for proportional data have be in widespread use in the last fifteen years or so. These regressions employ a continuous distribution for the response in the interval
The paper is organized as follows. In Section 2, we provide a summary of the beta and simplex regressions. We employ the likelihood function to measure the adequacy of the fitted regressions to the homicide rates. The parameter estimates are those values that maximize the log-likelihood. There are simple computer programs to fit most common regression models. Also, we present some diagnostic measures and residuals. In Section 3, we give details of the homicide data collected in the state capitals of Brazil and discuss the fitted of two regressions to these data. Finally, some concluding remarks are offered in Section 4.1.
The beta and simplex regression background
In this section, we present a background related to the (i) beta regression, (ii) simplex beta and (iii) diagnostic and residual analysis.
The beta regression
An important class of problems involves data in the forms of rates and proportions, such as mortality rates, infection rates of diseases, etc. So, the beta distribution is useful for modeling random experiments that produce results in the interval
where
We adopt the beta density implemented in the Generalized Additive Model for Location, Scale and Shape (GAMLSS) software with mean parameter
where
Here, the mean and the variance of
Let
where
Consider a sample
where
The simplex dispersion (SD) density (Barndorff-Nielsen & Jrgensen, 1991) for a univariate continuous random variable
where
is the unit deviance.
Let
We denote a random variable
Let
Let
The maximization of Eqs (3) and (6) can be performed by the RS algorithm. See details in Rigby and Stasinopoulos (2005) and Stasinopoulos and Rigby (2007). All computational procedures for the simplex regression are done in GAMLSS software in R.
For the beta and simplex regressions defined before, the log-likelihood function of
Another popular measure of the difference between
In order to study departures from the distribution assumption as well as the presence of outlying observations, we consider the normalized randomized quantile residuals (Dunn & Smyth, 1996). These residuals are determined by
We also use Worm Plots (WP) of the residuals (Buuren & Fredriks, 2001) as the technique to check the adjustment quality. The general idea of these plots is to identify regions (intervals) of an explanatory variable within which the model does not fit adequately to the data. Model inadequacy is indicated when many observations lie outside the point-wise 95% confidence bands or when the points follow a systematic shape. For example, the interpretations of the shapes of the WP are: a vertical shift, a slope, a parabola or a S shape, thus indicating a misfit in the mean, variance, skewness and excess kurtosis of the residuals, respectively.
We build envelopes to enable better interpretation of the normal probability plot of the residuals. These envelopes are simulated confidence bands described by Atkinson (1985) that contain the residuals such that if the model is well-fitted, the majority of points will be randomly distributed within these bands.
Data
We consider the data for the 27 Brazilian state capitals in three separated years (1991, 2000 and 2010). The dependent variable is the homicide rate measured by the number of homicides in the city normalized by its total population. All variables were collected at the city level using public secondary data from DATASUS (
Brazil is divided in five regions that spread out over 27 states. The Southeast and South are the most developed regions. The North and Northeast are the poorest parts of the country, which concentrate 36% of the total population. About half of the people that live in these two regions earn less than 125 US dollars a month. The curves of homicide rates in the cities follow very different patterns. In the 2000–2010 period, Maceió more than double the homicide rate, whereas São Paulo has a big decay. In the total 20 years, João Pessoa and Salvador have more than triple homicide rates. In the South capitals, there is no decrease in the homicide rates. In the North, Northeast and South, the average homicide rates grow steadily in the whole period.
As long as regression models dealing with rates and proportions are typically heteroscedastic, the beta regression is more appropriate than the logistic transformation. Another important feature of the beta is that its parameters can be interpreted in terms of the mean response when the logit link is employed (Ferrari & Cribari-Neto, 2004).
The data for this research are available at
Table 1 gives a descriptive report of all variables in this study.
Summary of the variables
Summary of the variables
The application described aims to find those explanatory variables that explain the homicide rates in state capitals of Brazil. For doing this, we collect the following variables from the separated years 1991, 2000 and 2010 (for
time GINI MHDI illiteracy poverty
Figure 1 displays the homicide rate behavior in the capitals of the Brazilian states for the three years. We also give the boxplot of
(a) Histogram of homicide rate data. (b) Boxplot of homicide rate data (observations of the response variable).
The systematic components for both beta and simplex regressions are defined by
and
where
Table 2 lists the MLEs, their standard errors (SEs) and
Results from the fitted complete beta and simplex regressions
Goodness-of-fit measures for the homicide rate data
The numbers in Table 2 reveal that some explanatory variables are non-significant at the 5% level. Thus, the non-significant covariates were eliminated from the regressions based on the
Results from the fitted reduced beta and simplex regressions
Goodness-of-fit measures considering only significant covariates
We now provide some conclusions about the fitted reduced beta regression which includes in the mean systematic component an intercept, time (
We calculate the case-deletion measures
Index plots for the reduced beta regression: (a) 
Residual analysis for the reduced beta regression considering only significant covariates in 
The plots of the quantile residuals versus the order of the observations for the fitted reduced beta regression is displayed in Fig. 3a. In these plot, we can note the random distribution of the residuals. As measure of fit quality, Fig. 3b display the normal probability plots and the simulated envelope for the fitted reduced beta regression, respectively. These plot indicate that there is no evidence of serious departures from the model assumptions of the fitted reduced beta regression. Thus, there is a clear evidence that the beta regression is better suited to the current data than the simplex regression. We display the wormplots for the fitted reduced beta regression in Fig. 3c. Based on these plot, we conclude that the beta regression with systematic components for
More findings can follow from the beta regression estimates in Table 4.
The time is significant, i.e., the mean homicide rate tends to increase as the time goes by. These results are consistent with the aggregate analysis by federation unit which indicates a positive trend of homicidal violence in Brazil (1996–2017). Despite recent reductions (2018 and 2019), official data have not yet been released, making it difficult to assess the consistency of negative fluctuation over the past two years; see The MHDI is also a significant covariate and the average homicide rate tends to increase when this index decreases. In fact, the negative sign of the estimate The poverty is significant and the mean homicide rate tends to decrease slightly when poverty proportion increases. This finding is curious given the lower correlation between MHDI and poverty. Another interesting issue is the independent effect of the poverty, even controlled by the MHDI. The higher the poverty, the lower the mean homicide rate. These results corroborate Messner’s (1982) proposition about the need to review the causal link between poverty and violence.
In the beta regression model, we emphasize that the homicide rate variability increases when the dispersion parameter
As the time passes, the estimated dispersion parameter of the homicide rate decreases, assuming the other variables fixed. The GINI coefficient is significant in relation to the dispersion homicide rate, i.e., this dispersion is expected to increase when the GINI increases. This finding is consistent with the literature that examines the relationship between income inequality and homicidal violence (Kelly, 2000; Neumayer, 2005). These results have direct implications for the formulation and implementation of public policies specially designed to reduce income asymmetry. The illiteracy rate is significant in relation to the dispersion parameter. The positive sign of Finally, the poverty proportion is also significant in relation to the dispersion homicide rate. This dispersion tends to decrease for capitals with greater variability of wealth.
The primary purpose of this paper is to quantify some explanatory variables that influence homicide rates in Brazil using the most important regression models for proportional data. We construct beta and simplex regressions with two systematic components for the mean and dispersion parameters for modeling the homicide rates in the Brazilian capitals. The response variable refers to the homicide rates measured by the number of homicides in the city normalized by its total population with data from the 1990s and 2000s. We consider as explanatory variables: the time, the GINI coefficient as measure of income inequality, municipal human development index (MHDI), illiteracy and poverty rates. We adopt standard methods based on likelihood theory to estimate the model parameters and select the best regression. We evaluate important findings to look for the effects of the explanatory variables on the mean and dispersion homicide rates. The analysis shows that the beta regression provides better estimates for these data and provide some more substantive conclusions about the explanatory variables that influence homicide rates in Brazil. Our results can be useful for future research on homicides prevention and enhance our understanding of which factors influence murder rate in extreme violent countries.
Footnotes
Acknowledgments
We are very grateful to a referee and Co-Editor-in-Chief for helpful comments that considerably improved the paper. We gratefully acknowledge financial support from CAPES and CNPq.
