Abstract
A major development in criminology in recent years has been the efforts by the World Health Organization (WHO) to provide reasonably reliable estimates of homicide rates for a large number of nations. In some instances, these estimates entail adjustments of the records on homicide from vital statistics or criminal justice sources submitted by participating nations. These adjustments are designed to deal with underreporting and detected anomalies. In other instances, the estimates are generated by regression modeling. The purpose of this research note is to raise awareness among the community of homicide researchers of the nature of the WHO homicide estimates and to offer caution about their appropriate use for cross-national research.
Introduction
A long-standing problem confronting researchers interested in analyzing cross-national variation in homicide rates has been the limited availability of data. Many nations lack the requisite administrative agencies to compile reliable homicide statistics, and as a result, researchers have had little choice but to restrict their analyses to those nations for which data are available rather than to study the theoretical populations of interest. Moreover, as LaFree (1999) explained almost 20 years ago, reliance on such “availability samples” (p. 135) rather than genuine probability samples results in two interrelated problems (see also Koeppel, Rhineberger-Dunn, & Mack, 2015; Nivette, 2011). Such samples can by no means be regarded as representative. As might be expected, homicide data have been more readily available for the more developed nations, nations with well-established statistical recording agencies. The numbers of nations included in the “availability samples” employed in comparative homicide studies have also been rather small relative to the total population of nations. Small samples can be problematic because analyses based upon them are highly susceptible to the impact of outliers (LaFree, 1999, p. 135).
The primary data source that has in practice determined the availability samples used in the cross-national homicide research has changed over time. The early studies typically relied on data from the International Criminal Police Organization (INTERPOL), but subsequently, the publications of the World Health Organization (WHO) emerged as the generally preferred source. Specifically, researchers have relied mainly on the WHO Mortality Database. This database derives from the health/vital statistics reports from participating nations on the specific causes of death, including homicide. Homicide is defined for this purpose as “the killing of a person by another with intent to cause death or serious injury” (WHO, Indicator and Measurement Registry, 2017). These WHO homicide data are now generally regarded as being of higher quality for comparative research than are other data sources (Koeppel, Rhineberger-Dunn, & Mack, 2015, p. 51; LaFree, 1999, p. 133; Levchak, 2016, p. 8; Messner, Pearson-Nelson, Raffalovich, & Miner, 2011, p. 67; Messner, Raffalovich, & Shrock, 2002, p. 383; Messner, Raffalovich, & Sutton, 2010, p. 511). However, depending on the specific year, death registration data that are taken directly from the public health records can only be provided for approximately 70 countries. This is only slightly over a third of the total number of nations (U.S. Department of State, Bureau of Intelligence and Research, 2017).
Homicide researchers have accordingly been eager to locate new homicide data sources for their analyses, especially data sources that expand and diversify the coverage of nations. A major development with particular relevance to this quest is the ambitious effort by the WHO to generate Global Health Estimates (GHE) to facilitate cross-national comparisons. These estimates are available for much larger samples of nations than are the cause of death reports contained in the WHO Mortality Database. WHO disseminates homicide estimates via their GHE. In addition, the United Nations Office on Drugs and Crime (UNODC) has incorporated the WHO estimates for some countries in their Global Studies on Homicide (GSH; UNODC, 2011, 2013, see 2013, p. 110). 1 Studies are beginning to appear in the literature that draw upon the newly generated WHO data with estimated measures of homicide.
The purpose of our research note is to raise awareness of the nature of the WHO homicide estimates and to highlight appropriate and inappropriate uses of these data. We begin by describing the distinctive purposes for the development of homicide estimates by WHO and by explicating their estimation procedures. We then document the growing use of homicide estimates in the literature and explain why the use of the data based on estimates is potentially problematic. Finally, we offer some concluding thoughts about the value of the WHO homicide estimates for cross-national research.
The Rationale for WHO Homicide Estimates and the Estimation Procedures
It is important at the outset to place the WHO homicide estimates within context. As noted above, the homicide data with estimates are in essence a by-product of a larger effort. The analysts at WHO have been concerned primarily with the more general issue of the relative importance of different health problems for societies across the globe. To facilitate meaningful comparisons, they have devoted a good deal of effort to assessing the quality of the vital statistics data that are supplied by individual nations, and they have developed procedures to adjust the data that were submitted when suspect, sometimes incorporating information from other data sources. In addition, they have implemented modeling procedures to generate estimates of the respective causes of death when data are lacking in part or in whole, thereby expanding the pool of nations for which inferences about the relative importance of different health concerns might be made.
The WHO homicide estimates are based on a rather complex process that combines vital statistics data from the WHO Mortality Database and criminal justice data from the UNODC (WHO, Global Status Report on Violence Prevention, 2014, p. 62). 2 As noted above, criminological researchers have generally accepted the vital statistics data on homicide as the “gold standard” for cross-national research. However, researchers at WHO have discovered that the reported numbers of homicide in this source, as well as in the criminal justice data, are in some cases suspect. This has prompted them to generate their estimates of homicides.
To explain the procedures, it is necessary to introduce some important conceptual distinctions. There are two basic “modes” of estimation used by WHO. One mode is grounded in data on homicides that come from the vital registration systems and/or criminal justice sources for a particular nation. The WHO (ibid., p. 63) researchers label the resulting estimates the directly estimated homicide rates. For some nations, no reasonably reliable data are available from either of these two main data sources on homicides. In these instances, the WHO researchers have relied on regression models to predict homicides when the data are missing from a set of covariates. These are referred to as model-based homicide rates.
An additional important conceptual distinction pertains to three types of health statistics: reported homicide deaths, adjusted homicide deaths, and comparable homicide estimates ibid., p. 62). The reported homicide deaths are the “raw” data on homicides that come directly from the vital registration statistics and/or the criminal justice statistics of various countries. The WHO researchers have developed procedures to correct these raw data for underreporting and misclassification. In some instances, the WHO analysts determine after quality controls that only a specified proportion of all deaths are recorded. The counts for all causes of death are accordingly adjusted upward. In other instances, vital statistics data include a proportion of deaths that are classified as deaths due to injuries for which the intent is unknown. These deaths can be redistributed pro rata across causes, including homicides, to yield estimates that are likely to be more accurate than those originally reported in the vital registration systems.
The criminal justice data on homicide do not contain information analogous to “undetermined cause of death” that could be used to adjust for misclassification. Nevertheless, the WHO researchers have derived a rough estimate of underreporting in criminal justice data by comparing these data with vital statistics when high-quality reporting systems are in place. They have concluded that, although there is variability, “criminal justice data may typically underreport homicides by 15%” Global Status Report on Violence Prevention, 2014, p. 63). The application of these types of procedures designed to correct for underreporting and/or possible misclassification generates the adjusted homicide deaths.
For nations that have “raw” data that are judged to be reasonably reliable from both the vital registration system and from criminal justice sources, the selection of the specific estimate of homicide deaths is determined by the application of two decision rules. (a) When the homicide deaths from the criminal justice data are significantly higher than those from the adjusted vital registration data, the criminal justice figure is selected. This is based on the assumption that overreporting of homicides is less likely than is underreporting. (b) When there is no significant difference between the criminal justice and adjusted vital registration counts of homicide deaths, or if the count is higher for the adjusted vital registration count, the vital registration data serve as the final estimates.
A third decision rule pertains to counties that have reasonably reliable criminal justice data for an extended period of time (at least 8 years) but lack acceptable vital statistics data. For these nations, the criminal justice homicide count is adjusted upward by 15% to yield the homicide estimate.
The application of these decision rules results in the three categories of nations listed in the top panel of Table 1, that is, the panel for the “directly estimated homicide rates.” The first two categories include nations with high-quality vital registration and criminal justice data. The uppermost category includes nations for which the adjusted vital statistics data serve as the basis for the homicide estimate, whereas the second category is comprised of nations for which the criminal justice figures were selected rather than the vital statistics data, applying the decision rules enumerated above. The third category includes nations lacking reliable vital statistics data. For these nations, the homicide estimates are based solely on data from criminal justice sources.
Country Listing by Modes of Estimation and Homicide Data Sources.
Note. The table has been adapted and altered from Table 8 of the Global Status Report on Violence Prevention 2014 (WHO, Violence and Injury Prevention, 2014, p. 66).
A fair number of countries lack quality data on homicide from either vital registration systems or criminal justice sources, or have very limited data from these sources. For these nations, homicides cannot be estimated directly. To facilitate truly global comparisons of the importance of homicide relative to other causes of death, the WHO researchers have generated estimates that are based on regression models. These procedures were implemented by means of successive testing of different models whose predictions were averaged to get the final estimates. Six predictor variables passed the validation process into the final models: the Gender Inequality Index, alcohol consumptions patterns, the percentage of people residing in urban areas, the male proportion of the population aged 15 to 30, the infant mortality rate, and the religious fractionalization measure. Through these procedures, countries that have yet to develop acceptable quality systems for collecting homicide data can be assigned homicide estimates based on the regression modeling. WHO refers to the final set of both the directly estimated and model-based-estimated homicide rates as comparable homicide estimates. The WHO researchers caution, however, that the model-based estimates are “more appropriately interpreted as guides to priority setting and understanding the likely homicide burden within a country, as opposed to evidence of the effectiveness of national policies on homicide” Global Status Report on Violence Prevention, 2014, p. 62). More generally, the WHO researchers advise that the quality of the estimates in the top two categories is reasonably high, while the quality of the estimates in the third category is somewhat more tenuous but perhaps acceptable to use with caution (personal communication, November 7, 2016).
The Recent Use of WHO Estimated Homicide Data in the Cross-National Research and Associated Problems
Starting in 2004, WHO has released homicide estimates every 4 years. Hence, homicide data based on estimates have been published for the years 2004, 2008, and 2012. These figures would seem at first glance to provide researchers with the welcome opportunity to expand greatly the coverage of nations for analyses of homicide rates. Indeed, studies have been published in recent years reporting that the analyses are based on samples of nations much larger than the samples available in the WHO Mortality Database. The extent to which the WHO homicide estimates have been used in these studies is not always clear, especially if UNODC data are the cited source. The UNODC data combine data from their surveys of criminal justice agencies with data from other sources, including the WHO estimates (UNODC, 2013, p. 110). However, datasets that have been made available through the GHE are definitely based on homicide estimates. The use of homicide estimates can also be suspected (if not definitely confirmed) when the number of observed nations considerably exceeds that of 70. In contrast, it is reasonable to infer that the homicide counts are based directly on reports from vital statistics systems if the cited source is the WHO Mortality Database. 3
We have searched the homicide literature for studies that have evidently incorporated the WHO homicide estimates for large cross-national samples and that have reported regression results pertaining to the social structural correlates of homicide rates. For these studies, we have also identified any covariates in the regression models that “overlap” with those used by WHO to generate the regression-based estimates of homicide. The selection of these studies involved a three-step process. First, we applied the rule to filter the studies by sample sizes larger than approximately 70. We then proceeded to investigate those remaining studies. Prior to WHO becoming the most commonly used data source, INTERPOL had good standing in the community of homicide researchers (LaFree, 1999, p. 126ff.). Before INTERPOL ceased the collection of homicide statistics, the data allowed for relatively large samples exceeding 70 nations (Koeppel, Rhineberger-Dunn, & Mack, 2015, p. 51ff.). Thus, in a second step, those studies that used INTERPOL data were excluded. The concluding step was an in-depth examination of the methodologies of the remaining publications to identify the exact data source. This was not always entirely straightforward because of imprecise documentation or dead links. However, once the source, either WHO, UN, or both, was identified, the methodological and/or statistical annexes were accessed to get information on the use of estimated homicide rates. In cases of uncertainty about the correct source, we decided against a listing in our compilation in this research note. Thus, we tried to identify as many of the publications using homicide estimates as possible. The results of our search are reported in Table 2. Although the presented list might not be exhaustive, it should serve the purpose of illustrating the increasing incorporation of WHO homicide estimates in the cross-national literature and the associated methodological issues.
Recent Cross-National Homicide Studies Based on Expanded Samples.
Note. UNODC = United Nations Office on Drugs and Crime; WLS = weighted least squares; IMR = infant mortality rate; WHO = World Health Organization; GHE = Global Health Estimates; OLS = ordinary least squares; EIMR = excess IMR.
In case a study evaluated more than one dependent variable, only the homicide rate is reported. Date refers to year of publication.
Sample size used in final statistical analyses if not else reported.
Only those variables used by WHO for estimation are reported.
The first three columns of Table 2 report, respectively, the author’s(s’) name(s) and publication date of the study, the source(s) of the homicide data, and the sample size for the main analyses. The fourth column reports the statistical model, and the final column indicates the inclusion of any predictor variable that overlaps with predictors used in the generation of WHO’s comparable homicide estimates via the regression modeling. All studies are from relatively recent years, with publication dates ranging from 2009 to 2016.
Potentially, several problems can arise with the use of the homicide estimates in these studies. One such problem pertains to the common practice of computing multiyear averages of homicide rates. The rationale for this procedure is to minimize the instability in measured homicides caused by random fluctuations over different years. This is certainly a defensible procedure in principle, but it can be problematic when using the WHO homicide estimates. Every iteration featured methodological changes in the estimations, and as a result, the respective estimates as reported in different publications cannot be regarded as strictly comparable across years. In personal correspondence (November 7, 2016), WHO researchers have explicitly advised against combining such datasets for calculating averages, as well as for conducting longitudinal analyses with these data. 4
A second potential problem entails model specification. As depicted in column 5 of Table 2, all but one of the studies include among the predictor variables one or more of the very same variables that were used to estimate the homicide rates under investigation. This entails a degree of mathematical confounding of the explanandum with the explanans. In the one study that does not include an “overlapping” predictor, the model incorporates in the regression model what is probably the most frequently analyzed predictor of homicide rates—the Gini coefficient. Past research had documented moderately strong positive correlations between the Gini coefficient and the infant mortality rate, which is used in the homicide estimation (Jacobs & Richardson, 2008, p. 37; Paré, 2006, p. 49; Paré & Felson, 2014, p. 445; Pridemore, 2008, p. 154). Thus, in this analysis as well as in the analyses with directly overlapping predictor variables, part of any relationship between the dependent variable and the predictor variables is likely to be an artifact of the measurement of homicides.
Finally, a more general criticism can be raised about using the WHO model–based estimates in criminological inquiry. As noted above, these estimates are not derived directly from any data on homicides that have actually been recorded by agencies in the respective nations. The utility of such predictions for purposes of gauging the level of homicides in a given nation and for explaining variation in homicides across nations can thus be challenged on purely face validity grounds.
Conclusion
In their recent review of the cross-national research on homicide, Koeppel et al. (2015) took stock of progress in the field that has followed LaFree’s (1999) literature review published almost 20 years ago. These authors highlighted two particularly positive developments that characterize more recent research: (a) the shift away from reliance on INTERPOL data toward the use of homicide data from the WHO, and (b) the increases in sample size from very small samples that were occasionally used in the early studies. We concur that these have been positive developments, but we have pointed out that the WHO estimates incorporated in other sources must be used judiciously. Our view is that the model-based estimates of homicide are not appropriate for the most common objective of cross-national research on homicide—identifying the features of social structure that help account for variation in homicide rates across nations—because the very measurement of homicide is predicated upon statistical models of the impact of such factors on homicide rates. Moreover, these model-based homicide estimates are not based on any national data on homicides in many countries.
An important issue to be addressed in future research is to assess the substantive implications of the use of the model-based estimates in prior cross-national homicide studies, such as those enumerated in Table 2. Several key questions arise. To what extent do the coefficients for social structural predictor variables differ across the subsamples of nations with homicide data based on records from the vital statistics and/or criminal justice agencies in comparison with nations with homicide data generated from statistical models? If the coefficients are similar, does this indicate that actual causal processes are essentially invariant across the nations represented in the subsamples, or might this simply be a methodological artifact of the measurement of homicide, suggesting that the effective sample sizes for analyses do not in fact extend much beyond that attainable with data from the WHO Mortality Database? If the coefficients do differ across subsamples, how can the impact of differential measurement be separated from that of the predictor variables? Further unanswered questions arise concerning the appropriate uses of the model-based estimates. We noted above that WHO officials urge caution when working with such estimates, observing that the figures are best regarded “. . . as guides to priority setting and understanding the likely homicide burden within a country . . .” (WHO, Global Status Report on Violence Prevention, 2014, p. 62). It would be useful for members of the public health community and homicide researchers to explicate more fully the ways in which data on homicide that are not derived directly from national records of homicide can nevertheless be fruitfully incorporated in analyses of mortality patterns.
Given these unresolved issues, what guidelines should cross-national homicide researchers follow at present? One strategy is to continue to use the WHO Mortality Database. The homicide data in this source derive from the vital statistics systems of the reporting nations, and there is no confounding of the measurement of homicide with potential predictors of homicide. In addition, the Mortality Database provides homicide counts disaggregated by age and sex, as well as information on mechanism (e.g., firearms). 5 These data can also be accessed readily from electronic sources. However, a plausible case can be made for preferring the directly estimated homicide rates by WHO instead of the figures in the WHO Mortality Database. As explained above, these figures have been adjusted for incompleteness and anomalies after careful scrutiny by the WHO analysts. The legitimate use of these data, however, presupposes that researchers be sure to verify that only nations with homicide data that have been directly estimated be included in the samples (vital registration data,criminal justice data, and possibly adjusted criminal justice data, in Table 1).
In addition, we strongly encourage cross-national homicide researchers to make explicit the data source that has been used for each of the specific nations included in the samples if multiple sources have been used. This is likely to become increasingly important as additional initiatives to stimulate the compilation of homicide estimates come on line, such as the United Nations project on Sustainable Development Goals. The ultimate goal of course is to encourage the development of accurate and reliable recording systems of homicides in as many nations as possible. In the meantime, and with appropriate care, cross-national homicide researchers can benefit from the quality controls being applied by WHO, even though samples are likely to continue to be limited and not very representative for the near future.
Footnotes
Acknowledgements
The authors would like to thank Daniel Hogan from World Health Organization for his very helpful commitment.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
