Estimating one-sided-killings from a robust measurement model of human rights

Abstract

Counting repressive events is difficult because state leaders have an incentive to conceal actions of their subordinates and destroy evidence of abuse. In this article, we extend existing latent variable modeling techniques in the study of repression to account for the uncertainty inherent in count data generated for this type of difficult-to-observe event. We demonstrate the utility of the model by focusing on a dataset that defines ‘one-sided-killing’ as government-caused deaths of non-combatants. In addition to generating more precise estimates of latent repression levels, the model also estimates the probability that a state engaged in one-sided-killing and the predictive distribution of deaths for each country-year in the dataset. These new event-based, count estimates will be useful for researchers interested in this type of data but skeptical of the comparability of such events across countries and over time. Our modeling framework also provides a principled method for inferring unobserved count variables based on conceptually related categorical information.

Keywords

event count human rights measurement model one-sided killing repression

Introduction

Recording repressive events is integral to the scientific study of peace and conflict. Doing so accurately, however, is complicated by the fact that state leaders often have strong incentives to conceal these events from the international community and destroy evidence associated with abuse. Even when monitors, activists, and journalists have complete access, resource constraints may limit their ability to observe or record state violence. The lack of access, and constraints on monitoring resources, combine to potentially bias counts of repressive events (Brysk, 1994; Davenport & Ball, 2002).

Researchers recognize that differences in information sources may lead to divergent inferences and have spent considerable time seeking to resolve these problems by integrating data derived from multiple sources (e.g. Hendrix & Salehyan, 2015; Krüger et al., 2013). These approaches provide a promising means of validating inferences from the study of repressive events, but they are seldom applied to the time-series cross-sectional analyses that are central to much of the empirical human rights literature. Remaining concerns over the validity of repressive events count data contributed to a movement away from these data in human rights research (Poe, 2004). Yet standards-based indicators are subject to other forms of bias (Fariss, 2014, 2019) and are less well suited to precisely track the patterns of repressive events. Much could be learned with counts of these events but only if the political and operational processes that make obtaining such data problematic are conceptually identified and empirically addressed.

We take up this challenge by developing and validating a solution to the problem of bias in count data within the context of estimating one-sided government killings (Eck & Hultman, 2007; Pettersson, Högbladh & Öberg, 2019). We demonstrate how latent variable modeling techniques can be used to triangulate information from a variety of data sources to improve and expand upon existing estimates of whether and how many individuals were killed by their governments. Our approach builds on existing latent variable models of human rights, which assume that a government’s underlying level of repression is not observed directly, but can be estimated based on observable pieces of information captured through a wide variety of human rights monitoring sources. Specifically, we leverage a useful property of latent variable models: the ability to generate predictive distributions for all input variables, regardless of whether they are observed or missing in a particular country-year. The result is the creation of new estimates of one-sided-killing that account for and reduce bias stemming from resource constraints and the incentive to conceal repression.

We use the Eck & Hultman (2007) data from the Uppsala Conflict Data Program (UCDP) as our benchmark for our model. These data are central to conflict research and have been deployed in hundreds of published articles.¹ We nevertheless endeavor to make several improvements. Eck & Hultman (2007) code one-sided-killings as absent in country-years where there is no direct and reliable evidence of at least 25 individual deaths. The risk is that these data undercount repressive events. We empirically assess this possibility by using information conveyed in other human rights indicators to identify instances where killings were likely to have occurred, but for which insufficient documentary evidence exists. Determining which regimes are within this category is itself of substantive interest to scholars interested in concealed or otherwise unobserved abuse. In addition, while existing data begin in 1989, our approach is able to generate predicted counts extending back as far as 1946, allowing researchers to expand the temporal range of empirical tests.² This is particularly useful for testing theories of slow-changing and/or system-level changes of repression. Finally, we improve estimates of uncertainty around these events by moving beyond the provision of ‘low’, ‘best’, and ‘high’ estimates and instead generate full probability distributions for the number of individuals killed in any particular country-year.

Our latent variable model allows us to expand the empirical testing ground for the scientific study of political violence by providing a principled means of combing biased, incomplete, or otherwise imperfect pieces of information. This approach is critical for scholars because of the increasing volume and breadth of information about repression. The transformation of information from analog to digital is allowing scholars of peace and conflict studies to generate counts of important events like killings but also other forms of political events and patterns of communication in and around conflict (e.g. Steinert-Threlkeld, 2017).

Beyond these substantive contributions, we also innovate methodologically by extending latent variable modeling techniques to account for zero-inflated count processes. Zero-inflation occurs in count processes when zeros are observed (perhaps excessively) for one of two reasons: (1) the event being counted did not, and could not have occurred; or (2) the event being counted was not observed but could have happened. Within our context, this corresponds to instances where: (a) no killings are observed because none occurred; and (b) killings occurred, but were not recorded due to reporting biases driven by a regime’s attempt to conceal these events or because of capacity limitations on the monitoring organizations that collect information about repression. Though used within the context of human rights, we expect these techniques will be useful for a wider body of research linking political event-counts to unobserved concepts of interest.

Below, we introduce our modeling innovations in further detail and validate our estimates of one-sided killing. In so doing, we highlight several instances where we believe one-sided-killings occurred, despite their absence from existing data sources. Leveraging the expanded temporal domain of our data, we track a reduction in one-sided-killing over time, corroborating evidence of more widespread reductions in other forms of political violence (Goldstein, 2011; Lacina, Gleditsch & Russett, 2006). We close with a discussion of the promise of latent variable models generally, with suggestions for addressing the limitations of our model and improving the measurement of repressive events.

Existing measurement models of repression

All latent variable models share the assumption that an underlying concept of interest cannot be observed directly but can be approximated through observable manifestations. For latent variable models of human rights, the underlying concept is broadly defined as respect for physical integrity rights, which comprises a unidimensional spectrum ranging from complete and widespread abuse of these rights at one extreme, to complete respect of the rights at the other (Eck & Fariss, 2018; Schnakenberg & Fariss, 2014; Fariss, 2014, 2019). Where a state sits along this spectrum is determined by data collected from a variety of sources on different types of abuse and repression: the observed manifestations, which are broadly classified as either ‘events-based’ indicators or ‘standards-based’ indicators.

Events-based indicators capture extreme rights violations, identifying the scope and scale of repressive events. Typically, these indicators are collected and periodically updated through the continual evaluation of primary and secondary sources. By contrast, standards-based indicators capture both less extreme abuse and widespread abuses. These categorical variables rely on contemporaneous human rights reports. Though we focus on reducing bias in event-counts, both types of data sources are subject to a variety of biases. A latent variable approach lessens these biases by combining and aggregating information through a principled and transparent measurement model.

The Online appendix contains descriptions of the specific datasets and the sources for the standards-based indicators and event-based indicators that we use for our updated model. The standards-based indicators are almost all derived from Amnesty International and US State Department reports (Hathaway, 2002). The event-based indicators are: massive repressive events (Harff & Gurr, 1988); genocide and politicide (Harff, 2003); genocide and democide (Rummel, 1994; Wayman & Tago, 2010); one-sided government killing (Eck & Hultman, 2007); and political executions (Taylor & Jodice, 1983). Fariss (2014, 2019) treats all of these variables as dichotomous indicators that identify whether each type of event occurred. The definitions of genocide, politicide, and massive repression variables capture human rights violations at the extreme end of the repression spectrum. The measurement of one-sided government killing captures instances in which more than 25 individuals (non-combatants) are killed, but excludes extra-judicial killings that occur inside a prison and combatant deaths that occur during civil conflicts (Eck & Hultman, 2007). Extra-judicial killing more generally is captured by both the political execution data (Taylor & Jodice, 1983) and several of the variables derived from the human rights reports described above.

The models constructed by Schnakenberg & Fariss (2014) and Fariss (2014) are outlined in Table I. These models assume there is an unobserved latent trait, $θ_{i t}$ – the level of respect for physical integrity rights – from which we observe manifest indicators $y_{i t j}$ . The human rights variables are indexed with i, t and j, where $i = 1,\dots, N$ indexes cross-sectional units (countries), $t = 1, \dots, T$ indexes time periods (years), and $j = 1, \dots, J$ indexes indicators. We also use k_j to indicate the values that the manifest indicator j can take on. In the original models, the observed variables are either ordinal or binary, such that the binary indicators take on K_j = 2 cut-points while the ordinal indicators take on K_j > 2 cut-points.

For each physical integrity item, the model estimates an ‘item discrimination’ parameter $β_{j}$ and a set of $K_{j} - 1$ ‘item difficulty cut-points’ ${(α_{j k})}_{k = 1}^{K_{j}}$ . These parameters connect the observed indicator to the latent variable and are analogous to a slope and intercept term in a logistic regression or the slope and cut-points in an ordered logistic regression. The likelihood function for this model is in Table I, with $F (\cdot)$ denoting the logistic cumulative distribution function. The likelihood is akin to a logistic regression, but with multiple outcome variables for each observation.

Fariss (2014) extends this model by allowing the difficulty cut-points for some of the items to vary over time, changing ${(α_{j k})}_{k = 1}^{K_{j}}$ to ${(α_{t j k})}_{k = 1}^{K_{j}} .$ ³ Note the t subscript indicating that the cut-points for the standards based variables are estimated for each year of data.⁴ This parameterization is used for each of the standards-based indicators. This is done to account for the possibility that over time human rights monitoring agencies have applied increasingly stringent assessments of state behavior (Fariss, 2019). Put differently, this model accommodates the possibility that states have been subject to a changing standard of accountability regarding repressive behavior.

The event-based indicators retain the constant item difficulty cut-point parameterization:

{(α_{j k})}_{k = 1}^{K_{j}}

. The standard

Table I.

Existing latent variable models of repression

Model and description	Prior Distributions
Schnakenberg	Latent variable
& Fariss (2014)	Country-year latent variable	$θ_{i 1}$ $\sim$ $N (0, 1)$ , $θ_{i t}$ $\sim$ $N (θ_{i t - 1}, σ)$
Dynamic ordinal IRT model	Innovation parameter	$σ$ $\sim$ $U (0, 1)$
	Categorical indicators
	Slope	$β_{j}$ $\sim$ $G a m m a (4, 3)$
	Cut-points	$α_{j k}$ $\sim$ $N (0, 4)$
Likelihood function:
$ℒ = \prod_{i = 1}^{N} \prod_{t = 1}^{T} \prod_{j = 1}^{J} \underset{Ordinal Indicators}{\underset{︸}{[F (α_{j y_{i t j}} - θ_{i t} β_{j}) - F (α_{j y_{i t (j - 1)}} - θ_{i t} β_{j})]}}$
Fariss (2014)	Latent variable
Dynamic ordinal IRT model	Country-year latent variable	$θ_{i 1}$ $\sim$ $N (0, 1)$ , $θ_{i t}$ $\sim$ $N (θ_{i t - 1}, σ)$
with changing standard of	Innovation Parameter	$σ$ $\sim$ $U (0, 1)$
accountability
	Categorical indicators
	Slope	$β_{j}$ $\sim$ $G a m m a (4, 3)$
	Cut-points (event-based indicators)	$α_{j k}$ $\sim$ $N (0, 4)$
	Cut-points (standards-based indicators)	$α_{1 j k}$ $\sim$ $N (0, 4)$ , $α_{t j k}$ $\sim$ $N (α_{t - 1, j k},4)$
Likelihood function:
$ℒ = \prod_{i = 1}^{N} \prod_{t = 1}^{T} \prod_{j = 1}^{J} \underset{OrdinalIndicators}{\underset{︸}{[F (α_{j y_{i t j}} - θ_{i t} β_{j}) - F (α_{j y_{i t (j - 1)}} - θ_{i t} β_{j})]}}$ $\underset{Ordinal items (events-based indicators)}{\underset{︸}{{[F (α_{j y_{i t j}} - θ_{i t} β_{j}) - F (α_{j y_{i t (j - 1)}} - θ_{i t} β_{j})]}^{(1 - v_{j})}}}$

of accountability likely affects the documentation used to code event-based variables as well. However, unlike the CIRI, PTS, Hathaway, and ITT data projects, the event-based variables are not direct categorizations of documents but rather, are binary indicators that are coded 1 if sufficient documentary information exists in the historical record to support such a categorization. For the standards-based variables, the documents are directly categorized. Because the documents are never updated or revised, the standards-based variables are rarely updated. For the event-based variables, documentary evidence is taken from multiple sources to look for evidence that a particular type of repressive event occurred. If new documentary evidence emerges about a specific type of repressive event, the categorized value for the country-year is updated. The event-based categorization process is therefore able to address variation in the underlying documentation processes that generates information because these variables are each based on a set of different documents and are updated periodically. The standards-based coding process cannot directly account for this variation (Fariss, 2019).

A latent variable model for binary, ordered, and zero-inflated count processes

We can extend this latent variable model to take advantage of the event-counts from some of the event-based data sources that have been coarsened to binary indicators in existing models. This extension allows for the incorporation of more information into latent variable estimates.

The new model requires us to assume a parametric form for event-counts that fits the underlying data generating process. Specifically, we need to account for the fact that killings may be unobserved either because they did not occur, or because they did, but evidence of the event was unobserved. This second category likely features cases where killings were concealed due to information scarcity, potentially driven by deliberate concealment. We therefore use a zero-inflated, negative binomial probability distribution to link the latent repression variable with the event-count data.

\begin{array}{l} ℒ (β, α, θ, r | y) \\ = (p^{*} + (1 - p^{*}) [{(\frac{r}{e x p (α + θ_{i t} β) + r})}^{r}]) y_{i t} = 0 \\ + {((1 - p^{*}) [\frac{Γ (r + y_{i t})}{Γ (r) y_{i t}!} {(\frac{r}{e x p (α + θ_{i t} β) + r})}^{r} {(\frac{e x p (α + θ_{i t} β)}{e x p (α + θ_{i t} β) + r})}^{y_{i t}}])}^{y_{i t} > 0} \end{array}

As with the above models, we again are assuming that the observed indicators (here the amount of one-sided-killing) is a function of the underlying latent trait. We parameterize this with α, $β$ , r, and $p^{*}$ . The α and $β$ have a similar interpretation as in other latent models, and we unpack the other parameters below. Observations enter into different portions of likelihood function depending on whether a zero was observed or not. The first line of Equation 1 accounts for instances where no deaths were counted (i.e. $Y_{i t} = 0$ ). In this instance the zero count could be structural – that is, a zero was recorded because no killings occurred – with a probability of $p^{*}$ or it might result from a situation where killings may have occurred, but none were observed. The second case has a probability of $(1 - p^{*})$ multiplied by the probability of the negative binomial producing a zero count. The second line is for cases where there are non-zero counts (i.e. $Y_{i t} > 0$ ), and so is the probability of $(1 - p^{*})$ multiplied by the probability of observing a non-zero count from the negative-binomial distribution. $p^{*}$ is parametrized as:

p^{*} = F (α^{*} - θ_{i t} β^{*})

where $F (\cdot)$ again denotes the logistic cumulative distribution function.

The negative binomial likelihood also incorporates a rate parameter, r. This accounts for the degree of ‘over-dispersion’ in the count data by allowing the variance to increase. The variance is equal to $μ + \frac{μ^{2}}{r}$ where $μ$ is the expected count value, and is equal to $α + β * θ$ . r is assumed to be strictly greater than 0 and as it approaches 0 the negative binomial distribution converges to the Poisson distribution.

Extending the latent variable model of repression

Next we must integrate this negative binomial framework into the broader model of human rights so that UCDP Eck & Hultman (2007) data can be incorporated. Though our framework could be used to accommodate all count data, we chose to use the UCDP data as our primary data source for two reasons. First, these data have been widely used, widely scrutinized, and have relatively expansive spatial and temporal coverage. Second, as we discuss in more detail below, this is one of the only data sources that has categorical estimates of uncertainty, providing researchers with ‘low’, ‘best’, and ‘high’ fatality estimates.⁵

To integrate count data, we construct a model with the following likelihood function:

\begin{array}{l} ℒ = \prod_{i = 1}^{N} \prod_{t = 1}^{T} \prod_{j = 1}^{J} & \underset{OrdinalItems(Changingstandard)}{\underset{︸}{{[F (α_{t j y_{i t j}} - θ_{i t} β_{j}) - F (α_{t j y_{i t j} - 1} - θ_{i t} β_{j})]}^{(v_{j}) * (1 - c_{j})}}} * \\ ​ & \underset{OrdinalItems(Constantstandard)}{\underset{︸}{{[F (α_{j y_{i t j}} - θ_{i t} β_{j}) - F (α_{j y_{i t j} - 1} - θ_{i t} β_{j})]}^{(1 - v_{j}) * (1 - c_{j})}}} * \\ [ & {(p_{i t}^{*} + (1 - p_{i t}^{*}) [{(\frac{r}{e x p (α_{J} + θ_{i t} β_{J}) + r})}^{r}])}^{y_{i t j} = 0} + \\ ​ & \underset{CountIndicators(Constantstandard)}{\underset{︸}{{((1 - p_{i t}^{*}) [\frac{Γ (r + y_{i t j})}{Γ (r) y_{i t j}!} {(\frac{r}{e x p (α_{J} + θ_{i t} β_{J}) + r})}^{r} {(\frac{e x p (α_{J} + θ_{i t} β_{J})}{e x p (α_{J} + θ_{i t} β_{J}) + r})}^{y_{i t j}}])}^{y_{i t j} > 0}]^{(1 - v_{j}) * (c_{j})}}} \\ ​ & ​ \end{array}

where v_j and c_j are indicator variables that determine which portion of the likelihood function a particular manifest variable should be passed through. For standards-based indicators $v_{j} = 1$ and $c_{j} = 0$ ; for events-based indicators $v_{j} = 0$ and $c_{j} = 0$ ; and for the UCDP count data $v_{j} = 0$ and $c_{j} = 1$ .

When constructing the model, one important choice was how to treat the ‘low’, ‘best’, and ‘high’ variables. One option would have been to treat these as three independent indicators, and assign each their own difficulty and discrimination parameters. That is, we would assume that they are conditionally independent and only a function of the latent variable $θ_{i t}$ . A useful analogy for this would be three different coders, one liberal (high), one conservative (low), and one moderate (best). We did not adopt this approach for two reasons. First, treating the three estimates as independent of one another ignores their interdependence and instead assumes that each reflects a distinct manifestation of the latent trait. Second, as we detail below, this would inhibit our ability to generate a single predicted distribution of one-sided-killing for all country-years.

We therefore used an alternative, and potentially more realistic model parameterization that considers these values as the result of one coder or set of coders deliberately attempting to generate estimates of an unknown, true count of one-sided-killing, $y_{i t}^{*}$ . Because this quantity is not observed, coders provide an estimate of this quantity itself, $y_{i t - b e s t}$ , and two additional estimates $y_{i t - l o w}$ and $y_{i t - h i g h}$ to produce a simple distribution around this mean to reflect uncertainty in the estimate of $y_{i t}^{*}$ . In other words, this approach removes the assumption that the low, best, and high estimates are observed independently and instead assumes that the variation across these three indicators reflects measurement error around the unobserved, true number of killings.

This assumption is reflected in the notation for count-indicators, where single $α_{J}$ , $β_{J}$ , $α^{*}$ , $β^{*}$ , and r parameter values are estimated for all three one-sided government killing outcomes: ${b e s t, l o w, h i g h}$ . The subscript on these item-specific parameters is J to denote that these parameters are assumed to be the last value in the j vector of α and $β$ parameters and therefore the same for each of the three government killing count variables.

Along with generating improved estimates of the latent trait, this model specification generates two additional substantive quantities of interest. First, as detailed above, the model can be used to estimate $p_{i t}^{*}$ , which captures uncertainty related to whether it was possible to observe one-sided-killing in a given country-year. Second, to generate country-year predictive distributions for one-sided-killings, we leverage a useful property of latent variable models – that estimates of the latent trait ( $θ_{i t}$ ) and item-specific parameters can be used to produce predictions for each manifest indicator $y_{i t j}$ . For the event-counts, the expected value of one-sided-killing is:

E (y_{i t}) = (1 - p_{i t}^{*}) exp (α_{J} + β_{J} θ_{i t})

Often, these predictions are used as a form of model-checking (Gelman & Hill, 2007). Yet, relaxing the assumption that the manifest indicators are measured without error, these posterior predictions are also a useful means of approximating uncertainty around the indicators themselves. A desirable feature of this modeling framework is that predictions for $y_{i t j}$ can be generated regardless of whether this indicator was observed within a particular year. We therefore generate predictive distributions for one-sided-killing both for years where UCDP did not find reliable documentary evidence of one-sided-killing resulting in at least 25 fatalities and for years that are outside the temporal domain of the UCDP dataset (1946–88). With regard to the first set of cases, this allows researchers to weaken the assumption that zero killings took place for country-years not included in the UCDP data. Thus, we can identify countries where killings were not observed, but were probabilistically likely to have occurred based on the high levels of other repression variables.

Uncertainty around the number of killings is also quantifiable because the prior distribution of each of the model parameters and the latent variable allows for the approximation of the posterior distribution of each country-year distribution of one-sided government killing counts. Country-year heterogeneity is driven by either increased uncertainty in $θ_{i t}$ , which captures the latent degree of repression in a country-year and is a function of variation between the ‘low’, ‘best’, and ‘high’ estimates and the other manifest indicators, and uncertainty in $p_{i t}^{*}$ . Conversely, when the human rights indicators all point in a similar direction and there is less variation in the ‘low’, ‘best’, and ‘high’ indicators, we expect more precise estimates of one-sided-killing. While this modeling structure offers meaningful extensions to conventional techniques, broader challenges to estimating count data nevertheless remain. Most importantly, the number of primary sources available for each country varies and the quality and reliability of the information contained in each document vary as well. The model parameterizes each of these variables, which will eventually allow researchers to make probabilistic statements about the relative quality of the information used in the estimation itself. We leave this task for future research.

Priors and estimation

The parameters for the binary and ordered data are given the same distributions as in Fariss (2014) (Table 1) with one exception. Recent work has applied robust-modeling techniques as a means of improving latent variable model estimates (Reuning, Kenwick & Fariss, 2019). Specifically, the conventional assignment of a standard normal prior to the latent trait is substituted with a Student’s T distribution using the following prior specification on the latent trait and innovation parameter:

θ_{i 1} \sim T_{1, 000} (0, 1) \forall i \in [1, N]

θ_{i t} \sim T_{4} (θ_{i (t - 1)}, σ) \forall i \in [1, N] and \forall t \in [2, T]

σ \sim N (0, 3) I (σ > 0)

The wider tails of the Student’s t-distribution allow estimates of the latent trait to experience sudden changes within a given time series. This is a desirable modeling innovation as repression levels may change rapidly because of regime change, military coups, or the onset of rebellion.

We now need to assign priors to the parameters for our count data. $α_{J}$ , $β_{J}$ , $α^{*}$ and $β^{*}$ are given the same priors as the $α_{j}$ and $β_{j}$ priors for the other binary manifest indicators. The rate parameter is given the following prior:

r \sim gamma (1, .5)

Results and validation

Validating estimates of respect for human rights and one-sided-killings is difficult because we cannot observe the ‘true’ values with complete certainty (Adcock & Collier, 2001). In our application, we are particularly concerned with whether our model fits the data reasonably well and whether the estimates of one-sided-killing produced are valid.⁶

We assess the validity of our model using several types of criterion-related validity checks, which validate a measure based on its relationship with existing measures and determining whether each behave in theoretically plausible ways (Trochim & Donnelly, 2008: 59). First, we compare the model’s estimates to previous latent estimates of repression, which is one type of convergent validation check. For convergent validity, the latent variable estimates should closely relate to other measures that are known to be valid measures of the concept of interest. Second, we conduct a posterior predictive check of how well our model’s estimates correspond to the data used to generate the model (Gelman & Hill, 2007). Specifically, we examine the correlation between model predictions of one-sided-killing and the original UCDP variables. A strong correlation would be evidence that our model fits the data relatively well, which itself functions as a form of predicted validity check (Trochim & Donnelly, 2008: 57). Third, we take advantage of the expanded temporal

Figure 1.

Comparing estimates of latent respect for physical integrity rights

scope of our data to track changes in one-sided-killing over time. As a convergent validity check, we determine whether the predicted number of one-sided-killings corroborates recent findings of a decline in other forms of political violence such as fatalities during war (Goldstein, 2011; Lacina, Gleditsch & Russett, 2006). Fourth, we assess the predicted count variables through the examination of the Democratic Republic of Congo and the ten cases most likely to have experienced one-sided-killing that are not covered by UCDP. Such assessments of deviant cases is a concurrent validity check that can use a case with an unexpected value, a positive count that is missing from the UCDP dataset, to understand if the estimates from our model are in line with qualitative information about that case (Seawright, 2016; Trochim & Donnelly, 2008).⁷

Latent variable estimates of the respect for physical integrity rights

Figure 1 displays the mean estimates of the latent trait ( $θ_{i t}$ ), respect for physical integrity rights, across different

Figure 2.

Predictions counts of one-sided-killing along values of the latent trait

model specifications. Comparisons are made between the model including counts of killings and those originally produced by Fariss (2014). The addition of count-based data into the model produces more variation between the latent traits, as is reflected in the dispersion of estimates along the diagonal line, which would otherwise indicate perfect agreement in model estimates. Substantively, these patterns suggest that introducing count-based indicators uncovers more granular estimates for respect for physical integrity rights across country-years when observing one-sided-killings.

Model predictions of one-sided-killing

Figure 2 displays predictions across values of $θ_{i t}$ along with the observed UCDP ‘low’, ‘best’, and ‘high’ counts. The red line in the main figure corresponds to the mean posterior prediction of one-sided-killings given that killing is observed, while the shaded region corresponds to the 95% credible interval around the prediction. The curved red line in the bottom section of the panels is the probability of observing killings given the estimate of $θ_{i t}$ . These lines are the same across the different figures as we only estimate one set of parameters for the three different UCDP counts.

One-sided-killings reduce to zero at approximately the mean value of latent repression estimates. The UCDP dataset begins recording frequent observed instances of one-sided-killing at approximately one standard deviation (–1.0) below the mean value of the ‘true’ level of repression. The magnitude of the predictions increases as the latent variable decreases. Though only Rwanda (1994) nears the maximum observed value, the model makes predictions that accord with earlier episodes of domestic

Figure 3.

Predictions of one-sided-killing for country-years omitted from the UCDP data

political violence that occurred prior to 1989 when the coverage of the UCDP conflict dataset begins.

There are several observations where the UCDP data do not identify one-sided-killings – reflected as zeros for the low count in these country-years – but our model generates non-zero predictions. Figure 3 reports predictions for this subset of observations. For most country-years, the model produces predictions tightly clustered around 0, which is consistent with the decision to exclude them from the UCDP data. For observations that are otherwise low on the latent trait, however, the model predicts non-zero values for one-sided-killings.

Figure 4.

Predicted one-sided-killing for worst country-years with no reported killings in UCDP data

Next, we focus in on the cases where UCDP reports no one-sided-killing while our model predicts high one-sided-killings. In Figure 4 we show the top ten country-year units for unobserved one-sided-killings. Afghanistan in 1989 is the most extreme case. We estimate a median of 1,098 killings taking place, with a 33% to 66% range of 649 to 1,778. In 1989 the USSR withdrew from Afghanistan starting a civil war that would last until 1992. Iraq in 1989 is the next with a median estimate of 705 and a range of 470 to 1,024. This was the last year of the Anfal genocide where the Iraqi government systematically massacred Kurdish Iraqis. Although the bulk of the killings occurred just prior to this date, documentary evidence suggests active repression campaigns persisted, and our estimates suggest that one-sided-killings remained probable. Eight of ten of the cases we identify are between 1989 and 1991, and almost all of these cases were countries caught between the United States and the Soviet Union during the end of the Cold War. The only cases that are not from this period are Sudan in 2009 and 2010, where we have median one-sided-killing estimates of 190 and 179 respectively. Reports from Human Rights Watch mention civilian bombing in Sudan during these years in addition to reports of 200 individuals that were ‘disappeared’ between 2008 and 2009.

Figure 5 displays the Spearman correlations between the model’s predictions of one-sided-killing and the original UCDP ‘low’, ‘best’, and ‘high’ counts. The median Spearman rank correlation for the low estimates is 0.69, while for the best estimates it is 0.80 and for the high it is 0.82. The correlations along with the other examinations of the predictions of one-sided-killing tell us that the model does a good job of fitting the observed and unobserved data.

Changes in government killing over time

Because of the temporal coverage of the other human rights variables, our model produces estimated counts of one-sided-killings beginning in 1946, allowing us to conduct a convergent validity test by examining whether our measure can corroborate existing findings that the level of violence has declined over time. Figure 6 displays the total number of one-sided-killings each year. Each annual count is created by taking draws from the posterior of each country’s predicted one-sided-killing for a given year and then summing across all countries. Readers should keep in mind that our predictions of one-sided-killing are based primarily on our country-year estimates of the latent respect for human rights ( $θ_{i t}$ ), which is in turn informed by the available manifest indicators. From 1946 to 1988, the UCDP data are missing and our predictions are therefore based on information conveyed in the categorical human rights indicators; this

Figure 5.

Correlations between model predictions of one-sided-killings with UCDP low, best, and high estimates

loss of data results in an increase in predicted uncertainty during that time.⁸

The model suggests that the total number of one-sided-killings was relatively low starting in 1946 before increasing in the mid 1950s. This increase is driven in part by the independence of states like Sudan, who had violent entries into the international system. Estimates remain high throughout the Cold War; more than a million one-sided-killings occurred each year. The number dropped into the high thousands during the 1990s (other than during the Rwandan genocide in 1994) and most recently to just below 1,000 (these deaths do not include extra-judicial killings that occur in custody). These estimates corroborate the results from other studies that find a decline in fatalities during war (Goldstein, 2011; Lacina, Gleditsch & Russett, 2006), a decline in the level of violence more generally (Pinker, 2011), and improvements in respect for human rights (Fariss, 2014, 2019). All of these authors point out that the decline in violence has not been steady – a fact thrown into stark relief as recent conflicts in Ukraine, Venezuela, and Syria presage heightened violence.

Country example: The democratic Republic of Congo (1993–96)

Documenting repressive events in any country is difficult because of limited resources and limited access to areas in which repressive acts take place (e.g. Brysk, 1994). This is especially true in places such as the Democratic Republic of Congo, which, over the last

Figure 6.

Model based estimates of the yearly number of one-sided government killings beginning in 1946 and ending in 2017

two decades, has experienced two large-scale internationalized conflicts with armed participants from multiple countries, as well as internecine violence between armed groups of militia with even more varied affiliations than the state-sponsored combatants (Schatzberg, 2012). Acquiring information in such an environment is, not surprisingly, challenging (Sundaram, 2014).

Though the best open-source information – much of which is provided by journalists and monitors on the ground – is used by the UCDP coders, the estimates given are still just that: estimates. Figure 7 displays distributions of the number of one-sided-killings for the Democratic Republic of Congo (1993-1996). Each plot contains the simulated distribution of potential values, the median prediction from our model, and the original UCDP ‘low’, ‘best’, and ‘high’ counts. 1994 is notable because it is the only year for which UCDP was unable to uncover direct evidence of 25 killings or more. Nevertheless, an absence of killing is unlikely, given qualitative evidence, the values on the other observed repression variables, and the UCDP estimates from 1993 and 1995.

According to the US State Department Human Rights reports, in the Democratic Republic of Congo,

Figure 7.

Country-year distribution of the number of one-sided government killings

then Zaire, ‘[p]rovincial officials continued to incite ethnic strife leading to massive displacement and deaths in Shaba, although on a smaller scale than the unprecedented violence in 1993’. The report goes on to provide more detail about the scale of this type of repressive event, stating (1) that the ‘undisciplined security forces committed numerous extrajudicial killings’; (2) that ‘[h]uman rights observers, the press and eyewitnesses reported several dozen such fatal altercations, many committed by uniformed personnel’; and (3) that ‘[i]t is highly likely that additional incidents went unreported, especially in Zaire’s remote interior’ (see more details about the information for this case in the Online appendix).

Though the frequency of the events was less than prior years, security forces engaged in the extrajudicial killing of civilians in 1994. Because the information environment for this case was poor, sufficient reliable information was not available for this case to enter the UCDP database. Our latent variable model produces a distribution of potential estimates for this case, with a median estimate of 61 deaths.

Conclusion

The measurement model and validation tests presented in this article contribute to a growing research area on using measurement models like this to improve the validity of the variables used to study peace and conflict (e.g. Anders, 2020; Barnum & Lo, 2020; Clay et al., 2020; Cordell et al., 2020; Huddleston, 2020; Krüger & Nordås, 2020; Marquardt, 2020; Meserve & Pemstein, 2020; Montal, Potz-Nielsen & Sumner, 2020; Terechshenko, 2020). Our research makes important improvements in the measurement and understanding of repressive events by linking together count data and categorical variables of repression. In particular, our modeling strategy leverages disagreements in event-counts within data sources, which allows us to generate country-year distributions of estimates of one-sided-killings as part of the latent variable model of repression. Disagreements between event-count estimates exist because of the reporting incentives of monitoring organizations and a lack of transparency or resources with which to completely observe all repressive events. This framework allows us to bring together different sources of information about repression and assess how well each piece of information works together and then, based on assumptions about the way the information was produced, modify, validate, and update the model. We conclude by noting three remaining threats to measurement validity. For each point, the expanding set of human rights data and increasing adoption of new measurement modeling techniques promise to yield additional insights about counting repression.

First, additional data collection for cases where UCDP does not report reliable counts above 25 deaths would allow us to further refine the latent variable model estimates. There is a general challenge in the study of repressive practices because scholars and activists tend to focus monitoring capacity and attention on the most violent cases (Brysk, 1994; Eck & Fariss, 2018). Additional knowledge of the cases not likely to have involved one-sided-killing would improve the performance of the model. Relatedly, our model assumes that the standard of accountability for UCDP one-sided-killing data does not change. The UCDP data, like other event-based variables, are created using many sources and updated over time. These features of the coding process help to account for bias from particular sources, which makes the event-based variables suitable to act as a baseline for comparison with the standards-based variables that do not share this feature.

Second, and building on the first point, scholars are beginning to acknowledge and quantify disagreements between different sources of information. Such efforts should assuage concerns about models that use event-counts from disparate sources of information. Recent research has exploited multiple systems evaluation and capture-recapture models as a promising means of leveraging disagreements between sources to produce more accurate accounts of repressive events (e.g. Hendrix & Salehyan, 2015; Krüger et al., 2013). These analyses are limited to a smaller number of spatial and temporal units. As source-specific information becomes increasingly available, our latent variable measurement strategy provides a principled, model-based approach for incorporating information from new count-based estimation procedures. Linking information from multiple-systems estimation and latent variable models is an important area of new methodological research that directly confronts this challenge.

Third, researchers and activists want to make inferences about more than just country-year units. New data collection efforts are beginning to acknowledge and understand the roles of different state actors who commit human rights violations and the different groups that are targeted. To date, the ITT data project (Conrad, Haglund & Moore, 2013) and the UCDP data project (Eck & Hultman, 2007; Pettersson, Högbladh & Öberg, 2019) are the only data efforts that systematically collect repression data about targets, agents, or non-state actors for all states. Other event-based data collection efforts exist and are also beginning to provide some of this information for specific regions (e.g. Saleyhan et al., 2012). The models presented in this article are capable of systematically linking diverse sources of information and multiple levels of information in one model (e.g. country-year, country-year-actors, country-year-victims, country-year-regions). Each of these measurement challenges represents opportunities for new theorizing, new data collection, and new measurement modeling.

Footnotes

Replication data

The estimates from this article, along with the code necessary to implement the models in STAN and R, are publicly available at a dataverse archive: https://doi.org/10.7910/DVN/7C7KPU. The Online appendix can be found at .

Acknowledgements

We thank James Lo, Zhanna Terechshenko, and three anonymous reviewers for many helpful comments and suggestions.

Funding

Fariss acknowledges research support from the SSK (Social Science Korea) Human Rights Forum, the Ministry of Education of the Republic of Korea, and the National Research Foundation of Korea (NRF- 2016S1A3A2925085).

ORCID iD

Christopher J Fariss

Notes

References

Adcock

Robert

Collier

David

(2001) Measurement validity: A shared standard for qualitative and quantitative research. American Political Science Review 95(3): 529–546.

Anders

Therese

(2020) Territorial control in civil wars: Theory and measurement using machine learning. Journal of Peace Research 57(6): 701–714.

Barnum

Miriam

James

(2020) Is the NPT unraveling? Evidence from text analysis of review conference statements. Journal of Peace Research 57(6): 740–751.

Brysk

Alison

(1994) The politics of measurement: The contested count of the disappeared in Argentina. Human Rights Quarterly 16(4): 676–692.

Clay

K Chad

Bakker

Ryan

Brook

Anne-Marie

Hill

Daniel W

Jr Murdie

Amanda

(2020) Using practitioner surveys to measure human rights: The Human Rights Measurement Initiative’s civil and political rights metrics. Journal of Peace Research 57(6): 715–727.

Conrad

Courtenay R

Haglund

Jillienne

Moore

Will H

(2013) Disaggregating torture allegations: Introducing the ill-treatment and torture (ITT) country-year data. International Studies Perspectives 14(2): 199–220.

Cordell

Rebecca

Skrede Gleditsch

Kristian

Kern

Florian G

Saavedra-Lux

Laura

(2020) Measuring institutional variation across American Indian constitutions using automated content analysis. Journal of Peace Research 57(6): 777–788.

Davenport

Christian

Ball

Patrick

(2002) Views to a kill: Exploring the implications of source selection in the case of Guatemalan state terror, 1977–1995. Journal of Conflict Resolution 46(3): 427–450.

Eck

Kristine

Fariss

Christopher J

(2018) Ill treatment and torture in Sweden: A critique of cross-case comparisons. Human Rights Quarterly 40(3): 591–604.

10.

Eck

Kristine

Hultman

Lisa

(2007) Violence against civilians in war. Journal of Peace Research 44(2): 233–246.

11.

Fariss

Christopher J

(2014) Respect for human rights has improved over time: Modeling the changing standard of accountability in human rights documents. American Political Science Review 108(2): 297–318.

12.

Fariss

Christopher J

(2019) Yes, human rights practices are improving over time. American Political Science Review 113(3): 868–881.

13.

Fariss

Christopher J

Dancy

Geoff

(2017) Measuring the impact of human rights: Conceptual and methodological debates. Annual Review of Law and Social Science 13: 273–294.

14.

Gelman

Andrew

Hill

Jennifer

(2007) Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press.

15.

Goldstein

Joshua S

(2011) Winning the War on War: The Decline of Armed Conflict Worldwide. New York: Dutton.

16.

Harff

Barabara

(2003) No lessons learned from the Holocaust? Assessing risks of genocide and political mass murder since 1955. American Political Science Review 97(1): 57–73.

17.

Harff

Barbara

Gurr

Ted R

(1988) Toward empirical theory of genocides and politicides: Identification and measurement of cases since 1945. International Studies Quarterly 32(3): 359–371.

18.

Hathaway

Oona A

(2002) Do human rights treaties make a difference? Yale Law Journal 111(8): 1935–2042.

19.

Hendrix

Cullen S

Salehyan

Idean

(2015) No news is good news? Mark and recapture for event data when reporting probabilities are less than one. International Interactions 41(2): 392–406.

20.

Huddleston

R Joseph

(2020) Continuous recognition: A latent variable approach to measuring international sovereignty of self-determination movements. Journal of Peace Research 57(6): 789–800.

21.

Jackman

Simon

(2008) Measurement. In: Box-Steffensmeier

Janet M

Brady

Henry E

Collier

David

(eds) The Oxford Handbook of Political Methodology. Oxford University Press, 119–152.

22.

Kenwick

Michael R

(2020) Self-reinforcing civilian control: A measurement-based analysis of civil-military relations. International Studies Quarterly 64(1): 71–84.

23.

Krüger

Jule

Nordås

Ragnhild

(2020) A latent variable approach to measuring wartime sexual violence. Journal of Peace Research 57(6): 728–739.

24.

Krüger

Jule

Ball

Patrick

Price

Megan E

Green

Amelia Hoover

(2013) It doesn’t add up: Methodological and policy implications of conflicting casualty data. In: Seybolt

Taylor

(ed.) Counting Civilian Casualties: An Introduction to Recording and Estimating Nonmilitary Deaths in Conflict. Oxford: Oxford University Press, 247–264.

25.

Lacina

Bethany

Gleditsch

Nils Petter

Russett

Bruce M

(2006) The declining risk of death in battle. International Studies Quarterly 50(3): 673–680.

26.

Marquardt

Kyle L

(2020) How and how much does expert error matter? Implications for quantitative peace research. Journal of Peace Research 57(6): 692–700.

27.

Meserve

Stephen A

Pemstein

Daniel

(2020) Terrorism and internet censorship. Journal of Peace Research 57(6): 752–763.

28.

Montal

Florencia

Potz-Nielsen

Carly

Lawrence Sumner

Jane

(2020) What states want: Estimating ideal points from international investment treaty content. Journal of Peace Research 57(6): 679–691.

29.

Pettersson

Therese

Högbladh

Stina

Öberg

Magnus

(2019) Organized violence, 1989–2018 and peace agreements. Journal of Peace Research 56(4): 589–603.

30.

Pinker

Steven

(2011) The Better Angels of Our Nature: Why Violence Has Declined. New York: Viking.

31.

Poe

Steven C

(2004) The decision to repress: An integrative theoretical approach to the research on human rights and repression. In: Carey

Sabine C

Poe

Steven C

(eds) Understanding Human Rights Violations: New Systematic Studies. Aldershott: Ashgate, 16–42.

32.

Reuning

Kevin

Kenwick

Michael R

Fariss

Christopher J

(2019) Exploring the dynamics of latent variable models. Political Analysis 27(4): 503–517.

33.

Rummel

Rudolph J

(1994) Power, genocide and mass murder. Journal of Peace Research 31(1): 1–10.

34.

Saleyhan

Idean

Hendrix

Cullen S

Hamner

Jesse

Case

Christina

Linebarger

Christopher

Stull

Emily

Williams

Jennifer

(2012) Social conflict in Africa: A new database. International Interactions 38(4): 503–511.

35.

Schatzberg

Michael G

(2012) The structural roots of the DRC’s current disasters: Deep dilemmas. African Studies Review 55(1): 117–121.

36.

Schnakenberg

Keith E

Fariss

Christopher J

(2014) Dynamic patterns of human rights practices. Political Science Research and Methods 2(1): 1–31.

37.

Seawright

Jason

(2016) The case for selecting cases that are deviant or extreme on the independent variable. Sociological Methods & Research 45(3): 493–525.

38.

Steinert-Threlkeld

Zachary C

(2017) Spontaneous collective action. American Political Science Review 111(2): 379–403.

39.

Sundaram

Anjan

(2014) Stringer: A Reporter’s Journey in the Congo. New York: Doubleday.

40.

Taylor

Charles Lewis

Jodice

David A

(1983) World Handbook of Political and Social Indicators, 3rd edition. Volume 2, Political Protest and Government Change . New Haven, CT: Yale University Press.

41.

Terechshenko

Zhanna

(2020) Hot under the collar: A latent measure of interstate hostility. Journal of Peace Research 57(6): 764–776.

42.

Trochim

William MK

Donnelly

James P

(2008) Research Methods Knowledge Base, 3rd edition. Mason, OH: Atomic Dog.

43.

Wayman

Frank W

Tago

Atsushi

(2010) Explaining the onset of mass killing, 1949–87. Journal of Peace Research 47(1): 3–13.