Abstract
This article introduces the Ethnic One-Sided Violence dataset (EOSV) that provides information on the ethnic identity of civilian victims of direct and deliberate killings by state and non-state actors from 1989 to 2013. The EOSV dataset disaggregates the civilian victims in the one-sided violence dataset from the Uppsala Conflict Data Program by identifying which ethnic group they belong to, using the list of politically relevant ethnic groups from the Ethnic Power Relations data. By providing information on the ethnic targets of violence, EOSV enables researchers to explore new questions about the logic and dynamics of violence against civilians.
Introduction
Campaigns of armed violence against civilians on behalf of state forces and non-state armed groups often cause severe harm to civilian populations. Case-based evidence suggests that ethnicity is a salient dimension when accounting for patterns of civilian abuse by state and non-state actors alike (cf. Prunier, 2005, 2008; Straus, 2015; Weidmann, 2011). Indeed, a cursory review of contemporary conflicts, in places such as Sudan, Afghanistan, Myanmar, Iraq and Mali, suggests that ethnic markers represent primary criteria based on which civilians are targeted. Yet a major obstacle to a more systematic and comparative investigation of the ethnic dimension of civilian victimization has been the lack of data across cases over time. Whereas information on perpetrators of civilian abuse is available in a global time series dataset, there are no comparable data on the ethnic identity of civilian victims. As a result, we still have a very limited understanding of when and why ethnic one-sided violence occurs, and why some groups of civilians are targeted while others are not.
The lack of data on civilian victims’ identities represents a significant hurdle for the current advancement of conflict research. Over the past decade, the analysis of a number of recent datasets has facilitated a more nuanced and comprehensive understanding of the role of ethnic mobilization and ethnic exclusion for the outbreak and duration of armed conflict pitting armed non-state actors against state governments (e.g. Cederman et al., 2010; Wucherpfennig et al., 2012). Meanwhile, we lack comparative data to address how ethnicity shapes the strategies and dynamics of violence against civilian populations—violence that occurs both within and outside the context of armed conflict (Gohdes, 2017).
To address this gap, this article presents a new dataset on the ethnic affiliation of victims of direct and deliberate killing by state and non-state actors. The Ethnic One-Sided Violence dataset (EOSV) is the first dataset with global coverage that codes the ethnic identity of victims of one-sided violence. It covers the period 1989–2013. Specifically, the EOSV dataset disaggregates the civilian victim category in the Uppsala Conflict Data Program (UCDP) One-Sided Violence (OSV) dataset (Allansson et al., 2017; Eck and Hultman, 2007) by specifying the victims’ ethnic affiliation. Based on a careful reading of the original records of the OSV events and a number of secondary sources, the EOSV dataset records—for each perpetrator-year in the OSV dataset—the ethnic identity of all of the victim groups that can be identified among the civilians killed. Each ethnic victim group is coded using the identifiers in the Ethnic Power Relations (EPR) dataset that provides records of all politically relevant ethnic groups worldwide (Cederman et al., 2010; Vogt et al., 2015). In addition to the ethnic identity of the victims of one-sided violence, the EOSV data identify spells of deliberate ethnic targeting, i.e. one-sided violence in which victims were predominantly targeted on the basis of their ethnic identity.
The consistency with UCDP and EPR data allows the researcher to explore various links between ethnic power relations, perpetrator characteristics and the victims of one-sided violence. The EOSV dataset thus holds the potential for significantly enhancing our understanding of the conditions under which ethnic targeting occurs, and which population groups are at particular risk. This is knowledge with high policy relevance. Ethnic violence against civilians is associated with tremendous direct and immediate physical harm for those affected. At the societal level, collective targeting along ethnic group lines is also likely to polarize societies, complicate the prospects for reaching peaceful settlement to armed conflict and undermine the conditions for a durable peace. Our new data provide a firmer empirical base to probe the causes and consequences of ethnic one-sided violence, a critical step to identify policies that can prevent its occurrence.
We begin this article by reviewing recent scholarship on violence against civilians to motivate our effort to collect a new dataset of ethnic one-sided violence. We then proceed to describe our conceptual approach and data collection procedures. After presenting some descriptive trends over time and across cases in the EOSV data, we proceed to discuss the compatibility of our data with other sources and present an empirical illustration of how our data can be used. The final section points to some potentially fruitful new avenues for research that are opened up by this new data source.
Motivation for a new dataset
The question of why state and non-state actors intentionally kill unarmed civilians has received vast scholarly attention over the past decade (e.g. Balcells, 2010; Kalyvas, 2006; Stanton, 2016; Ulfelder and Valentino, 2008). This literature relies on a number of diverse sources, including micro-level surveys, carefully collected qualitative evidence, and cross-sectional data. The systematic study of violence against civilians has also been spurred by the more recent availability of large-N, comparative data on its occurrence. Much recent work relies on the UCDP One-Sided Violence (OSV) dataset to understand the conditions under which civilians are more likely to be subject to violent campaigns. This dataset includes all direct and deliberate killings of civilians by state and non-state actors that claim at least 25 fatalities per calendar year, with a global coverage from 1989 and onwards. 1 The OSV data represent an important complement to other existing data that either deal only with mass killings or genocidal violence, such as data collected by Barbara Harff and Ted Gurr, in part together with the State Failure Task Force (e.g. Harff, 2003; Krain, 1997; Ulfelder and Valentino, 2008; Valentino et al., 2004), or are not restricted to direct and deliberate killings of non-combatants as a separate phenomenon, such as the Armed Conflict and Location Dataset (Raleigh et al., 2010). 2 Whereas datasets that cover individual conflicts, such as Greece, Bosnia, or Colombia, typically offer a higher level of detail (cf. Kalyvas, 2006; Restrepo et al., 2006; Weidmann, 2011), the global and long-term coverage of the OSV dataset makes it uniquely positioned to highlight general patterns in civilian targeting.
One shortcoming with the original OSV dataset, which it shares with most other large-N data on civilian killings, is the lack of information on who the civilians are. Generally, whereas information on perpetrators is available on a global scale, existing data do not offer comparable information on the victims’ collective identities. We are only aware of two datasets that record violence against civilians across a larger number of cases that also systematically record victim attributes, namely the Global Terrorism Database (National Consortium for the Study of Terrorism and Responses to Terrorism (START), 2016) and the Konstanz One-Sided Violence Event Dataset (Schneider and Bussmann, 2013). In their global, time series data of terrorist attacks, the Global Terrorism Database codes target attributes, including for example victim nationality. Yet the data only cover non-state actors and are limited to acts of terrorism (which corresponds to only one subset of the one-sided violence data). 3 The data of the Konstanz One-Sided Violence Event Dataset offer detailed information on acts of one-sided violence that claim at least one fatality, including, for a subset of the cases, the ethnicity of the victims. Yet, whereas the data are rich in detail about the events, the dataset only includes selected time periods for 17 non-randomly selected armed conflicts. Hence, despite the existence of these two valuable datasets, there is a need for global time series data on the ethnic dimension of civilian victimization. Studies of individual conflicts or regions have corroborated theories that make predictions about violence along ethnic lines in conflict situations (e.g. Kaufman, 1996, 2006; Melander, 2009; Posen, 1993; Weidmann, 2011). Yet comparative analysis of these claims has been hampered by the lack of global, time series data on the identity of civilian victims. Existing studies have drawn inferences regarding ethnic targeting based on national level ethnic configurations (Kim, 2010; Querido, 2009), whether the conflict is fought along identity lines (Valentino et al., 2004; Valentino and Croco, 2006), or information about ethnic settlement patterns combined with the location of wartime civilian abuse (Fjelde and Hultman, 2014; Sullivan, 2012). Yet, since ethnicity and ethnic targeting are inferred rather than ascertained from data, these approaches are suboptimal for identifying the conditions under which ethnicity becomes a salient dimension of civilian killings. Sometimes this inference is also, at least partly, based on plausible explanatory variables, such as the ethnic geography of a polity. In sum, we need data on the ethnic identity of victims of violence to assess theoretical claims about ethnic tensions and civilian victimization.
The ethnic one-sided violence dataset
We introduce a new global dataset on the ethnic identity of civilian victims of one-sided violence. The EOSV is the first global dataset that records the ethnic identity of the victims of deliberate lethal violence against non-combatants by state and non-state actors.
The EOSV dataset is the result of an extensive data collection effort to identify the ethnic identity of the civilian victims recorded in the OSV dataset from the UCDP (Allansson et al., 2017; Eck and Hultman, 2007). The OSV dataset records all instances where an organized armed actor, either a state or a non-state armed group, has been responsible for directly and deliberately killing at least 25 civilians (unarmed non-combatants) in one calendar year (Eck and Hultman, 2007). Whereas the OSV dataset is released in a geo-referenced event-based format, which provides information about the number, date and location of civilians killed in deliberate and direct attacks, there is no information about the characteristics of the civilian victims.
For each perpetrator-year in the original OSV dataset, the EOSV dataset unpacks the civilian victim category—and records each specific ethnic group that can be identified among the victims. Through a comprehensive coding effort, the EOSV data collection back-tracks the documentation underlying the original coding of the OSV dataset by re-reading all individual media reports, non-governmental organization (NGO) reports, and other case-specific documentation provided in case studies and the coding documentation that support the original coding of the event. 4 In addition to these, we also rely on other relevant sources, notably Amnesty International, Human Rights Watch, and the UN, as well as local NGOs, truth commissions and other country-specific sources. 5 Below we describe the data collection procedures in more detail. 6
To code the ethnic identity of civilian victims, we start with a candidate list of ethnic groups: the EPR dataset (Cederman et al., 2010; Vogt et al., 2015). The EPR dataset provides a list of all politically relevant ethnic groups worldwide. It defines ethnicity as a subjectively experienced sense of commonality based on a belief in common ancestry and shared culture (Weber, 1978). An ethnic group is considered politically relevant if at least one political organization has claimed to represent its interests at the national level, or if its members are subjected to state-led political discrimination (Vogt et al., 2015: 1329). Relying on the EPR dataset for our candidate list ensures that we code ethnic group affiliation at a level of granularity where the ethnic classification is politically salient. 7 The EPR also provides users of the EOSV dataset with a number of additional variables to classify ethnic group attributes (some of which we discuss below in relation to our empirical application using the data). The EPR dataset itself does not, however, provide any information about the occurrence of violence directed against particular groups.
Assessing the ethnic identity of civilian victims from the source documentation presents challenges. As a general rule, we only assign ethnic group identifiers if the ethnic identity of civilian victims is explicitly mentioned in at least one of the sources that describes the event. This implies, for example, that the geographically concentrated occurrence of one-sided violence in areas primarily inhabited by a specific ethnic group is not sufficient to assign fatalities to this specific group. Instead, we corroborate the ethnic character of violence through independent sources that explicitly refer to the ethnic identity of civilian victims. As an exception to this rule, we infer the ethnic identity based on implicit information where the context in which one-sided violence occurs strongly implies a particular ethnic identity. An example is the Kurds in Turkey where victims of government violence are often not explicitly referred to as Kurds in the sources, but where geographical information combined with known patterns of targeting, corroborated by other sources such as human rights reports or case studies, clearly points to the ethnic identity of the victims. 8
Note that the coding of ethnic victim groups reflects whether victims could be identified as belonging to a particular group. Thus, it does not require ethnic groups to be disproportionally targeted in relation to the overall number of victims, nor does it require a stated intent on behalf of the perpetrators to subject a particular group to targeting. The coding of ethnic violence requires only that a particular ethnic group can be identified among the victims. However, for all instances of violence against an identified ethnic group, we also code a separate variable denoting whether members of ethnic victim groups were subject to deliberate ethnic targeting. The variable is coded based on available information on intentional ethnic profiling, by first making an assessment of whether one-sided violence was based on collective targeting (profiling based on potential victims’ alleged membership in particular groups, ethnic and non-ethnic), as opposed to either selective individualized targeting (e.g. based on behavioral criteria such as participation in a protest) or targeting that was more or less arbitrary in the selection of victims (e.g. indiscriminate killings of by-standers). On the distinction between selective, collective, and indiscriminate targeting, see Gutiérez Sanín and Wood (2017) and Steele (2009).
Where case-based reports suggest collective targeting, we look for evidence that the targeting was ethnic. This procedure focuses on, for example, explicit announcements of armed group leaders of an intention to target members of specific ethnic groups, or for other evidence from independent sources, such as Human Right Watch, that civilian victims were screened for ethnic “markers” before being killed. Such screening is visible, for example, in the targeting of ethnic Hazaras in Afghanistan in 1998–2001 by the Taliban regime. During these years, Taliban forces repeatedly searched villages for members of the Hazara ethnic group, who were then singled out and killed. In contrast, although we have identified members of the Temne and Limba ethnic groups as victims of violence conducted by the Revolutionary United Front in Sierra Leone in the 1990s, none of these were targeted because of their ethnic identity. Instead, evidence suggests indiscriminate violence. We have therefore coded the former as a case of deliberate targeting and the second as a case of no deliberate targeting, both with low uncertainty (see below for certainty coding).
Since explicit announcements by perpetrators are relatively rare, the coding is mostly based on contextual evidence and assessments by independent sources, for example Human Rights Watch, Amnesty International and various UN reports. An example is the targeting by the government of Sudan and Janjaweed of the Fur, Zaghawa and Masalit ethnic groups in and around Darfur since 2003. The specific patterns of killings and the general context strongly suggest consistent intentional targeting although explicit evidence is insufficient for some years. To ensure some certainty about the pattern, the ethnic targeting variable is only coded positive if there is evidence that at least 50% of the victims belonging to a given ethnic group were killed based on intentional ethnic targeting during that year. Note that where enough information is available, we determine the ethnic affiliation of the victims and the type of targeting on the event level (i.e. for every single event) before aggregating to the actor-year level.
We always specify the level of uncertainty that goes with our coding when we assess whether there was deliberate ethnic targeting. Higher uncertainty levels are coded when we lack information, for example, where we have contextual evidence that strongly implies that ethnic targeting took place, but our sources are not specific enough in their details to determine this with a very high level of confidence and we cannot completely rule out that the targeting was based on behavioural criteria or indiscriminate in nature. We also report higher uncertainty if we have explicit information of ethnic targeting, but this information cannot conclusively be linked to the particular events in the one-sided violence recorded in the UCDP dataset (e.g if sources in question do not provide the temporal and geographic locations to ascertain this linkage). High uncertainty may also reflect sources disagreeing on the nature of the events. In addition to assessing uncertainty based on the content and details provided, we also consider the credibility of the source when we assign uncertainty. If several independent sources (such as Amnesty International and Human Rights Watch) report evidence to support that deliberate ethnic targeting did occur, statements by the perpetrators denying systematic ethnic one-sided violence do not affect the coding decisions for the intention variable, nor the uncertainty estimates. More information and several examples to illustrate our coding rules are available from the codebook that accompanies the dataset. 9 The choice between focusing on the incidence of ethnic violence or the more restrictive coding of ethnic targeting, as well as subsetting data based on uncertainty scores, lies with the user of the data and should ultimately be guided by the research question and the arguments to be tested.
The resulting dataset contains one row per perpetrator-year. A visual, simplified presentation of the data structure is shown in Table 1. Each row maintains the original information contained in the UCDP OSV dataset, which includes the perpetrator and the number of civilian fatalities (shaded cells in Table 1). In separate columns, the EOSV dataset adds the newly collected data, with information on the name and EPR identifier of each ethnic group recorded among the civilian victims (non-shaded cells in Table 1). The number of groups identified in a particular perpetrator-year is often larger than 1 and goes up to 7. For each of the ethnic groups identified among the victims, we also record—in separate columns—whether there is evidence of deliberate ethnic targeting of this group (see discussion above). For the ethnic victim groups, we code whether the information the coding was based on was explicit or implicit. For the ethnic targeting variable, we code the level of uncertainty with which the targeting pattern could be established.
Data structure.
Table 2 presents two lines from our new dataset for the Sudan example discussed above. From the OSV dataset we can glean that the Sudanese government killed 1776 civilians in 2003. Based on our data collection, and thus contained in our dataset, we can indicate that among the targeted civilians were members of the Masalit, Fur, and Zaghawa ethnic groups. In all three cases the government targeted these victims because of their ethnicity. In the same year the Janjaweed killed 1064 civilians according to the OSV data. Our data collection shows that these civilians belonged to the Zaghawas, Masalits, and Furs, and that these were all intentionally targeted by this non-state actor. 10 For each of the ethnic groups mentioned we also retain the EPR group identifier, where appropriate.
Data structure illustrated with data from Sudan.
Data on violence against civilians are most likely to be under-reported and to suffer from selection bias (e.g. Gohdes and Price, 2013; Price and Ball, 2015). This potential bias applies to the reporting regarding ethnicity as well, where characteristics of the conflict environment, e.g. the lethality of the violence and victim or perpetrator characteristics, shape the information available. This bias may be particularly pronounced, for example, where a dominant ethnic majority perpetrates violence against a minority (Davenport, 2007; Gohdes, 2017). Acknowledging this uncertainty, we do not use counts of fatalities, but code dichotomous variables of whether the parties relied on violence against members of particular ethnic groups, and whether such violence was based on intentional ethnic targeting. However, there is still a risk that ethnic violence in cases of high state control over media or low international media attention goes unreported, and that ethnicity may sometimes not be reported for similar reasons. We attempt to mitigate this by using a combination of sources, i.e. news sources, international government organizations and NGO reports, and other local and regional sources. Finally, although our final data include records of ethnic victim categories and intentional targeting across a number of different contexts and actors, it is likely that ethnicity is more consistently reported (and thus coded in our data) in contexts where ethnicity is already politically salient. All of these inherent limitations should be reflected in the conclusions drawn from any analysis of the data.
EOSV descriptive trends and patterns
In this section we summarize and describe some patterns in the EOSV data. We begin by summarizing the data by perpetrator. UCDP’s original OSV dataset for the period 1989–2013 contains a total of 236 unique perpetrators of one-sided violence. Among these, EOSV identifies 152 actors (40 state actors and 112 non-state actors) that engage in one-sided violence against ethnic groups we identified. In total, there are 806 observations of active perpetrator-years in the OSV dataset; a majority of these are recorded in the EOSV dataset with information about ethnic victim groups. More specifically, 65% (525 observations) of the total 806 observations in the original OSV data are coded with ethnic victim groups. For the state actors the share is 70% and for the non-state actors the share is 63%. Table 3 summarizes the number of observations recorded with ethnic victim groups in our dataset per region. 11
Regional summaries of EOSV actor-years.
The total number of actor years recorded with ethnic one-sided violence.
Share of total number of one-sided violence actor-years (not reported) in parentheses.
As seen in Table 3, ethnic victim groups can be identified in one-sided violence across all regions of the world. America stands out with the lowest share of ethnic violence (48%), whereas the Middle East has the highest share (88%). Yet, since the absolute prevalence of civilian targeting is so much higher in both Africa and Asia, these latter two are the regions with the highest number of active perpetrators in absolute terms.
Focusing specifically on deliberate ethnic targeting, we identify 110 perpetrators. This corresponds to 72% of the total number of perpetrators of ethnic violence. The share of state and non-state actors engaging in deliberate ethnic targeting is approximately the same as for any ethnic violence (corresponding to 29 state actors and 81 non-state actors in total). Several of these engage in deliberate ethnic targeting across multiple years.
The EOSV dataset enables an analysis of the extent to which ethnic violence differs from one-sided violence. Figure 1 shows the global summary of the number of actors over time, comparing the number of EOSV actors (separating between ethnic violence and deliberate, i.e. intentional, targeting) with the total number of actors recorded in the original OSV dataset. The trends are quite similar for actors in both the ethnic and non-ethnic one-sided violence category, with a downward trend from the early 2000s and a slight upward trend after 2010. Comparatively, however, we note that the share of actors engaging in ethnic OSV (including deliberately targeting ethnic victims) forms a smaller share of the overall number of actors engaged in OSV in the 2000s. On the contrary, in the 1990s, actors engaged in ethnic OSV formed a much larger share of all perpetrators of OSV.

Temporal trend of the number of actors.
Table 4 compares the distribution of state and non-state actors more directly across our three categories of violence: one-sided violence without an ethnic victim group, 12 ethnic violence without deliberate ethnic targeting, 13 and deliberate ethnic targeting. 14 The differences between state and non-state actors are quite small. However, states engage more frequently in intentional ethnic targeting compared with non-state actors—potentially a consequential difference. While we do not record the number of fatalities per type of ethnic violence, we can note that the average annual total fatality count is about 60% higher for actors that engage in ethnic targeting compared with the average in the full OSV dataset. If actors that engage in intentional collective targeting of ethnic groups kill more civilians than others, scholars need to better understand the dynamics of ethnic violence. The EOSV dataset provides an opportunity in that direction.
Type of violence comparing states and non-state actors.
Absolute numbers refer to actor-year observations.
A main advantage of the EOSV dataset is that it allows an analysis of target groups. First, we can account for the fact that perpetrators often direct violence against more than one ethnic group. Of the actors that engage in ethnic violence, states target on average 1.8 ethnic groups (with a maximum number of seven groups by the government of Myanmar), while non-state actors target on average 1.6 groups (with a maximum of five groups by the Islamic State, Syrian insurgents, the National Patriotic Front of Liberia, and the Janjaweed).
Second, since the dataset links each identified victim group to the EPR dataset, it is also possible to take the ethnic group as the analytical starting point. We have identified a total of 161 unique ethnic victim groups. Of these groups, 103 (64%) are the victims of intentional ethnic targeting by at least one actor. Since the data are time-varying, we can also observe how some groups are victims of ethnic violence over a longer period, but only deliberately collectively targeted in some of those years. Moreover, while the majority of these are exposed to violence by only one perpetrator, 35 EPR groups (22% of the total number) have been exposed to violence by both state and non-state actors. These include the groups in Darfur (Masalit, Fur, Zaghawa) mentioned earlier. The Shia Arabs and Kurds in Iraq are other examples that illustrate both the fact that groups can be targeted by several actors and that the nature of violence may change over time. Shia Arabs in southern Iraq and Kurds in northern Iraq were both targeted by the government of Iraq after their failed uprisings against the Saddam Hussein regime in March 1991. Intentional ethnic targeting was thus coded for both ethnic groups in 1991–1993. There were also state killings of Kurds in 1990, and of Shia Arabs in 1999, but neither of these are coded as deliberate ethnic targeting since they both took place in the context of the suppression of protests where also other civilians participated and were killed. More recently, the Islamic State targeted both these ethnic groups in Iraq in the period from 2004 onwards. This illustrates the complexity of violence against civilians that, in the absence of our dataset, could not have been detected or explored in a crossnational and quantitative way.
EOSV applications
In this section we highlight ways in which the new data on the ethnic identity of victims of one-sided violence can be utilized. We do so, first, by discussing the compatibility of the EOSV dataset with some of the existing data sources frequently used in the study of political violence. Second, we replicate a study on civilian targeting in civil war that emphasizes the ethnic ties between warring actors and civilian constituencies. This replication relies on our new data, which enable us to test the study’s argument directly, rather than relying on spatial proxies.
The EOSV dataset can be used to answer questions about the ethnic dimensions of political violence in conjunction with several existing data sources, including those that fall within the broader set of UCDP and EPR affiliated datasets. Because the EOSV is coded based on actor-year entries in the one-sided violence dataset from the UCDP, scholars can use the EOSV dataset to empirically distinguish between perpetrator-years where we see one-sided violence with an ethnic dimension and perpetrator-years of one-sided violence where the ethnic dimension is not prevalent. The EOSV dataset supports the analysis of the determinants of civilian targeting while enabling researchers to focus exclusively on subsets of the data that correspond better to the scope conditions of their theoretical arguments (i.e. ethnic or non-ethnic violence specifically).
Researchers can also investigate whether ethnic and non-ethnic violence have the same determinants. The compatibility with the UCDP armed conflict datasets, specifically the actor identification codes, furthermore allows examination of the overlap between perpetrators of ethnic one-sided violence and civil war actors. This makes it possible to study variation in the occurrence of ethnic violence against civilians across different conflict contexts, for example in order to understand how the prevalence of deliberate ethnic targeting of civilians may vary across ethnic conflicts.
The family of EPR datasets provides information about ethnic groups’ access to state power, their settlement patterns, links to rebel organizations in the UCDP data, their trans-border ethnic kin relations, and their intra-ethnic cleavages (Vogt et al., 2015). The explicit linkage of ethnic victim groups to EPR groups in the EOSV dataset yields information about the political power status of the groups that become targets of one-sided violence. It also yields information on their cross-border linkages to other groups alongside other characteristics, such as their settlement patterns and ethnic/religious composition, which may explain their exposure to deliberate targeting. Moreover, the ACD2EPR dataset matches ethno-political groups in EPR to the rebel organizations in the UCDP intrastate armed conflict dataset (Wucherpfennig et al., 2011). The EOSV dataset complements this information by recording the ethnic affiliation of civilians that are victims of violence by these groups. When combined, scholars may thus study the relationship between ethnic group victimization and ethnic group participation in other forms of political violence, such as whether the group is a base for rebel organizations’ recruitment and support. This would for example allow researchers to study whether ethnic recruitment into armed groups makes ethnic group members more at risk of ethnic targeting, and whether ethnic victimization follows rebel group formation along ethnic lines. As such, data on the ethnic affiliation of civilian victims may provide a critical piece of information for research on how various forms of political violence—as documented by UCDP datasets—are interlinked.
To illustrate one potential application of our data—while also advancing an empirical debate about how ethnicity shapes wartime dynamics—we proceed by replicating a study of how ethnic ties influence patterns of civilian victimization by rebel groups and government actors in the context of civil war. In a study of one-sided violence in Sub-Saharan Africa, Fjelde and Hultman (2014) (henceforth F & H) argue that, in contexts where ethnicity is politically salient, warring actors often rely on ethnic affiliation to collectively identify groups of suspected enemy supporters, as individual wartime affiliations will often not be known. Since the combatants depend on their civilian constituencies for support, collective targeting of the enemy’s co-ethnics may be used to weaken the enemy’s capacity. Owing to the lack of data on the ethnic identity of victims of civilian targeting, F & H rely on subnational data on the geographical location of ethnic groups’ settlements as a proxy and find evidence that one-sided violence events are more likely to occur in the settlement areas of ethnic groups affiliated with the adversary (i.e. rebel-perpetrated violence in areas inhabited by government constituencies, and government-perpetrated violence in areas inhabited by rebel constituencies). The authors use, as a unit of analysis, grid cells per year and rely on geographic data to proxy the ethnic composition and the number of victims owing to one-sided violence by the government or a non-state actor. However, as recognized by the authors, the ethnic group settlement areas provide very rough approximations of the underlying link between victim populations and ethnicity, especially because most locations are ethnically heterogeneous. Several theories of ethnic violence even suggest that conflict parties may have strategic incentives to forcibly displace civilians that are not part of the ethnically dominant population through excessive violence (e.g. Posen, 1993). Hence, it is not possible to infer ethnic affiliation from the location of violence with a high degree of certainty. By recording the ethnic affiliation of victims of civilian abuse, the EOSV dataset allows us to assess this argument more directly.
We replicate the analysis from F & H’s study drawing on the new data on ethnic victim groups provided by the EOSV dataset. Since the argument suggests that the risk of civilian targeting will vary across groups, depending on their ties to warring actors, our unit of analysis is the ethnic group per year. We include all politically relevant ethno-political groups identified in the EPR dataset annually for the 1989–2013 period. 15 Since the argument focuses on how ethnic ties to warring actors shape incentives for civilian targeting, we limit our analysis to countries with an intrastate armed conflict that meets the 25-battle deaths criteria of the UCDP (Pettersson and Wallensteen, 2015). Our dependent variable differs somewhat from F & H’s study. They draw on the fatality estimates from the original OSV dataset and are therefore able to model the impact of ethnic constituency links on both the overall risk for observing any civilian targeting, and how such ties influence the severity of such violence in a particular grid cell in a given year. Both of these are estimated simultaneously in a zero-inflated negative binominal model. Since precise fatality counts by victim group are not available in the EOSV data, we only consider whether or not specific ethnic groups are subject to ethnic violence or ethnic targeting. Consequently, our analyses are akin to a replication of the part of the zero-inflated negative binomial regression that models grid cells in which one-sided violence fails to occur. Relying on data from the EOSV dataset, we code our dependent variable as a dummy indicator marking those ethnic groups that are subject to any one-sided violence by rebel groups or government actors respectively. Thus, if in the F & H study we find a positive effect for a variable explaining the absence of one-sided violence, we should find a negative coefficient for the same variable in our model explaining the presence of one-sided violence.
The theoretical argument holds that perpetrators use ethnic ties between civilian constituencies and warring actors to target specific victim groups during armed conflict. To operationalize the independent variable, we follow F & H and code the Ethnic constituency of the government as ethnic groups that according to the Ethnic Power Relations dataset have either dominant or monopoly status in the control of executive power in the country (Vogt et al., 2015). To identify the Ethnic constituency of the rebel group, we use the ACD2EPR dataset that codes whether the rebel groups involved in armed intrastate conflict reported by the UCDP have ties to the particular ethnic group through recruiting significant parts of their fighting cadres from the group and/or making political claims on behalf of that particular group (Wucherpfennig et al., 2012).
The results are reported in Table 5, Models 1–4. Models 1 and 2 show how ethnic ties to the warring adversary influence the risk that ethnic group members are subject to one-sided violence (disregarding the intentionality criteria) by rebel and government actors respectively. Models 3 and 4 run the same analysis, but focusing only on cases where the EOSV data report evidence of deliberate ethnic targeting of that particular group. Since our dependent variable is a dichotomous indicator, we estimate logit models, with robust standard errors clustered at the group level. As in the original study, we control for potential confounding variables, including the size of the population and economic development, OSV by the adversary at t− 1, the strength of the rebel group and previous history of ethnic targeting (using a decay function since the last observed incidence of ethnic targeting against that group). The spatial controls in the original subnational application are not part of the current analysis. 16
Analyzing ethnic targeting patterns.
Standard errors in parentheses.
p < 0.05; **p < 0.01; ***p < 0.001.
In Table 5, Model 1, we examine how group status as government ethnic constituency influences the risk that the ethnic group will be subject to direct and deliberate killing by rebel actors. In Model 1, the coefficient for Government ethnic constituency is positive and statistically significant at the 0.05 level. This corroborates the argument that ethnic groups with identity ties to the government are more likely to become victims of direct and deliberate attacks by the rebel groups during armed conflict. The estimated effect is also non-negligible: ethnic groups with no ethnic ties to the government have a 2.2% annual predicted risk of being subject to violence by rebel groups, whereas for groups with such ties, the risk is 4.1%. 17 In the Online Appendix (Table 7) we show that this effect only materializes for African countries (as also shown by F & H). For non-African countries (not covered by F & H) we find no significant effect of rebel groups targeting civilians from the government’s ethnic group. Consequently, our EOSV data, with a broader coverage, allow us to pinpoint regional differences in the rebels’ strategies to target particular civilians.
In Model 2, we look at the risk that the ethnic group is victimized by government actors. The coefficient denoting whether the group represents a rebel group’s ethnic constituency is positive and significant at the 0.01 level, suggesting that governments have strategic incentives to target civilians whose ethnic identity ties them to non-state actors that challenge the state with arms. The estimated effect is substantive: an ethnic group with ties to a rebel constituency has a 4.7% annual predicted risk of being subject to ethnic violence, compared with a 1.15% risk for a group without such ties. This effect, as we show in the Online Appendix (Table 7), holds for both African and non-African countries.
In Table 5, Models 3 and 4, we run the same models, but with the dependent variable coded as 1 only if the EOSV dataset has recorded evidence that the ethnic violence is (at least in some part) a consequence of deliberate ethnic screening on the side of the perpetrator. In other words, in addition to the identification of an ethnic victim group, we also require that our dataset records deliberate ethnic targeting. The results are similar, except that the government constituency (at least in this more narrow definition, which only includes dominant or monopoly groups) is no longer a significant predictor of ethnic targeting by rebel groups. 18 Arguably, government ethnic constituencies may feature disproportionately among ethnic victim groups of rebel violence not owing to strategic collective targeting, but because of other determinants such as the proximity of settlement patterns to capital cities or economically affluent areas. The coefficient for Rebels’ ethnic constituency remains positive and statistically significant, also with this more strict definition of ethnic targeting. Hence, ethnic ties to rebel groups are important for understanding why some groups are targeted disproportionally by the government during civil wars, and this both in African and in non-African cases (see Table 8 in the Online Appendix). 19
These results are largely consistent with the findings in the F & H study. As noted above, F & H rely on fatality estimates and therefore estimate zero-inflated negative binomial models, which comprise two dependent variables: the first corresponds to the probability of a non-occurrence and the second is a count of the number of people killed. In line with our results, they find that the link between government constituency and civilian targeting is the least robust. In their original application, government constituency is not a significant predictor of whether a one-sided violence event will occur or not—it is only a significant predictor of the number of fatalities, given that an event has occurred. Our analyses, however, show that ethnic violence by rebels does primarily focus on civilians of the government’s ethnic group, but only in the African context. When considering intentional targeting, however, we fail to find an effect regardless of the regional context.
Rebel constituency, on the other hand, is a significant predictor of an event, as well as the number of fatalities from OSV in the F & H study. While we cannot make a distinction between occurrence and fatalities, we find that rebel constituency remains a significant predictor of both ethnic violence and ethnic targeting, and this both in the African context covered by F & H and in the remaining countries. Beyond just largely replicating the main results of F & H, our analysis also shows that the patterns largely hold in a more geographically inclusive sample that extends beyond the African context. Only with respect to the rebels’ strategy of targeting the civilian constituents of the government do we find that the results do not extend to non-African countries.
Compared with F & H’s original study, these findings provide much more direct support for the argument that ethnic ties between warring actors and civilian populations shape strategic incentives for armed actors to direct violence against civilians against ethnic lines. In lieu of a detailed coding of the target populations’ ethnicity in existing datasets, previous research has been forced to establish this linkage indirectly through the spatial correlation between aggregate ethnic settlement patterns and the incidence of one-sided violence.
Conclusion
The EOSV dataset constitutes a first effort to systematically collect data on the ethnic identity of victims of one-sided violence on a global scale across a longer time period. This represents an important contribution to conflict research as it opens up the possibility of systematically exploring a range of new research questions. Our new data resource allows the researcher to analyze whether ethnic and non-ethnic violence against civilians have the same causes. The new data can be used to examine specific manifestations of one-sided violence by identifying potentially diverging causal pathways between the two. Systematic data on whether armed actors pursue ethnic claims and recruit members from specific ethnic groups have allowed researchers to address the question of whether ethnic and non-ethnic conflicts have different causes (cf. Sambanis, 2001). Hitherto, no similar dataset has allowed researchers to identify the ethnic dimension of violence against civilian populations. As a result, existing studies have evaluated theories that make predictions regarding specific types of violence using data that do not separate, for example, between collective violence along ethnic lines and more indiscriminate violence. Because the EOSV data are derivative of the broader OSV dataset, users are able not only to systematically evaluate the difference between ethnic and non-ethnic targeting, but also to identify appropriate empirical subsets for a more precise evaluation of their theories.
Our data may furthermore help to advance the empirical rigor by which conflict scholars approach the distinction between ethnic conflict and patterns of violence within ethnic conflicts. Indeed, one shortcoming with the existing literature has been the tendency to conflate the occurrence of ethnic conflict with the occurrence of violence against civilians along ethnic lines. In some cases, researchers have grouped together processes that are characterized by different modes of organization and dynamics. Others have imposed ethnic dimensions where there are none. Some actors mobilize around ethnic cleavages without targeting civilians along these lines. Even where an overarching macro cleavage prompts the mobilization of armed actors along particular identity lines, violent targeting of civilian constituencies may not follow this cleavage (cf. Kalyvas, 2003: 475). This is ultimately an empirical question, and there is a need for a more “disaggregated analysis of the heterogenous phenomena we too casually lump together as ethnic violence” (Brubaker and Laitin, 1998: 423). The EOSV dataset allows scholars to focus on ethnic violence against civilians, for example to shed light on the variation in ethnic targeting of civilians across ethnic conflicts.
Supplemental Material
cmp836256_onlineAppendix – Supplemental material for Introducing the Ethnic One-Sided Violence dataset
Supplemental material, cmp836256_onlineAppendix for Introducing the Ethnic One-Sided Violence dataset by Hanne Fjelde, Lisa Hultman, Livia Schubiger, Lars-Erik Cederman, Simon Hug and Margareta Sollenberg in Conflict Management and Peace Science
Supplemental Material
CMPS_replication – Supplemental material for Introducing the Ethnic One-Sided Violence dataset
Supplemental material, CMPS_replication for Introducing the Ethnic One-Sided Violence dataset by Hanne Fjelde, Lisa Hultman, Livia Schubiger, Lars-Erik Cederman, Simon Hug and Margareta Sollenberg in Conflict Management and Peace Science
Supplemental Material
CMPS_replication – Supplemental material for Introducing the Ethnic One-Sided Violence dataset
Supplemental material, CMPS_replication for Introducing the Ethnic One-Sided Violence dataset by Hanne Fjelde, Lisa Hultman, Livia Schubiger, Lars-Erik Cederman, Simon Hug and Margareta Sollenberg in Conflict Management and Peace Science
Supplemental Material
EOSV_june2019 – Supplemental material for Introducing the Ethnic One-Sided Violence dataset
Supplemental material, EOSV_june2019 for Introducing the Ethnic One-Sided Violence dataset by Hanne Fjelde, Lisa Hultman, Livia Schubiger, Lars-Erik Cederman, Simon Hug and Margareta Sollenberg in Conflict Management and Peace Science
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the Swedish Research Council, grant no. E0136501 and the Swiss Network for International Studies.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
