Abstract
A growing number of datasets collect information reflecting the behavior and characteristics of contentious and violent organizations. Most of these datasets are not arranged so that they can be easily combined for analytical purposes. In particular, there is a lack of common identifiers for organizations. This creates a great deal of duplication of effort as well as many potential stumbling blocks for those who would like to combine information from different datasets for new analyses or research questions. We describe an existing effort to overcome common challenges and link organizations called the Terrorist Organization (TORG) crosswalk. We further propose a new effort, a Shared Contentious Actor Registry (SCAR), to allow for a more comprehensive and concerted effort to meet the challenges in linking data.
Introduction
Recent research has highlighted the importance of agency and choices in contentious politics (e.g. Cederman & Gleditsch, 2009). The growing body of quantitative data reflecting organizations involved in contentious behavior such as protest, insurgencies, and terrorism opens new avenues for research on agency in conflict. Researchers have examined conditions that make non-state actors more likely to combine conventional military attacks and terrorist attacks (Findley & Young, 2012), or transitions from nonviolent protest to use of violence, and vice versa (e.g. Lichbach, 1987; Moore, 1998; Regan & Norton, 2005). Yet, despite the theoretical emphasis on agency, many empirical analyses remain limited to aggregate country-level measures that disregard the specific organizations. For example, the Democratic League of Kosovo headed a nonviolent campaign against Serb dominated rule, while the Kosovo Liberation Army chose a violent strategy. The two organizations had an antagonistic relationship, and it would be misleading to conflate the two or treat this as a case of an actor shifting strategies. We posit that the lack of attention to specific actors follows from the practical problems in identifying and integrating data on organizational behavior found in disparate datasets. We provide an overview of these challenges, and demonstrate how a current set of organizational connectors – the Terrorist Organization (TORG) crosswalk – can help overcome these. We then outline how the TORG can provide a starting point for a more comprehensive Shared Contentious Actor Registry (SCAR).
Linking actors in contentious politics data
We first review different structures in existing contentious politics data. The first category identifies conflict periods, with start and end dates, and participant information. The UCDP/PRIO Armed Conflict Dataset identifies post-1945 conflicts with more than 25 battle deaths per calendar year (see Gleditsch et al., 2002; Themnér & Wallensteen, 2014), while the UCDP Actor Dataset (UCDP, 2014) identifies actors across all the UCDP conflict datasets. The Nonviolent and Violent Campaigns and Outcomes (NAVCO) data track nonviolent campaigns where non-state actors make maximalist claims on the state since 1900 (Chenoweth & Lewis, 2013). A second category reports individual violent events. The Global Terrorism Database (GTD) tracks terrorist attacks by date and location since 1970, the groups associated with each attack, the targets, and related event data (LaFree & Dugan, 2007). Datasets such as the Social Conflict in Africa Database (SCAD) track an even broader range of events, including protests, riots, strikes, intercommunal conflict, and government violence against civilians (Salehyan et al., 2012). Finally, a third category focuses on the attributes of conflicts and contentious groups, including background on the actors involved, conflict termination, and links to ethnic groups (see Cunningham, Gleditsch & Salehyan, 2009; Kreutz, 2010; Wucherpfennig et al., 2012). The Big Allied and Dangerous (BAAD) dataset provides data on violent organizations, including links to other groups and states (Asal & Rethemeyer, 2008). Similarly, Minorities at Risk Organizational Behavior (MAROB) provides background on ethno-political organizations in the Middle East (Asal, Pate & Wilkenfeld, 2008).
Periodic, event, and attribute data can all help support testing a wide range of theories on contentious politics, but answers to new questions often require data from multiple sources. For example, assessing the overlap between conventional civil war and terrorist attacks needs to go beyond whether events overlap space and time and also consider whether the same actors carry out the events (e.g. Findley & Young, 2012). Likewise, studying transitions from violence to nonviolent actions (and vice versa) should distinguish between whether we see the same actors making different choices over time or whether transitions at the conflict level reflect the emergence of new organizations engaging in different tactics (e.g. Moore, 1998). Such analyses are often very difficult to carry out as significant effort will be required to align disparate data formats and structures.
Working with country-level data is relatively easy, given work on standardized lists of countries and identifying codes (see Gleditsch & Ward, 2001; Russett, Singer & Small, 1968), but we lack an analogous system for organizations. This creates many practical problems (and thus potential for error) when attempting to assess theories of conflict focusing on agency or organizations. Of course, there may be many differences in what each project considers the ‘relevant’ actors given its theoretical focus, but even when project scopes overlap, determining alignment can be challenging. It is common for groups to change ‘official’ names or adopt nicknames (or be given them by outsiders), and both journalistic and scholarly sources often use different names to refer to the ‘same’ group. This can be as prosaic as the varied use of full names, abbreviations or acronyms, but is compounded by enormous diversities in language, translation, and transliteration. Shifting factions, partnerships, and mergers are also common. Some sources identify organizations with multiple factions as umbrella groups, while others consider some subgroups to be distinct (e.g. political and military wings). Different groups often have the same (or quite similar) names, and sources do not always clearly distinguish ambiguous cases. Finally, reporting on group activities and interactions is often inconsistent. Events are sometimes misattributed, and it is not uncommon for journalists and scholars to use ‘older’ names to refer to successor groups. In these ways and more, accurately identifying specific organizations can prove exceptionally challenging.
Terrorist organizations (TORG) crosswalk
Differences in inclusion and naming criteria, coverage
Numerical id: ISO a-3 code of active locales (years of activity). Values of ‘−99’ signify that a given organizational name is not tracked by the associated dataset (i.e. EX-FAR, Interhamwe, and PALIR are not included in the GTD). As a crosswalk connecting constituent datasets, TORG has no independent data on active locales, or years of activity.
Differences in inclusion and naming criteria, coverage
Numerical id: ISO a-3 code of active locales (years of activity). Values of ‘−99’ signify that a given organizational name is not tracked by the associated dataset (i.e. ISI is not included in TOPs). As a crosswalk connecting constituent datasets, TORG has no independent data on active locales, or years of activity.
A related challenge stems from the way datasets refer to distinct organizations. GTD and START’s Profiles of Incidents Involving CBRN by Non-State Actors (POICN) database each publish a single name for each group, whereas TOPs, UCDP/PRIO, and MAROB included multiple aliases associated to a primary group name. A prime example is the Iraqi militant group known variously as ‘Tawhid and Jihad’, ‘Al-Qaeda in Iraq’, and the ‘Islamic State of Iraq and the Levant’, each identified as aliases of a single organization by UCDP/PRIO (see Table II). Similarly, TOPs lists ‘Tawhid and Jihad’ and ‘Al-Qaeda in Iraq’ as two (of eight total) alternative names. However, as an event-centric dataset, GTD identifies each group separately (temporal overlap across group names is likely an artifact of journalistic reporting on attack attribution). Finally, MAROB does not track any groups using any of these names. Overall, we have discovered nearly 40 associated name variants. Unfortunately, naming differences across datasets are very common – approximately 40% of TORG groups have multiple associated names associated with them (although this includes minor differences in spelling and punctuation).
Records and overlap between datasets in TORG (v. 2014 2.0)
The current version TORG (v. 2014 2.0) identifies 2,750 distinct organizations and 6,125 names. Since many other data sources use actor codes from one of the core sources, the crosswalk will also link to other data sources through other data connectors such as the ACD2EPR data, which link organizations in UCDP/PRIO to the ethnic groups in the Ethnic Power Relations data through explicit claims or recruitment (see Wucherpfennig et al., 2012). Table III indicates the number of unique organizations in the constituent datasets as well as overlap across datasets. It is clear that many organizations (204) engage in both terrorist activities and conventional armed conflict, as classified by GTD and UCDP/PRIO, respectively. Just as relevant are the non-overlapping records. In particular, we can see that ‘organizations in conventional civil war’ is a relatively small proportion of all the groups in TORG (204/2,750 = 7%), consistent with the idea that only stronger groups have realistic prospects to engage in conventional warfare. Conversely, a large proportion of groups involved in armed conflicts do not engage in terrorism or indirect targeting, including revolutionary groups and drug cartels (204/341 = 40%). Yet, the fact that most UCDP/PRIO groups also appear in GTD suggests that terrorism (at least as defined by GTD) is common in most civil wars.
Beyond TORG: A Shared Contentious Actor Registry
Although the TORG crosswalk can support many research questions, its focus is limited to terrorist organizations and groups engaging in violent conflict leading to 25 or more deaths per year. As such, it is not well suited to inform research on conflict beyond violence, such as when actors refrain from violence, or substitution between violence and nonviolent tactics. To broaden the focus, we propose a more comprehensive Shared Contentious Actor Registry (SCAR) to identify non-state organizations in both violent and nonviolent contentious politics. Just as the TORG data allow studying the relationship between terrorism and civil war, a broader framework of organizational identifiers would aid the study of contentious organizations and the integration of data beyond violent conflict. We hope to achieve this through an advisory board, tasked with creating a comprehensive online registry of organizational names and associated reference identifiers for existing relevant data sources.
Many of the problems of classifying groups arise from changing interactions over time, from ‘rebranding’ efforts to factional splits and the formation of stable coalitions. Properly accounting for these will require explicit protocols for attributing characteristics, events, and relationships to factions and splinter groups over time. Since umbrella groups may be significant in planning and resources (leaving ‘implementation’ to member groups) there may be value in identifying these separately. 1 Accurate accounting of variations in intra- and intergroup relationships will require an efficient means of capturing longitudinal relational data, 2 capable of supporting automated matching with external datasets. Ultimately, we hope to support queries for individual groups, identifiers, locales, and date ranges via a web-based interface.
Conclusion
The growing number of datasets on contentious political organizations has the potential to transform the study of political conflict, but this is hampered by the lack of a common organizational framework. Cooperative efforts that help users combine and take advantage of existing coding efforts can help prevent duplication of efforts and facilitate new opportunities for research. We offer the TORG crosswalk as both proof-of-concept and as a starting point for a more ambitious and inclusive ‘Rosetta Stone’ of contentious politics.
Footnotes
Acknowledgements
We thank Idean Salehyan as well the anonymous reviewers for their helpful suggestions. Gleditsch is grateful for support from the Research Council of Norway (213535/F10) and the European Research Council (313373). Support for this research was also provided by the Science and Technology Directorate of the US Department of Homeland Security (grant number 2008ST061ST0004) through the National Consortium for the Study of Terrorism and Responses to Terrorism (START). The authors are listed alphabetically; equal authorship implied.
