Abstract
While several existing datasets can help to address pressing questions on nonviolent resistance, data collection on nonviolent conflict involves several distinct challenges, including (1) conceptual distinctions between the absence of violence, non-violent behavior, and nonviolent direct action; (2) a systematic violence bias in mainstream news reports; and (3) incentives to misrepresent. As a way forward, we advocate (1) collecting data at multiple temporal and purposive units; (2) diversifying source materials; and (3) coding ambiguity as a meaningful substantive variable.
Introduction
There is increasing interest in using quantitative methods to meaningfully test hypotheses related to nonviolent methods of contention (e.g. Chenoweth & Stephan, 2011; Cunningham, 2013). While several existing datasets can help to address pressing questions (Asal, Pate & Wilkenfeld, 2008; Chenoweth & Lewis, 2013; Hendrix & Salehyan, 2013), data collection on nonviolent conflict involves several distinct challenges that can significantly impact the reliability of findings. Without reliable data, scholars are unable to fully understand the risks of nonviolent action in repressive contexts, how nonviolent campaigns unfold, how civilians use nonviolent action to protect themselves in armed conflict, or how such campaigns can play a role in initiating civil war termination. In this note we briefly discuss these challenges and offer a framework for performing high-quality data collection on strategic nonviolent conflict.
Challenge 1: Conceptualizing forms of action
The first challenge is distinguishing and defining nonviolent action. Traditionally, conflict datasets have only collected data on violent events, assuming that an absence of violence is equivalent to a lack of conflict. Other datasets examine non-violent behavior, such as protest events (e.g. Hendrix & Salehyan, 2013), or the activities of non-violent organizations (e.g. Asal, Pate & Wilkenfeld, 2008). In these datasets ‘non-violent’ is used simply as a negative term indicating that the behavior is not violent. However, the lack of violence in such events may be incidental. Far fewer datasets examine nonviolent action as a distinct category of behavior, which refers to civilian-led action in which unarmed persons confront opponents using coordinated, purposive, sequences of nonviolent methods. 1 As Sharp (1973: 65) notes, nonviolent action’s first key characteristic is a decision to address a conflict through active struggle, rather than passive submission. There are thus important distinctions not only between violence and nonviolence, but also between non-violent inaction and nonviolent contentious action.
These conceptual distinctions are important. Research suggests that coordination, organization, and tactical sequencing of nonviolent actions have important effects on how campaigns unfold, are affected by repression, and ultimately succeed or fail (Schock, 2005; Chenoweth & Stephan, 2011). It is therefore crucial to distinguish between the simple lack of violence, spontaneous protests that happen to be non-violent, and strategic nonviolent action. Scholars often conflate the latter two categories, with most events datasets identifying discrete protest incidents without considering the strategic linkages between protests and other influential but less visible tactics such as boycotts or stay-aways (Schock, 2005).
Challenge 2: The violence bias
The second major challenge emerges from the source material. Most protest events datasets rely on mainstream news sources (e.g. Herkenrath & Knoll, 2011), yet many studies show that these sources tend to overreport violent events (Barranco & Wisler, 1999; Earl et al., 2004). Scott (2001) also shows evidence of a saturation effect, where more terrorist attacks in one day result in less print space for other stories. As a result, newspapers systematically underreport nonviolent activity – especially dispersed tactics such as boycotts. Even reported nonviolent action is often relegated to brief mentions at the end of articles emphasizing violence.
Moreover, journalists may have different personal standards for which behaviors constitute violence. Observers may interpret and report highly disruptive incidents or extralegal behavior as violent. Reporters may also describe purely nonviolent events as ‘riots’ in headlines, particularly in older news sources (Banks, Overstreet & Muller, 2004). Many protests involve occasional spontaneous violence that is incidental to the overall event, while some overwhelmingly nonviolent campaigns have more routinized violent flanks. Thus there are often gray areas between ‘nonviolent’ and ‘violent’ behavior. Yet when violence occurs, standard media practices often make it unclear whether activists or opponents initiated the violence. This is especially problematic because subjective interpretations are likely non-random, resulting instead from the particular biases of different news bureaus (Barranco & Wisler, 1999).
Challenge 3: Incentives to misrepresent
Source material on nonviolent conflict can be subject to deliberate misreporting by conflict actors. Highly conflicting reports of protests are the norm, particularly with regard to (1) how many people participate (a central indicator of nonviolent campaign viability), (2) the use of violence by state actors or violent flanks, and (3) people wounded or killed, both in terms of numbers and roles in the conflict.
Professional incentives lead mainstream journalists to rely on police or state elite sources, which routinely downplay protest participation, minimize or deny repression, and emphasize violent elements of otherwise peaceful events. Activists, on the other hand, have a converse reliability problem, with strong incentives to overcount participation, understate violence perpetrated by movement participants, and overreport incidents of brutality by security forces.
The Syrian civil war illustrates the theoretical and practical stakes of these three challenges. Status quo collection methods would lead us to believe that nonviolent activism in Syria disappeared as civil war escalated, which is empirically false (Benedict, 2013). Yet under conditions of repression, high fatality rates, and active armed conflict, scholars have historically failed to systematically document observed nonviolent actions. As a result, existing scholarship has little to say about whether and how civilians can use nonviolent action to alter the course of civil conflict.
Suggested practices for data collection on nonviolent action
We advance a three-part framework for researchers to address the three challenges. First, we suggest collecting data at multiple levels of aggregation; second, we argue for more creative use of source materials; and third, we argue that researchers should code competing accounts of events to generate an ambiguity value for each observation.
Recommendation 1: Multilevel data collection
A key way to overcome concerns about both conceptual clarity and source reliability is to collect data at multiple levels of aggregation. With existing datasets we can observe nonviolent action at the level of incident or various temporal aggregations of incidents (e.g. Hendrix & Salehyan, 2013), campaign (e.g. Chenoweth, 2011), or campaign-year (e.g. Chenoweth & Lewis, 2013). Temporal aggregation is especially helpful in linking campaign events that might otherwise get lost in event data. For example, nonviolent action regularly declines during winter months when weather conditions prohibit marches and demonstrations. Piecing together these ebbs and flows is critical in determining continuities between events that otherwise might not be obvious (Schweingruber & McPhail, 1999).
Sample categories of nonviolent action
Creating civil resistance category maps may be a useful way to represent the above-mentioned typologies. Table I represents four different categories of nonviolent incidents based on their temporal, purposive, and spatial dimensions. Here one can distinguish ‘episodes’ of contention, which may be improvisational and disorganized, from ‘campaigns’ of nonviolent action, which are more deliberate, organized, and coordinated.
Researchers can further classify types of incidents according to the specific goals of the action (e.g. reformist, territorial, economic, identity politics, etc.), or by levels of participation, whether by threshold, range, or estimation (e.g. 25+; 1,000–10,000; ‘millions’).
Recommendation 2: Making use of self-reports and other sources
Triangulating multiple sources can help overcome underreporting biases and may allow researchers to better distinguish spontaneous non-violent behavior from organized civil resistance. Other conflict datasets are increasingly turning to activist self-reports, ethnographic research, eyewitness accounts, surveys, and interviews to confront underreporting (Loyle, Sullivan & Davenport, 2014; Schrodt, 2012). Researchers can leverage these direct sources against violence-centric media reports and state-supplied information. 3 Since, as mentioned above, the reporting biases of activists and eyewitnesses contrast with those of mainstream media and official state reports, the inclusion of these sources can ameliorate the absolute amount and direction of bias and the extent to which it is systematic. It could also prove helpful to investigate declassified documents, archives, and memoirs. 4 If coders clearly identify the source of the content and allow for end-users to filter out certain source types, then this practice is especially beneficial (Schrodt, 2012). 5
Recommendation 3: Taking incentives to misrepresent seriously
Since conflicting accounts occur routinely we suggest coding competing accounts and including ambiguity itself as a variable. Many datasets include reliability scores for different variables – a score that users often interpret as an indicator of missing information, ‘best guess’ estimates, competing accounts, or vague source material. But we suggest creating an ambiguity range score, which would count the distance between different official, eyewitness, or journalist accounts. See Table II for an example.
Sample ambiguity coding of reported protest with lethal ‘clashes’
Conclusion
The framework we suggest departs from many current practices in that it requires a degree of interpretive expertise that makes automated coding impractical. While automated coding provides advantages of speed and consistency (Schrodt, 2012), it cannot yet deal with the level of complexity involved in identifying the differences between the absence of violence, non-violent behavior, and civil resistance campaigns. Human researchers, on the other hand, have the ability to confront this complexity. Researchers must make decisions about how to adjudicate gray areas, develop clear codebooks, and train human coders so that judgment calls are minimal. Although collecting detailed data on nonviolent action is time-intensive, the theoretical and empirical contribution to the field will be considerable.
Footnotes
Acknowledgements
The authors thank the editors and anonymous reviewers for helpful comments. Remaining errors are our own. Equal authorship is implied.
