Collecting data on nonviolent action

Abstract

While several existing datasets can help to address pressing questions on nonviolent resistance, data collection on nonviolent conflict involves several distinct challenges, including (1) conceptual distinctions between the absence of violence, non-violent behavior, and nonviolent direct action; (2) a systematic violence bias in mainstream news reports; and (3) incentives to misrepresent. As a way forward, we advocate (1) collecting data at multiple temporal and purposive units; (2) diversifying source materials; and (3) coding ambiguity as a meaningful substantive variable.

Keywords

civil resistance dissent nonviolence nonviolent action protest repression

Introduction

There is increasing interest in using quantitative methods to meaningfully test hypotheses related to nonviolent methods of contention (e.g. Chenoweth & Stephan, 2011; Cunningham, 2013). While several existing datasets can help to address pressing questions (Asal, Pate & Wilkenfeld, 2008; Chenoweth & Lewis, 2013; Hendrix & Salehyan, 2013), data collection on nonviolent conflict involves several distinct challenges that can significantly impact the reliability of findings. Without reliable data, scholars are unable to fully understand the risks of nonviolent action in repressive contexts, how nonviolent campaigns unfold, how civilians use nonviolent action to protect themselves in armed conflict, or how such campaigns can play a role in initiating civil war termination. In this note we briefly discuss these challenges and offer a framework for performing high-quality data collection on strategic nonviolent conflict.

Challenge 1: Conceptualizing forms of action

The first challenge is distinguishing and defining nonviolent action. Traditionally, conflict datasets have only collected data on violent events, assuming that an absence of violence is equivalent to a lack of conflict. Other datasets examine non-violent behavior, such as protest events (e.g. Hendrix & Salehyan, 2013), or the activities of non-violent organizations (e.g. Asal, Pate & Wilkenfeld, 2008). In these datasets ‘non-violent’ is used simply as a negative term indicating that the behavior is not violent. However, the lack of violence in such events may be incidental. Far fewer datasets examine nonviolent action as a distinct category of behavior, which refers to civilian-led action in which unarmed persons confront opponents using coordinated, purposive, sequences of nonviolent methods.¹ As Sharp (1973: 65) notes, nonviolent action’s first key characteristic is a decision to address a conflict through active struggle, rather than passive submission. There are thus important distinctions not only between violence and nonviolence, but also between non-violent inaction and nonviolent contentious action.

These conceptual distinctions are important. Research suggests that coordination, organization, and tactical sequencing of nonviolent actions have important effects on how campaigns unfold, are affected by repression, and ultimately succeed or fail (Schock, 2005; Chenoweth & Stephan, 2011). It is therefore crucial to distinguish between the simple lack of violence, spontaneous protests that happen to be non-violent, and strategic nonviolent action. Scholars often conflate the latter two categories, with most events datasets identifying discrete protest incidents without considering the strategic linkages between protests and other influential but less visible tactics such as boycotts or stay-aways (Schock, 2005).

Challenge 2: The violence bias

The second major challenge emerges from the source material. Most protest events datasets rely on mainstream news sources (e.g. Herkenrath & Knoll, 2011), yet many studies show that these sources tend to overreport violent events (Barranco & Wisler, 1999; Earl et al., 2004). Scott (2001) also shows evidence of a saturation effect, where more terrorist attacks in one day result in less print space for other stories. As a result, newspapers systematically underreport nonviolent activity – especially dispersed tactics such as boycotts. Even reported nonviolent action is often relegated to brief mentions at the end of articles emphasizing violence.

Moreover, journalists may have different personal standards for which behaviors constitute violence. Observers may interpret and report highly disruptive incidents or extralegal behavior as violent. Reporters may also describe purely nonviolent events as ‘riots’ in headlines, particularly in older news sources (Banks, Overstreet & Muller, 2004). Many protests involve occasional spontaneous violence that is incidental to the overall event, while some overwhelmingly nonviolent campaigns have more routinized violent flanks. Thus there are often gray areas between ‘nonviolent’ and ‘violent’ behavior. Yet when violence occurs, standard media practices often make it unclear whether activists or opponents initiated the violence. This is especially problematic because subjective interpretations are likely non-random, resulting instead from the particular biases of different news bureaus (Barranco & Wisler, 1999).

Challenge 3: Incentives to misrepresent

Source material on nonviolent conflict can be subject to deliberate misreporting by conflict actors. Highly conflicting reports of protests are the norm, particularly with regard to (1) how many people participate (a central indicator of nonviolent campaign viability), (2) the use of violence by state actors or violent flanks, and (3) people wounded or killed, both in terms of numbers and roles in the conflict.

Professional incentives lead mainstream journalists to rely on police or state elite sources, which routinely downplay protest participation, minimize or deny repression, and emphasize violent elements of otherwise peaceful events. Activists, on the other hand, have a converse reliability problem, with strong incentives to overcount participation, understate violence perpetrated by movement participants, and overreport incidents of brutality by security forces.

The Syrian civil war illustrates the theoretical and practical stakes of these three challenges. Status quo collection methods would lead us to believe that nonviolent activism in Syria disappeared as civil war escalated, which is empirically false (Benedict, 2013). Yet under conditions of repression, high fatality rates, and active armed conflict, scholars have historically failed to systematically document observed nonviolent actions. As a result, existing scholarship has little to say about whether and how civilians can use nonviolent action to alter the course of civil conflict.

Suggested practices for data collection on nonviolent action

We advance a three-part framework for researchers to address the three challenges. First, we suggest collecting data at multiple levels of aggregation; second, we argue for more creative use of source materials; and third, we argue that researchers should code competing accounts of events to generate an ambiguity value for each observation.

Recommendation 1: Multilevel data collection

A key way to overcome concerns about both conceptual clarity and source reliability is to collect data at multiple levels of aggregation. With existing datasets we can observe nonviolent action at the level of incident or various temporal aggregations of incidents (e.g. Hendrix & Salehyan, 2013), campaign (e.g. Chenoweth, 2011), or campaign-year (e.g. Chenoweth & Lewis, 2013). Temporal aggregation is especially helpful in linking campaign events that might otherwise get lost in event data. For example, nonviolent action regularly declines during winter months when weather conditions prohibit marches and demonstrations. Piecing together these ebbs and flows is critical in determining continuities between events that otherwise might not be obvious (Schweingruber & McPhail, 1999).

Importantly, however, researchers cannot simply aggregate incident-level data and assume that higher frequency of incidents indicates that a major campaign is ongoing.

Table I.

Sample categories of nonviolent action

Event type	Temporal unit	Purposive unit	Spatial unit	Example
Improvised event	Short (e.g. 1 day)	Improvised	Dispersed	Day of rage
Improvised episode	Medium (e.g. 1 week)	Improvised	Concentrated	Tulip Revolution
Coordinated event	Short (e.g. 1 day)	Coordinated	Varies	Million Woman March
Coordinated campaign	Varies (e.g. 1 week–years)	Coordinated	Concentrated	Bulldozer Revolution

For example, from 2008 to 2010, the Social Conflict in Africa Database (SCAD) identifies 27 ‘social conflict’ events in Madagascar (where massive antigovernment protests led to the downfall of the government) but 77 events in South Africa (where no major campaign was ongoing).² Researchers must instead ascertain whether different episodes are purposively linked. This requires identifying the goals pursued and actors involved in each action. Similarity among organizational and actor profiles across events often indicates high levels of coordination. A model in this regard is the Social Conflict in Africa Database, where each event record codes the actors involved, the targets, the issue at stake, and the degree to which the action is organized or spontaneous (Hendrix & Salehyan, 2013). This practice, which we hope will become standard, allows users to aggregate events based on both purpose and actors to detect coherent campaigns.

Creating civil resistance category maps may be a useful way to represent the above-mentioned typologies. Table I represents four different categories of nonviolent incidents based on their temporal, purposive, and spatial dimensions. Here one can distinguish ‘episodes’ of contention, which may be improvisational and disorganized, from ‘campaigns’ of nonviolent action, which are more deliberate, organized, and coordinated.

Researchers can further classify types of incidents according to the specific goals of the action (e.g. reformist, territorial, economic, identity politics, etc.), or by levels of participation, whether by threshold, range, or estimation (e.g. 25+; 1,000–10,000; ‘millions’).

Recommendation 2: Making use of self-reports and other sources

Triangulating multiple sources can help overcome underreporting biases and may allow researchers to better distinguish spontaneous non-violent behavior from organized civil resistance. Other conflict datasets are increasingly turning to activist self-reports, ethnographic research, eyewitness accounts, surveys, and interviews to confront underreporting (Loyle, Sullivan & Davenport, 2014; Schrodt, 2012). Researchers can leverage these direct sources against violence-centric media reports and state-supplied information.³ Since, as mentioned above, the reporting biases of activists and eyewitnesses contrast with those of mainstream media and official state reports, the inclusion of these sources can ameliorate the absolute amount and direction of bias and the extent to which it is systematic. It could also prove helpful to investigate declassified documents, archives, and memoirs.⁴ If coders clearly identify the source of the content and allow for end-users to filter out certain source types, then this practice is especially beneficial (Schrodt, 2012).⁵

Recommendation 3: Taking incentives to misrepresent seriously

Since conflicting accounts occur routinely we suggest coding competing accounts and including ambiguity itself as a variable. Many datasets include reliability scores for different variables – a score that users often interpret as an indicator of missing information, ‘best guess’ estimates, competing accounts, or vague source material. But we suggest creating an ambiguity range score, which would count the distance between different official, eyewitness, or journalist accounts. See Table II for an example.

Researchers could then use the distance between estimates – the ‘ambiguity range’ – as a metric for both the reliability of reported values of each variable and the level of disagreement between official reports and self-reports. Researchers should interpret the ambiguity discrepancy proportionally rather than absolutely. For instance, the discrepancy between 10,000 and 40,000 and

Table II.

Sample ambiguity coding of reported protest with lethal ‘clashes’

	Source
	State media	Independent media	Activists
Number of participants	10,000	40,000	40,000
Fatalities	2	5	10
Injuries	10	15	20
Directionality of violence	Unidirectional – by activists (–1)	Bilateral – both state and activists (0)	Unidirectional – by state (+1)

1,000,000 and 1,030,000 is meaningfully different, though the absolute discrepancy for both is 30,000. The degree of relative ambiguity may be a useful metric for how contested the political narrative is. Incidents where the ambiguity range is minimal, for example, would indicate far higher agreement between official reports and self-reports. Thus, above and beyond the value of such scores as reliability indicators, ambiguity scores may be valuable as explanatory variables.

Conclusion

The framework we suggest departs from many current practices in that it requires a degree of interpretive expertise that makes automated coding impractical. While automated coding provides advantages of speed and consistency (Schrodt, 2012), it cannot yet deal with the level of complexity involved in identifying the differences between the absence of violence, non-violent behavior, and civil resistance campaigns. Human researchers, on the other hand, have the ability to confront this complexity. Researchers must make decisions about how to adjudicate gray areas, develop clear codebooks, and train human coders so that judgment calls are minimal. Although collecting detailed data on nonviolent action is time-intensive, the theoretical and empirical contribution to the field will be considerable.

Footnotes

Acknowledgements

The authors thank the editors and anonymous reviewers for helpful comments. Remaining errors are our own. Equal authorship is implied.

Notes

References

Asal

Victor

Pate

Amy

Wilkenfeld

Jonathan

(2008) Minorities at Risk Organizational Behavior Data and Codebook Version 9/2008 (http://www.cidcm.umd.edu/mar/data.asp).

Banks

Arthur

Overstreet

William

Muller

Thomas

(2004) Political Handbook of the World 2000–2002. Washington, DC: CQ.

Barranco

José

Wisler

Dominique

(1999) Validity and systematicity of newspaper data in event analysis. European Sociological Research 15(3): 301–322.

Benedict

Kristian

(2013) A map of non-violent activism in Syria. Amnesty International UK (http://www.amnesty.org.uk/blogs/campaigns/map-non-violent-activism-syria, accessed 5 March 2014).

Chenoweth

Erica

(2011) The Nonviolent and Violent Campaigns and Outcomes (NAVCO) dataset, version 1.1 (http://www.du.edu/korbel/sie/research/chenow_navco_data.html).

Chenoweth

Erica

Lewis

Orion A

(2013) Unpacking nonviolent campaigns: Introducing the NAVCO 2.0 data. Journal of Peace Research 50(3): 415–423.

Chenoweth

Erica

Stephan

Maria J

(2011) Why Civil Resistance Works: The Strategic Logic of Nonviolent Conflict. New York: Columbia University Press.

Cunningham

Kathleen Gallagher

(2013) Understanding strategic choice: The determinants of civil war and nonviolent campaign in self-determination disputes. Journal of Peace Research 50(3): 291–304.

Earl

Jennifer

Martin

Andrew

McCarthy

John D

Soule

Sarah A

(2004) The use of newspaper data in the study of collective action. Annual Review of Sociology 30: 65–80.

10.

Hendrix

Cullen S

Salehyan

Idean

(2013) Social Conflict in Africa Database (SCAD) (www.scaddata.org, accessed 30 October 2013).

11.

Herkenrath

Mark

Knoll

Alex

(2011) Protest events in international press coverage: An empirical critique of cross-national conflict databases. International Journal of Comparative Sociology 52(3): 163–180.

12.

Loyle

Cyanne

Sullivan

Christopher

Davenport

Christian

(2014) The Northern Ireland Research Initiative: Data on the Troubles from 1968 to 1998. Conflict Management and Peace Science 31(1): 94–106.

13.

Schock

Kurt

(2005) Unarmed Insurrections: People Power Movements in Nondemocracies. Minneapolis, MN: University of Minnesota Press.

14.

Schrodt

Philip

(2012) Precedents, progress, and prospects in political event data. International Interactions 38(4): 546–569.

15.

Schweingruber

David

McPhail

Clark

(1999) A method for systematically observing and recording collective action. Sociological Methods and Research 27(4): 451–498.

16.

Scott

John L

(2001) Media congestion limits terrorism. Defence and Peace Economics 12(1): 215–227.

17.

Sharp

Gene

(1973) The Politics of Nonviolent Action. Boston, MA: Porter Sargent.