Making Surveys Work Better: Experiments in Public Opinion Research

Abstract

Introduction

Modern political science finds its origins in the legal tradition of analyzing the design of constitutions and the sociological foci on elite power and group interests. But, thanks to the behavioural revolution of the 1950s and 1960s, political scientists ‘democratized’ their research and shifted their analytical gaze to the individual level. Understanding politics required studying the citizens whose quotidian activities and attitudes shaped the larger body politic. Such interest in ‘normal’ people was not original to the behaviouralists, of course. Anthropologists and other scholars for whom fieldwork was a primary research method had long-studied non-elites. But what was genuinely ‘revolutionary’ about the behavioural revolution was the scale of its ambition.

The principal weapon in the behaviouralist arsenal was the mass public survey. Clearer understanding of the statistical theory underpinning probability sampling, and technological advances such as random-digit-dialing and greater computing power, allowed political scientists to generate previously inconceivable large random samples of the citizenry, the analyses of which painted richer, more nuanced pictures of how people regarded and interacted with the political sphere (Groves, 2011). Since these samples were drawn randomly, analysts could generalize their findings to the larger national population from their samples. Overnight, the field of election prediction was born, countless election-night television specials were spawned, and countless psephologists’ careers were made possible.¹

A half-century after mass survey research became possible, it is now a commonplace. Compared to previous generations of scholars, today’s researchers are faced with an embarrassment of riches. We have national election studies from countries all over the globe, cross-national surveys such as the World Values Survey and its global and regional Barometer counterparts, and an innumerable number of other large and small surveys designed to study a wide variety of social and political issues ranging from vote intention to foreign policy attitudes to preferences over myriad policy domains. Indeed it is hardly an exaggeration to suggest that the survey is the workhorse of modern empirical political science, with one analysis estimating that they are ‘used in about a quarter of all articles and about half of all quantitative articles published in major political science journals’ (King et al., 2001, fn 1, cited in King et al., 2004, p. 191).

The Indian political science profession was quick to appreciate the value of mass public opinion surveys. The efforts of the pioneers at the Centre for the Study of Developing Societies (CSDS) who spearheaded the early election studies have been continued by Lokniti today. As a result, political scientists working on India have access to pre- and/or post-election surveys spanning five decades of national and state elections, which together represent a treasure trove of data unavailable to scholars working on any other developing nation.

The achievements of survey-based research for illuminating politics in India and elsewhere are indisputable, but all tools—even those as venerable as surveys—must innovate or risk becoming obsolete. Today’s political scientists are increasingly interested in questions of causation, and seek to move past the mere correlations possible with observational (i.e., non-experimental) data. Consequently, interest in field experimental techniques continues to rise, and thanks to the efforts of development research organizations such as the Poverty Action Lab (J-PAL) and Innovations in Poverty Action (IPA), a train journey from Bombay to Delhi passes through a thousand villages, and twice as many ongoing field experiments. Such efforts are undeniably interesting; perhaps some might even prove useful for identifying better policy design. Yet, even as they promise the researcher a glimpse of the holy grail of causality, they cede an important advantage of the survey: its generalizability.

I do not intend the preceding discussion to be a criticism of any research technique. All methods have strengths and weaknesses, and the prudent scholar utilizes multiple methods to triangulate her way closer to the truth. Rather I offer this brief overview as a preamble to considering how we might improve our surveys to take advantage of experimental methods while preserving the external validity that is the hallmark of a well-realized survey sample. In the next section, I discuss an exciting methodological innovation in public opinion research—the survey experiment, and illustrate its utility through recent applications. The conclusion summarizes the key virtues of this method.

The Promise of Survey-based Experiments

While an oversimplification, a useful pedagogical distinction is that experiments provide researchers internal validity while surveys generate external validity. In plainer language, this means that experiments, by randomly assigning participants to treatment and control conditions, can more confidently attribute variation in the outcome variable to whether or not one received the treatment. Surveys cannot; analysts ‘can control for’ those items that were included in the instrument but can do nothing about plausible confounders that were excluded. Yet surveys, because their samples were randomly generated, can generalize to the population of interest. Survey-based experiments seek to marry these strengths, and represent a crucial innovation to public opinion research that has been largely ignored in India.²

In a survey experiment a researcher ‘splits’ the sample into control and treatment groups. The only constraint on how many treatment conditions one has is the sample size required to provide adequate statistical power. For illustrative purposes consider the case with a single treatment. Because assignment to the treatment group is random (i.e., it is not correlated with any observable or unobservable aspect of the respondent), we can attribute any observed difference in responses between the two groups to the treatment. If so inclined, we might be so bold as to claim that, the treatment ‘causes’ the difference.

Scholars have used such techniques both to improve survey questions and to test theoretical hypotheses. In a cutting-edge study, Hanmer, Banks and White (2014) design an experiment to see if differences in question wording can reduce over-reporting of vote turnout on surveys. As anyone who has worked with India’s National Election Study (NES) knows, the share of respondents reporting having voted in the previous election exceeds the validated turnout reported by the Election Commission. Such over-reporting is universal across countries, and is explained by poor respondent recall as well as social desirability effects since voting is normatively desirable in a democracy. Hanmer and colleagues show that changing the wording of the turnout question can attenuate this bias. They test two treatments. In the first, the respondent is told that the researcher will check whether or not the respondent actually voted using official records. In the second, the question simply acknowledges that some people claim to vote even if they have not. These treatments invoke the norm against lying by implying that the researcher can know the truth. Since norms against being thought a liar are stronger in all societies than the norm of being thought a voter, respondents are more truthful in the treatment conditions than when asked the standard ‘Were you able to vote?’ question. Given how central this turnout question is to the analysis of Indian elections, similar experiments should be conducted here.

Eliciting truthful answers motivates Daniel Corstange’s (2009) work on the so-called ‘list experiment’. Applications of list experiments are typically to topics plagued by strong social desirability bias such as racism, sexism or other forms of bigotry. Corstange (2012) applies it to study vote trafficking in Lebanon. Simply asking respondents if they had ‘sold’ their vote would be unlikely to be answered honestly given strong norms (and laws) against such behaviour. So Corstange conducts an experiment on a survey sample. The control group is shown a list of uncontroversial questions (‘you read newspaper coverage of the campaign regularly’; ‘you read the candidates’ campaign platforms thoroughly’ and so on), while the experimental group sees the same list with one addition (‘someone offered you or a relative personal services, a job, or something similar’). Respondents are asked how many items on the list they agreed with, but not which ones. Since the respondent has the cover of not having to identify the controversial opinion, they can be more truthful. Corstange’s analysis finds that a majority of Lebanese voters in the 2009 election sold their votes! Such behaviour does not appear to have a sectarian or partisan basis, however, with members of all groups showing equal propensities to do so.

A final example considers the difficulty of ensuring that respondents understand the questions and answer categories on surveys similarly enough to ensure the comparability of their responses. This involves more than translating the questions into the different languages and dialects of the target population. Imagine asking citizens if they are satisfied with the provision of public services provided by their government. It is not inconceivable that a citizen of Denmark might evince lower satisfaction than a citizen of India, not because of any objective disparity but due to enhanced expectations (this is a variation of the so-called Easterlin Paradox). King et al. (2004) describe a method of ‘anchoring vignettes’ designed to enhance comparability of survey responses. A variation is to present respondents with a specific scenario to ensure comparability of information and benchmarks. Emily Beaulieu (2014) does so to understand perceptions of electoral fraud in the United States. Respondents are asked to read a newspaper article about an election and then asked their opinion about whether fraud was likely. The treatment group sees the same piece with key features altered. Beaulieu shows that standard questions about fraud perceptions are unreliable since they are dependent on one’s partisan identity and that of the accused party. Electoral fraud, it appears, lies in the eyes of the beholder, and we use different lenses to assess the behaviour of the party we support and those we oppose.

Enhancing Survey Research in India

Mass public opinion surveys play a critical role in the study of democratic politics. They allow us to form an accurate picture of the national policy mood and to represent the diversity of views held by the population. But to do their jobs best, surveys must be enhanced with the latest methodological innovations to which we have access. Survey-based experiments represent such an enhancement, allowing us to improve the wording of our questions, to elicit true attitudes on sensitive topics, and to test hypotheses about how citizens process the same information differently.

All these facets of survey experiments recommend them especially well to the Indian milieu. In a society as diverse as India’s, where elections are contested on contentious topics such as corruption, communalism, and criminality, and where citizens’ attitudes towards their neighbours are shaped by delicate topics such as gender, caste, and religion, enhancing our survey instruments with experimental techniques holds tremendous promise for the next generation of public opinion scholarship. The future is here; the methods are readily available; let’s experiment.

Footnotes

Acknowledgements

I thank Sunshine Hillygus for her help. All errors are my own.

Notes

References

Beaulieu

Emily A.

(2014). From voter ID to party ID: How political parties affect perceptions of electoral fraud in the United States. Manuscript, University of Kentucky, Lexington, Kentucky, USA.

Corstange

Daniel

. (2009). Sensitive questions, truthful answers? Modeling the list experiment with LISTIT. Political Analysis, 17(1), 45–63.

Corstange

Daniel

. (2012). Vote trafficking in Lebanon. International Journal of Middle East Studies, 44(3), 483–505.

Druckman

James

Green

Donald

Kuklinski

James

Lupia

Arthur

. (2006). The growth and development of experimental research in political science. American Political Science Review, 100(4), 627–35.

Groves

Robert M.

(2011). Three eras of survey research. Public Opinion Quarterly, 75(5), 861–871.

Hanmer

Michael J.

Banks

Antoine J.

White

Ismail K.

(2014). Experiments to reduce the over-reporting of voting: A pipeline to the truth. Political Analysis, 22(1), 130–141.

Hillygus

D. Sunshine

. (2011). The evolution of election polling in the United States. Public Opinion Quarterly, 75(5), 962–981.

Kinder

Donald R.

Palfrey

Thomas R.

(1993). On behalf of an experimental political science. In Kinder

Donald R.

Palfrey

Thomas R.

(Eds.), Experimental foundations of political science. Ann Arbor, MI: University of Michigan Press, pp. 1–40.

King

Gary

Honaker

James

Joseph

Anne

Scheve

Kenneth

. (2001). Analyzing incomplete political science data: An alternative algorithm for multiple imputation. American Political Science Review, 95(1), 49–69.

10.

King

Gary

Murray

Christopher J.L.

Salomon

Joshua A.

Tandon

Ajay

. (2004). Enhancing the validity and cross-cultural comparability of measurement in survey research. American Political Science Review, 98(1), 191–207.

11.

Sniderman

Paul M.

Grob

Douglas B.

(1996). Innovations in experimental design in attitude surveys. Annual Review of Sociology, 22, 377–399.