Abstract
Scholarship on agenda setting and electoral politics often suffers from lack of access to data on deliberative processes, a shortcoming that often leads to inflexible theoretical outcomes. In this article, I assess the viability of a new data source—Google web search data—for use in social science research. Such data, when collected and applied appropriately, offer a unique lens through which information on patterns of information seeking can be drawn. I examine data on the final phase of the 2012 US presidential election and present findings along two lines. First, sensitivity tests demonstrate that web search information can proxy for public opinion. Second, testing of hypotheses finds that candidate presence on the campaign trail, although generally correlated with increased interest in political issue, is subject to media- and event-specific modifying effects. These findings underline this article’s principal contention that web search data are a viable resource for scholars of political communication.
Introduction
How do presidents, candidates, and their campaigns affect the behavior of the mass public? In recent years, considerable scholarship has developed the years-old debate (Finkel, 1993; Gelman and King, 1993) on the degree to which highly visible campaign activities translate into real changes in group voter conduct, both at the polls and in social society (Flowers et al., 2003; Freedman et al., 2004; Lau and Schlesinger, 2005). From advertisement and campaign spending to the tone of candidate rhetoric, this work has broached a rich array of topics and has significantly expanded the literature on electoral processes and the campaign trail. Notable areas of research, particularly those looking at commercial promotion (Huber and Arceneaux, 2007; Krupnikov, 2011; Spiliotes and Vavreck, 2002) and organizational finance (Benoit and Marsh, 2008; Hillygus and Jackman, 2003), have helped solidify core assumptions about campaign operation and contributed to an understanding of how candidate actions come to impact individual and group behaviors.
And yet, despite significant work done on the ways in which individuals and broader demographic subsets of the population are both persuaded and mobilized during campaigns (Holbrook and McClurg, 2005), strikingly few studies of campaign effects on active individual and group behaviors exist. For various obvious reasons, notably, the limited availability of alternatives, the bulk of research performed on such issues makes use of data from polling services, survey projects, and other sources that focus on the outcome of voter stimulation rather than the process of individual action and reaction. Although most freely available, such data have numerous disadvantages. Respondents often fall prey to framing effects (Druckman, 2001) or know little about the subject in question (Berinsky, 2004). Moreover, data gathered as processes are unfolding, such as polls performed on election day, are notoriously prone to ecological biases such as group conformity, and don’t capture deliberation as a process distinct from perceived changes to voter behavior. The resultant dynamics bias data products and place an inappropriate emphasis on the need for data collectors to time queries perfectly, with the consequence for scholarship, of course, being a overweighted focus on the study of structural aspects of campaign behaviors.
In this article, I attempt to answer the question of how campaign visits during the presidential election season can influence the information seeking behaviors of the mass public via use of a relatively new source of information on demographic trends—web search data. Through analysis of data tied to particular campaign trail locations and dates from the 2012 Presidential election in the United States, I test and examine the ways in which presidential candidate visits affect the propensity of the mass public to actively seek out information on subjects related to national elections, politicians, and platform issues. In addition to making substantive headway on the topic of campaign effects and active deliberative behaviors, this pilot investigation serves as a feasibility test for the use of such data in social science research by testing for data sensitivity—that is, the degree to which results can be said to proxy for public opinion—and demonstrating a modified coding regime that rectifies surface issues stemming from ecological inputs.
When framed more closely in the context of studies of campaign actions, agenda setting, and organizational planning, measurements of information seeking reveal much about the ways in which persuasion and mobilization occur at the micro-level. Information seeking is costly. Individuals must allot valuable time and energy—essentially scarce resources—to any attempt to develop knowledge of an election-related subject. The result is that the choice to search for specific search terms says much about the value which individuals place upon particular issues in the presence of campaign-induced and environmental factors. Degrees of variance over time are, thus, an apt barometer for gauging the interactive significance of candidate and campaign inputs. Measuring information seeking behavior also offers theoretical perspective on the role of campaigns—and particularly of the executive branch—in contributing to the broader functionality of democratic processes (Elster, 1998; Fishkin, 1995). The President and challengers to the position play a crucial role in offering information and perspective as one of several prominent countervailing institutions of democracy. Web searches, regardless of the resultant preference, are a form of societal mobilization and of the flow of intellectual “goods” in a democracy and can, thus, indicate the degree to which campaigns function as a component part of the democratic system.
Beyond the intrinsic value drawn from the results presented in later sections, however, it is important to note from that start that this article’s primary contention is ultimately one of utility. Search engine data, effectively harnessed in recent predictive studies of both epidemiological and socioeconomic phenomena (Breyer et al., 2011; Ginsberg et al., 2009; Mohebbi et al., 2011; Pelc, 2011, 2013; Utych and Kam, 2013), represent a valuable approach for understanding the cumulative activity—distinct from the eventual preferences—of members of target populations for social scientists. Measurement of behavior, rather than preferences, brings several significant advantages to analyses of campaign and broader electoral processes in particular. Notably, and considering the fact that many of the field’s shared theoretical assumptions are premised on the notion that differences in campaign or candidate behavior are made significant by the inclinations and actions of the individual citizen, measures of changing behavior allow for insight into both the effectiveness of candidate messages and the interaction between campaign activities and environmental circumstances. In contrast with other outcomes-focused data sources, variance in web search data patterns along these lines is able to roll back the “black box” area between input and outcome (Goldthorpe, 1997) by detailing the propensity of the mass public to seek out information irrespective of potential environmental, intellectual, or study-specific framing effects. Moreover, the flexibility afforded by this and related approaches to tailor data collection processes across a wide range of lexical possibilities delivers a significantly improved capacity to determine the relative importance of different triggers.
Theory: campaigns and information seeking
Studies of candidates, campaigns, and elections have developed along two distinct lines of inquiry in the literature on American political contest in recent years. In one vein, a large number of scholars have placed focus on the ways in which candidate messages, delivered through a plethora of direct and media-based mediums, act to persuade undecided parts of the voting population during elections (Goldstein et al., 2002; Johnston et al., 2004; Shaw, 1999). Although common methodologies involve outcomes-based measures of preference in the context of factor-specific variations, studies of persuasion have overwhelming focus on the position and reactions of either individuals or demographic groupings. In contrast, a growing subset of interested researchers has sought to understand the impact of campaign actions and candidate behaviors on the degree to which elements of the electorate mobilize. Distinct in conceptual emphasis, examinations of mobilization highlight the role that marshaling structures and forces—like emotive advertisements or partisan and grass roots support groups—play in affecting the manner in which individual voters receive and respond to information (Holbrook and McClurg, 2005; McClurg, 2004). Both bodies of work, of course, intersect with broader questions of the agenda-setting research program at various points in assessing the relative impact of structures and messages on both individuals and organized sectors of the electorate. In this section, I briefly describe work at the intersection of directed messaging and campaign determinants of voter behavior so as to highlight the value of an information seeking angle of approach and to inform the assumptions of later sections of the study.
Studies of persuasion and mobilization
Studies of political persuasion and mobilization have, in the past 20 years, diverged significantly from the basic assumption that campaigns only serve to “activate” the latent core voter support base for each party. Contemporary examinations have predominantly come to focus on research questions that fall broadly within the bounds of three main research programs. One group of analyses has studied the direct effects of candidate and party efforts on the conduct of the mass public across a range of indicators and outcomes. These studies have emphasized the role that environmental factors, from ethnicity to socioeconomic status, play in crafting the frame through which the individual views and generates a response to campaign messages. Perhaps most directly relevant to the question of how campaign activities influence the information seeking behavior of voter groups, a subset of this group of analyses has attempted to measure the extent to which realistic information seeking—that is, where circumstances do not permit a review of the total volume of knowledge available on campaigns or candidates and where voters must choose how to best allot their time—produces “correct” voting routines (in which individuals revisit prior answers after a full post-event review of relevant facts) in a given population (Bartels, 1996; Dahl, 1989; Lau and Redlawsk, 1997, 2006). This work introduces and supports the thesis that individual-level information seeking tends to produce “correct” outcomes that may not necessarily fall in line with the predictions of broad-scoped voter profiles (of ideology, party affiliation, etc.). Although the remaining sections of this article do not reflect a further test of such work, this “correct” voting thesis does underwrite the notion that information seeking behaviors are telling of the nuance of electoral outcomes. Moreover, it lends weight to idea that a fuller understanding of how campaign inputs catalyze information seeking reflects a promising avenue for future research.
Although less directly tied to this essay’s research question, two other groups of studies have examined the manner in which advertisements, endorsements, and the efforts of issue-bound civil associations impact upon voter behaviors in both electoral and political support settings. In one vein, there is significant evidence to suggest that campaign advertisements disseminate and furnish voters with persuasive information appropriate to prompt mobilization (Alvarez, 2001; Gelman and King, 1993; Wlezien and Erikson, 2002), while in another, electoral success has been shown as strongly correlated with the successful mobilization of requisite partisan groups through campaign messaging, effective financial underwriting (Benoit and Marsh, 2008), and grass roots operations (Campbell, 2000). It is worth noting that, in both cases, mobilization among targeted or partisan groups in a population is most commonly correlated with broader mobilization (Holbrook and McClurg, 2005; Shaw, 1999). This suggests that popular mustering might occur reactively in the context of mass changes in voter interest rather than as a catalyzing agent of political galvanization and reinforces the importance of a process-oriented research program on the political determinants of information seeking.
The bottom line is that the field’s main inquiries of focus each emphasize the interaction between individual and a candidate’s campaign at two distinct points in time. First, a message is delivered or a specific organizational and financial dynamic is set in place. Then, after appropriate deliberation and, in the case of partisan or other mobilization efforts, implementation, candidates and individuals once again interact with “hard” communication of preferences in surveys, polls, votes, and other recordings of opinion. Process-based studies of information seeking propensities offer a means for understanding the nuance of interacting input effects and eventual impacts on voter outcomes. However, as discussed above, survey and operationalization of discursive processes between interactions are difficult, and researchers often face significant challenges in the form of potential intervening biases. In the remainder of this article, I offer first steps on how this gap in study prospects might be filled.
Information seeking and the web
Use of web search data allows for analysis of patterns of individual behaviors, both at the individual and group levels, rather than just preferences stated at static points in time. To be sure, the scope of application for web search data is yet limited—such information has only become widely available in the past few years and only covers the brief decade since today’s major search companies (namely, Google and Yahoo) started operation. However, this new source of data inputs has already seen promising use in predictive and analytic studies in both the medical and economic fields. Mohebbi et al. (2011), for instance, demonstrated in a landmark study the efficacy of applying data gathered from the Google Insights repository to predicting the outbreak of flu in communities around the United States.
Web search data have also been used in both economic and political economic studies to predict consumer and labor force interest in distinct issues that link to broader trends. Some economists have used information from online sources to show that upticks in interest on unemployment benefits and processes parallel near-term shifts in real unemployment in communities. Pelc (2013) demonstrates that the mass public reacts to impending legal action taken against the United States through the World Trade Organization (WTO), particularly in those states where unemployment levels and industry interest in the specific nature of potential sanctions are high, by searching for information online.
The use of web search data as an indicator of individual propensities to behave in a specific way is perhaps most particularly important because it represents the active and dynamic outcome of an individual’s decision to pursue one set of inquiries over another. Searching of information is a costly activity—consumers and voters alike must choose not only which issues are of most interest and merit closest scrutiny but must also determine from where information will be gathered. Such data, once coded and appropriately treated so as to test for variations in campaign input mechanisms, have the potential to allow social scientists to derive insight beyond the scope of prior studies on political information seeking (Downs, 1957; Fiske and Taylor, 1991; Lau and Redlawsk, 1997, 2006) by tying discrete event data and environmental knowledge (the role of media organizations, etc.) with real representations of deliberative process. In short, the combination of search-related data and data on both processional and structural campaign activities can provide insight on the field’s most basic questions: do individuals care about campaign or broader electoral political issues? If so, what inputs make them care and what actions does caring lead to?
Finally, it is certainly worth noting, as mentioned above, that measurement of information seeking has significance to the study of democratic functionality. Beyond the ability to study particular relationships between identifiable political inputs and civil societal reactions, the broader notion that researchers might observe extensive variation in behavior suggests an enhanced capacity to control for environmental factors in hypothesis testing. A theoretical examination of the centrality of media messages in determining the particular shape of popular mobilization, to pick just one example, is likely to produce more accurate results in controlling for patterns of inquiry among the population. To what degree do media messages really produce certain outcomes, if individuals tend to seek out information across a broader—or just as constrained—range of topics?
Expectations
The literatures on campaign operations aimed at persuading and mobilizing both core and periphery voter bases suggest a number of hypotheses about reactionary individual and group behaviors in locales visited by candidates. In this section, I lay out these basic expectations and note the salience of additionally framing such hypotheses, given the focus on individual behavior between points of “hard” preference setting in the context of the literature on democratic debate and deliberation.
The focus in this study is on voter behavior on the campaign trail of the 2012 Presidential election—that is, in locations where a presidential candidate appears for a campaign-related event (excluding fundraisers). 1 Data were pulled from across the 3-month period between 1 August, which roughly represents the completion of the primary season and covers both party national conventions, and election day on 2 November. The campaign trail was chosen for several distinct reasons. First, candidate appearances represent a “hard” test of the assumptions of the literature on electoral behavior (Pelc, 2013). There is intuitive logic in thinking that additional interest in election topics might be found in places where candidates visit. After all, local media coverage is likely to intensify and the candidate’s message is invariably tailored to the nuanced interests of a particular region. Second, focus on the campaign trail allows us to separate out the impact of the shifting national political and media environment, as comparison with national trends provides a firm baseline from which to root assumptions. Finally, this focus removes the potential impact of extraneous influences on campaign operations, such as Vice Presidential candidate visits, the activities of the national party body, or the rhetoric of correlated interest groups, making it easier to minimize the quasi-independent nature of candidate message as an independent variable.
Use of the campaign trail also makes particular sense when considered in the context of the literature on democratic debate that significantly underwrites scholarship on campaign activities and both group and individual behavioral responses. Across the various arenas of public policymaking, societal mobilization is assumed to occur as the result of distinct elite and organizational inputs at all levels of the system. The focus here on the behavior of populations on the campaign trail thus has particular relevance to scholarship on the operation of the marketplace, as the executive branch represents a unique, significant voice and source of information for debates on national policy issues.
With this combined reasoning in mind, the literature on campaign activities and democratic discourse suggests several simple hypotheses. First, at the base level, I adopt the assumption that local variation in interest in the period around candidate visits will significantly outpace national interest in the same issues. Although this study is certainly interested in the first “hardpoint” of interaction during the electoral process, namely, the input of agenda-directing information and subsequent information seeking at the micro-level, this assumption better mirrors a basic supposition of the literature on agenda setting that increased exposure to policy agendas through intermediary channels better directs discursive activities because of the provision of narrow topics for attention (McCombs, 2005). Thus, H1 is as follows:
H1. Time periods and locations directly tied to candidate visits are likely to be associated with greater positive variation in interest (as denoted by search volume) than is the national average.
Although debated by many (Atkin and Heald, 1976; Brians and Wattenberg, 1996), increased advertisement and promulgation of a particular platform should, according to a number of scholars in the field (Alvarez, 2001; Gelman and King, 1993; Wlezien and Erikson, 2002), feed voter interest in election issues and candidates. Perhaps more importantly, the degree to which media attention indexes to political agendas and realities—such as the historically close nature of voting outcomes in a district—is often taken as a measure of how likely subsets of the overall population are to respond to campaign and related inputs (Kiousis, 2004; McCombs, 2005). In sum, conventional wisdom dictates that individuals resident in swing states are more likely to exhibit deliberative interest during electoral seasons. Thus, H2 is as follows:
H2. Swing state populations, as high-content and politically sensitive environments, should exhibit a relatively greater amount of interest in election issues than is the national average. 2
Finally, while candidate presence is expected to universally lead to increased interest across time and space relative to the national average, a common refrain in recent developments in the literature on the marketplace of ideas and the role of the president suggests that variation is unlikely to be equal across candidates (Mayhew, 2008). In presidential campaigns, an information advantage—garnered through long-term access to intelligence and a logical association with international leadership—is expected to lend an incumbent candidate a distinct edge when it comes to both debating and developing platform positions for foreign and international economic policy (Kaufmann, 2004). Since this study’s focus is on the direct role of candidates and incumbents in campaigning, a hypothesis on the nature of such a particular campaign function seems appropriate. Thus, H3 is as follows:
H3. Although increased variation is expected across the board in campaign trail data versus national averages, greater variance in interest on foreign policy issues is likely to be associated with incumbent (over challenger) visits.
Together, these hypotheses represent just some of the assumptions and questions laid out in the literature on campaign behaviors and democratic debate. In this initial use of web search data, these expectations are deployed to prompt thought on several basic questions on the role of Presidential candidates in directly encouraging information seeking and the utility of campaign information operations in encouraging the same.
Research design
In this section, I detail the data and coding regime adopted in this study.
Data
In the remainder of this article, I test expectations about population behavior during elections over time. Specifically, I focus my analysis on the 2012 Presidential election campaign trail in line with the basic assumption—drawn from both the literatures on campaign efficacy and the role of the President as an important voice in the marketplace of ideas—that the purpose of executive branch political visits (both incumbent and challenger) during election season is to generate support and debate through issue persuasion and mobilization techniques.
Distinct from prior studies of political and economic phenomena that cite search volume as the principal unit of analysis (Breyer et al., 2011; Ginsberg et al., 2009; Mohebbi et al., 2011; Pelc, 2013), I instead turn to a modified measure of recorded interest over time—week-on-week variation in the reported likelihood that random users in a particular locale will search for candidate- and issue-related terminology. Calculating and analyzing variance is useful in that reported values provide an indication of change as a result of new and geographically specific inputs to community debate of political issues—in this case candidate visits to different locales. Week-on-week variation also minimizes the impact of ecological externalities—vastly different search volumes recorded across areas possessed of different population levels and network connectivity, for example—and provides outputs apt for analyzing the degree to which campaign activities seem to induce information seeking and more broadly set agendas. Data trends are less prone to distortion from locational outliers. Moreover, as the focus of campaign trail analysis is the candidate and not those environs involved, results better portray the over-time effects of local versus national campaign inputs.
Data are pulled from Google Trends. 3 Searches using Google account for more than 85% of all web searches per year and represent the bulk of all online access to Internet search data by the 90+ million individual Internet users in the United States. The information provided takes the form of a scaled index, where search volume is reported as an aggregate measure of the probability of searches for particular terms in specific locations. These data are produced through a process in which Google samples web searches within specific chronological and spatial parameters and compares results to overall search volumes for the same information in the matching period. This outputs a value of likelihood that a random individual will search for the particular lexical information over a given time frame. Although it is not a direct measure of individual behaviors, this value essentially quantifies the propensities of the mass public as derived from individual-level inputs. Beyond this normalization of the data, the outputs are further scaled to account for volume shifts and reported on a scale from 0 to 100 that is continually reset on the basis of new search maximums.
In this project, I am interested in terminology related to candidates and election issues. In addition to searches for “Obama,” “Romney,” and “Election,” I am interested in 10 search terms commonly cited in popular media representations of important issues search terms similarly cited as both popular and significant to both campaigns by The Washington Post, Politico, and CNN—“Taxes,” “Healthcare,” “Jobs,” “Economy,” “Immigration,” “Foreign Policy,” “Education,” “Social Security,” “Gun Control,” and “Gay Marriage.” Rather than the simple extraction of static and isolated search instances, however, data pulled on these specific terms represent a dynamic relative measure of interest in the term across all possible searches. Searches can be as simple as “Obama” or “Romney 2012,” or as complex as “Obamacare” and “Romney campaign trip foreign policy gaffe.”
For each search term, I coded variation in interest—essentially variation on the variance in expected interest of random users in specific locales and time periods—on a week-to-week basis. Dummy variables representing candidate (what is the balance of candidate search interest?) and issue (what is the balance of issue search interest?) values were generated along two lines—weighted and unweighted. Both the candidate and issue variables were weighted evenly, and then the issue variable was reweighted along the percentage lines of Gallup’s “Most Important Problem,” a monthly survey product that reports what a representative sample of the population thinks of as the most pressing concern of the nation. Although variation was minimal, I found during tests that the weighted dummy variable produced the most accurate results and is thus reported below.
It is important to note again, of course, that the data presented do not represent the entirety of Google search volume on the stated terms. The sample data size here is 2132 discrete observations based on random sampling of local search populations of 5213 instances in each case. Moreover, it is important to note the dynamic nature of these data. Data are rescaled on an ongoing basis, with new maximums continually resetting the 0–100 scale. This is, of course, useful for long-term analyses.
The impact of candidate presence on public behavior
Issue salience and representativeness in web search data research
Before moving forward with any correlative tests on the data, I address the problem of issue salience and data representativeness. While Internet data have clear advantages in terms of measuring behaviors over survey, poll, and other outcome-oriented sources of information, there is the question of whether Internet searches by individuals (as represented web data that pertain to a geographic grouping) can be said to be representative of broader searches for information by non-Internet using individuals. It is not outlandish to suggest that someone likely to seek information online may be naturally more sensitive to political discourse or media coverage of campaign events. Thus, can search data really proxy for generalized public opinion?
In this section, I briefly engage the topic of issue salience by addressing the study in the context of three validity tests proposed by Mellon (2013). Mellon suggests that search terminology should be assessed in terms of an issue’s face validity (whether or not a term intuitively appears as a plausible measure), content validity (is the term in question relevant to the broader field being studied?), and, most importantly, criterion validity (does the term in question relate well to an existing measure of validity?).
This study’s use of candidate names and issue terminology pulled from and corroborated by several large media sources suggests few issues in terms of assessing face and content validity. Terms like “Obama” and “Romney” clearly relate to the candidates in question, and the very nature of election processes that see candidates identify with policy positions intuitively suggests that such names will act as an effective and plausible measure of correlative position. Moreover, as befits any research program premised on the notion that agenda setting occurs via the translation of pre-set agendas, the issue terms chosen reflect the framing and phrasing choices of the mainstream media. Information on issues related to Medicare, Obamacare, and more, for example, are commonly tagged under the moniker of “social security,” and candidate campaigns typically engage in debates or in other forums within the context of these broad issue headers.
Comparison with existing measures of validity, however, is by far the most important factor in demonstrating the representativeness of the data in question to act as a proxy for broader public interest. Mellon, utilizing Gallup’s “Most Interesting Problem” (MIP) dataset, achieves this comparison through the application of a standard ordinary least squares (OLS) regression that treats issue terms from the search data in question as independent variables and the related terminology from Gallup’s results as dependent variables. He finds significant evidence (R2 between .491 and .754) that the Google data he tests can be predicted by broad existing measures of validity.
Using 2012 MIP data, I reproduce Mellon’s criterion validity test by matching search term data to measures utilized by Gallup (“taxes” are excluded as having no close match, and are thus excluded from the remainder of the study) by running an OLS regression that treats search terminology as independent variables and the survey’s own labels as dependent variables. Six of the nine search indices were found to significantly predict use of MIP labels (strong R2 results between .578 and .823), with the three remaining showing strong—if not significant—correlation (Table 1).
Results of OLS test for sensitivity of search data.
OLS: ordinary least squares; IV: independent variable; DV: dependent variable.
p < .01.
These results confirm the viability of Google search data as a proxy for assessing broader trends in public opinion and public political behavior, and underline this article’s secondary contention that web search data can feasibly be employed in social science research studies to represent aggregate demographic preferences. It is worth noting, of course, that researchers may find significant challenges in correlating datasets based on diverse lexical parameters with accepted measures of such aggregate preference. Nevertheless, such tests of sensitivity are feasible and may prove to be of limited challenge for studies of agenda setting and campaign organization, where examination of the effects of relatively static informational platforms is the norm.
Study results and analysis
I begin examination of the data by looking at the overall variance in random user interest in different search terms over a period of time on both the campaign trail and nationwide. 4 Again, it is important to note that this is not a direct measure of individual behavior, but rather a read of behavior within a certain area that is informed by individual inputs. The underlying assumption that time periods and locations directly linked to the campaign trail are more likely to be associated with increased variation in searches—and therefore in changes in the degree to which the public at large are interested in different issues and candidates—suggests that we should see a sustained and positive differentiation in search volume variation across the entire period of the post-conventions campaign stretch. Figure 1 displays both national and campaign trail-specific trends in search variation alongside data on nationwide volume of major news media focus on both candidates. Rather than simply representing volume, trend lines denote the degree to which the likelihood of a random user search—essentially a group measure of aggregated individual propensity—for candidate-related information falls or rises relative to average search terms on a week-to-week basis.

Information seeking and media coverage in the 2012 US Presidential election.
Taken at face value, the data seem to corroborate the expectation that greater positive variation in interest is associated with the campaign trail than is seen nationwide. With few exceptions, members of the public in locations where candidates visited were far more likely on average to search for information on that candidate than were random Internet users in the United States writ large. Even in cases where search interest fell in time with candidate visits, variation in campaign trail locations nevertheless dipped at a lower rate than it did for the whole population, further suggesting that candidate visits do have a positive impact on public interest via direct campaign activities. Such a result is highly consistent with the expectation of the literature on the efficacy of the marketplace of ideas and of the role of the president as a countervailing institution of the system in providing information to spur debate and to inform voters. Furthermore, the data seem to support the assumption—central to research programs on agenda setting and the role of the media in political processes—that both particular agenda-setting events and media coverage thereof play a significant part in producing direct interest in those areas visited by candidates and targeted by campaign efforts. Surging media coverage of candidates in time with party national conventions, for example, raises the probability of random user search both generally across the country and significantly on the campaign trail. While this might seem obvious, as candidate presence for each national convention is typically the subject of great media attention, such a major event clearly has a duration effect for stops on the campaign trail—the surge of interest demonstrated by this measure of user search probability outlasts conventional news cycle coverage.
Beyond this interesting result, however, several points bear consideration and additional analysis. When reviewed in the context of both the events and coverage of the 2012 election season, the data offer a variable and nuanced read of interaction between candidates, the media—broadly constituted by major news and wire volume 5 —and information seekers. Spiking interest in campaign trail search trends does not uniformly match with similar (if reduced) results at the national level. Although interest does appear event-driven, input type appears to have a meaningful impact on the degree to which campaign measures produce distinctly local outcomes. National presidential debates, for example, fail to provoke a significant differential result despite increased media coverage of both candidates immediately following such events. This suggests that candidate presence is not itself a universal driver of the propensity to seek information, but rather is subject to the modifying impact of discrete intervening quantities. Events or occurrences with a distinct focus on the “national stage” play less significantly into campaign capacity to provoke an information seeking response, although major foibles like the Romney “47%” comment seem to focus back on the campaign. Likewise, certain landmark policy events, such as the attack on the consulate in Benghazi, seem to support the supposition of H3 that challengers face disadvantage on certain issues—likelihood of user search along the campaign trail failed to outpace national probability during that period, despite increased coverage of Romney’s statements on the matter.
I now turn to the assertion that high-content environments alter the discursive dynamic for individual members of the public at large by injecting significantly more information than is common and, therefore, should be correlated with much higher information seeking behavior. This hypothesis follows on, of course, from the assumptions of literature in the field on the impact that advertisements, effective partisan mobilization efforts, and effectual financial disposition should have on influencing voter behaviors and preferences. I take two approaches to testing this notion here. First, I present the results of correlation between dummy variables, in the context of which candidate actually visits, weighted to represent the distribution in variance of random user searches for candidates and campaign issues. Both the candidate identification and candidate search variables are coded such that “Obama” is matched with “0” results and “Romney” with “1” results. As can be seen in Table 2, candidate searches are relatively even regardless of which candidate shows up. This suggests that individuals, as aggregated by this group measure, are likely to search for opposition as much as they are to search for the candidate in question on the campaign trail. More importantly, however, the results suggest that “Romney” searchers were far less likely to seek information on specific campaign issues throughout the period, even though candidate presence does broadly correlate with increases in the propensity to search for such topics. Considering the distinct early advantage that the Obama campaign had in terms of money to spend on advertisement in the early weeks of the 2012 election, this result is revealing and suggests that greater ability to convey messages via commercial or media methods leads to greater individual search behavior.
Results of correlation test for weighted variables.
Second, I drop the dummy variables and present the results of correlation between search variation values for specific issue areas and variables coded (either 0 or 1) to represent the political status of the state being visited by the candidate—blue-leaning, red-leaning, or swing state. These results suggest that the metric of random Internet users in swing states are significantly more likely to seek information on campaign issues than are those in traditionally red or blue states. There are, of course, exceptions to this, but the two principal outliers are in issue areas that held unique positions in the 2012 election. Healthcare, while arguably not a headline issue for much of the campaign period, was at the heart of a bitter philosophical divide between the parties. This matches the correlation seen here between random user interest in red and blue states. The other outlier, foreign policy, is discussed in Table 3.
Results of multiple correlation test for sensitivity of search data.
An interesting takeaway from this result is that interest in election issues seems more strongly correlated with the exposure of the public at large in a given area to a high-content environment—in this case, living in a swing state—than it does with traditional partisan inclinations. With few exceptions, red and blue states seem to react less aggressively to broad issue areas, including those emphasized by candidates in order to rally what is seen to be a core voter base, than do random users in swing states. The state typology result also suggests that foreign policy issues occupy an exceptional role in terms of candidate rhetoric and individual receptivity. In line with the assertion that foreign policy is unlikely to feature as a major pillar of campaigns due to the disadvantage in experience and perception held by a presidential challenger, the correlative results above indicate that the mass public in swing states is far less likely to search for foreign policy matters than those in relatively more stable states where such issues are not a potentially volatile feature of a campaign’s issue “offensive.” This result also aligns with the result above that events that focus on the “national stage,” such as presidential debates, have a minimal effect on local user interest and lead to an increase in media coverage accompanied by a low aggregate rise in interest nationwide.
This result is further borne out in simple correlative comparisons of the collected data. Individuals searching for information on Barack Obama were more likely to also search for foreign policy-related issues as were those searching for information on Mitt Romney. This supports the hypothesis that foreign policy interest is far less likely to be associated with the challenging candidate than it is with the intelligence- and experience-advantaged incumbent and is even surprising in the wake of prominent campaign talking points like the attack on the American consulate in Benghazi or a series of gaffes made over the state of the nation’s naval forces. That said, it is important to note that this interesting finding is likely prone to modifying effects not controlled for within the scope of this study, such as the administration-centric nature of information release on the Benghazi crisis during the course of the election season.
Conclusion
The recent and increasing availability of web search data has significant implications for social science research. While observation of inputs and measurement of resultant conditions tell us much about the viability of approaches to theoretical modeling, obstacles to the examination of deliberative processes—often referred to as the “black box” of social science research—have consistently complicated the ability of scholars to fully test certain hypotheses. My objective in this article has been to initially demonstrate, through the supposition and testing of simple hypotheses, the viability of web search data—in terms of both representativeness and application—as a potential source for expanding the scholarly capacity to study deliberation.
In general terms, the data and analysis presented here provide support for several broad hypotheses. First, the direct involvement of an elite voice such as the President of the United States or an electoral challenger does seem to stimulate the propensities of geographically defined population groupings to seek out information on policy issues related to political platforms. This is not a direct commentary on the behavior of individuals, of course—the web search metric instead offers aggregate information about a population as informed by individual samples. However, this does suggest the occurrence of changed voter propensities in such a way as to be useful as a starting point for future research. Too, although interest appears dependent on input typology and does vary in contrast with media coverage of candidate activities, candidate presence seems to bolster the deliberative impacts of major campaign events by way of a generated duration effect. The relative strength of these trends in high-content regions of the country—electoral swing states in this case—also suggests that the public at large is broadly influenced by factors relating to increased political activity or, as may be the case, the knowledge that votes in contested areas seem to carry greater weight in terms of outcomes than do those in politically “stable” regions. Again, of course, the web search data inform a group metric based in individual behaviors—future research must determine whether this trend is applicable at the individual level in such populations. Finally, the data presented here also provide basic support for the idea that the role of the elite involved in stimulating democratic debate through campaign activities does matter. Although further work is needed on the subject, it seems fair to posit that informational and experiential advantages held by an incumbent on issues related to foreign policy, defense, and international trade can influence determinations of what campaign issues are to be pursued or engaged by candidates.
This article’s overall narrative and simple tests also suggest directions for future research along mechanical lines. In particular, while a key advantage of web search data is its relative freedom from the biases of outcomes-focused methodologies, it is certainly the case that there is significant promise in the application of existing poll, survey, or other data to the corroboration of Internet-focused studies in the future. Such data can help validate new methods of inquiry and strengthen the broader understanding of the interaction between individuals, organizations, and structures. The opportunities for research on agenda setting and campaign efficacy are one of many potential paths for developing studies of behavior and interaction in human politics. Furthermore, this study demonstrates the efficacy of investigating “hard” representations of political activities and platforms—inclusive of activities as distinct as elite or interest group rhetoric and the active operation of such organizations at the community level—through the use of search data in both temporal and spatial terms.
Footnotes
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
