Abstract
This article uses Twitter data and machine-learning methods to analyze the causal impact of the Supreme Court’s legalization of same-sex marriage at the federal level in the United States on political sentiment and discourse toward gay rights. In relying on social media text data, this project constructs a large data set of expressed political opinions in the short time frame before and after the Obergefell v. Hodges decision. Due to the variation in state laws regarding the legality of same-sex marriage prior to the Supreme Court’s decision, I use a difference-in-difference estimator to show that, in those states where the Court’s ruling produced a policy change, there was relatively more negative movement in public opinion toward same-sex marriage and gay rights issues as compared with other states. This confirms previous studies that show Supreme Court decisions polarize public opinion in the short term, extends previous results by demonstrating opinion becomes relatively more negative in states where policy is overturned, and demonstrates how to use social media data to engage in causal analyses.
Introduction
Researchers are divided over how the Supreme Court impacts American public opinion. One group of scholars argues that the public moves with the Justices’ rulings, garnering consensus through strength of argument and the legitimacy of the courts (Lerner, 1967). Another camp argues that the public becomes further polarized after ruling on a divisive issue, with those inclined to agree with the Justices becoming more adamant in their support and those predisposed to disagreement becoming further entrenched in their opposition (Franklin & Kosaki, 1989). While these studies consider the ways in which public opinion changes temporally in the wake of a Supreme Court decision, few analyze the state-by-state reactions to federal rulings, critical if a decision aligns with one state’s existing legal framework but overturns another’s. This article extends previous research into Supreme Court rulings and public opinion by incorporating a state-level analysis in studying the Obergefell v. Hodges decision and the federal legalization of same-sex marriage in the United States.
In June of 2015 in a 5-4 ruling, the Supreme Court held that the right to marry was a “fundamental right” under the Due Process Clause of the Fourteenth Amendment, instantaneously overturning same-sex marriage bans in 13 states. 1 This decision represents the most recent in a long line of monumental cases in which the Court made a ruling on a divisive social issue. Exploiting this variation in state laws regarding the legality of same-sex marriage, I use a difference-in-difference estimator to identify the causal impact of a policy change on the expression of sentiment toward same-sex marriage. 2 I find this impact to be negative, indicating a less positive response by those in the affected states, even when controlling for potentially relevant demographic variables and party identification.
While many studies use polling or survey data to measure the shift in public opinion before and after landmark decisions (e.g., Christenson & Glick, 2015; Franklin & Kosaki, 1989; Hanley et al., 2012; Johnson & Martin, 1998), this article uniquely investigates these issues by using a subset of Twitter messages regarding same-sex marriage and gay rights issues. By implementing machine-learning methodologies to extract measures of sentiment from a large collection of tweets before and after the Supreme Court decision, I analyze a finely grained data set that allows for new insights into the short-term dynamics between Supreme Court decisions and public opinion. Studying this relationship is critical to discovering whether or not the Judiciary is able to guide public opinion in the direction of their opinion, or rather acts as a catalyst to further divide the public. The fine-grained temporal nature of social media data further allows me to examine the impact of the court’s decision in the immediate weeks before and after the decision, examining the short-term reactions in these states to the court decision.
The article proceeds as follows. First, I outline the relevant literature discussing the Supreme Court’s impact on public opinion. Then, I describe the methodology I employ in this article, including details on how I collect and analyze Twitter data. I then look at the aggregate change in opinion following that the Court’s ruling before turning my analysis to a detailed investigation of the impact of policy change at the state level. In this section, I conduct a difference-in-difference analysis, finding that the Supreme Court’s decision engendered an increased negative reaction in those states where the ruling represented a change in state policy.
The Supreme Court and Public Opinion
What constitutes the proper role of the Supreme Court in the United States is a long-standing question. In the opinion of Alexander Hamilton and the Federalists, the “independence of the judges may be an essential safeguard against the effects of occasional ill humors in the society” protecting against the “serious oppressions” of minority parties (Hamilton et al., 1787–1788/2009, pp. 395–396). However, to counter the Anti-Federalists’ arguments that an independent judiciary could wholly override the democratic process (Storing, 1981), Hamilton further emphasized that the judiciary would be the “weakest” of the three branches of the Federal Government, without the “force nor will” to enforce its judgments (p. 392).
Without the ability to enforce its rulings, a number of scholars have argued that public opinion constrains the Supreme Court (Hall, 2014). However, the Supreme Court’s record demonstrates a number of instances where the court’s rulings went against popular opinion, leading others to conclude the institution is counter-majoritarian in nature (Mishler & Sheehan, 1993). When decisions run counter to a majoritarian preference, scholars have argued the Court consciously recognized its role as “Republican Schoolmaster,” using their judicial power to educate citizens and guide public opinion (Lerner, 1967).
Behind these arguments is the notion that, viewed as a popular and revered institution, the Supreme Court is able to directly influence public opinion in the direction of their decisions (Casey, 1974; Dahl, 1957; Gibson & Caldeira, 2009; Mondak, 1994). 3 The theory that the court lends legitimacy to their rulings in a way that moves public opinion in the direction of their decisions is termed the Positive Response Hypothesis (Franklin & Kosaki, 1989).
While there is a great deal of support for the Positive Response Hypothesis in experimental work (Bartels & Mutz, 2009; Clawson et al., 2001; Hoekstra, 2003; Mondak, 1994), the theory does a poor job explaining empirical findings in a number of observational studies (Franklin & Kosaki, 1989; Nicholson & Hansford, 2014). Roe v. Wade represents a particularly important case study that refutes the Positive Response Hypothesis, as public opinion data show that before and after the ruling, aggregate support for abortion remained unchanged.
To address this empirical discrepancy, there have been a number of alternative theories describing how the public will respond to Supreme Court decisions. The Structural Response Hypothesis posits that, even if Supreme Court decisions fail to move aggregate public opinion in one direction, court decisions can still alter the “structure of opinion”—that is, the amount to which different groups “support and oppose a position and how intensely” (Franklin & Kosaki, 1989, p. 753). Thus, a Supreme Court decision might cause ex-ante supporters of a position to become more favorable, while simultaneously causing ex-ante opponents to become more negative. In the aggregate, this would appear as no movement in overall public opinion, although in actuality the court was responsible in further polarizing public opinion.
Another alternative is the backlash model, predicting that Supreme Court rulings that change policy will move aggregate public opinion away from the Justice’s decision (Haider-Markel, 2007, 2010). In this model, Supreme Court decisions act as focusing events that lead to a “large, negative, and enduring shift in opinion against a policy or group” (Bishen et al., 2016, p. 626). We observe this backlash most acutely in the short term, and it can eventually lead to long-term aggregate support to the Justice’s decision (Ura, 2014). As I will detail below, I explicitly test these competing theories of how the might public respond to major Supreme Court decisions in a causal framework using social media data.
Heterogeneous State Reactions
While there is extensive research and debate analyzing the impact of Supreme Court decisions on aggregate public opinion, much of the previous work does not explicitly test whether a shift in opinion is the same in the group of states where a ruling leads to a policy change, occurring whenever state and local policies contradict a Federal decision by the Supremacy Clause of the United States Constitution (U.S. Constitution Article, VI, §2). My work addresses this gap in the literature by considering the consequences of Supreme Court decisions that nullify some state polices while leaving the policies of other states unchanged. 4
The reason most earlier work does not consider the state-level reactions to Supreme Court rulings conditional on the state’s existing legal framework is likely limitations in available data; with few comparable state-by-state surveys, researchers often rely on national survey data (e.g., Franklin & Kosaki, 1989; Johnson & Martin, 1998; Marshall, 1989). However, given citizens in different parts of the country experience different policy consequences as a result of Supreme Court decisions, it seems natural to assume that different groups of states might have divergent reactions to the Justices’ rulings.
To hypothesize how public opinion will move in states where the Supreme Court overturns policy, I consider the literature on public opinion toward Federalism. Survey data over the course of many years demonstrate that citizens consistently view their state governments more favorably than the Federal government (Kincaid & Cole, 2000, 2008, 2011). These “attitudes are sensitive to respondents’ affiliation with the party in power nationally” (p. 66), with members outside the standing President’s party more likely to believe the federal government has too much power (Kincaid & Cole, 2011). These opinions also vary by region, with citizens in southern states more likely to believe their state/province is not “treated with the respect it deserves in the federal system of government” (Kincaid & Cole, 2008, p. 479). Given that the public tends to view state governments more favorably than the federal government, these studies suggest that when a Supreme Court decision goes against state-level policy, public opinion is prone to move away from the Justice’s decision, a hypothesis I am able to test with my research design.
Court Rulings and Opinion Toward Gay Rights
Prior to Obergefell v. Hodges, the Supreme Court ruled on a number of cases concerned with gay rights. While scholars analyzed the public response to these earlier cases, the empirical evidence across studies is mixed. Analyzing four separate gay rights cases, Stoutenborough et al. (2006) find public support moved in the direction of the court decision in one case, against the court decision in another, and remained unchanged for the remaining two cases. 5 More recently, research analyzing the public’s reaction to prominent Supreme Court cases expanding gay, including Obergefell v. Hodges, found little evidence that liberal decisions lead to a backlash against gay rights (Bishin et al., 2016; A. R. Flores & Barclay, 2016; Kazyak & Stange, 2018). 6 A. R. Flores and Barclay (2016) further find that residents of states where the Court introduced same-sex marriage policy led to the greatest reduction in anti-gay attitudes. One potential reason these studies find little evidence of backlash is they utilize survey data, which often lags behind the date of a Supreme Court decision. This work may miss an initial, short-term backlash (Ura, 2014).
Testing Hypotheses About the Public Response to Obergefell v. Hodges
Reviewing previous research allows me to come up with a number of predictions concerning the public’s response to the Supreme Court’s Obergefell v. Hodges ruling. Given the empirical support for the Structural Response Hypothesis, I predict that, in the aggregate, the Supreme Court will polarize public opinion. In addition, given the literature on public attitudes toward Federalism, I believe in those states where the Supreme Court’s decision resulted in a change in policy, there will be a more negative reaction toward the ruling as compared with other states in the short term.
This allows me to develop two testable hypotheses:
Twitter Data and Sentiment Scoring
Although I address the oft-discussed question of how Supreme Court decisions impact public opinion, I do so with a different methodology compared with past studies. Rather than relying on survey data, this article utilizes machine-learning sentiment analysis methodologies to obtain a measure of public opinion from Twitter messages. This section briefly describes how I obtain and process this social media data and the strategies I used to quantify sentiment from raw text.
Using Twitter Data to Study Opinion
While survey data are far and away the most popular source of data in studying public opinion, it is nearly impossible to collect for my present research question. First, to measure changes in public opinion before and after major Supreme Court cases, one needs to run comparable polls immediately before and after the Justice reach their decision, a “limiting factor for all studies of Supreme Court influence on public opinion” (Brickman & Peterson, 2006, p. 98). Second, studying the heterogeneous impact of a Supreme Court decision requires strong state samples, with many national surveys failing to report state-by-state results, given the margin-of-error for smaller states can be problematic for inference (Silver, 2016). 7 While researchers developed several techniques to estimate state samples from national survey data, including disaggregation (Erikson et al., 1993) and multilevel regression and poststratification (MRP; Lax & Phillips, 2009a, 2009b), the additional necessity in finding comparable national surveys immediately before and after a ruling makes it difficult to use these techniques to study the short-term reactions to Supreme Court decisions.
Collecting messages on a site like Twitter is a potential way to circumvent these issues. Users send tweets in real time, allowing for much fine-grained estimates of public opinion in comparison with monthly (or even weekly) polls. Twitter data are also “always-on,” making it possible to continuously collect information without needing to specify where and when to conduct a particular survey (Salganik, 2018, p. 21), allowing a researcher to study a wide range of unexpected events that alter might public sentiment and discourse. Although an imperfect substitute to well-collected polling data, many researchers have demonstrated how Twitter data can provide a strong signal of the public opinion (e.g., Beauchamp, 2017; McKelvey et al., 2014; O’Connor et al., 2010).
Of course, one of the major weaknesses of Twitter data is the fact that the population of American Twitter users is not a representative sample of the adult population in the United States. Research into the demographic makeup of Twitter users shows that populous American counties tend to be overrepresented (Mislove et al., 2011), users are more likely to be younger and richer (Barberá & Rivero, 2015), and, overall, there is a liberal and pro-Democratic bias compared with the country as a whole (Mitchell & Hitlin, 2013). While a nonrepresentative population is not a unique feature with social media data—nonresponse rates in polls produces a similarly difficult to correct bias (e.g., Desilver & Keeter, 2015; Groves & Peytcheva, 2008; Massey & Tourangeau, 2013)—it does make it difficult to claim high external validity in any study relying on Twitter data. That said, studying the population on Twitter does allow for research designs that maintain strong internal validity, allowing us to consider the comparative statistics beyond the overall level of estimated effects. Given the lack of alternative polling data differentiated by state over the short time frame around the Obergefell v. Hodges decision, Twitter data represent the best alternative to conduct a causal analysis. 8
Gathering Twitter Data
To utilize Twitter data to study changes in opinion concerning same-sex marriage, it is first necessary to filter through the vast quantity of Twitter data and obtain only the subset of messages where users discuss topics relating to gay marriage and rights. I accomplish this by utilizing the Twitter Streaming API, a tool that pulls any tweet that fits certain criteria in real-time. 9 To obtain all relevant tweets, I tracked the following set of words: gay marriage, gay marriages, same- sex marriage, same-sex marriages, same sex marriage, same sex marriages, same-sex union, same-sex unions, same sex union, same sex unions, marriage equality, equal marriage. 10 I pulled tweets containing one of these keywords from the Twitter Streaming API and placed into a MySQL database with a Python script. This monitor ran from May 27, 2015, to August 24, 2015, collecting 5,996,741 total tweets.
For each tweet collected, several other pieces of relevant metadata were captured, including the time stamp a message was sent and the user’s number of followers. When available, I also collect user profile data, such as the user’s full name and location. 11
As the goal of this project is to analyze sentiment within the United States, I focus on the subset of tweets with location data that can be mapped to a specific U.S. state. I rely on self-reported locations to map users into U.S. states. Specifically, I employ a large series of regular expressions with state names and the most populous American cities to map self-reported location data into a standardized state-coding scheme. In total, I mapped 1,028,151 messages to a specific state. 12
In addition to analyzing location data, I examine the subset of users that choose to share their full name to predict demographic characteristics. Specifically, I use the gender package (Mullen, 2015) to link first names to gender and the wru package (Khanna & Imai, 2015) to link surnames with race. While it is impossible to perfectly predict gender and race based on names, these packages are commonly employed in the literature, utilizing census data to predict these variables. In total, 481,487 messages were from users with names that could be linked to race and gender.
Finally, I classify a subset of users as either Republicans or Democrats. These labels come from data collected by Pablo Barberá (2013). Very briefly, Barbera`s work takes advantage of follower networks to predict the likelihood an individual is a Republican or Democrat, with the estimation strategy relying on the logic that a Republican is more likely to follow other Republicans and Democrats are more likely to follow other Democrats. I was able to merge 184,042 my own Twitter data with users in Barbera`’s data, creating a subset of accounts with estimated party labels.
Sentiment Scoring
After collecting a large set of Twitter data, I preprocess the raw text data in a way that made it possible to utilize various supervised sentiment scoring algorithms to measure. 13 Supervised training methods require a training set, a collection of messages annotated with true labels. As the goal of this project is to classify tweets based on sentiment, this involves building a training set of tweets labeled as positive or negative.
I use the crowd sourcing platform Mechanical Turk to obtain a set of hand-annoted tweets to build a binary sentiment classifier. Each tweet was labeled by three human coders, with the final label being the majority category. 14 In total, I collect a set of 626 negative and 1,778 positive unique tweets. 15 To create a training set most representative of the corpus, I gather labels for the most-retweeted messages in my data collection period, with the 626 unique negative messages representing 44,031 total tweets, and the 1,778 unique positive messages representing 161,525 total tweets.
With this training set, I test a number of supervised classifiers, settling on random forest as the algorithm that leads to the best results (more details on this procedure can be found in the Supplemental Material). Leaving out 10% of the training data for a test set, my final model specification has 81.74% accuracy and a Cohen’s kappa coefficient of 0.40. Of the 1,028,151 tweets I map to a U.S. state, I classify 182,031 as negative and 846,120 as positive. On acquiring this well-performing estimate of sentiment in a carefully selected subset of tweets, I use these data as a measure of sentiment toward gay rights issues before and after the Supreme Court’s federal legalization of same-sex marriage.
Aggregate Shift in Public Opinion
I begin by testing the Structural Response Hypothesis by replicating the analysis outlined in Franklin and Kosaki (1989). This model takes the form:
where i indexes messages, Y is a “positive” or “negative” classifier, “After” is an indicator variable specifying whether the message was from before or after the Supreme Court ruling, X represents covariates, and s represents unobservables. To measure Y, I use the random forest classifier described in the previous section to label each message in my data set as positive or negative, replacing Random Forest scores with hand-labeled Mechanical Turk results when available (the hand-annotated labels are closer to the ground truth). I code positive message as a one and negative message as a zero. I estimate the above equation with a probit model.
To test the Structural Response Hypothesis, I run two models: a constrained model in which β2k is set to zero for all K covariates, and an unconstrained model where these values are allowed to vary. If I reject the constrained model in favor of the unconstrained model, it demonstrates that the Supreme Court decision alters the structure of opinion. I run two pairs of models: a pair that only includes demographic variables and a pair that includes demographic variables and party labels. Table 1 contains the results of these tests. 16
Structural Response Hypothesis Results.
p < .1. **p < .05. ***p < .01.
In both pairs of models, I reject the constrained model in favor of the unconstrained model at high levels of significance, which confirms my first hypothesis (H1). Of note is the fact that this level of significance is much higher when including party fixed effects, as demonstrated by the much larger chi-squared value across models three and four. This pattern demonstrates that the polarizing impact of Obergefell v. Hodges was especially pronounced across party lines.
Overall, these results provide further evidence for the Structural Response Hypothesis, demonstrating that the Supreme Court polarizes aggregate public opinion. Confirming the core result of Franklin and Kosaki (1989) represents a good initial validation of the accuracy of my sentiment classifier.
Impact of Policy Change
To test how the Supreme Court’s ruling in Obergefell v. Hodges may have affected the expression of sentiment toward gay marriage for citizens in regions where the Supreme Court overturned state-level policy, I use a difference-in-difference estimator to identify a treatment effect. The difference-in-difference estimator works by differencing across the treated and untreated observations, as well as across time. This effectively differences out both the time-variant and time-invariant unobservables, allowing for a causal interpretation of the difference-in-difference coefficient.
However, this estimation technique is only useful if what occurred in the untreated set is a reasonable counter-factual for what might have happened in the treated set. Thus, in this setting, I assume the treated states would have had a similar reaction as their untreated counterparts if the Supreme Court decision did not lead to a top-down shift in state policy. Importantly, the level of sentiment can still differ greatly between the two sets of states: only the general time trend must be the same, an assumption explored below.
When these assumptions hold, there is no need for a difference-in-difference estimation to include other covariates. However, as the parallel trends assumption is very difficult to test, I include a number of covariates that could reasonably explain the heterogeneous response to the Supreme Court ruling across each set of states.
The difference-in-difference regression takes the form:
where i indexes messages, Y is a “positive” or “negative” classifier (defined in the way described in the previous section), D is a treatment indicator that takes the value of 1 if the user sent the tweet if the Supreme Court decision lead to a change in state policy, and “After” is an indicator variable that takes on the value 1 if the user sent the tweet after the Supreme Court’s decision on June 26, 2015. X represents a set of potential control variables and s represents unobservables. I run this regression with a linear probability model, as the assumptions of the difference-in-difference estimator require linearity to interpret the results causally.
The coefficient of interest in the above equation is β3, which corresponds with the average change in the expression of positive sentiment in the treatment group before and after the Supreme Court decision, minus the change in sentiment over the same period of time in the untreated group. This difference-in-difference represents the change in sentiment caused by the treatment, in this case the change in sentiment that results from the Supreme Court overturning state-level policy.
In total, I consider five models and present the results in Table 2. The first model is the baseline difference-in-difference models, with no added controls. The second model removes all tweets sent on June 26, the day of the Supreme Court decision, to look beyond the impact of individuals only tweeted on June 26 and no other point in the data set. Model 3 includes gender and race fixed effects, Model 4 includes partisan labels, and Model 5 includes both.
Difference-in-difference analysis results.
p < .1. **p < .05. ***p < .01.
In Table 2, I find a negative and statistically significant Treated × After coefficient across the first four model specifications, providing strong empirical evidence that the Supreme Court’s decision lead to a more negative reaction in those states where the decision lead to a policy change. Thus, I find evidence for my second hypothesis (H2): the Supreme Court’s decision caused short-term backlash against gay marriage and gay rights in those affected states.
In Model 5, I find a null result. While this model includes the most covariates, I severely restrict the data set by only considering users with names that could be linked to race and gender and could be matched to party labels. In total, this limits me to less than 10% of the original data, and losing statistical power is one reason I may fail to recover the effect.
Overall, the highly significant and negative Treated × After coefficient across the first four models demonstrates the relative backlash in the treated states after the Court decision. Thus, these models provide support that, when the Supreme Court overturns state policy, there is less relative support for the decision in affected states. While this may seem to contradict earlier results that demonstrate a positive response to the Justice’s decision, this is most likely due to the very short time frame of my current study. Previous studies (A. R. Flores & Barclay, 2016; Kazyak & Stange, 2018) analyzing the publics’ reaction to Obergefell v. Hodges use survey data that lag behind and ahead of a Supreme Court case, perhaps failing to identify a short-term backlash.
Turning to each model in detail, Models 1 and 2 represent baseline models, with and without tweets from the day of the Supreme Court decision. Looking at the After coefficient across Models 1 and 2, I find that ignoring the strong positive response immediately following the Justice’s decision on June 26, there appears to be an overall backlash in the short-term. The Treated coefficient is also negative in all models that do not include party labels. In Model 3, I find that including party labels leads to a statistically significant and positive Treated coefficient, demonstrating that when controlling for partisanship, there was overall more positive messages in these states. Even in this model specification, however, Treated × After remains negative and significant, demonstrating that there was a relatively less positive response to the court ruling in the treated states. The large, negative, and highly significant Republican coefficients in Models 4 and 5 is not surprising, as conservative groups (consisting of mostly Republicans) consistently respond negatively to policies that advance a gay rights agenda.
Parallel Trends Assumption
While these results do not definitively prove causality, they demonstrate that the Supreme Court overturning state policy is correlated with less positive sentiment toward gay marriage and gay rights issues. If the untreated states are a good counter-factual to the treated states, this correlation can be interpreted causally.
This requires me to consider the untreated states as a good counter-factual to what might have occurred in the treated states. Unfortunately, this assumption is impossible to test. That said, if I can demonstrate that the treated and untreated states had a parallel trend in expressed sentiment prior to June 26, I can argue that the untreated set is a good control group for the treated set. 17 To explore this parallel trend assumption, I graph the differences in the daily mean sentiment score for treated and untreated states over time. As these daily sentiment scores are volatile, I chart the 7-day moving average (and LOESS curve in blue) to better visualize the data. This visualization is found in Figure 1.

Time trends in sentiment across treated and untreated states.
In Figure 1, I find that overall the parallel trends assumption seems to hold, as both the treated and untreated states have the similar overall trend in expressed sentiment prior to June 26. For the most part, treated states have lower sentiment scores than their untreated counterparts, although there are periods of time where the scores overlap. After the court decision, there was a general widening in the gap between sentiment scores across the two sets of states, a gap driving the difference-in-difference results. This gap is especially pronounced around July 1 to 20. Exploring messages from these days might elucidate why there was an increase in negative sentiment during these time periods, although this investigation is beyond the scope of the this article. Near the end of the period of analysis, the treated and untreated states once again begin to converge, indicating a possible mean-reversion. This again demonstrates that my finding could point toward short-term backlash, with an eventual positive response later in the time trend.
Conclusion
In this project, I bring a new perspective to the long-standing debate on how the Supreme Court impacts public opinion. In the landmark case Obergefell v. Hodges, the Supreme Court definitively ruled that same-sex marriage was a “fundamental right,” conferring the right to marry for same-sex couples across the United States. As same-sex marriage is a divisive social issue, previous studies theorize this Supreme Court decision would cause opinion to be further polarized across the American public.
However, few earlier studies explicitly consider Supreme Court decisions’ heterogeneous impact on different groups of states—with varying preexisting legal conditions, a court ruling might overturn certain state policies while leaving other policies unchanged. Such was the case in Obergefell v. Hodges, with only 13 of the 50 states having policy overturned in the wake of the Justice’s decision. I study this event in a causal inference framework with a difference-in-difference estimator, finding that overturning state-level policy led to a relatively more negative reaction toward the decision by citizens in those affected states.
I engage in this analysis with a novel data set: rather than conducting my study with public opinion polling data, I utilize machine-learning methods to classify a large set of political tweets as positive or negative with a high degree of accuracy. These data allow me to track the expressed sentiment of gay rights issues in a short time frame, making it possible to detect shifts in sentiment immediately before and after the Supreme Court’s decision. While social media data have their own set of potential issues, relying on Twitter allows me to construct a large data set with coverage across the entire United States over the short period of time before and after the Court ruling, a necessary precondition in conducting a difference-in-difference analysis and allowing me to analyze the effect of the Supreme Court case in a causal framework.
This work represents a theoretical and methodological contribution to the literature on the Supreme Court’s impact on public opinion. On the theory side, my work demonstrates that analyzing national survey data without considering state samples is insufficient in understanding the impact of Supreme Court decisions on public sentiment when those decisions have varied regional consequences. This work suggests that, when the Supreme Court overturns state policy, it leads to a relative short-term backlash against the Justice’s decision. Future work should extend the data collection period to discover how this trend changes in the months and years after a decision is reached, as well as look at new court cases in different issue areas to establish this as a general finding.
On the methodological side, I demonstrate that combining sentiment analysis techniques with social media data grants a new perspective in analyzing public opinion. These data allow me to isolate the state-by-state reactions immediately following and preceding the Obergefell v. Hodges Supreme Court ruling, making it possible to analyze reactions to policy change in a causal inference framework, a unique contribution to this literature. In the future, gaining a better understanding of the demographic population on Twitter and improving the machine-learning classification techniques will improve this methodology, allowing researchers to better understand and correct the bias in analyzing a nonrepresentative subpopulation.
Supplemental Material
05_online_appendices – Supplemental material for Policy Change and Public Opinion: Measuring Shifting Political Sentiment With Social Media Data
Supplemental material, 05_online_appendices for Policy Change and Public Opinion: Measuring Shifting Political Sentiment With Social Media Data by Nicholas Joseph Adams-Cohen in American Politics Research
Footnotes
Author’s note
Nicholas Joseph Adams-Cohen is also affiliated with Stanford University, Stanford, Immigration Policy Lab, Encina Hall, 616 Serra Street, Stanford University, Stanford, CA 94305, USA.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
1.
This group of 13 states are: Arkansas, Georgia, Kentucky, Louisiana, Mississippi, Missouri, Montana, Nebraska, North Dakota, Ohio, South Dakota, Tennessee, and Texas.
2.
Sentiment, broadly defined, is an expression of an individual’s “opinions, sentiments, evaluations, appraisals,attitudes, and emotions” towards a particular event, topic, or object (Liu, 2012). Public opinion refers to a citizen’s feelings regarding an important political issue (Norrander & Wilcox, 2001). As the terms are closely linked, political sentiment and public opinion are used interchangeably in this work.
3.
While the popularity of the Supreme Court ebbs and flows over time (Caldeira, 1986), it is often shown to be perceived as more favorable than either the Legislative or Executive Branch (Cox, 1976; Marshall, 1989, p.g. 139-141).
4.
It is important to note that a number of the thirty-seven states that legalized same-sex marriage prior to the Obergefell v. Hodges decision did so only as the result of a state or district court ruling. While possible to assume citizens in these states would have the same reaction as the citizens in the thirteen states where Obergefell v. Hodges lead to a policy change, the backlash model theorizes citizens with direct exposure to focusing events are more likely to have a negative reaction to a policy (Hopkins, 2010). Therefore, even within this group of states, it is plausible that citizens in states where Obergefell v. Hodges lead to a policy change would have an increased negative reaction towards the ruling.
5.
The four cases studies were Bowers v. Hardwick (1986), Romer v. Evans (1996), Boy Scouts of America v. Dale (2000), and Lawrence v. Texas (2003).
6.
These cases include United States v. Windsor (2013), which invalidated sections of the Defense of Marriage Act, and Hollingsworth v. Perry (2013), which effectively legalized same-sex marriage in California.
7.
A good example of this limitation can be found when observing the Pew Research Center’s (2016) report on changing attitudes towards gay marriage. While age, religion, party identification, race, and gender are among the reported covariates, state data are not provided.
8.
Flores (2017) uses this same research design to study whether anti-immigrant laws lead to a backlash against immigration related issues with Twitter data.
9.
A potential issue in utilizing data from Twitter’s Streaming API is you do not get access to the full universe of messages. However, as there is no systemic pattern to which data are unavailable from the API, the bias this introduces is small when collecting a large dataset.
10.
While I specified these keywords to follow a single issue over time, in relying on a static list of keywords, I risk missing important phrases that developed dynamically during the data collection period (King, Lam, & Roberts,2017). However, one advantage in using a static list of terms is I use the same criteria to select tweets during the entirety of my data collection period.
11.
A primary concern when collecting Twitter data is the potential incidence of ‘bot’ accounts – automated programs that perform a variety of actions on Twitter including sending messages, following other users, or retweeting messages (Jajodia, Wang, Gianvecchio, & Chu, 2012). While potentially problematic, an examination of the users in my data reveals little evidence that a large number of users are bots (see
: Checking for Bot Accounts for details).Based on this, I do not believe bots heavily bias my results.
12.
The most accurate form of location information in Twitter data are the GPS coordinates (geotags) users can elect to post with their tweets (Steinert-Threkeld, 2018, p.g. 7). However, as only only 2-3% of all tweets contain geotags (Leetaru,Wang, Cao, Padmanabhan, & Shook, 2013), this would not provide enough location tweets for me to engage in my analysis. In order to get a sense of how well my location coding scheme performs, I do analyze the subset of messages (1,990 in total) that are geotagged. For each of these messages, my mapping algorithm correctly classifies the user’s state 91.1% of the time, providing a good robustness check for the mapping algorithm.The most accurate form of location information in Twitter data are the GPS coordinates (geotags) users can elect to post with their tweets (Steinert-Threkeld, 2018, p.g. 7). However, as only only 2-3% of all tweets contain geotags(Leetaru,Wang, Cao, Padmanabhan, & Shook, 2013), this would not provide enough location tweets for me to engage in my analysis. In order to get a sense of how well my location coding scheme performs, I do analyze the subset of messages (1,990 in total) that are geotagged. For each of these messages, my mapping algorithm correctly classifies the user’s state 91.1% of the time, providing a good robustness check for the mapping algorithm.
14.
If all three Mechanical Turkers choose different categories, I dropped the data point. Only 5% of the tweets had no majority category, demonstrating high inter-coder reliability.
15.
16.
All regression tables are made with stargazer for R (Hlavac, 2015).
17.
Author Biography
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
