Abstract
Living in an age of big data, this study explores (a) how much certain online information is shared by media users and (b) what sentiments do social media users predominantly express on Twitter. Quantitative findings indicate that after the 2011 nuclear disaster at Fukushima Japan, the amount of nuclear energy–related tweets that were linked to outside information far outnumbered tweets containing no external link. Results also indicate that the predominant tone in these tweets was one of pessimism about nuclear energy.
Many scholars have been interested in the effects of the Internet and social change (e.g., Bimber, 2003; Norris, 2001). The Internet has indeed changed “the rules for traditional media outlets” (Hindman, 2009, p. 89). In recent years, our paradigm of media message production and distribution has undergone a transformation. Interactive media permit nearly every individual to participate in a conversation, giving people a chance to influence if not set the public agenda (Benkler, 2006), as well as to discuss issues that come up in the news (Hardy & Scheufele, 2005). The ease of information distribution using social media means that any individual can participate in the process of disseminating information. Many people today are involved in social information networks; they easily engage with online content, sharing information with others or endorsing it. People use social media to seek and share information (Marwick & Boyd, 2011) and to offer their opinions on certain issues (Kushin & Yamamoto, 2010). In doing so, individuals can play significant roles in forming public opinion.
In contemporary societies, social media users can undoubtedly engage with a diversity of issues. Interestingly, the New York Times reported that the article most shared by Facebook users in the United States in 2011 was their article “Satellite Photos of Japan, Before and After the Quake and Tsunami” (Facebook, 2011). The significance of this article attaining such a rank is that it dealt with an issue with which many people were unfamiliar. While Americans may have known little about the science of nuclear catastrophes, they were still engaged in distributing information about the issue online.
Online media users can easily engage in societal issues by delivering external news content by linking to news stories in their posts (Szabo & Huberman, 2010). Considering the notion of collective power or collective intelligence, it is important to explore how information on a given topic is shared by online media users. Thanks to a new era of big data, given the large amount of available public discourse data across a web-based media platform, we can gauge public opinion expression almost in real time. What public attitudes are toward certain issues can be brought to light by analyzing all the terms media users use in online messages; this can be accomplished through the automated computational algorithm-based sentiment analysis (Ceron, Curini, Iacus, & Porro, 2014).
It is worth remarking that Japan’s nuclear power plant disaster in March of 2011 returned nuclear energy to being a hot-button issue. In the disaster’s aftermath, media coverage—Japanese and foreign—was massive, provoking widespread discussion of the accident itself and of radiation contamination (Lazic, 2013). Using nuclear energy as our focal issue, this study explores the role of Twitter as an emerging channel of news distribution in the changing information environment.
The Rise of Twitter as a News Distribution Channel
Twitter, a widely used channel for online communication, is a microblogging website that allows users a limit of 140 characters per post, or “tweet.” Twitter allows users to engage in public discourses on certain issues through its website and its app for mobile devices. Based on a report published by Twitter (2015), about 316 million active unique users throughout the world send more than 500 million tweets per day. The service has facilitated online users in their participation in collective action, dissemination of emerging news, and exchange of opinions (Runge et al., 2013).
Research on Twitter as an information source is growing. Considering the role of Twitter as a critical hub for (a) connecting anonymous Twitter users including their followers with others (Runge et al., 2013) and (b) linking users to the online content from third-party sites (Szomszor, Kostkova, & De Quincey, 2012), researchers have suggested that Twitter could be used as an information source for emergency communication (Mills, Chen, Lee, & Rao, 2009). And in fact, Twitter users generated a large number of tweets during such emergency events as (a) the 2009 police officers shootings in Washington (Heverin & Zach, 2010), (b) the 2011 “Arab Spring” uprisings (Lotan, Graeff, Ananny, Gaffney, & Pearce, 2011), (c) the Red River Valley Flood in 2009 (Palen, Starbird, Vieweg, & Hughes, 2010), and (d) the Boston Marathon Explosions in 2013 (Cassy, Chunara, Mandl, & Brownstein, 2013)
Because Twitter significantly influences information diffusion, researchers have become interested in whether it is a social network or a news media platform (e.g., Kwak, Lee, Park, & Moon, 2010). Gruzd and Roy (2014) found that people on Twitter are more likely to connect to others to seek and share information than they are to seek social relationship bonding. In addition, Runge et al. (2013) stated that topics that are covered in tweets are a more important determinant than source-cue when Twitter users decide whether to follow certain posts. Thus, the present study holds that Twitter has indeed played an important role as a news distribution channel in public discourses.
With the significant rise of social media as emerging news delivery channels (Ju, Jeong, & Chyi, 2014; Papacharissi & de Fatima Oliveira, 2012), researchers have grown interested in the agenda-setting function of interactive media. For instance, Sayre, Bode, Shah, Wilcox, and Shah (2010), who applied the agenda-setting theory to social media settings, argued that the advancement of digital media sometimes follows the mainstream media agenda. This would seem to suggest that the agenda-setting power of traditional media would persist. Yet on occasion social media influence the agenda of mainstream media. Despite the dynamic nature of the relationship between old and new media, it is still worth considering the potential influence of agenda setting in social media settings. Given the opportunity to interact with information about emergency events on Twitter, on the basis of a (a) first level of agenda setting (i.e., the function of the media to influence which salience of issues on the public agenda) and (b) the inter-media agenda setting (i.e., concerning for the shared media agenda among different media channels), this study posits the significant spike in content discussing nuclear energy in social media following the Fukushima nuclear disaster.
This study is also concerned with the potential effect of media tone on public attitude toward the controversial issue. It is difficult to predict the dominant sentiment regarding nuclear energy on social media because the production of radioactive waste as an outcome of generating nuclear energy has made nuclear power a controversial issue. In spite of the damage to public perception of nuclear power caused by the recent nuclear disaster in Japan, public opinion related to the technology is still ambivalent. In fact, a 2015 report showed that about half of Americans support increased use of nuclear energy, whereas 43% of Americans are against it (Gallup, 2015). Public controversy over nuclear energy has potentially been exacerbated by how the media has covered the issue. When it comes to public awareness of scientific and technological issues, mass media can influence how individuals form judgments about these issues (Scheufele & Lewenstein, 2005). Therefore, media coverage of nuclear energy has been associated with how publics form attitudes toward the technology (Yeo et al., 2014). However, it is also worth noting that in dealing with a controversial issue, people are more likely to seek more social cues to confirm the general opinions of others on the issue because they do not want to counter others’ normative expectations (Spartz, Su, Brossard, Griffin, & Dunwoody, 2015). Indeed, people tend to heed crisis communication via social media more than they do crisis communication via traditional media (Schultz, Utz, & Göritz, 2011; Utz, Schultz, & Glocka, 2013). It is thus worth exploring what sentiment Twitter users predominantly express regarding nuclear energy given that this could provide a social clue for gauging public attitudes on the controversial issue.
Research Questions
On Twitter, users can engage in news sharing in different ways. They can decide whether to, in their tweets, insert web links to external sites. If they choose to do so, Twitter users can include a hyperlink in their tweets to link their social network connections to the online information from external sources such as online news articles (Szomszor et al., 2012). By providing outside information (i.e., inserting an external web link) in their tweets, Twitter users can direct their followers to process further information that supports their views (Himelboim, McCreery, & Smith, 2013; Tanaka, Sakamoto, & Honda, 2014). Also, readers of tweets containing external links have more chances—if they click on the link—to reach further information that might be interesting or important (Hughes & Palen, 2009). Interestingly, social media users do evaluate tweets differently if they include an external link (Alonso, Carson, Gerster, Ji, & Nabar, 2010), tending to perceive such a tweet as more interesting than one without it. Hence, we have separated tweets into those with external links and those without link, and pose the following research questions:
Methods
To gauge (a) the volume, proportion and (b) the sentiment of the tweets shared by Twitter users regarding nuclear energy–related content, we collected tweets related to nuclear energy that were shared over a 36-month period (October 1, 2010, through September 30, 2013). We ran the specified time frame because the March of 2011 disaster was more likely to affect the amount of (and sentiment of) nuclear energy–related tweets.
For this study, we collected and analyzed big data using an automated nonparametric content-analysis software, Forsight, developed by Crimson Hexagon. The social media monitoring and analysis company Crimson Hexagon uses a “trained” algorithm to track linguistic patterns of specific concepts (for this study, sentiment in nuclear energy as optimistic-neutral-pessimistic) identified by human coders 1 (Hopkins & King, 2010; also see methods papers for example, Pang & Lee, 2008; Su et al., 2013, for more details about the use of sentiment analysis tools). There are several benefits associated with using this software for online content analysis. As Crimson Hexagon’s monitor captures the complete posts of all publically posted tweets, there is no concern about a sampling margin of error. In addition, the categorization performed by the algorithm has shown a satisfying reliability with a +/− 3% margin of error (see http://www.crimsonhexagon.com/wp-content/uploads/2012/06/CH_e-Book_Four_Cs_of_Social_Media.pdf). Hence, researchers in the computer sciences have used sentiment analysis—using algorithms to automatically track linguistic patterns—to track public discourses in the online sphere (Bansal, Cardie, & Lee, 2008; Chmiel et al., 2011).
To conduct the content analysis, a total of 10 trained coders were involved in the process of training the software. In detail, all human coders worked in rotating groups of three to train the program and manually code example posts in each category. To improve the reliability of the coding process, a single consensus codebook was provided to all the interrelated teams, and all coders shared their questions or discussed any ambiguous tweets at a weekly meeting. Also, to enhance the program’s validity and reliability, our coding groups reviewed each classified post and ruled out any controversial tweets.
In the process of content analysis, our coding team, using Boolean search logic, determined a set of keywords related to nuclear energy. 2 This way the content-analysis software could identify nuclear-related tweets for analysis. In our study, its close technological and perceptual correspondence between nuclear energy and nuclear weapons were reflected in a list of keywords. This study considered the extended influence of perceived fears toward radiation from nuclear weapons on sentiments regarding nuclear energy (Kasperson, Berk, Pijawka, Sharaf, & Wood, 1980; Lifton, 1973).
After collecting a census of all nuclear energy–related tweets based on these keywords, the program randomly extracted and displayed a set of sample tweets from all the available posts within the time frame specified. A total of 248 sample tweets were classified into specific categories by human coders in the monitor-training process (the term “monitor” is used to describe each query related to a set of analyses; see Pew Research Center, 2015; Su et al., 2013). During training, to assure the exclusivity of the sample tweets, human coders would not identify tweets that fit into more than one category, such as tweets expressing both optimistic and pessimistic opinions on nuclear energy. See Table 1 for samples of tweets coded. 3
Examples of Tweets Mentions a Coded According to the Conceptualization of Interest.
We refer to a mention as any tweet that matches the set of keywords and date range for analysis. Mentions represent the size of relevant and irrelevant discourse about the issue because any tweet that includes multiple keywords is still considered as a single mention.
For this study, we created two different monitors (Crimson Hexagon uses “monitor” to refer to assessing the volume or sentiment in online discourses) to train the software to recognize nuclear energy–related tweets. To gauge the proportion of relevant content on Twitter, the monitors collected tweets on nuclear energy, including all those that did and did not include a link to an external site. To assess how the tweets framed nuclear power generation, a monitor was created that categorized tweets under each “with link” and “without link” group as optimistic, neutral, or pessimistic. Human coders had trained the program by classifying an initial set of tweets into each category. Optimistic refers to language indicating positive attitudes about the future of nuclear energy or beneficial outcome associated with the technology. Pessimistic refers to language indicating a gloomy picture or outlook of nuclear power, or a tendency to see the negative aspects of the energy source. Neutral refers to language indicating content that contained no overt judgment, neither a positive or negative outcome associated with the use of nuclear energy. (See Table 1 for examples of tweets fitting into each sentiment category.) In the process of coding, the study used an automated sentiment analysis of linguistic pattern approaches combining machine-learning algorithms, computational linguistics, and human input processing (e.g., using sets of words with known positive, neutral, and negative meanings that exist in English). 4 What could be confusing, however, were certain terms derived from a combination of opposite words (e.g., “not bad”). These could be confusing because they contain a negative word while conveying a positive sentiment. In such cases, during the monitor-training process by human coders, the technical algorithm was trained to look at the context of its uses instead of examining terms in isolation. During the process of training the monitors, tweets that were off topic or failed to fit into any of the specific categories were labeled “off topic” and subsequently coded as such by the monitor.
Based on previous findings of its validity and reliability tests, the program was considered sufficiently trained once the human coders had coded at least 20 example posts in each category (Pew Research Center, 2015; Runge et al., 2013; Su et al., 2013). To “train” the program for this study, the “initial set” included more than 240 tweets. Next, we ran the monitors for a specified time period to collect all relevant, English-language tweets using the keyword search string and classifying all tweets into the established categories. Eliminated from the final analyses were tweets not fitting into any of the presented categories.
Over the 36-month period chosen for analysis (October 1, 2010 to September 30, 2013), the software identified a total of 29,034,859 English-language tweets in Twitter that contained nuclear-related content (as defined by the keyword searching string). Of these nearly 30 million tweets, however, 10,992,166 were identified as “off topic” based on the software training outlined. Hence, in the final analysis, 18,042,693 tweets were used to gauge the sentiment of nuclear energy.
Results
To answer

A comparison of the volume of tweets on nuclear energy that contained an external link and those that did not, between October 2010 and December 2013.
Our data show that the proportion of tweets without external links (tweets reached 308,260) was about one and half times greater than the proportion of tweets about nuclear energy that contained links (in total 223,260 tweets) for the first 2 days after the accident. From the third day on, however, these proportions had reversed themselves (see Figure 2).

A comparison of the (a) volume and (b) proportion of tweets on nuclear energy that contained an external link and those that did not, from March 5, 2011 to March 31, 2011.
During the 6 months after the nuclear accident, the proportion of tweets about nuclear energy linking to an external site is more than double those not linking to external links. For the 6 months after the Fukushima Daiichi nuclear disaster, on average, tweets with an external link comprised 69% of the tweets about nuclear power, and tweets without a link comprised 31% of the tweets. However, as time passed, the proportion of tweets with an external link became almost equivalent to the proportion of tweets without a link (see Figure 3).

The proportion of tweets on nuclear energy that contained an external link and those that did not, from March 2011 to February 2012.
This study also sees how critical events may have triggered a volume spike in tweets with hyperlinks and tweets without hyperlinks. As shown in Figure 4, nuclear energy–related tweets increased around nuclear-related agenda (e.g., IAEA releases report that Iran has been developing nuclear weapon; Fukushima nuclear plant opens to journalists in November 2011; leaking stream seen at damaged Fukushima nuclear plant in July 2013). Although we did not test the causal effect of nuclear-related events on increased volume in tweets, the proportion gap between a tweet containing a link and a tweet without seemed to become wider when the nuclear-related issue was reported.

A comparison of the volume of tweets on nuclear energy that contained a link and those that did not, (left) from October to April 2012, (right) from May to September 2013.
The next research question (

Trend of the sentiment expressed in nuclear energy–related tweets without an external link to those with an external link between October 2010 and December 2013.
Discussion
In this study, by conducting computer-aided–human-based content analysis, we empirically demonstrated that online media users appear to tend to write more posts about a particular issue just after a big event occurs that seems pertinent, though as time passes the tendency fades. In this case, up until the first quarter of 2012, the amount of nuclear energy–related tweets that linked to outside information (by inserting external links in tweets) far outnumbered those not containing an external link. This study also investigated public expressed opinion toward such a controversial technology as nuclear energy. In terms of the sentiment expressed about nuclear energy in the world of Twitter, results indicate that pessimism about nuclear energy was the predominant tone in tweets about the controversial technology over the time period sampled.
Before we discuss our findings in more detail, we should acknowledge some limitations of this study. First, as we used a computer-aided, predetermined categorical (i.e., optimistic, pessimistic, and neutral) analysis, the tweets that were analyzed in our sample were limited to the categories of sentiments we selected. That is, as the content analysis was run only within the pre-designated categories, we may have missed opportunities to find some more revealing insights had the sample not been limited to these categories. Moreover, by coding the sentiment using the linguistics of positive or negative words, it only allowed us to derive conclusions about general descriptive findings of opinion toward the controversial issue. Also, we are unable to report calculated inter-coder reliability, which is usually provided in a content analysis. However, because all the tweets were classified into their respective categories based on the agreement of all the coders, and the software was trained to follow the rules of linguistic patterns for each category, this study assumes higher confidence in reliability in the process of content analysis. (See Su et al., 2013, for additional discussion about automated nonparametric content analysis.)
There is also the concern of “off-topic” (i.e., irrelevant) content. At issue is whether all off-topic content was actually caught and that relevant content was not incorrectly deemed off-topic by the automated monitor. To overcome this potential drawback, during training, human coders tried to find about 30 unique examples of posts identified as “off-topic” within the keywords. In the process of training, after finding those examples of irrelevant tweets, the rest of the tweets were presented in similar patterns over and over. Because we used a set of Boolean keyword systems in our analysis, we believe that this study has filtered most of the potentially irrelevant topics in the process of data collection.
Although our current data do not identify the author of each tweet, this ability would be necessary to answer underlying questions about the influence of traditional media and content sharing by opinion leaders, or at least those with a large reach within the network. Researchers have suggested that an increase in the number of tweets containing external links in social media posts indicates an increased response to news and other online sources (Szomszor et al., 2012). If we could present what proportion of links were to traditional media articles, it might show empirical evidence that social media, in following a crisis, mainly follows the agenda set by traditional media. However, this study only analyzes the information directly shown in Twitter, as it stands; the content of the link was not considered. Furthermore, we did not get pictures categorized as tweets containing a link or no link because this study encompasses only text-based content analysis. Considering the importance of the link contents, future study should examine the sources and content of the links included in tweets. Last, although our research questions are developed from the comparison of the tweets with links and without links, we understand the blurred boundary between these two groups of Twitter users (i.e., authors of tweets with links vs. authors of tweets without links) because people play the role as information mediator, creator, or both when they write tweets with supportive links.
Despite these limitations, it is certainly worth reporting our conclusions based on the enumeration-oriented sentiment analysis of nuclear energy–related tweets over the time period studied. First, this study allows us to think about “why does tweeting change from mostly original ‘broadcast’ tweets pre-major events to sharing such as linking post major events?” One answer to this could be nuclear energy became a hot-button issue after Fukushima’s disaster. There would have been little mass media coverage before the disaster, and a lot afterward. So, there is more opportunity to include external links about nuclear energy in their tweets. This may also have been to gather more information about this controversial technology as people decided where they stood on the issue. Holton, Baek, Coddington, and Yaschur (2014) support this assumption. In their study, they found that people would have posted hyperlinks in their tweets expecting some sort of compensation, such as getting more similar information from others (Holton et al., 2014).
We believe that this study revisits significant insights into understanding the role social media plays in the information-dissemination process. The volume of tweets about nuclear energy dramatically increased just after the Fukushima accident and with time gradually decreased. However, after quite some time had passed, the volume of tweets about nuclear energy increased again. (There were approximately 730,000 tweets in the third quarter of 2011 and approximately 2,100,000 tweets about nuclear energy in the second quarter of 2013.) Based on our analysis, we would suggest the role of critical events in the volume spike of tweets. Our data show that the volume of nuclear energy–related tweets with external links greatly increased in November 2011, when the IAEA released its report that Iran had been developing a nuclear weapon and the Fukushima nuclear plant was opened to journalists. We should acknowledge, however, that only the dates of several nuclear-related events were considered in this study without providing the data of the volume of the media coverage of these events. Hence, if in a future study we could present the correlation between the volume of mainstream media coverage and tweets toward the issues, it would enable us to strengthen our statements.
Concerning the role of social media as a space for public discourse about societal issues, with this study we aim to understand how two types of tweets take different views toward nuclear energy. In line with previous studies (e.g., Alonso et al., 2010; Szomszor et al., 2012), we have separated tweets into those with external links and those without. After the Fukushima accident, the predominant sentiment expressed about nuclear energy was pessimism, across all the tweets over the entire time-sampled period. Our data show that between these two types, however, the proportion of opinions expressing optimism and opinions expressing neutral was somewhat different. For tweets containing external links, sentiments were primarily either pessimistic (52% of opinion) or optimistic (30%) toward nuclear energy. For tweets without an external link, optimistic opinions toward nuclear made up only 8% of all opinions while the rest were pessimistic (51% of sentiment) and neutral opinions (41%). Given these findings, it seems that when people express a more slanted opinion (either positive or negative) on the issue, they tend to include an external link in their tweets. Such a finding is unsurprising. Previous studies have found that polarized Twitter users are more likely to link supportive information sources in their posts (e.g., Pew Research Center, 2014). Due to its 140-character limit, Twitter does not allow for in-depth discussions in a single post. Many Twitter users thus link to external sources as a way of giving depth to their interest in specific news and issues (De Maeyer, 2012; Hsu & Park, 2011) and as a way to expand the dialogue about those issues (Gruzd & Roy, 2014).
In addition to the theoretical implications of this study, we should highlight the methodological contribution on computer-aided content analysis used in this study. In the human-coded content-analysis setting, researchers usually submit their findings from a representative sample of content documents because of some potential physical limitations, including human coders’ time and resources. The beauty of a complete enumeration-oriented, computer-aided content analysis adds to both validity and reliability; this study can extend the range of content for certain issues in our analyses. Specifically, systematically programmed computer-aided content analytic technique allows us to conduct analyses on the complete posts of all available documents pulled. This can be done using the keywords provided by the researcher in a certain online platform (Twitter for this study) without needing to be concerned about a sampling margin of error. In addition, it enables a researcher to analyze online content in real time after applying their coding rules to the particular coding query (Hopkins & King, 2010).
Footnotes
Authors’ Note
Any opinions, findings, and conclusions or recommendations expressed in this article are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This material is supported by grants from the National Science Foundation to the Center for Nanotechnology in Society at Arizona State University (Grant No. SES-0937591) and the University of Wisconsin–Madison Nanoscale Science and Engineering Center in Templated Synthesis and Assembly at the Nanoscale (Grant No. SES-DMR-0832760).
