Abstract
Abstract
We investigated online diffusion of information, spread of fear, and perceived risk of infection to Middle East Respiratory Syndrome (MERS) as cases of MERS spread rapidly and dozens of fatalities occurred in South Korea in May–June of 2015. This study retrieved 8,671,695 MERS-related online documents from May 20 to June 18, 2015, from 171 Korean online channels and analyzed such documents by using multilevel models and data mining with Apriori algorithm association analysis. We used R software (version 3.2.1) for the association analysis data mining and visualization. Buzz with negative emotions (i.e., anxiety or fear) was more prevalent in online discussion boards, Twitter, and online cafes than news sites and blogs. News buzz (b = 0.21, p < 0.001), but not rumor buzz (b = 0.06, p = 0.308), was associated with positive MERS emotions (i.e., being calm or composed). The mention of eating immunity-boosting food in the news led to a 94 percent chance of a positive MERS emotion and that such a chance of showing a positive emotion was 4.75 times higher than that without such a mention (support of 0.001, confidence of 0.94, and lift of 4.75). Even with the same precautionary messages that were disseminated, they yielded the opposite emotional reactions to people depending on the channel through which the messages were communicated. In the face of a novel and highly contagious disease such as MERS, the government must deploy a response system that includes provision and dissemination of reliable information and inhibits online diffusion of false information.
Introduction
C
Although the majority of MERS cases were reported in April 2012 in Saudi Arabia, it spread to other parts of the world. 5 From April 2012 to July 2015, there have been 1,392 reported MERS cases and 538 deaths worldwide. 6 The first confirmed case of MERS in South Korea was reported on May 20, 2015. By July 4, 2015, there had been a total of 186 confirmed cases and 36 deaths. Fears among Koreans spread rapidly after cases of secondary infections were reported, and the number of cases jumped within a short period of time. 7
Mao 8 proposed an agent-based model that explains diffusions of and their interactions between (a) infectious diseases, (b) information regarding the disease, and (c) human preventive behaviors against the disease. Mao's Model 8 postulates that the three processes diffuse simultaneously through social networks. This implies that information diffusion 9 can be accelerated depending on diffusion of preventive behavior adoption of informed individuals as well as diffusion of the disease. In addition, if a society is interweaved with highly accessible and mobile social network service (SNS), such information diffusion will be even swifter. According to the Social Amplification Theory, 10 people tend to spread perceived risk in the form of messages via news media or their informal networks and such messages undergo transformation and are either amplified or weakened by various factors at each point of risk perception. 11 Once such perceived risk is officially acknowledged by others or authorities, ripple effects can kick in and the information can be widely spread out often with distortion or exaggeration, 12 which, in turn, can fuel fears. In this fear situation, the typical “fight or flight” 13 may not be adequate to describe a fear reaction unless one believes that he or she is immune to the infection or the infected area is quite distant and localized. When the infectious disease such as MERS is perceived as fatal and extremely contagious, though there is virtually no means of complete escape from exposure to the disease, people may assess the risk as substantial, feel fear of infection, and exhibit various defense mechanisms. 14 On the Internet, emotions on MERS can be expressed or new information can be created and diffused as a defense mechanism.14,15 Such information, once created, spreads stigmatization in wave-like effects. The more provocative the rumor, the better it is able to elicit user responses and spread fear and mistrust of the authorities, amplifying the losses to the individual and society as a whole. 16
To date, there have been no studies on the diffusion of information on infectious diseases or perceived risk of infection via social media. In this study, we collected buzz (individual documents) from Korean online news sites, blogs, online cafés (online communities), tweets, and discussion boards that mentioned MERS to analyze the diffusion of information, spread of fear, and perceived infection risks as expressed online. Our goal was to derive lessons and implications that could help guide the response of the government or public authorities in the face of a future crisis (such as the outbreak of an infectious disease) that take into account the online dimension of our current society. The hypotheses of this study were as follows: (a) Tweeted buzz as well as online news and rumors on MERS affect MERS-related emotions; (b) tweeted buzz is more associated with negative than positive MERS emotions; and (c) interactions exist between hour-level tweeted preventive measures and day-level MERS news and rumor buzz in online channels.
Materials and Methods
Research design
It is the goal of this study to utilize social big data to analyze the factors behind information diffusion regarding MERS and perceived risks of infection. We have elected to use multilevel models and data mining methods (i.e., association analysis) to build a predictive model for MERS information diffusion and perceived infection risk in South Korea. The dependent variable is a binary MERS-related emotion over time (anxiety or fear that is referred to as negative emotion hereafter or being calm or composed that is referred to as positive emotion hereafter). The Level-1 characteristics are hour-specific buzz on preventive measures (i.e., eating immunity-boosting food, complying with recommended precautionary measures, refraining from going outside, washing hands frequently with soap, and wearing protective masks) as mentioned in tweets. A binary variable was created for each of these Level-1 characteristics with a code of “1” in the presence of mention of the preventive measure and a code of “0” in the absence of such mention. These hour-level tweets were conceptualized as influenced by and nested under day-level news and rumors. Hence, the Level-2 characteristics are day-level MERS news buzz and rumor buzz as found in all online channels. Again, a binary variable was created for each of these Level-2 characteristics with a code of “1” in the presence of buzz and a code of “0” in the absence of such buzz.
In case of outbreaks of highly infectious diseases, the government typically controls resource mobilization, provision of treatment, isolation of patients, surveillance of infections, financial support, and the like; the general public have almost nothing to do except seeking out information of the infectious disease and on how to protect themselves from being infected. As the dependent variable is a binary MERS-related emotion of the general public as expressed in tweets, mention of preventive measures and news/rumor buzzes served as independent variables of this study.
The data utilized in this study encompass text-based web documents (buzz) collected from 171 Korean online channels (149 online news sites, 15 discussion boards, 5 channels including Twitter and blogs, and 2 communities). The search keywords included MERS and its synonyms such as MERS virus, Middle East Respiratory Syndrome, and MERS coronavirus. Stop-words included “Mercedes Benz,” because the first three characters of “Mercedes” are spelled exactly the same as MERS in Korean. The covered time period was 30 days from May 20, 2015 (the day the first case of MERS was reported) to June 18, 2015. A leading Korean telecommunications company was contracted to conduct data collection and treatment. The telecommunications company employed a crawler to collect documents from the selected channels (news sites, blogs, online communities, discussion boards, and tweets); categorized and cleaned unstructured data through text mining; and converted the categorized unstructured data into binary representation (with a code of “1” in the presence of a relevant keyword with positive emotion and a code of “0” for negative emotion) in preparation for statistical analysis. The “21st Century Sejong Project” corpus developed by the National Institute of Korean Language was used as the lexicon for text mining. Hourly buzz was collected from all channels. The total number of retrieved online documents related to MERS during the 30-day period was 8,671,695, which were broken down into 78,884 blogs (0.9 percent), 187,641 online community texts (2.2 percent), 7,672,083 Twitter texts (88.5 percent), 451,615 discussion board texts (5.2 percent), and 281,472 news site texts (3.2 percent).
Data analysis
The frequency and percentage of the binary dependent variable (positive or negative MERS-related emotion) and independent variables are summarized in Table 1. Normality test results for the news buzz (mean: 7,780, SD: 10,721, skewness: 1.21, kurtosis: −0.25) were in support of normality, whereas those for the rumor buzz (mean: 4,415, SD: 6,284, skewness: 2.79, kurtosis: 9.19) indicated violation of the normality assumption. 17 Subsequently, the normality test was re-run after a square-root transformation and a logarithmic transformation were applied. Because the latter transformation yielded results in better agreement with the normality assumption, the log-transformed variable was used in the analysis.
Includes multiple responses.
A general hospital has been shut down due to MERS cases.
Eating chicken helps prevent MERS.
Taking fever medicine will help cure MERS.
One MERS patient (Patient No. 35) has infected 1,500 people.
Application of Vaseline under the nose helps prevent MERS.
Fake lists of hospitals where MERS was diagnosed and fake lists of confirmed MERS patients and deaths.
Tamiflu cures MERS.
MERS, Middle East Respiratory Syndrome.
To construct an efficient predictive model regarding MERS information diffusion and perceived infection risk, we used the so-called association analysis data mining method, 18 which calls for no particular statistical assumptions. An Apriori algorithm as proposed by Agrawal and Srikant 18 was used for the association analysis. The Apriori algorithm discovers associations between two or more words included in an online document or transaction. Association rules are predicated on “support” (that is used to remove rules that appear less frequently) and “confidence” (that is used to gauge the strength of the association between words). 19 In the context of the directional association rule (X→Y), “lift” refers to the ratio of the count of word Y in the presence of word X as compared with when X is not present. Association analysis involves the generation of frequent item sets that satisfy a minimal support criterion as defined by the researcher. Of these item sets, those that satisfy a minimal confidence criterion and a lift of at least 1 are selected. 20 For this study, the minimum criteria we set are support >0.001 and confidence >0.1. 21
We performed a multilevel analysis to estimate the effect of day-level factors (news and rumors) mentioned in all channels on the MERS emotions as expressed in online documents. That is, the daily number of mentions in news and rumors served as the Level-2 unit of analysis and the hourly number of specific buzz on preventive measures (i.e., eating immunity-boosting food, complying with recommended precautionary measures, refraining from going outside, washing hands frequently with soap, and wearing protective masks) served as the Level-1 unit of analysis. Both Level-1 and Level-2 variables were uncentered. Parameter estimation was performed through restricted maximum likelihood, which controls for the loss of degrees of freedom in the fixed effect when calculating the estimated variance of the random effect. 20 Robust standard errors were applied during the final estimation of the fixed effect. We used HLM 7.0 software for the multilevel analysis and R software (version 3.2.1) for the association analysis data mining and visualization.
Results
As summarized in Table 1, of the 8,671,695 online documents collected from the 171 channels, the most commonly mentioned preventive measures were “complying with recommended precautionary measures” (36 percent) and “wearing protective masks” (36 percent). The most commonly mentioned MERS-related rumors included fake lists of hospitals where MERS was diagnosed and fake lists of confirmed MERS patients and deaths (73 percent), a false rumor that a general hospital has been shut down due to MERS cases (10 percent), and a rumor that application of Vaseline under the nose helps prevent MERS (9 percent). As summarized in Table 2, buzz with negative emotions was more prevalent in online discussion boards, Twitter, and online cafes (online communities); whereas buzz with positive emotions was more prevalent in news sites and blogs.
Positive emotion refers to being calm or composed, and negative emotion refers to anxiety or fear.
As shown in Appendix Table A1, all the preventive measures (i.e., eating immunity-boosting food, complying with recommended precautionary measures, refraining from going outside, washing hands frequently with soap, and wearing protective masks) positively influenced MERS emotions, so that more frequent mentions in a post were associated with less negative emotions such as anxiety or fear. Of the channel-related factors, only Twitter and discussion boards were found to negatively affect emotions. Online documents diffused through Twitter amplified negative emotions (anxiety or fear), whereas those diffused through news sites boosted positive emotions.
As summarized in Table 3, the null model (Model 1) where no explanatory variables were entered produced an intraclass correlation coefficient (ICC) of 0.084 [0.3/(0.3 + 3.29)], which means that the daily variance of MERS-related mentions accounted for ∼8.4 percent of the total hourly variance of MERS-related emotions. An ICC value in excess of 0.05 typically justifies the use of a multilevel model. 22 The Level-1 predictors model (Model 2) was used to determine whether hour-level mentions of preventive measures had an influence on MERS-related emotions while allowing day-level variance (i.e., random effect). In Model 3, day-level mentions of MERS in news and rumors were added to Model 2. News buzz (b = 0.21, p < 0.001), but not rumor buzz (b = 0.06, p = 0.308), was associated with positive MERS emotions. In Model 4, interactions between significant Level-2 (i.e., news buzz) and Level-1 predictors were examined. None of the interactions were significant, signifying mutual independence between tweeted preventive measures and MERS-related information mentioned in news sites.
p < 0.001, **p < 0.01, *p < 0.05.
Coef., logit coefficient; ICC, intraclass correlation coefficient; OR, odds ratio.
As summarized in Table 4, the association rule with the highest confidence was found with the link from eating immunity-boosting food in the news to positive emotion. This association showed a support of 0.001, confidence of 0.94, and lift of 4.75, indicating that the mention of eating immunity-boosting food in the news led to a 94 percent chance of a positive MERS emotion and that such a chance of showing a positive emotion was 4.75 times higher than that without such a mention. The mention of complying with recommended precautionary measures and wearing protective masks was associated with an 87 percent chance of a negative MERS emotion and that such a chance of showing a negative emotion was 1.08 times higher than that without such a mention. This was corroborated by a parallel coordinates plot of association rules as shown in Appendix Figure A1. The mentions of “complying with recommended precautionary measures” and “wearing protective masks” were linked to negative emotion on the right-hand side when going through “Twitter.” By contrast, the same mentions were linked to positive emotion when going through “News.”
Although 54 rules with confidence in excess of 0.01 were identified, only 29 are reported here due to space constraint.
CRPM, complying with recommended precautionary measures; EIBF, eating immunity-boosting food; WHSF, washing hands with soap frequently; WPM, wearing protective masks.
Discussion
Fear or panic is common in victims of both terrorism and highly infectious diseases. Although such fearful emotion is what terrorists intend to generate, it is typically generated without intention in the case of highly infectious diseases. In terrorism, people fear about being injured or killed, because there might be a possibility that they could be attacked without knowing and without reason. Likewise, people fear being killed or diseased, because there might be a possibility that they could be infected with a highly contagious disease with a high mortality rate. However, an outbreak of a highly infectious disease usually affects a much larger population than a terrorist attack. Thus, let alone responding to potential mortalities and other losses that such an infectious disease incurs, a great deal of attention and efforts should be made to counter and reduce the fear and panic people that encounter due to the infectious disease outbreak. Especially, given the ubiquitous use of the Internet and exploding capacity of SNS that diffuses information in real time, such fear could be spread out to millions of people in a few minutes along with the information on the outbreak. Due to people's keen concern, demand on information about the outbreak tends to escalate and, as a result, so does supply of false information and rumors. This study contributes to the literature, as this is one of the first attempts that investigated the diffusion of information on a highly infectious disease (i.e., MERS), spread of fear, and perceived risk of infection via social media by analyzing 8,671,695 counts of online documents mentioning MERS during the 1 month after the outbreak of MERS in South Korea in May 2015.
Of the examined online documents, negative emotion accounted for 80 percent of all posts throughout all channels. Negative emotion was the most prevalent in Twitter, whereas posts with positive emotion were the most prevalent in news sites. This suggests that the diffusion of unsubstantiated information on diseases might be more prevalent in Twitter than in other social media and such misinformation diffusion through SNS channels may bring about confusion and social chaos and endanger the government's effective response to the disease outbreak, which can exacerbate fears in the face of an outbreak of a highly contagious disease such as MERS or Ebola. 11 Meanwhile, channels such as online news sites featuring substantiated MERS-related documents prepared by experts were found to have the potential to create a positive environment that is conducive to the government's effective response to the disease outbreak. Therefore, it is imperative for each nation to develop and put into place a system where correct information on the disease at hand and appropriate precautionary measures that people should take are developed and diffused rapidly to the SNS in case of an outbreak of a highly infectious disease. Besides, there should be a system in place where misinformation on the infectious disease is constantly monitored and countered in a swift and authoritative manner.
The results of the present study partially supported the hypothesis that buzz on both preventive measures as mentioned in tweets and MERS news and rumors in online channels influenced MERS emotions. More frequent online mentions of preventive measures were found to reduce negative emotions such as anxiety or fear, indicating that the diffusion of information on how to protect oneself from contracting an infectious disease helps relieve anxiety or fear rather than elevate such emotions. This result contrasts with previous findings in the literature. During the 2003 severe acute respiratory syndrome (SARS) epidemic, although repeated mentions of how low the likelihood of contracting SARS inundated the airwaves, people's anxiety or fear did not subside. Rather, people substantially overestimated their risk of contracting SARS. 23 In their study, Young and Oppenheimer confirmed 24 that people overestimate risk for contracting such an infectious disease when they are presented with semantic descriptors such as low or infinitesimal risk for infection. People react to such descriptors with alarm rather than with relief. 24 However, the finding of the present study indicates that when the public are presented with concrete measures on how to protect themselves from contracting an infectious disease rather than authoritative statements that the risk of infection is very low, people may reduce their perceived infection risk. Interestingly, as for a possible influence of MERS news and rumors on MERS emotions, only news buzz, not rumor buzz, significantly influenced emotions and its influence was positive. This may be interpreted in light of trustworthiness of the diffused message. Compared with news buzz, much more rumors were false. The fact that the Korean public were not significantly influenced by rumors indicates that many people somehow questioned the trustworthiness of the rumors. By contrast, people collectively trusted information diffused by online news and such information helped relieve their anxiety and fear. Taken together, this emphasizes the importance of developing and disseminating concrete precautionary measures that people should take and providing substantiated information on the disease via news channels rather than issuing authoritative statements with semantic descriptors.
The result in the parallel coordinates plot of association rules deserves mention. The mentions of “complying with recommended precautionary measures” and “wearing protective masks” in Twitter were linked to negative MERS emotion, whereas the same mentions were linked to positive MERS emotion when they were mentioned in news sites. Even with the same messages that were disseminated, they yielded the opposite emotional reactions to people depending on the channel through which the messages were communicated. This appears to have to do with perceived trustworthiness of the channel. This indicates that it might be a better option to use news channels than Twitter among SNS channels to disseminate correct information on the infectious disease at hand and appropriate precautionary measures that people should take. At the same time, it would be better to use Twitter among other SNS channels as the main online outlet to monitor and counter misinformation on the infectious disease.
Human Participant Protection
The Institutional Review Board at the authors' institution approved the study protocol of this study.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
Appendix
| Variable | b | S.E. | OR | 95% CI | |
|---|---|---|---|---|---|
| Preventive measures | Eating immunity-boosting food | 1.025 *** | 0.009 | 2.79 | 2.74–2.84 |
| Complying with recommended precautionary measures | 0.774 *** | 0.005 | 2.17 | 2.15–2.19 | |
| Refraining from going outside | 2.308 *** | 0.024 | 10.05 | 9.59–10.54 | |
| Washing hands with soap frequently | 1.117 *** | 0.006 | 3.06 | 2.02–3.09 | |
| Wearing protective masks | 0.094 *** | 0.005 | 1.10 | 1.09–1.11 | |
| Types of online channel | Blogs | 1.537 *** | 0.008 | 4.65 | 4.58–4.72 |
| Online cafe | 0.321 *** | 0.006 | 1.38 | 1.36–1.39 | |
| −0.874 *** | 0.002 | 0.42 | 0.42–0.042 | ||
| Online board | −0.267 *** | 0.004 | 0.77 | 0.76–0.77 | |
| News | 2.135 *** | 0.004 | 8.45 | 8.38–8.53 | |
The reference category for the binary outcome variable is negative emotion. The b coefficients are standardized logit coefficients.
p < 0.001.
CI, confidence interval; OR, odds ratio.
