Abstract
With the increase of online journalism, embedded multimedia stories have become more popular. Yet, little is known about the cognitive and affective effects this journalistic format may have on the audience. This experimental study compares the effects of embedded multimedia, traditional multimedia, and text-only format on readers’ knowledge gain, emotional reactions, and narrative transportation. Overall, the effects are substantially less pronounced than expected. The audiences’ emotional reactions and narrative transportation do not depend on modality, whereas knowledge gain is slightly decreased by multimodality. The theoretical, practical, and methodological implications of these limited effects are discussed.
The digital shift has given rise to new opportunities in journalism practice, making it possible to combine a variety of media modalities in journalistic products, such as, but not limited to, text, audio, video, photographs, and other visuals (Jacobson, 2012). The term multimedia journalism aptly describes this development (Harper, 2005). A groundbreaking publication, which combined different modalities into a single news story, was Snowfall: The Avalanche at Tunnel Creek, published on the front page of the NYTimes.com in 2012 (www.nytimes.com/projects/2012/snow-fall/#/?part=tunnel-creek). The publication embedded various media elements in the textual story line, thus creating a seamless and compelling multimedia reading experience. The story’s success, both in viewership and awards, has encouraged other organizations to create similar multimedia publications. Snowfall was lauded as having redefined journalistic practices to the extent that “to snowfall” a story became a verb for creating an embedded multimedia journalistic piece (Pompeo, 2013; Rue, 2013).
The use of embedded multimedia journalism has increased during the past few years, and projects that incorporate multimedia have become an important means for newspapers to differentiate themselves in an ever-competitive media landscape (Steensen, 2010). This growth is astonishing given the fact that multimedia productions are demanding in terms of time and resources, and considering that such form of online journalism does not fit well with the immediacy and timeliness that dominates online news production (Steensen, 2010).
Opinions vary greatly as to whether this format is merely a way for newspapers to hold their readers’ attention through sheer bombast, or an innovation that enriches online reporting (Dowling & Vogan, 2014; Malik, 2013; Rieder, 2013; Rue, 2013; Sonderman, 2012; Washeck, 2013). The increasing popularity of embedded multimedia journalism and the resources that media organizations feed into its creation are, in part, encouraged by the presumption that this new format is, in some important ways, a better way to utilize multimedia in online reporting (Fisher, 2015). The stylistic and presentational features are quite different from more traditional types of online news, as we outline below. Little, however, is known about how the audience perceives embedded multimedia stories or about their possible effects on the audience.
This gap is addressed by investigating whether the effects exerted by embedded multimedia journalism differ from those of print-only and traditional multimedia reporting. Does embedded multimedia journalism affect people cognitively or emotionally in different ways than the more traditional journalistic formats do? This question is important for both theory and practice. Embedded multimedia reporting is still in its infancy, and—as it emerges on the media landscape—it stands to reason to systematically analyze its effects, thereby contributing to theoretical advancements regarding how and why this format differs from more traditional formats. Also, the potential effectiveness of embedded multimedia reporting may influence journalists and media organizations to put more or less effort and resources into the creation of such stories. This study aims to advance media effects research and bridges communications, media psychology, and journalism theory by systematically studying cognitive and affective effects of embedded multimedia journalism.
Multimedia Journalism
Increasingly, media and journalism studies focus on multimedia and online journalism (e.g., Deuze, 2004; Jacobson, 2010, 2012; Kiuttu, 2013; Larrondo Ureta, 2011; Steensen, 2009, 2010). Work in this area mostly examines how news is produced and presented by describing the multimedia and online journalism landscape or by analyzing the content of websites (Steensen, 2010). In this work, multimedia journalism is defined as a “presentation of a news story package on a website using two or more media formats, such as (but not limited to) spoken and written word, music, moving and still images, graphic animations, including interactive and hyper-textual elements” (Deuze, 2004, p. 140). Although both online journalism and multimedia journalism share existence on the web, online journalism is not driven by multimodality per se (Deuze, 2004). Because an online news story with text and a photo is generally not considered to be multimedia (Steensen, 2010), in this study, multimedia journalism refers to stories in which more than two media modes are utilized, such as text, images, and video.
Multimedia journalism can take at least two forms. First, traditional multimedia journalism refers to stories that utilize the Christmas tree format, where “multimedia elements like videos, photo slideshows, maps and graphics are add-ons, placed to the side of the main text story like ornaments hung on a tree” (Grabowicz, Hernandez, & Rue, 2014, para. 1). In this case, multimedia elements are used as extensions of the written word, not as primary storytelling techniques (e.g., Deuze, 2004; Jacobson, 2010; Kiuttu, 2013; Steensen, 2009). Second, multimedia journalism can take the form of embedded multimedia journalism. In this case, the main story is usually text-based and is told in a linear fashion. However, compared with the traditional multimedia layout, the multimedia elements in an embedded layout are “integrated into the main story so they’re viewed at appropriate points in the narrative” (Grabowicz et al., 2014, para. 1). The layout emphasizes narrative flow, resulting in “a more seamless transition between text and video or graphics and back to text, with the multimedia a part of the narrative, rather than separated out” (Grabowicz et al., 2014, para. 3).
In this study, these two multimedia formats are juxtaposed against a story format in which text is the only modality used. The latter format is not prevalent in online newspapers today, as typically, articles include photos, graphics, and/or links; however, text-only long-form journalism is a (re)emerging form on online platforms such as longform.org and byliner.com and thus constitutes part of a trend in online journalism that is much in contrast to the developments in multimedia journalism.
Many journalists and media organizations, such as the New York Times, the Guardian, NPR Online, or Spiegel Online, have invested substantial time and financial resources in developing embedded multimedia stories, investment that is propelled by the belief that audiences crave higher value journalism and storytelling, but perhaps also by the presumed effects that the embedded multimedia stories have on the understanding, emotions, or engagement of the users (Fisher, 2015; Greenfield, 2012; Hernandez & Rue, 2015; Jula, 2014). However, as aforementioned, studies on embedded multimedia journalism are relatively scant. Even more limited is the evidence on its potential effects. Theoretically, the diverse presentational features in text-only stories, traditional multimedia, and embedded multimedia should affect the audience differently.
Research that compares print and visual media can inform our expectations of the different effects exerted by these formats, and we outline the various theoretical approaches and empirical findings below. It must be noted that these findings may not fully apply to embedded multimedia. First, unlike prior studies that focus on the presentation of multiple modalities in a simultaneous fashion (e.g., Wojcieszak, 2009), research on embedded multimedia journalism needs to account for the fact that the audience attends to the modalities separately, such as when a user reads a text and later watches an accompanying video. Second, past research often addresses competition between media modalities (e.g., with regard to learning effects, see Sundar, 2000). In embedded multimedia journalism, in contrast, these features are complementary and blended together seamlessly. Unlike text-only or video-only formats, multimedia journalism features various modalities in a single story and offers users a unique ability to switch back and forth between these modalities.
Nevertheless, existing communication research allows us to advance expectations concerning the effects exerted by the three different formats. Following several studies (LaMarre & Landreville, 2009; Sundar, 2000), we focus on cognitive (i.e., knowledge gain) and emotional (i.e., affective reactions to content) effects of text-only, traditional multimedia, and embedded multimedia journalism. Furthermore, given its special applicability to embedded multimedia journalism, we also attend to narrative transportation (Green & Brock, 2000), both as directly resulting from exposure to the different media formats and mediating the tested effects.
Recall and Knowledge Gain
Knowledge gain is among the most extensively studied media effects, given its socio-political importance (e.g., Delli Carpini, 2004). Research on specific media (e.g., radio vs. television vs. newspapers; for example, Eveland & Scheufele, 2000; Kwak, 1999) or genres (e.g., hard news vs. late-night comedy vs. documentary vs. fiction film; see Baek & Wojcieszak, 2009; Baum, 2003; Moy, Xenos, & Hussain, 2013; Prior, 2005) typically examines which of these enhance citizens’ political understanding. Such studies show, by and large, that although newspaper readership is more strongly related to political knowledge than television viewership, specific content or programs on television also promote knowledge acquisition (Graber, 1996). Here, we focus on how different types of journalistic formats contribute to audiences’ knowledge.
One would expect multimodality, and in particular embedded multimodality, to increase users’ knowledge. Memory for visuals is better than memory for words, otherwise known as the pictorial superiority effect (Levie, 1987; Moore, Burton, & Myers, 2004), and the human brain absorbs larger amounts of information when the messages are visual or audiovisual, compared with solely being read (Graber, 1996). This is because images help us decode text and attract our attention to information (Parkinson, 2012). Similarly, realism can enhance learning. When audiences “perceive the content of a stimulus as factual it leads them to process the information more deeply, which in turn leads to better memory and more extensive learning of that content” (Pouliot & Cowen, 2007, p. 244). Arguably, a presentation that includes visual or audiovisual material contributes to perceived realism. Combining pictures with words in a multimedia article may make messages more memorable than text only, as well as stimulate individual “capacity to take in, comprehend, and more efficiently synthesize large amounts of new information” (Parkinson, 2012, para. 28). These processes may increase the likelihood of the audience being able to remember the information given in a story (Graber, 1996; Levie & Lentz, 1982).
Several theoretical frameworks support this expectation. The cue summation model (Severin, 1967), often used in journalism studies, predicts that learning increases with an increasing number of available cues or stimuli. It is argued that “when textual information is presented along with images it provides additional learning cues, particularly at the time of retrieval from memory” (Sundar, 2000, p. 482). Similarly, the dual coding framework, which has a long history in psychology, assumes that there are two cognitive sub-systems, one specialized in processing verbal stimuli, and the other specialized in nonverbal or image stimuli; these systems operate independently in encoding information into memory (Paivio, 1991; Sundar, 2000). Accordingly, information delivered via several different modalities could be better stored and more easily recalled. In fact, high visual–verbal redundancy in television coverage produces greater recall (Son, Reese, & Davie, 1987), and viewers’ memory of television stories with semantic overlap is considerably better than recall of the same printed stories (Walma van der Molen & Klijn, 2004).
However, messages delivered via several modalities can require more cognitive abilities and may be too demanding for the information processing system. The Limited-Capacity Information Processing (Lang, 2000) and the Multiple Resource (Wickens, 2002) models, for example, argue “that media messages, delivered simultaneously in a number of modalities, are cognitively complex and serve to overload the processing system” (Sundar, 2000, p. 482). Similarly, additional cues provided by the use of two or more information channels simultaneously within a single channel “may be distracting or even evoke responses in opposition to the desired types of learning” (Dwyer, 1978, pp. 29-30). The recent work on task switching also shows that people who use various media devices simultaneously often switch between media without conscious awareness (Brasel & Gips, 2011). More germane to our focus on multimedia content, presenting several integrated modalities (video, text, etc.), switching between content on single devices displaying multiple types of content is very frequent and generates high arousal (Yeykelis, Cummings, & Reeves, 2014). This task switching may deplete attention, diminishing knowledge gain.
These theoretical perspectives posit that the added resources utilized for encoding different modalities may compromise information processing and storage. A number of studies support this conclusion: Adding extra modalities to text degrades memory for content, and, in line with the arguments above, possible explanations for this effect include interference, distraction, overstimulation, cognitive overload, fatigue, or task switching (Sundar, 2000).
An important distinction to be made is between simultaneously or consecutively administered stimuli (Moore et al., 2004; Sundar, 2000). Traditional multimedia journalism entails simultaneous stimuli presentation. In embedded multimedia journalism, however, the narrative flow layout limits such presentations, and stimuli are attended to separately. Concerning simultaneous presentation of visual and text stimuli, it is likely that the visual attracts higher attention (Segel & Heer, 2010). Thus, if an image is placed in the middle, or right next to the textual element (as is often the case in the Christmas tree layout), the visual likely attracts attention away from the text. Thereby, the reading process may be disrupted, which ultimately might compromise recall and knowledge (see Segel & Heer, 2010). By contrast, a narrative flow layout in embedded multimedia journalism, where each media mode has its fixed place in the story line, may decrease such interference or distraction, thereby facilitating recall and leading to greater information gain.
Emotions
The role of emotions has gained considerable attention in media effects research, with studies often focusing on discrete emotions evoked by specific media content (e.g., Brader, 2005; Nabi, 1999; Valentino, Banks, Hutchings, & Davis, 2009; Weber, 2013). Here, we examine a slightly different issue, namely, whether the same story delivered via various formats affects the intensity of the evoked emotions. Due to the nature of our experimental materials—a story about child abuse (see details below)—we focus exclusively on negative emotions. We test anger and anxiety, two emotions that have been frequently studied in the context of news media (Marcus & MacKuen, 1993; Nabi, 2003; Valentino, Brader, Groenendyk, Gregorowicz, & Hutchings, 2011).
Audiovisual and visual messages typically heighten emotional responses (Detenber, Simons, & Bennett, 1998; Graber, 1996; Parkinson, 2012; Salomon, 1984; Schill, 2012). This is because visuals tap into “reservoirs” of collectively held knowledge and cultural associations (Schill, 2012) and engage the reader’s imagination by stimulating those areas of the brain that are not activated by text alone (Bobrow & Norman, 1975). Some argue that the holistic processing of visuals leads to unconscious emotional responses (Barry, 1997). Visuals can quickly communicate various emotions and activate powerful emotional responses from viewers (Hill, 2004; Schill, 2012). Studies exploring the differences between the two multimedia formats and the text-only story suggest that nonverbal messages convey 93% of emotional meaning (see Schill, 2012), and that moving images, relative to still images, increase viewers’ emotional arousal (Detenber et al., 1998).
Furthermore, research in both psychology (e.g., Frijda, Kuipers, & ter Schure, 1989) and political science (Valentino et al., 2009) indicates that some negative emotions increase attention, interest, and learning. First, although anger encourages careful attention to one’s surroundings (Marcus, Neuman, & MacKuen, 2000; Nabi, 1999), it also is related to immediate action that makes careful considerations difficult. Anger is characterized by lower cognitive effort and more superficial processing, leading people to make inferences based on accessible scripts (Tiedens, 2001), rely on shortcuts, and attend to peripheral cues (Bodenhausen, Sheppard, & Kramer, 1994). As a result, anger does not encourage people to gather and integrate new information and, ultimately, does little to enhance knowledge gain (Valentino, Hutchings, Banks, & Davis, 2008). In turn, anxiety encourages thoughtfulness (Berenbaum, Fujita, & Pfenning, 1995), deeper information processing (Bless, Mackie, & Schwarz, 1992), as well as greater vigilance and careful analysis (see Huddy, Feldman, & Cassese, 2007). As such, anxiety can “stimulate political interest, enhance the quality of information seeking in the political arena, and boost learning” (Valentino et al., 2008, p. 249; see also Hutchings, Valentino, Philpot, & White, 2006; Valentino et al., 2009), and function as a mediating variable, as outlined below.
Narrative transportation
Last, we test narrative transportation as an outcome of exposure and as a facilitator of cognitive and emotional effects. Transportation refers to immersion into a narrative world and occurs when media consumers “lose track of time, fail to observe events going on around them, and feel they are completely immersed in the world of the narrative” (Green, Brock, & Kaufman, 2004, p. 247). Transportation theory emerged in the context of longer, often fictional, narratives, such as novels, television shows, or full-feature movies, and suggests that such content engages the viewers, transports them into the narrative, and generates story-consistent effects on attitudes or behavioral intentions (e.g., Slater, Rouner, & Long, 2006). The concept of transportation has also been applied to different formats, such as short messages or health advertising (Durkin & Wakefield, 2008; Wojcieszak & Kim, 2015), and the related concept of empathetic engagement has been applied to news formats (Oliver, Dillard, Bae, & Tamul, 2012). Embedded multimedia articles are typically quite long, contain multiple elements that guide users through a coherent story, and engage users with visuals, videos, and graphics. Therefore, transportation is directly relevant to embedded multimedia journalism, which may transport viewers into the story to a greater extent than the other journalistic formats.
Research on the effects of visuals versus text gives some indications of the differences between the multimedia stories and the text-only story. On one hand, the imaginative investment needed for processing textual material might increase transportation, in that a written text “allows a reader to participate more fully in creating a mental image of the story” (Green et al., 2008, p. 517). On the other hand, vivid images can transport the audience to a different time or place by increasing the ease or fluency with which people enter the narrative world (Green et al., 2008). For instance, audiovisuals may provide an even richer impression of characters, may make it easier to identify with people and situations, and may give “the viewer a sense of participating in an event or, at least, witnessing it personally” (Graber, 1996, p. 86). A study that compared a novel and a fictional movie found, on average, no differences in transportation between the reading and watching groups (Green et al., 2008). Yet, it may be easier for the audience to be transported into a story containing auditory and visual elements than through text alone.
In addition to being directly affected by media formats, narrative transportation facilitates other effects. Transportation may increase knowledge relevant to the story line (Green et al., 2004; LaMarre & Landreville, 2009; Murphy, Frank, Moran, & Patnoe-Woodley, 2011). That is, as people are transported into embedded multimedia content, they better remember the story details and are able to recall them with greater ease, because these individual details are linked, temporally and logically, with impactful story themes. Furthermore, transportation may mediate the effects of embedded multimedia on emotional reactions, as transported viewers are said to lose awareness of their surroundings, which leads to heightened emotions (Green & Brock, 2000). In other words, as people enter the “story world,” they will more vicariously experience various emotions generated by the content.
Research Question and Hypotheses
Drawing on the literature discussed above, we propose a research question and a number of hypotheses. Concerning learning, the literature presents competing theoretical models regarding the effects of multimodality (cue summation theory and dual coding framework, vs. the limited capacity model and multiple resource theory). However, the literature leans on single-channel theories that may not apply to embedded multimedia. We thus pose the research question:
Our first theoretical prediction focuses on the differences between the two multimedia formats. Because embedded multimedia formats are characterized by a narrative flow layout, we expect this format to more effectively increase knowledge gain compared with traditional multimedia, which typically add on various audio and/or visual components.
Turning to emotions, the literature suggests that audiovisual and visual messages typically heighten emotional responses, and that anxiety, in particular, encourages deeper information processing. Based on this research, we expect the following:
Looking at transportation, the literature indicates that it may be easier for the audience to be transported into a story containing auditory and visual elements, which leads us to expect the following:
Moreover, due to the emphasis of narrative flow in embedded multimedia stories, which is closely related to transportation (Green & Brock, 2000; Green et al., 2004), we also predict the following:
And last, through heightened transportation, we expect two mediating effects to occur:
Method
To assess these direct and indirect effects, we conducted a posttest-only experiment. The experiment was administered by Research Now, a survey and market research organization. The online panels of Research Now are research-only and the organization adheres to ESOMAR standards. The U.K. panel consists of approximately 325,000 members who were recruited through a multitude of recruitment campaigns. In each survey invite, panelists are informed about the survey topic and, in exchange for participation, are rewarded with an incentive, reflecting the length of survey. Incentives are awarded only once the survey has been completed.
For our study, Research Now recruited a sample of 258 participants from their U.K. panel. We set quotas on age, education, and gender to generate a sample that approximates the population distribution in the United Kingdom on these socio-demographics (see Table B1 in Appendix B).
Participants were randomly assigned to one of three versions of an online news story, each with identical content, but differing in modality and layout. A first check of the data revealed that a considerable number of respondents spent too little time on the study to have fully engaged with our stimuli. We excluded these outliers, resulting in a final sample of 173 participants (see below for more detail). Of those, due to a technical glitch, slightly more people were assigned to the text-only version (n = 69) compared with the traditional multimedia story (n = 55) and the embedded multimedia story (n = 49). Comparing the three experimental groups on key characteristics revealed no significant differences on gender, χ2(1, 2) = 3.72, p = .16; age, F(2, 170) = 1.69, p = .19; education, χ2(1, 4) = 3.22, p = .52; online news consumption (see below), F(2, 170) = 0.17, p = .84; offline print news consumption, F(2, 170) = 1.83, p = .16; or the time it took to complete the study, F(2, 170) = 1.31, p = .27, suggesting successful randomization. After exposure to the news story, participants completed a questionnaire that assessed information recall, emotional reactions, as well as transportation.
Stimulus Material
The stimuli were constructed for use in this experiment and were based, with permission from the newspaper, on Betrayed—Janne’s Story, an embedded multimedia story published by the Norwegian newspaper Bergens Tidene (http://multimedia.bt.no/janne/). Janne’s Story was one of Norway’s most-read stories in 2013 with more than 1,000,000 views and was awarded two of the most prestigious journalistic distinctions in Norway from the Norwegian Union of Journalists and the Norwegian Foundation for Investigative Journalism, respectively. The feature tells the story of Janne, a woman suffering severe health problems as the result of prolonged sexual abuse and negligence throughout her childhood. The stimuli used in the experiment were shortened versions of the original story. General facts and statistics about sexual child abuse were added for the purpose of the experiment. The story was translated from Norwegian into English, and English subtitles were added to the video material.
The story was chosen for several reasons: To assess knowledge gain, it was advantageous to choose a relatively uncommon topic such as child abuse, due to the expected lack of extensive prior knowledge about the issue. However, even if there was some prior knowledge about this topic, we assume that our randomization procedure assured that such prior knowledge would have been evenly spread across the three experimental conditions. Furthermore, transportation is more likely with high-quality narratives, whereas poor narrative quality may disrupt engagement (Bilandzic & Busselle, 2008; LaMarre & Landreville, 2009). Relying on Janne’s Story assured that our participants were exposed to a high-quality embedded multimedia publication and increased the external validity of our experiment. Also, the fact that the story included images of Janne from her childhood and a video of her state today increases perceived reality.
Experimental Treatment Conditions
All versions of the stimulus were made to look similar to optimize treatment equivalence and to better attribute the effects to the modality itself. The only differences were in the modality and the layout (for screen shots of the stimuli, see Appendix A). In the first condition (referred to as embedded multimedia), all modalities (text, video, and pictures) were present, and the layout embedded the multimedia modes, resulting in a linear narrative flow. As the readers scrolled down, they were presented with one modality at a time, each telling a unique part of the story, and the general facts about sexual abuse were embedded in the story line. In the second condition (referred to as traditional multimedia), the same modalities were present, but instead of an embedded layout, the Christmas tree layout was used. The text was placed in the center, the pictures were included in a carousel at the top, and the videos were placed at the bottom. The facts on sexual abuse were included in a text box to the side of the text. Both the embedded and the traditional multimedia versions contained 929 words, eight pictures with captions (three pictures also had a quote in the corner of the image), one audio clip (0:45 min), and one video (1:42 min). With average reading speed being between 250 and 300 words per minute (e.g., Ziefle, 1998; Trauzettel-Klosinski & Dietz, 2012) and online readers likely to read up to 370 words per minute (Noyes & Garland, 2008), the average time to fully consume the multimedia stimuli (including video/audio segments) was estimated at 5½ min. The third version (referred to as text-only) utilized the same layout as the embedded multimedia version (emphasizing linear, narrative flow). The additional facts were embedded in the story line, and the verbal elements in the videos were transcribed to plain text. The pictures and the captions were not transcribed, but the three picture quotes were added to the text. The text version contained 1,069 words and the average time to read the entire piece was about 3½ min.
Our online experiment does not allow control for how extensively participants engaged with the stimuli, and, unfortunately, the survey company failed to deliver time stamps for how long participants exposed themselves to the stimuli. We, however, devised a measure of how long participants took to complete the entire survey, including stimuli exposure and the 28 posttest questions. We argue that participants who took less than 6 min for the entire survey—even if they are quick readers—were very unlikely to have attended to the whole story, or watched or listened to all of the multimedia features in our multimedia conditions. On the other end of the spectrum, we argue that 40 min to complete the entire survey is an appropriate cutoff point, to increase the likelihood that respondents answered questions with the stimulus material still fresh in mind.
We thus excluded participants who took less than 6 and more than 40 min to complete the entire study (both stimulus exposure and survey completion). To ensure we did not alter the outcomes with these cutoffs, we re-ran our analysis with alternative cutoff points, such as a minimum of 5 or 7 or a maximum of 60 or 120 min. This did not affect our results. Importantly, those who were excluded do not differ significantly from the remaining sample in terms of socio-demographics, and they were evenly distributed across the three conditions. In fact, on average, the embedded multimedia condition group took slightly longer to complete the entire survey (13.07 min), followed by the text-only group (12.57 min) and the traditional multimedia condition (11.18 min).
Dependent Measures
We assessed participants’ knowledge gain by asking six questions related to sexual abuse in general, and to information about the story itself (e.g., “Why does Janne enjoy living at the nursing home?” “What is the most common relationship young female victims of sexual abuse have with their perpetrator?” or “How many girls approximately are sexually abused worldwide each year?”). Correct answers were assigned the value 1 and incorrect answers received a 0. The additive knowledge index ranged from 0 (no correct answer) to 6 (all answers correct; M = 2.20, SD = 1.82, Cronbach’s α = .73).
Furthermore, participants were asked to rate, on a 7-point scale from none to a great deal, to what extent Janne’s story made them feel five negative emotions (disgusted, guilty, angry, afraid, and/or anxious). Factor analysis revealed two underlying dimensions: anxious (i.e., afraid, anxious, scared; M = 2.99, SD = 1.61; Cronbach’s α = .92) and angry (i.e., disgusted, angry; M = 4.42, SD = 1.72; Cronbach’s α = .87), which we use in our analyses.
To measure transportation, we included five items from the narrative transportation scale, previously shown to be valid and reliable (Green et al., 2004). Participants were asked how strongly they agreed or disagreed on a 7-point scale with statements such as, “I was mentally involved in the story line while reading it,” “While I was reading the article, I could easily picture the events in it taking place,” “The story affected me emotionally,” “I found myself thinking of ways the story could have turned out,” and “I wanted to learn how the article ended.” These items were averaged (M = 5.21, SD = 0.87, Cronbach’s α = .86).
Given the topic of the story and the potentially different experiences participants may have had with multimedia reporting, we controlled for age (M = 45 years), gender (60% female), education (45% of the sample had 4 years of college or more), familiarity with online news reading (three items ranging from 1 = low to 7 = high; M = 3.78, SD = 1.72, Cronbach’s α = .76), and the time participants spent completing the study as a proxy for their engagement with the stimuli (M = 12.26 min, SD = 6.31 min).
Results
To address our hypotheses and research question, we conducted a series of ANCOVA models that include participants’ age, gender, education, and online news exposure as covariates.
1
The first model tested the effect of our stimuli on knowledge gain, finding a marginally significant overall effect at a 90% probability level, F(1, 165) = 2.46, p = .08, η2 = .03.
2
However, we find no support for the expectation that the embedded multimedia story leads to higher knowledge levels than the traditional multimedia story (
Our expectations regarding the effects of embedded multimedia, traditional multimedia, and text-only stories on emotional responses (
Furthermore, we expected that multimodality would lead to higher transportation than the text-only story (
Our theoretical framework led to hypotheses on mediation effects, in that knowledge may be enhanced indirectly, through the effect that embedded multimedia has on heightened emotional responses (
Discussion
Each new technology and each new media form generate debates about their potential revolutionary power (see Steensen, 2010). In a similar vein, the publication of Snow Fall led to a discussion on whether “the mainstream media is [are] about to forgo words and pictures for a whole lot more” (Greenfield, 2012, para. 1). During the past years, online newspapers and magazines have invested in utilizing new coding options and new design tools on an unprecedented scale (see Grabowicz et al., 2014). With multimedia modes integrated into the story line, some observers have expected embedded multimedia publications to alter traditional reading experiences. This study set out to ask whether embedded multimedia journalism generates different effects than other online journalistic formats. The study was designed as a first step toward offering insights into the potential differential cognitive and affective effects of embedded multimedia, traditional multimedia, and text-only journalism.
Overall, we find very few differences between the three story modes. Notably, and against our expectations, we find that those participants who read the text-only version learned slightly more about the story topic than those exposed to parallel content in a multimedia format, in particular, the traditional multimedia story. Rather than facilitating knowledge gain, the added multimodality made readers to remember less. This is consistent with theories arguing that a combination of modalities within a story adds cognitive complexity, requires more resources to encode the different modalities, and may ultimately overload the processing system (e.g., Lang, 2000). In contrast, simple text, without accompanying graphics and videos, may be the easiest format to process, thereby leaving more “cognitive room” for information storage and recall (Sundar, 2000). That the traditional multimedia produced the lowest knowledge gain—significantly lower than the text-only version—might indicate that the Christmas tree layout may be the most confusing to readers, and that the multimedia features, presented simultaneously, may be distracting (e.g., Dwyer, 1978). Likewise, the multimedia format may require users to “switch” from one modality to another, switching creating a strain on the cognitive system, potentially compromising learning and information recall. Whether our results were caused by a cognitive overload, however, remains speculation, as we did not measure resource allocation, attention, focus, or other potential mechanisms that could explain the effects. The fact that embedded multimedia ranked somewhere in between the two other formats may indicate that the narrative flow layout led to less distraction than the traditional multimedia’s Christmas layout.
When it comes to emotional effects, the differences between the stories were not significant, although those exposed to embedded multimedia stories scored higher on both emotions than the participants in the other conditions. Also, although we proposed that transportation could explain such effects, multimodality did not facilitate transportation and the indirect effect also was insignificant. These non-significant differences may be due to limited empathy or identification with the story protagonist, an issue we discuss below. Future research should attend to how—and through which mechanisms—various reporting formats affect different discrete emotions and their intensity.
This largely insignificant pattern of effects is quite indicative. Given that the content of the three stories was identical across the conditions, the effects—or the lack thereof—should be attributable to the added multimedia features and the ways in which these features were presented. However, before concluding that media organizations will be better off spending their resources on traditional articles than on creating embedded multimedia, more research is necessary to draw firmer conclusions about the effects of multimodality and, in particular, embedded multimodality. Our study has limitations.
First and most importantly, because the experiment was conducted online, we had no control over whether the participants in fact read or viewed each component of the stimuli. 4 If the participants attended to certain components more closely, we do not know to which. Excluding those participants who spent little time on the study provides some assurance that the final sample engaged with the stimuli. Nevertheless, it is possible that participants’ attention and engagement were insufficient to generate the effects. Even more so, we cannot exclude the possibility that the text-only version yielded higher knowledge gains because participants in the multimedia versions did not attend to the video or audio segments. To examine this issue, it would be beneficial to include measures that can determine whether, and which, components of the stimuli are viewed. We encourage future research in the area to test these issues in laboratory settings, where researchers have greater control over participants’ exposure, as well as the use of thought listing tasks, eye-tracking techniques, and physiological measures to assess whether and which layout features exert various cognitive, emotional, and physiological effects when people view multimedia enhancements.
It needs to be noted, however, that low control over exposure and insufficient attention to stimuli are perennial concerns of online experiments, because respondents complete the studies in their free time and in the comfort of their own homes. The upside of using an online experiment for this type of research is that the treatments are administered in a close-to-real-life situation. That also means that it may be typical for readers of multimedia stories to not watch or listen to all video or audio segments. Thus, for some participants, such cursory exposure may be considered “naturalistic,” possibly adding external validity to our results. In that sense, news outlets can provide all the multimedia enhancements they wish, but if readers ignore them, then these enhancements are of little value.
Second, the null effects on emotions and transportation could be related to the fact that the story reported in the stimuli took place in Norway, whereas the participants were from the United Kingdom. It is plausible that they did not identify with the protagonist as much as they would if she was British, and did not see the story as realistic for their own country. At the same time, using this multimedia story assured that our stimuli were of high quality, and testing a similar story from the United Kingdom was not an option, given that participants may have been familiar with it already.
It must also be kept in mind that the stimuli focused on an idiosyncratic topic, child abuse and its consequences, and it is unclear whether the results generalize to other issues. In fact, this topic does not share many commonalities with other topics addressed by news media (e.g., the economy, governmental policies). Perhaps the cognitive effects would have been more pronounced if the materials had dealt with a rather theoretical and convoluted issue (e.g., the economic crisis). In contrast, emotional reactions could have been stronger had we used an appealing, heart-warming story. Future research should more systematically attend to whether and how differences in a topic matter for the effects exerted by multimedia reporting in general and specifically vis-à-vis other journalistic formats. It is possible that the subject matter should dictate how a story is presented. Perhaps some matters are best told using text only, whereas others may be best suited for audio and/or video formats, whereas yet others benefit from a multimedia approach. It is an important challenge for media organizations and communication research to determine which topics are best suited for which format.
With these limitations in mind, we conclude that embedded multimedia, traditional online multimedia, and text-based stories may not substantially differ in the effects they generate, and that media consumers may learn more from the “good-old” text articles. After all, embedded multimedia journalism is still a new format, and some of the null findings may be due to audiences becoming acquainted with how to process this format. Although modality did not prove to have pronounced effects, modality may affect whether or not audience members choose to attend to a news story in the first place. Finding that multimedia elements encourage the selection of and attention to stories would have important implications in the current fragmented media environment, and future studies should address this issue. Ours was the first study to systematically venture into the cognitive and emotional consequences of various journalistic forms. Therefore, further research is needed to build a more complete body of knowledge on the effects of emerging and traditional journalistic formats on various attitudinal, cognitive, and behavioral outcomes across various topics and issues.
Footnotes
Appendix A
Appendix B
Summary of Mean Comparisons.
| Knowledge | Transportation | Anger | Anxiety | |
|---|---|---|---|---|
| Text-only story (N = 69) | 2.58a
(2.55)a |
5.14 (5.15) |
4.40 (4.37) |
2.84 (2.87) |
| Traditional multimedia story (N = 55) | 1.74b
(1.87)b |
5.25 (5.26) |
4.15 (4.25) |
2.90 (2.80) |
| Embedded multimedia story (N = 49) | 2.16 (2.06) |
5.24 (5.22) |
4.75 (4.68) |
3.30 (3.34) |
| ANOVA (ANCOVA) F value |
3.33 (2.46) |
0.27 (0.26) |
1.53 (0.84) |
1.27 (1.72) |
Note. Cell entries are means and between brackets, the estimated marginal means from the ANCOVA models including gender, age, education online news exposure and time spent on the survey as covariates. Different a, b subscripts denote significant mean differences within columns at p < .1 (two-tailed).
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
