Abstract
Story Appraisal Theory posits that reduced memory representations of stories, or story kernels, are appraised in a three-dimensional story appraisal space. Stories deemed to have a point (pointedness), to be plausible (plausibility), and to be generalizable to society (probative value) are more likely to provoke implications than stories found wanting on one or more of these appraisal parameters. Story kernel–prompted implications, in turn, produce attitudinal and behavioral effects. Stories may have implications for the self, others (family and friends), and society. Four experiments found general support for the proposition that favorable appraisals promote implication generation. Experiments 2 to 4 revealed that implications partially mediate between the story appraisal dimensions and estimates of behavior change in response to the stories.
Unlike persuasive messages that present arguments and evidence advocating for a particular point of view, stories and narratives involve characters and events that are arranged in a causal structure that eventuates in an outcome. The persuasive intent of messages that rely on arguments and evidence is usually transparent to audiences, but the precise intent of stories may be substantially more opaque. Far from seeking to alter attitudes and behavior, narratives may be entertaining in their own right and may provoke a multiplicity of attributions about their authors’ communicative intent. Authors may seek to alter story consumers’ beliefs and actions; however, because good stories have entertainment value and may not be ostensibly aimed at achieving persuasive goals, they can be effective vehicles for doing so.
That narratives may sometimes be more persuasive than messages presenting arguments and evidence has become an increasingly accepted tenet among persuasion researchers (Brock, Strange, & Green, 2002; Dillard, 2010; Slater, 2002; Strange, 2002; Strange & Leung, 1999). Meta-analyses suggest that narratives prompt predicted changes in attitudes compared with those unexposed to them (Braddock & Dillard, 2012). Cultivation proponents contend that television’s persuasive power derives from its role as a storytelling medium (Gerbner & Gross, 1976), and media information campaigns employing stories are commonplace (Singhal & Rogers, 1999). Exemplification theory explains the inordinate impact of case reports in news stories (Zillmann, 2002; Zillmann & Brosius, 2000). Some have proposed that jurors arrive at verdicts by using narrative structures to process trial-related information (Pennington & Hastie, 1991, 1992). This optimistic view of narratives’ persuasive power is tempered by findings indicating that statistical evidence may sometimes be more influential than narrative evidence (Han & Fink, 2012). Regardless of the relative persuasive efficacy of narratives and statistical evidence, stories vary in their persuasive impact.
A number of models have been proposed to account for stories’ persuasive efficacy. The Transportation Imagery Model (TIM) posits that when story consumers are transported to the “story world” and experience belief-relevant imagery, narratives will influence them (Green, 2006; Green & Brock, 2000, 2002). Others have argued that narratives have impact because their consumers tend to generalize from the phenomena depicted in them, making them seem highly prevalent (Strange, 2002). For example, consistent with the availability heuristic, those exposed to a story about a specific homeless person will be more likely to provide inflated judgments of homelessness’ prevalence (Tversky & Kahneman, 1973). Identification with story characters has also been proposed to mediate narrative effects (De Graaf, Hoeken, Sanders, & Beentjes, 2012). While identification may be influential, the construct remains ill-defined (Cohen, 2001; Coplan, 2004). Some have argued that transportation and identification overlap in that both involve absorption with story characters (Sestir & Green, 2010). However, identification based on perceived similarity, emulation of story characters and parasocial interaction has been purported to transcend transportation (Moyer-Gusé, 2008). While transportation, generalization, and identification may foster narrative impact, these constructs have not been integral to a coherent narrative impact theory. Story Appraisal Theory (SAT) represents an alternative to this single-variable, ad hoc approach to explaining narrative impact (Berger & Ha, 2011), although, a few communication researchers have proposed comprehensive frameworks dealing with narrative comprehension and engagement (Busselle & Bilandzic, 2008).
The idea that narratives may have substantial impact within a variety of communication contexts can be subsumed under broader processing frameworks. Numerous theories and models have been advanced to elucidate the processes subserving story and text comprehension, from story grammars and story schemata (Mandler, 1984; Mandler & Johnson, 1977; Rummelhart, 1977), scripts (Bower, Black, & Turner, 1979; Schank & Abelson, 1977) to story memory (Schank & Abelson, 1995; Schank & Berman, 2002), text bases and situation models (van Dijk & Kintsch, 1983; Zwaan, 1999; Zwaan & Radvansky, 1998), the landscape model (van den Broek, Young, Tzeng, & Linderholm, 1999), and others (D. R. Roskos-Ewoldsen & Roskos-Ewoldsen, 2010; B. Roskos-Ewoldsen, Roskos-Ewoldsen, Yang, & Lee, 2007; Tapiero, 2007). These frameworks seek to explain how stories, text, and discourse are represented in memory and how they are recalled. Difficulties in developing coherent situation and character models have been postulated to undermine transportation and identification processes, thus attenuating narrative impact (Busselle & Bilandzic, 2008).
Some theorists have contended that the ubiquity of storytelling in everyday social commerce has led to the development of a “story memory” dedicated to processing stories and story-like experience (Schank, 1990; Schank & Abelson, 1995; Schank & Berman, 2002). Others have asserted that story comprehension theories should specify how (a) story meaning is represented in processors’ minds, (b) these representations are generated during comprehension, and (c) representations are used to accomplish such tasks as retrieving the story from memory (Graesser, Olde, & Klettke, 2002). As crucial as these memory processes are for determining story comprehension, these models are mute about the implications comprehended stories have for their processors. Story processors may comprehend story events, their causal structure, and story characters’ goals, but discerning implications that flow from stories, so comprehended, is another matter. SAT accords this implication generation process a significant role in producing narrative effects (Berger & Ha, 2011).
SAT postulates that stories are represented in memory, in gist-like form, as story kernels. These kernels retain information about important story characters and events, and the timeline and causal structure of the story but omit details not critical to the story’s plot. Substantial research supports the proposition that stories are represented in memory in reduced form (Bartlett, 1932; Black & Bower, 1980; Mandler, 1984; Mandler & Johnson, 1977; Trabasso & van den Broek, 1985; van den Broek, 1997). Story kernels are appraised for their implications. If implications are generated, they can prompt affective, cognitive, and behavioral responses. Attenuated implication production will provoke limited effects. The idea that implication generation is integral to appraisal processes is explicitly embodied within some appraisal theories of emotion (Scherer, 2009). In such models, implications appraisals are posited to follow primary sensory-motor appraisals. SAT postulates that story processors can appraise stories and generate implications consciously and deliberately or pre-consciously (Bargh, 1997; Bargh & Chartrand, 2000); moreover, pre-consciously generated implications that cannot be verbalized may exert effects on attitudes and action.
Implications are inferences based upon the characters and events depicted in the story kernel. Thus, an individual exposed to a news story about a local burglary could infer that the community needs more police or that a house alarm system should be purchased. Of course, implications could be generated in response to a non-narrative, statistical depiction of the burglary problem and to other seemingly singular events, such as the testing of a nuclear weapon by a heretofore non-nuclear country. In the latter case, the nuclear test might be the concluding event of a long narrative about how the country developed the weapon. In the former case, individuals exposed to the burglary statistic might be prompted to mentally simulate prototypical, story-like burglary scenarios that prompt implications. Because narratives link multiple events in a coherent manner, they may afford more implication generation opportunities than statistical depictions or depictions of ostensibly singular events that are part of a larger, albeit implicit, narrative. This more catholic view of narrative’s relationship to human communication is reflected in Fisher’s (1987) narrative paradigm.
Implications may be germane to the story processor (self), to those whom the story processor knows such as friends (others), or to larger social entities (society). Given this tripartite distinction, we would caution that conflating the constructs of implications and personal relevance might create unnecessary conceptual muddle. Specifically, personal relevance is a global construct that focuses generally on the self, while our conceptualization of implications includes not only the self but also others and larger social entities. Global characterizations of relevance and measures that reflect this generality, for example, asking participants to rate, on a 7-point scale, the relevance of a virus warning to them (Bordia, Difonzo, Haines, & Chaseling, 2005) or asking participants to rate the personal importance of a videotaped warning (Claypool, Mackie, Garcia-Marques, McIntosh, & Udall, 2004), contrast with the more particularistic conceptualization of implications. Because implications include others and society, story processors could generate numerous implications in these two categories but few that are self-relevant; thus, a lifelong non-smoker exposed to a story about a heavy smoker dying of lung cancer might generate few self implications but many for others (relatives/friends who smoke) and for society (increased health care costs). Implications germane to others and society might serve to prompt individual action, even in the absence of self-relevant implications. Implications vary in their importance; consequently, a highly important implication might affect an outcome more than several less important implications.
Because implications are generated through associative processes that include experientially acquired knowledge, individuals with similar understandings of story events could generate very different implications from similar story kernels. Although story kernels can be “re-comprehended,” as when individuals attempt to “get their stories straight,” kernels may be reappraised either individually or socially for their implications without altering the kernel. Story kernel reappraisal is akin to “second guessing” (Hewes, 1995) and has been the focus of investigations in political (Hwang, Gotlieb, Nah, & McLeod, 2007) and health communication contexts (Dunlop, Kashima, & Wakefield, 2011). The idea that there may be substantial variability in the numbers and kinds of implications processors garner from similar story kernels is compatible with the message interpretation model (Edwards, 1998; Edwards & Bello, 2001). This model proposes that message processors’ interpretations mediate between input messages and their effects. Story-provoked implications are part of such interpretations.
SAT postulates that the implication generation process is set in motion by the outcome of appraisal processes occurring in a three-dimensional story appraisal space. Story kernels are assayed along the dimensions of pointedness, plausibility, and probative value. Pointedness indexes the degree to which story processors think the story kernel makes a point or points. In contrast to texts that are coherent merely because they portray a series of related events, compelling narratives make a point or multiple points. Stories and the events and actions that take place within them are frequently understood by reference to the point or points they make (Schank et al., 1982; Wilensky, 1982, 1983). Although seemingly similar, the pointedness and perceived persuasive intent of stories constructs can be differentiated. Narratives that entail no particular persuasive intent can have a point or points. Protagonists in children’s stories may ultimately conquer villains of different ilks, and thus make a point, but, at the same time not necessarily subserve any explicit or implicit suasive goals. Stories conveyed to provoke humor may require that listeners/readers apprehend the story’s point to experience the humorous response; however, unless one invokes a very broad notion of “persuasive intent,” even those who apprehend the story’s point and find it humorous may not perceive the story to entail persuasive ends. Indeed, such stories might involve perceived intent to entertain. Finally, even stories devised by sources to achieve narrow suasive goals may prompt their consumers to perceive unintended points.
Plausibility refers to story kernels’ believability and the degree to which story events could actually happen to people. Research reported under the aegis of “perceived reality,” has concerned the determinants of the degree to which media consumers judge media content to be “real” (Potter, 1988). Media stories featuring typical elements are judged to be more real than those containing atypical elements (Shapiro & Chock, 2003); moreover, when evaluating media stories in terms of the self, story information consistent with story consumers’ spontaneous attributions of causality prompts increases in stories’ perceived realism (Shapiro, Barriga, & Beren, 2010). While studies such as these provide insights into the antecedents of perceived realism judgments of media stories, SAT focuses on the influence of the perceived plausibility of narratives writ large, especially with respect to their capacity to provoke story-related implications and effects. In addition to its narrow focus on media texts, one difficulty in drawing parallels between the perceived realism literature and the present theory is the conceptual and measurement confusion surrounding the perceived realism construct (Hall, 2003, 2009). These problems notwithstanding, some evidence suggests that plausibility may be a primary dimension on which perceived realism judgments are based (Hall, 2003). Although plausibility may seem to be related to the fiction–non-fiction distinction, fictional stories involve such themes as personal loyalty, sacrifice, and betrayal, all of which are plentiful in the non-fiction world. Because fictional story plots may resemble events that individuals have experienced, such stories seem plausible.
Finally, probative value concerns the degree to which story processors believe the story kernel to be diagnostic of the prevalence of the phenomena depicted in the story (Berger, 2007; Berger & Lee, 2011); that is, how reasonable it is to generalize from the story to conditions in the physical or social world. Probative value judgments grant the plausibility of the events portrayed in the story but ask whether it is reasonable to use the story as a basis for generalizing to conditions in the physical or social world in terms of their prevalence. Pointless stories are likely to provoke “so what?” responses and implausible stories are likely to be dismissed as “far-fetched.” Stories lacking probative value tend to be seen as “isolated instances,” and thus, tend to provoke few implications.
Some theorists have proposed criteria for “good stories” that resemble two of the three appraisal dimensions. Such criteria include coherence and fidelity (Fisher, 1987), external and narrative realism (Busselle & Bilandzic, 2008), and coherence and plausibility (Beach, 2010). However, none of these formulations deals directly with the issue of stories’ pointedness. Stories can be internally coherent in that they depict comprehensible event sequences, and these event sequences can comport with those that might occur in the “real world.” At the same time, however, the otherwise internally coherent and externally realistic story could be relatively pointless, as when a story depicts characters enacting highly routine behavior sequences in a plausible, coherent, and realistic manner, but with no discernable point (cf. Schank et al., 1982; Wilensky, 1982, 1983).
Some have argued that story consumers’ default processing mode is to accept stories as realistic (Busselle & Bilandzic, 2008). SAT is consistent with this postulate, but stories found wanting on one or more of the three appraisal space parameters are less likely to provoke implications and effects. Moreover, although SAT does not stipulate a specific canonical order in which appraisal space dimensions are activated, it seems possible that story events encountered as the story unfolds would be more likely to prompt plausibility assessments, with bizarre events in particular undermining plausibility. Such events could be represented in story kernels and influence plausibility judgments made at the story’s end. Sometimes, the unusual events in such stories become the central reason for telling the story; for example, when people say “Let me tell you about something bizarre that happened to me today.” Although pointedness and probative value judgments might be held in abeyance until the story’s end, it seems possible that the pointlessness of a story could become apparent to processors well before its conclusion; the notion of the long, drawn-out, boring story being consistent with this possibility. Sometimes stories may explicitly make reference to their point and may provide cues to the prevalence of the phenomenon they depict; however, consumers are frequently left to base these judgments on inferences rather than information explicitly provided in the story.
The present article presents four experiments designed to evaluate the SAT-derived model (see Figure 1). Story reappraisal processes were not investigated in the experiments. Experiment 1 examined the relationships between the story appraisal dimensions and implications, indexed by open-ended measures. Experiment 2 employed structured implications measures and estimates of behavior change enabling assessments of implications as a mediator of appraisals and effects. Experiment 3 examined the same relationships utilizing both open-ended and structured implications measures, and Experiment 4 employed a larger story sample.

A model of Story Appraisal Theory.
Experiment 1
Experiment 1 tested the following hypotheses:
Although the stories’ probative value was not manipulated, a probative value measure was included to test the following hypothesis:
The experiment was conducted at two different universities but employed the same stories and measures.
Method
Participants
The East Coast sample consisted of 131 undergraduate students, 51% women, whose ages ranged from 18 to 29 years (M = 20.08 years, SD = 1.50 years). The West Coast sample included 103 undergraduates, 65% women, whose ages ranged from 19 to 27 years (M = 21.35 years, SD = 1.37 years). All participants were enrolled in communication classes and received extra credit.
Procedure
As was the case in all four experiments, participants completed questionnaires individually and anonymously, and instructions indicated that they would be reading and responding to a story that ostensibly appeared in the local news media. Two of the stories, one titled “On the Way to Class” and the other titled “After Class” described a student engaging in a series of everyday actions. Other stories depicted a computer theft at the student union and a campus mugging in either plausible or less plausible versions. The crime stories were modeled after ones that appeared in back issues of campus newspapers. Each story was 160 words long. Participants were randomly assigned to one of the six story conditions.
The mundane stories described the actions of a student either going to class in the morning or returning from an early evening class. In the morning version, on the way to a 9:00 a.m. class, a student stopped at the student union, placed a backpack on a table, purchased coffee, read a newspaper for a few minutes, and then continued on to class. In the evening version, the student left class to find a virtually empty campus and went to the student union, purchased pizza for dinner, checked for email, and then went home. The first few sentences of these two stories served as the introduction to the crime stories. In the morning story’s case, while the student was standing in the coffee line, a thief came to the table where the backpack had been placed, opened it, and stole an expensive new computer containing unbacked up, homework-related files. In the evening version, as the student walked toward the union, an attacker appeared and demanded money. In addition to inflicting physical harm, the mugger fled with cash and a debit card, which was subsequently used to withdraw money from the student’s account. The student’s injuries were treated at the campus health center. Story events were identical for both universities, but local landmark names differed.
Pretests revealed that the mundane stories were judged to be more pointless than the crime stories. The plausibility ratings of the crime stories were fairly high; hence, less plausible crime story versions were developed. This goal was accomplished by introducing a mysterious event into the depicted crime sequence. Thus, just before the theft or mugging took place, a bright light appeared and a deep voice commanded “Do not steal!” The bright light then vanished but no one was there. In spite of this intervening event, one that we labeled “divine intervention,” the story ended in the same way as the more plausible crime story versions. Pretests confirmed that divine intervention story versions were judged to be less plausible than versions not containing the event. 1 After reading the story, participants rated the story’s plausibility, probative value, and the degree to which the story had a moral and a point. They listed, in three open-ended items, any story-relevant implications for themselves, other people (friends and family), and the campus, respectively. Instructions indicated that the just-read story might or might not have implications for the reader’s personal life, the lives of others they know, and to the campus as a whole; that is, the story may or may not affect what the reader or others believe or the way the reader or others might act in the future, or both. They then completed demographic items.
Results
Index construction
Plausibility (M = 7.20, SD = 2.34, α = .89) was indexed by four items, responded to on 11-point scales, that asked participants to indicate the degree to which the news story they read was believable, plausible, true to life, and reflective of events that could happen to students on campus. 2 Pointedness (M = 5.53, SD = 2.79, α = .79) was indexed by two 11-point items that asked participants to rate the degree to which the story had a moral and a point. Probative value (M = 4.99, SD = 1.86, α = .88) was assessed by five, 11-point items that asked participants how reasonable it is to generalize from the story to the broader situation on campus with respect to the depicted crime; how indicative, accurate, and reliable the story was of the crime’s prevalence on campus; how much one should rely on the story as an indicator of the extensiveness of the crime on campus. Probative value items for the mundane stories asked the degree to which the story was an accurate, reliable, and so forth, indicator of students’ behavior. Probative value was positively correlated with plausibility (r = .46, p < .01) and pointedness (r = .14, p < .05), but pointedness and plausibility were not significantly correlated (r = −.07). 3
Implications coding
Because participants listed implications, there was no need to unitize them prior to coding them into categories. Coders counted the number of self (M = 1.84, SD = 1.39), other (M = 1.70, SD = 1.28), and campus (M = 1.79, SD = 1.23) implications. These three measures were significantly intercorrelated (rs = .52-.63, p < .01) and were summed to form a total implications index (M = 5.30, SD = 3.31, α = .80). Because the pointless stories prompted participants to generate both fewer and highly idiosyncratic implications, only implications listed for the crime stories were content coded. Two coders independently classified the implications generated by a randomly selected sub-sample of 48 protocols (24 protocols per university) into the three implications categories shown in Table 1.
Exemplars and Kappa Reliabilities for Implications Categories for Experiment 1.
Manipulation checks
As intended, the mundane stories (M = 3.11, SD = 1.89) were judged to be significantly less pointed than were the combined crime stories (M = 6.53, SD = 2.52), t(231) = 10.52, p < .0001, η2 = .33. The plausible crime story versions (M = 7.23, SD = 2.08) were judged to be significantly more plausible than the less plausible crime story versions (M = 5.85, SD = 2.13), t(231) = 4.44, p < .0001, η2 = .08. Mundane stories were judged to be quite plausible (M = 8.89, SD = 1.73). An ANOVA of the probative value measure with the story conditions as the independent variable was significant, F(2, 231) = 10.61, p < .0001,
Tests of hypotheses
To test Hypothesis 1, the mundane story condition was contrasted with the average of the two crime story conditions. As Figure 2 indicates, this contrast yielded support for Hypothesis 1. Mundane story readers generated significantly fewer (M = 3.59, SD = 3.18) implications than did crime story readers (M = 6.18, SD = 3.05), t(231) = 6.01, p < .0001, η2 = .13. Contrary to Hypothesis 2, plausible crime story version readers generated the same number of implications (M = 6.19, SD = 2.99) as readers of the less plausible crime story versions (M = 6.19, SD = 3.13). No significant sex differences emerged in this analysis.

Mean total implications by story conditions for Experiment 1.
We computed a series of hierarchical regressions in which sex was entered as a dummy variable in the first step and the three appraisal space measures in the second step. In none of these analyses was sex significant.
4
When total implications for all three story conditions were regressed on the three appraisal space dimensions in the second step, there was a significant increment in accounted for variance, ΔF(3, 229) = 13.49, p < .0001,
Hierarchical regression analyses employing the same predictors were conducted on the precautions, safety, and security implications. As Figure 3 shows, the frequency with which three types of implications were generated differed significantly, F(2, 314) = 110.51, p < .0001,

Mean crime story implications for Experiment 1.
Discussion
Experiment 1 results lend credence to SAT’s proposition that stories judged to have a point provoke more implications than do stories that are deemed to be relatively pointless. The fact that the mundane stories prompted significantly fewer implications than did the more pointed crime stories, even though the mundane stories were judged to be more plausible and probative than the crime stories, suggests the importance of story pointedness in promoting implication production. Although the manipulated plausibility of crime stories did not prompt differential implication production as predicted by SAT, consistent with SAT, plausibility was positively and significantly associated with the production of precaution implications. Although the plausibility manipulation check for the crime stories was significant, the mean for the less plausible crime stories (5.85) was close to the theoretical midpoint of the plausibility measure (6.00), while the plausible story versions’ mean (7.23) was considerably above the midpoint. Thus, the less plausible crime stories may not have been sufficiently implausible to yield the predicted effects on implication production, an issue addressed in Experiment 2.
Experiment 2
The principal goal of Experiment 2 was to manipulate story plausibility such that it would better represent the plausibility continuum. This aim was accomplished by using the plausible and divine intervention conditions of Experiment 1 and adding a third condition that was less plausible than the divine intervention condition. Three additional measurement changes were made in Experiment 2. First, the story moral item was dropped from the pointedness index and replaced by two new items. This change was made because the two pointedness items used in Experiment 1 formed an index with somewhat less than desirable reliability. Second, instead of free listing story implications, Experiment 2 participants indicated the degree to which the story had implications for themselves, others, and the campus on three structured items. This approach was used as a potential way of detecting the influence of pre-conscious implications that could not be articulated in response to open-ended questions. Finally, participants estimated the degree to which the story would make them more likely to change their behavior. This measure enabled a test of the mediating effects of implications.
Method
Participants
A total of 58 undergraduate students, 59% women, enrolled in communication classes at a large west coast university participated for a small amount of extra credit. Participants’ ages ranged from 19 to 26 years (M = 21.58 years, SD = 1.43 years).
Procedure
Three versions of the computer theft story were used in the present experiment. The plausible and divine intervention versions of the story were highly similar to those used in Experiment 1. A third, least plausible story version included both the divine intervention event and the fact that the thief stole only two, #2 pencils rather than the expensive computer from the unattended backpack. The thief sent a note to the victim apologizing for taking the pencils. After reading the stories, participants made a series of judgments and completed demographic questions. Each story was 171 words in length.
Results
Index construction
Plausibility (M = 6.04, SD = 3.08, α = .95) and probative value (M = 4.58, SD = 2.03, α = .90) were indexed by the same items as those used in Experiment 1. In addition to judging the degree to which the story had a point, participants indicated the degree to which the story had a “take away message” and was “meaningful.” These additional items markedly improved the pointedness index’s reliability(M = 6.23, SD = 2.38, α = .90). 5 Probative value was again positively and significantly correlated with both plausibility (r = .49, p < .01) and pointedness (r = .26, p < .05), as were plausibility and pointedness (r = .43, p < .01). On three 11-point items, participants rated (none to many) the degree to which the story had implications for themselves, friends, and the campus (M = 6.26, SD = 2.72, α = .87). Participants indicated the degree to which the story would make them more careful, alert, and likely to change their behavior (M = 5.68, SD = 2.83, α = .90).
Manipulation check
The judged plausibility of the three story versions differed significantly, as intended, F(2, 55) = 20.50, p < .0001,
Tests of hypotheses
Figure 4 displays the estimated implications for the three story conditions. A contrast analysis revealed that the difference between the two extreme conditions was significant in the predicted direction, t(55) = 6.21, p < .0001, one-tailed, η2 = .41; moreover, the three means differed significantly from each other employing a SNK test. Hypotheses 1 through 3 were evaluated by a regression analysis in which the three story appraisal dimensions were entered as independent variables. Estimated implications was the dependent variable. Entry of the three appraisal dimensions produced a significant model, F(3, 54) = 20.48, p < .0001,

Estimated implications by story versions for Experiment 2.
SAT posits implications as a mediator between the three story appraisal space dimensions and effects (see Figure 1). This role was assessed by computing simple mediation models (Hayes, 2009, 2012) employing bootstrapping procedures (5,000 samples) and bias-corrected 95% confidence intervals (CIs) for the three appraisal space dimensions, with estimated implications as a mediator between the dimensions and estimated ΔBehavior measures. Three analyses each employed one of the appraisal dimensions as the independent variable and the other two dimensions as covariates. In all three analyses, the indirect effect (SE in parentheses) of implications was significant, plausibility, .11 (.08), 95% CI = [.0072, .3176]; pointedness, .17 (.11), 95% CI = [.0346, .4650]; and probative value, .12 (.07), 95% CI = [.0157, .3014]. 6 In addition, an appraisal composite was computed by trichotomizing and summing dimension scores for each participant. Scores ranged from 1 (low on all three) to 7 (high on all three) (M = 4.01, SD = 1.88). Figure 5 shows the significant implications indirect effect for the composite measure.

Estimated implications mediation analysis for Experiment 2.
Discussion
In contrast to the Experiment 1, Experiment 2 revealed that the three story appraisal dimensions are positively related to story implication generation. Moreover, the mediation analyses support SAT’s prediction that implications generated in response to stories play a mediating role between appraisal space dimensions and story impact. The fact that implications were assessed by structured measures in Experiment 2 raises the possibility that the Experiment 2 results may have arisen from common methods bias. This explanation for the enhanced relationships between the appraisal space dimensions and estimated implications observed in Experiment 2 was rendered less plausible by the failure to find a coherent unifactor solution that adequately fit items employed in the measures used in the analyses. 7 Experiment 3 directly compared open- and close-ended implications measures to clarify this question.
Experiment 3
Experiment 3 sought to determine the degree of concordance between open- and close-ended implication measures and to determine how the two measurement approaches would fare when employed as mediators between the story appraisals and an effect measure.
Method
Participants
A total of 165 undergraduate students, 69% women, enrolled in communication classes at a large, West Coast university participated for extra credit. Participants’ ages ranged from 18 to 30 years (M = 21.44 years, SD = 1.61 years).
Procedure
The three versions of the computer theft story used in Experiment 2 were employed. The open-ended and structured implication items were counterbalanced across the story versions. 8 Participants were randomly assigned to the six conditions. After reading the story, participants completed the implications items and items assessing pointedness, plausibility, probative value, estimated ΔBehavior, as well as demographic items.
Results
Index construction
The plausibility (M = 6.27, SD = 2.87, α = .95), pointedness (M = 7.14, SD = 2.39, α = .89), and probative value (M = 4.53, SD = 2.27, α = .88) measures used the same items as those in Experiment 2. Probative value was positively correlated with both plausibility (r = .39, p < .01) and pointedness (r = .33, p < .01), as were plausibility and pointedness (r = .50, p < .01). The structured implications (M = 6.56, SD = 2.29, α = .84), estimated ΔBehavior (M = 6.46, SD = 2.73, α = .92), and appraisal space composite (M = 4.07, SD = 1.84) measures consisted of same items used in Experiment 2.
Open-ended implications coding
Two judges independently coded implications into precautions, safety, and security categories using a randomly selected sample of 35 protocols. The category (κ) reliabilities were .94, .89, and .91, respectively. Total implications ranged from 0 to 12 (M = 3.96, SD = 3.11). Figure 6 displays the frequencies of the three implication types. A within-participants ANOVA of the three implications categories was significant, F(2, 328) = 40.99, p < .0001,

Mean computer theft story implications for Experiment 3.
Manipulation checks
A 3 (story conditions) × 2 (implications measures order) ANOVA of the plausibility measure yielded only a significant story condition main effect, F(2, 159) = 40.08, p < .0001,
Tests of hypotheses
To determine the concordance between structured and open-ended implications measures, the total number of open-ended implications and the numbers of the three specific types of implications (precautions, safety, and security) were correlated with the structured implications index. Structured implications were positively related to the open-ended total (r = .14, p < .07) and to the number of precautions implications (r = .22, p < .01). These correlations did not differ significantly by presentation order. No other correlations approached significant levels.
Mediation analyses in which structured implications, total open-ended implications, precaution, safety, and security implications served as mediators between the appraisal composite and estimated ΔBehavior measures were computed. 10 Presentation order failed to produce any significant moderating effects. Precautions, safety, and security implications evinced non-significant indirect effects; however, as shown in Figure 7, both the structured and total open-ended implications measures each yielded significant indirect effects. Additional analyses employing each appraisal space dimension as an independent variable and the other two dimensions as covariates revealed no significant indirect effects for open-ended implications. By contrast, significant indirect effects were found for pointedness, .05 (.02), 95% CI = [.0073, .1311], and probative value, .04 (.03) 95% CI = [.0015, .1363], but not plausibility, .02 (.01), 95% CI = [−.0064, .0619], when the structured implications measure served as the mediator.

Structured implications and open-ended implications as mediators for Experiment 3.
Discussion
Although the total number of open-ended implications and the structured implications indexes were positively but not significantly related, responses to the structured implications items were driven to some extent by the degree to which individuals generated precaution implications, the most prevalent open-ended implication. The structured implications measure showed a stronger indirect effect as a mediator between the story appraisal dimensions and estimated ΔBehavior, both for the composite appraisal space measure and for two of the three individual appraisal space dimensions. This difference may have arisen because the structured measure may reflect more than consciously generated implications. The fact that the total number of open-ended implications exerted a weak but discernable indirect effect between the appraisal space composite and estimated ΔBehavior suggests that story appraisals may have motivational properties that prompt the production of all types of implications.
Although Experiments 2 and 3 provided evidence supporting the proposition that implications partially mediate between story space appraisals and estimated behavioral effects, the fact that the same story versions were employed in both experiments raises the specter of the language-as-fixed-effects fallacy (Jackson, 1992). These significant indirect effects could be limited to the computer theft story and its different versions. Experiment 4 addressed this issue.
Experiment 4
Experiment 4 sought to determine whether the significant mediation effects observed in Experiments 2 and 3 would generalize across other story topics; thus, stories dealing with computer theft, identity theft, and the flu were included. The three versions of the computer theft story used in Experiments 2 and 3 were again employed. The identity theft and flu stories were each written in three versions to represent three levels of plausibility.
Method
Participants
A total of 279 undergraduate students, 72% women, enrolled in communication classes at a large, West Coast university participated for a small amount of extra credit. Participants’ ages ranged from 18 to 33 years (M = 21.43 years, SD = 1.80 years).
Procedure
In the identity theft story, a student discovered that someone had made US$700 of illegal debit card charges on his or her account. In the less plausible version, the student dreamt that the thief sent an apology note for the theft, but the student never received a note. In the least plausible version, the day after the dream, the theft victim received an apology note but no money from the thief. The flu story described severe flu symptoms a student experienced and their negative impact on the student’s test grades. In the less plausible version, the student dreamt that immune system killer T cells attacked and killed the flu virus; however, the next morning the flu symptoms remained unabated. In the least plausible version, after having the dream, the next morning the flu symptoms miraculously vanished. In all flu story versions the student’s test grades suffered. Each story was 171 words long. Participants were randomly assigned to the nine story conditions. Participants completed structured implications, pointedness, plausibility, probative value, estimated behavior change, and demographic items.
Results
Index construction
The same items used in Experiments 2 and 3 were employed to measure the following variables: plausibility (M = 7.06, SD = 2.63, α = .94), pointedness (M = 6.66, SD = 2.36, α = .84), probative value (M = 5.11, SD = 2.10, α = .89), structured implications (M = 6.35, SD = 2.33, α = .83), and estimated ΔBehavior (M = 5.37, SD = 2.80, α = .93). The appraisal space composite (M = 3.75, SD = 1.74) was computed in the same way. Probative value was positively and significantly correlated with both plausibility (r = .44, p < .01) and pointedness (r = .48, p < .01), as were plausibility and pointedness (r = .41, p < .01).
Manipulation check
A 3 (story conditions) × 3 (plausibility conditions) ANOVA of the plausibility measure yielded a significant plausibility conditions main effect, F(2, 270) = 57.38, p < .0001,
Tests of hypotheses
Two 3 × 3 ANOVAs of the implications and estimated ΔBehavior measures both yielded significant main effects for the story plausibility conditions, F(2, 270) = 10.35, p < .0001,
We aggregated the stories and conducted analyses to determine whether implications would again show evidence of mediating the relationship between story appraisal space and estimated ΔBehavior. The indirect effect of implications was significant for the appraisal space composite, replicating the results of Experiments 2 and 3 (see Figure 8). Analyses of each appraisal dimension as an independent variable and the other two dimensions as covariates showed significant indirect effects for implications for plausibility, .02 (.01), 95% CI = [.0013, .0557] and pointedness, .04 (.02), 95% CI = [−.0013, .1084], but not probative value, .02 (.01), 95% CI = [−.0007, .0578].

Estimated implications mediation analysis for Experiment 4.
Combined analyses
Given that the same measures of the three appraisal space dimensions, structured implications, and estimates of behavior change were used in Experiments 2 to 4, the data for these measures were aggregated (N = 502) and used to compute mediation models, all of which employed implications as the mediator. The indirect effect of implications was significant in the case of the appraisal space composite, .12 (.04), 95% CI = [.0443, .2044]; plausibility, .02 (.01), 95% CI = [.0043, .0488]; pointedness, .04 (.02), 95% CI = [.0090, .0942]; and probative value, .03 (.01), 95% CI = [.0040, .0606].
Discussion
The SAT-predicted indirect effect for implications was again significant in Experiment 4, using a more heterogeneous set of stories. This was the case when the composite appraisal space measure as well as the plausibility and pointedness measures served as independent variables. The implications indirect effect fell just short of significant levels when probative value was the independent variable. However, in view of the combined analysis results, probative value’s failure to provoke a significant implications indirect effect may have arisen from sampling error, as also may be the case for plausibility’s failure to do so in Experiment 2.
General Discussion
The four experiments provide general support for SAT’s contention that stories lacking a point, plausibility, and probative value tend to prompt fewer implications than stories scoring higher on these three appraisal parameters. However, as Experiment 1 showed, there are conditions under which stories found wanting on even one of these dimensions may provoke fewer implications. Specifically, stories that are plausible and probative but pointless may prompt fewer implications. In this regard, a reader of one of the mundane stories included in Experiment 1 indicated that he “kept waiting for something to happen” in the narrative. Events literally did happen in these stories, but they were commonplace and predictable. Although story appraisals involve the three dimensions, of the three, pointedness may be more influential than the other two for implication production. In the mediation analyses of the individual appraisal space dimensions, the indirect effect of implications was consistently higher when pointedness served as the independent variable.
While the four experiments focused on story-provoked implications and their potential to mediate narrative impact, the evidence adduced in Experiments 2 to 4 also supports the proposition that resultant appraisals from story appraisal space can exert a direct effect on story impact, pending the identification of other mediators (Bullock, Green, & Ha, 2010; Bullock & Ha, 2011). Although SAT proposes that pre-consciously generated implications may drive narrative impact, it is also possible that appraisals might provoke emotional responses to stories that, in turn, affect narrative impact. Such intuitive processes need further attention (Berger, 2007). The postulated three-dimensional story appraisal space might be viewed as a filter that determines whether stories warrant further processing. Stories lacking a point, plausibility, and probative value may be dismissed, thus preserving scarce cognitive resources. Is it possible to be transported or to identify with story characters if a story is a bad one? This “gatekeeping” function might extend to pre-conscious processing.
The themes of the open-ended implications generated in Experiments 1 and 3 suggest that implications may be viewed as precursors to behavioral intentions (Ajzen, 1991; Fishbein & Ajzen, 1975). For example, purchasing property insurance or backing up one’s computer in response to the computer theft story seem akin to abstract behavioral intentions that could be particularized by more specific behavioral intentions such as calling an insurance agent on a specific day in order to buy property insurance or backing up one’s computer on a specific schedule. Although this hierarchical relationship between implications and behavioral intentions seems plausible, more work is necessary to establish linkages between the two.
The fact that precaution implications correlated positively but weakly with the structured implications measure in Experiment 3 suggests that generalized, structured estimates of implications can reflect, somewhat, the content of prevalent types of implications. However, because this correlation was not robust, questions remain about the processes tapped by generalized, structured measures. One possibility is that the generalized, structured estimates represent overall implication generation activity. In contrast, open-ended measures may pick up only the most prominent implications available to respondents. This potential limitation might be overcome in future studies by interviewing individuals in depth to see whether implication production might be enhanced. Prompting individuals for additional implications in an iterative manner might increase the numbers of implications they generate. Implication production could also reflect story complexity such that complex stories might provoke more implications. The stories employed in the present experiments were relatively simple and may not have served to activate large numbers of implications.
The stories included in the experiments involved crime and health-related threats and, thus, consistently provoked precaution, safety, and security implications. Nevertheless, individuals could generate implications that discount threats presented in stories and, thus, attenuate their impact. In addition, stories concerning less threatening domains might prompt story consumers to generate more heterogeneous sets of implications, including countervailing implications that dampen narrative impact. Future research should examine these possibilities.
As the SAT model suggests, story kernels may be reappraised for their potential implications. Reconsidering kernels may activate new implications and provoke delayed effects. Story kernels involving similar stories may be represented in memory in abstract form prompting individuals presented with what is ostensibly a “new” story to observe that they have “heard this story before.” These abstract representations may have associated implications that are automatically activated when the abstract kernels are activated. With respect to the stories employed in the present experiments, such a generalized story kernel might subsume a variety of “cautionary tales.” 13
That narratives can effect significant changes in audiences’ attitudes and actions is a claim that has achieved the status of a truism among narrative impact researchers. However, most would agree that bad stories are less likely to be impactful than good ones. SAT provides a framework for understanding how story kernels are appraised and how these appraisals prompt implications, as well as other mediating processes. These mediating processes, in turn, determine narrative impact. This episode of the SAT narrative concludes with the hope that the reader has found it to be sufficiently plausible, pointed, and probative to provoke numerous implications for future narrative impact theorizing and research.
Footnotes
Acknowledgements
The authors owe a deep debt of gratitude to the reviewers and the editor for their insightful and cogent comments on an earlier version of the article. Their questions and suggestions were highly instrumental in the clarification of various theoretical arguments and methodological issues. Of course, the authors bear responsibility for any remaining oversights.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
