Abstract
Young adults received information regarding the platforms of two candidates for mayor of a troubled city. Half constructed a dialogue between advocates of the candidates, and the other half wrote an essay evaluating the candidates’ merits. Both groups then wrote a script for a TV spot favoring their preferred candidate. Results supported our hypothesis that the dialogic task would lead to deeper, more comprehensive processing of the two positions, and hence a richer representation of them. The TV scripts of the dialogue group included more references to city problems, candidates’ proposed actions, and links between them, as well as more criticisms of proposed actions and integrative judgments extending across multiple problems or proposed actions. Assessment of levels of epistemological understanding administered to the two groups after the writing tasks revealed that the dialogic group exhibited a lesser frequency of the absolutist position that knowledge consists of facts knowable with certainty. The potential of imagined interaction as a substitute for actual social exchange is considered.
Central among the 2010 Common Core State Standards (CCSS) is proficiency in nonnarrative reading and expository argumentive writing. Similarly, the new Next Generation Science Standards (NGSS) identify argumentation as key among the process skills these standards feature (NGSS Lead States, 2013), as well as central to achieving mature epistemological understanding of the nature of science (Sandoval, 2014). However, not articulated in the CCSS or in the NGSS is how these proficiencies are achieved. The writing component is a particular challenge. Students of all ages find expository writing challenging and perform poorly in assessments (Graham, McKeown, Kiuhara, & Harris, 2012), and it is unclear how young writers asked to support claims with logical reasoning and relevant evidence, as the CCSS (2010) stipulate, can best learn to do so.
The dialogic approach to achieving this goal, reported on by Kuhn and Crowell (2011), rests on the view that dialogue supports the development of written argument by giving it a purpose. There is now someone to communicate to—the missing interlocutor (Graff, 2003)—and a purpose for communicating. This dialogic approach to promoting individual argumentive competence rests on a tradition that emphasizes the close connection between social and individual cognition. This tradition can be considered to go back as far even as Socrates, but certainly goes back to Vygotsky (1937/1987) and Mead (1934/1967), who described thought as a “conversation with the generalized other,” and, later, Billig (1987), who emphasized the connection between dialogic argument and the interiorized individual argument he claimed occurs in thought. Walton (2014) attributed to Grice (1975) the introduction of dialogic theory to modern analytical philosophy and attributed the further development of this theory to van Eemeren and his colleagues (e.g., van Eemeren & Grootendorst, 1992), who emphasized the need to evaluate arguments within their conversational context. According to Grice, an argument should be evaluated on the basis of its collaborative value as a contribution to dialogue.
Contemporary empirical studies have supported Graff’s (2003) claim of the benefits of dialogue, demonstrating that scrutiny of an opposing view and use of evidence are greater in dialogic contexts than in individual written argument (Kuhn & Moore, 2015; Macagno, 2016). Middle schoolers participating in a multiyear intervention centered around peer dialogues, Kuhn and Crowell (2011) found, wrote superior essays, compared with a control group who participated in a nondialogic intervention that had equal scope but was focused on whole-class discussion and essay writing.
In the study reported here, we asked whether a parallel benefit might appear in the context of a much more limited intervention, namely, a 1-hr individual session. Instead of engaging in actual dialogue, participants were asked to construct a written hypothetical dialogue between two expert arguers regarding which of two mayoral candidates who held contrasting positions was the better candidate. A comparison nondialogic group wrote an essay evaluating the merits of the two candidates. In addition to comparing characteristics of the dialogues and essays, we compared performance on a second, nondialogic writing task on the same topic. Our hypothesis was that the dialogic task would lead to a deeper, more comprehensive processing of the two positions, and hence a richer representation of them, that would be manifested not only in the constructed dialogue itself but also in the second, nondialogic writing task. Finally, a measure of epistemological understanding (asking the respondent to account for discrepant accounts of an event; Barzilai & Weinstock, 2015) was administered in order for us to examine possible effects of the dialogue task on this important competency.
Method
Participants
Sixty undergraduate students were recruited through notices posted around the campus of a 4-year college in a large city in the Northeast United States. Their mean age was 23.4 years (SD = 6.3 years, range = 17–55 years). Fifteen percent were freshman, 12% sophomores, 30% juniors, and 42% seniors. They reported more than 20 different majors or prospective majors. Eighty-three percent were female and 17% male; the high percentage of females reflects the gender distribution of the entire population of the college (73% female). Fifty percent of the sample self-identified as Black or African American, 38% as Hispanic or Latino, 5% as White, and 2% as “other”; 3% did not report their race or ethnicity. The students were compensated for their participation with a $10 Dunkin’ Donuts gift card.
Procedure
The activity was conducted individually in a quiet room. Students were allotted as much time as needed, which ranged from 45 to 60 min. A random-number generator was used to assign 30 students to the experimental group and 30 to the comparison group. Upon a student’s arrival, the activity was explained, and the student gave consent. Following completion of the task, the student answered a brief demographic questionnaire.
Experimental group’s task
Students in the experimental group worked first on a dialogic writing task (Kuhn, Zillmer, Crowell, & Zavala, 2013). They were asked to construct a dialogue between two expert arguers (TV commentators Chuck and Doug) regarding who was the better candidate (Ana Cruz or Maria Diaz) for mayor of a troubled city. A list of major problems in the city was presented, as were lists of the actions each candidate proposed to take if elected. (See Table 1 for the instructions and the information provided.)
Dialogic Task Presented to Participants in the Experimental Group
Comparison group’s task
Students in the comparison (essay) group received a sheet containing the same information, but different instructions. In place of the section introducing Chuck and Doug and the dialogic frame, their materials had the following instruction:
Write an argumentive essay in which you consider the merits of each of these candidates for mayor.
(Also, the word essay replaced the word script in the final sentence of the instructions.)
TV-script task (all participants)
Next, participants from both groups were asked to write a TV script:
You’ve been asked to appear on a TV show and make a case for either Cruz or Diaz as the best candidate. Who would you choose and what would you say in your 2-minute talk? Write a short script for yourself.
Epistemological assessment (all participants)
Following the writing tasks, all participants were presented the Livia problem (Barzilai & Weinstock, 2015; Kuhn, Pennington, & Leadbeater, 1983) as an assessment of their epistemological thinking. They were asked to follow along on a written copy as the experimenter read aloud two accounts of the fictitious Fifth Livian War between North and South Livia. One account was from a North Livian historian, and the other was from a South Livian historian. The participants then answered questions regarding their understanding of the discrepancies between the accounts. The questions were read aloud, the student answered verbally, and their answers were audio-recorded.
Results
We first applied a common coding scheme to the dialogues written by the dialogue-group participants and the essays written by the essay-group participants. Our intention was to identify ways in which the dialogues and essays differed, to better understand any subsequent differences between the groups in the TV scripts they wrote. We then turned to the TV scripts themselves and the assessment of epistemological thinking.
Differences between the dialogues and essays
For the purpose of identifying differences between the dialogues and essays, we used the coding scheme in Table 2 (from Kuhn et al., 2013). First, the constructed dialogues and essays were segmented into idea units, each of which expressed a single idea or assertion. Two coders independently segmented six dialogues and eight essays to establish interrater reliability. The percentage agreement for identifying segments was 100%. Each idea unit in those dialogues and essays was then classified according to the coding scheme in Table 2. Percentage agreement between the same two coders was 91%. Disagreements were resolved through discussion. The remaining dialogues and essays were coded by the first author.
Coding Categories and Illustrations of Statements Appearing in Participants’ Constructed Dialogues and Essays
In the case of a constructed dialogue, the arguer making the statement was either Chuck or Doug. In the case of an essay, the arguer was the participant writing the essay.
The mean number of idea units differed significantly between the dialogues and essays, t(58) = −2.72, p = .009, d = 0.70. The dialogue group produced more idea units (M = 9.33, SD = 4.49) than the essay group (M = 6.67, SD = 2.96). (As normality was violated for the number of idea units, a square-root transformation was applied; results were similar for the nontransformed data.)
A multinomial logistic regression, via SAS/Nonlinear Mixed (MLE), was conducted, with the frequencies of the category types in Table 2 as dependent variables and group (dialogue vs. essay) as an independent variable. Unsubstantiated statements were set as the reference category. The dependent variables were nested within each participant; therefore, a mixed-effects random-intercept logistic regression was fitted.
A significant group effect was found for two categories, simple comparisons and integrative comparisons. The mean number of simple comparative statements was 0.20 (SD = 0.55) for the essays of the comparison group and 1.30 (SD = 1.58) for the dialogues of the experimental group. Similarly, the mean number of integrative statements was 0.40 (SD = 0.68) for the essays of the comparison group and 1.67 (SD = 1.63) for the dialogues of the experimental group. Relative to the odds of making an unsubstantiated statement, the odds of making a simple comparative statement were 5.17 times higher for the dialogue group than for the essay group, and the odds of making an integrative comparative statement were 3.31 times higher for the dialogue group than for the essay group. Table 3 summarizes the modeling results for each category type.
Results of the Multinomial Logistic Regression Examining the Content Differences Between the Dialogues and Essays
Note: OR = odds ratio.
Furthermore, the essay group was less likely than the dialogue group to produce any comparative statements: Four (of 30) essay-group participants and 17 (of 30) dialogue-group participants produced simple comparative statements, and this difference was significant, χ2(1, N = 60) = 12.38, p < .001, φ = .45. Similarly, 9 (of 30) essay-group participants included at least one integrative comparative statement in their essays, whereas 25 (of 30) dialogue-group participants included at least one integrative comparative statement in their dialogues. This difference was also significant, χ2(1, N = 60) = 17.38, p < .001, φ = .54.
Differences between the TV scripts of the two groups
The TV scripts were also segmented into idea units. Although the dialogue group produced slightly more idea units in their TV scripts (M = 3.17, SD = 1.97) than the essay group did (M = 2.93, SD = 1.80), this difference was not significant, t(58) = −0.48, p = .633, d = 0.13. We next asked whether there was evidence that the content of the TV scripts differed between the two groups, and in particular, whether there was evidence of the hypothesized richer representation of the two positions on the part of the dialogue group. To answer this question, we conducted a multinomial logistic regression on the number of references to one of the stated city problems, the number of references to a candidate’s proposed actions, the number of statements linking a problem and an action addressing it, the number of critical statements regarding a candidate’s position, and the number of comparisons of the two candidates. In these analyses, unsubstantiated statements were set as the reference category. Table 4 summarizes the modeling results.
Results of the Multinomial Logistic Regression Examining the TV Scripts’ Content
Note: OR = odds ratio.
In addition to referring to fewer problems (M = 1.23, SD = 1.43, vs. M = 2.07, SD = 1.76) and fewer actions addressing them (M = 2.27, SD = 1.86, vs. M = 3.53, SD = 2.47), students in the essay group were more likely than those in the dialogue group to make claims unsubstantiated by any form of support. Sixty percent of the students in the essay group, but only 20% of those in the dialogue group, made at least one unsubstantiated claim, χ2(1, N = 60) = 10.00, p = .002, φ = −.41. The mean number of unsubstantiated statements was 1.07 (SD = 1.20) in the essay group, compared with 0.20 (SD = 0.41) in the dialogue group. Although the majority of the students in both groups made reference to at least some problems and some proposed actions, and a majority of the students in the dialogue group made links between them and made comparisons across candidates, only half of the students in the essay group ever referred to a link between a problem and action to address it, and fewer than half ever made a comparison across candidates.
The odds of referring to a problem were 27.55 times higher for the dialogue group compared with the essay group, and the odds of referring to a candidate’s proposed action were 25.76 times higher for the dialogue group. The odds of making a link between a problem and an action addressing it were 23.88 times higher for the dialogue group. The odds of noting a negative attribute of one of the candidate’s positions were 39.81 times higher for the dialogue group, and the odds of comparing the two candidates were 30.72 times higher for the dialogue group. (See Table 5 for examples of scripts written by students in the two groups.)
Examples of TV Scripts of Students in the Essay and Dialogue Groups
Group differences in epistemological thinking
Responses to the Livia problem were coded on 22 dimensions (Leadbeater & Kuhn, 1989; Weinstock & Cronin, 2003) in order to assign each participant a level of epistemological understanding. Table 6 characterizes the six levels of this scheme in terms of 3 major dimensions: the nature of the accounts, why they differ, and how claims are justified. Levels 0 through 2 are regarded as predominantly absolutist, Level 3 as multiplist, and Levels 4 and 5 as evaluativist. A participant was assigned a level from 0 to 5 on each of the 22 dimensions, and then a dominant level—absolutist, multiplist, or evaluativist—was assigned.
Characteristics of the Six Levels of Epistemological Understanding
Concurrent construct validity of this coding scheme was established by Kuhn, Cheney, and Weinstock (2000), in a study in which participants assigned to one of the three main epistemological levels using a different instrument were independently assigned to the same epistemological level using their responses to the Livia problem. To establish interrater reliability in the present study, two researchers independently coded 20% of the data. The percentage agreement was 83% across the 22 individual dimensions and 96% for the overall level assigned. All disagreements were resolved through discussion.
Figure 1 shows the percentages of participants assigned to each level as their predominant epistemological level. Eight participants from the essay group were categorized as absolutist, 9 as multiplist, and 13 as evaluativist. Zero participants from the dialogue group were categorized as absolutist, 12 as multiplist, and 18 as evaluativist. A 2 (group) × 3 (epistemological level) Fisher exact test revealed that epistemological levels differed significantly by group (p = .007).

Percentage of participants in each group who were assigned to each epistemological level.
Discussion
In the present work we extended the claim that dialogue has beneficial effects to a context in which dialogue was only hypothetical and constructed by a single individual. We begin with discussion of effects of the dialogue-construction task on understanding and then turn to behavior.
A mature level of epistemological understanding provides the foundation for serious discourse (Greene, Sandoval, & Braten, 2016; Kuhn et al., 2013; Moshman, 2015). If knowledge consists of claims not open to question—either facts that can simply be “looked up” or opinions that must be accepted as the unquestioned possessions of their holders (the stances reflected in less mature epistemological positions)—discourse serves little purpose. Nonetheless, an adult who remains largely at an absolutist level of epistemological understanding does not conceive of the world in a way identical to that of the 8-year-old absolutist, who understands all knowledge as matters of certain fact ascertainable from direct observation or authoritative sources. Adult absolutists have become aware that disagreement is commonplace and not always easily resolvable. The concept of a rigid stage progression in epistemological understanding has given way to one in which adults, certainly, if not children and adolescents as well, hold a loosely connected network of epistemological ideas that span adjacent levels (Hammer & Elby, 2003). As a result, a more useful assessment scheme may be one that categorizes individuals with respect to the degree to which their own set of epistemological ideas conforms to different epistemological levels (Barzilai & Weinstock, 2015).
If so, it is reasonable to suppose that an individual’s context or experience prominent at a given point in time may influence the epistemological ideas that the individual endorses at that point in time. This is the model that we believe applies to our data on young adults’ epistemological thinking immediately after they have constructed an adversarial dialogue, as opposed to after they have written an essay. We did not assess epistemological understanding prior to this activity and hence, of course, cannot say with certainty that the absence of individuals classified as absolutists in our dialogue group reflected a change away from absolutism that some would have exhibited prior to engaging in the dialogue task. Nor can we say that such a change, if it did occur, would be permanent. Rather, our interpretation is that performing the dialogue-construction task heightened awareness that conflicting claims with reasonable support can exist and that a straightforward resolution may not be apparent. Consistent with this view is other recent evidence that individuals may show a shift away from an absolutist position when they are “primed” by an experience that highlights multiple perspectives (Fisher, Knobe, Strikland, & Keil, 2016; Kienhues, Stadtler, & Bromme, 2011).
Another recent study provides evidence that dialogue can have beneficial cognitive effects even though it is not participated in directly and is only observed. Chi, Kang, and Yaghmourian (2016) reported a parallel effect when participants watched a video that was dialogic. A video of a tutee struggling to correct misconceptions while interacting with a tutor, Chi et al. found, was more conducive to learning than was a video of the tutor alone.
In the present study, we hypothesize, constructing a dialogue required the writer to repeatedly shift back and forth between one perspective and the other, generating credible arguments to support each, as well as counterarguments to weaken them, and rebuttals aimed at restoring their strength. As a result, the writer formed a richer representation, not only of each position and the evidence bearing on it, but also of the positions in relation to one another.
These results have both theoretical and practical implications. Theoretically, they support the close link between interpersonal and intrapersonal forms of thought, and practically, they support the claim that one can serve as a bridge to the other. Close analysis of middle-school students’ essays as they engaged deeply with a series of topics over 2 years (Kuhn, Hemberger, & Khait, 2016a, 2016b) showed that the dialogic structure of their argumentation with peers made its way into their essays. “Others might say . . . ,” a phrase rooted in the real-life other that dialogue provides, appeared increasingly often in their essays. Dialogue makes an opposing position and its accompanying arguments clear and vivid enough for a student to represent them in an essay and address them, but at least as important is gaining recognition of the relevance of doing so.
Role taking is a powerful mechanism that has the potential to substitute for actual social exchange. The present results suggest that it can be productive even in virtual, imagined form. One might think of imagined dialogue as an intermediate process between actual social interaction and passive observation of other peoples’ interaction, which research shows may be ineffective if the observer does not act on these observations in some way (Jewett & Kuhn, 2016; Muldner, Lam, & Chi, 2014). We, of course, cannot rule out additional, noncognitive differences between the two conditions in this study. In particular, the conditions might have differed in motivation and engagement to the extent that participants found the dialogue task novel and hence more engaging than the essay task. Engagement, however, is part of what is thought to make dialogue effective, and hence its potential contribution to the cognitive outcomes we have identified need not be excluded; it would nonetheless be desirable to further elucidate this contribution.
Although there now exists a good deal of evidence for the positive effects of peer interaction on intellectual development (Resnick, Asterhan, & Clarke, 2015), such interaction may not always be feasible, and in school settings especially, teachers may be disinclined to incorporate interactive activities into their lessons. We do not mean to suggest that the dialogue-construction activity we examined is a satisfactory substitute, or that all possible should not be done to encourage and support authentic classroom discourse. The virtual form of interaction we examined here may be a less desirable but still productive substitute. In an era in which positions on an issue too often lack the deep analysis to support them, this may be worth knowing.
Footnotes
Acknowledgements
We thank Bryan Keller for providing his statistical expertise.
Action Editor
Gretchen B. Chapman served as action editor for this article.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
