Abstract
Many researchers claim that facilitation is a determining factor, if not a necessary condition, for successful deliberative discussion, but little research has applied randomized experimental designs to empirically test such claim. This article analyzes the effect of professionally facilitated versus non-facilitated discussions in a real-life context on participants’ attitudes and the perceived quality of group deliberation, controlling for various individual- and group-level variables. We conducted 26 deliberative discussions with 226 teachers from 13 primary schools on the topic of school discipline measures. We assessed the teachers’ post-discussion perceptions of the perceived quality of the group deliberation and their attitudes toward school discipline measures pre- and post-discussion. The results show the facilitation’s significant influences on attitude change and the perceived quality of the group deliberation. Quality of deliberation is also influenced by heterogeneity of restorative attitudes in discussion groups, whereas attitude change is to a large extent determined also by pre-discussion attitudes.
One can hardly find a reference in the field of deliberative democracy that does not at least indirectly acknowledge the need for a structured communicative process and thus some sort of moderation or facilitation. Communicative exchanges are in the center of deliberative democracy, which is understood as a talk-centric form of democracy that focuses on the communicative processes of opinion and will formation (Delli Carpini, Cook, & Jacobs, 2004). Many authors (e.g., Blumler & Coleman, 2001; Landwehr, 2014) warn about unregulated speech and about the need to organize communication in larger groups to improve the decision-making process.
The field of deliberation research is replete with definitions of deliberative criteria or qualities of communication and methods for measuring them. Although there is no unanimous agreement in the literature regarding the deliberative standards of discussion (Burkhalter, Gastil, & Kelshaw, 2002; Delli Carpini et al., 2004; O’Doherty, 2013), the majority of studies are at least implicitly based on abstract deliberative ideals, such as equality, fairness, rationality, and inclusion that are operationalized in various forms in the context of deliberative discussions (Mansbridge, Hartz-Karp, Amengual, & Gastil, 2006). According to Burkhalter et al. (2002), small-group face-to-face deliberative discussions combine careful problem analysis and an egalitarian process, where participants have adequate speaking opportunities and can engage in dialogue that bridges emerging divergent ways of speaking and knowing. Based on this definition, Gastil and Black (2008) state that deliberation means carefully examining a problem to arrive at a well-reasoned solution.
Accordingly, operationalizations of deliberation not only reflect the ideals of deliberative democracy, but, more importantly, also make visible the importance of the deliberative process, which is not self-generating (Levine, Fung, & Gastil, 2005) but requires substantial intervention to create the conditions for effective, cooperative, and collaborative discussions. Interventions that set the ground rules for discussions in line with deliberative prerequisites are known as facilitation (Carcasson & Sprain, 2016; Dillard, 2013; Moore, 2012). It is thus facilitation “that turns everyday political talk into rigorous deliberative exchanges” (Dillard, 2013, p. 218). It can be claimed that facilitation is inherent to deliberation because every practice of face-to-face deliberative discussion includes facilitation (Ryfe, 2006). According to Black, Burkhalter, Gastil, and Stromer-Galley (2011), to meet the expectation of deliberative theory, discussions must include a trained facilitator who helps to orient the group around deliberative norms.
It seems that the common presumption of deliberation studies is that the groups without the presence of facilitator are not able to foster high enough quality of discussion and reach effective solutions to problems under discussion. However, it is still surprising that such presumption has not been empirically systematically tested. Only little scholarly attention has been paid to facilitation and its effects in the field of deliberation research. Scholars (e.g., Abdel-Monem, Bingham, Marincic, & Tomkins, 2010; Karpowitz & Mendelberg, 2011; Landwehr, 2014; Setälä & Herne, 2014) recognize the lack of studies testing the effects of facilitation and call for experiments that manipulate facilitation, along with research designs that include control groups without professional facilitation, which would be required for a comprehensive assessment of the effects of facilitation. There is a clear need for randomized experimental research designs with sufficient control on both individual and group levels (Karpowitz & Mendelberg, 2011).
The purpose of this study is to empirically investigate the impact of professional facilitation on the deliberative qualities of small-group discussions and attitude changes among participants. This investigation implements a methodologically rigorous experimental design with random group assignment and control for individual- and group-level variables. Controlling for group-level characteristics and their effects on deliberative functioning and attitude change seems to be a very important but often neglected aspect of such research (Gastil, Black, & Moscovitz, 2008).
Facilitation in Deliberative Discussions: Role, Tensions, and Forms
Details about what is facilitation, how facilitation is perceived, which roles are performed by facilitators, and how they fulfill them in practice are often missing in studies about deliberation (Wright, 2006). Recognizing the importance of addressing this gap, Landwehr (2014) distinguishes four ideal types of intermediation, namely, the chair, the moderator, the mediator, and the facilitator. The chair is in charge of applying the procedural rules and leading discursive interactions, the moderator evaluates contributions, rationalizes communication, and controls the participants’ emotions, whereas the role of the mediator is to aggregate opinions, seek solutions, and summarize results, and “the most appropriate role for an intermediary in deliberation is that of a facilitator” (Landwehr, 2014, p. 87), who has the task of helping the group achieve the goals and ensuring that all voices, arguments, and points of view are heard (Landwehr, 2014). Other authors (Epstein & Leshed, 2016; Ryfe, 2006) describe facilitation mostly as a process that enables and guides participants to engage meaningfully and effectively in deliberative discussions. More specifically, the role of facilitators is to set up rules for discussion, assure meaningful exchanges among participants, strive for the equality of participants, their voices, arguments, and internal deliberative quality, offer a balanced summary of the discussion, and help in obtaining results (Moore, 2012; Park, 2012; Trénel, 2009).
These defining features reveal that there are inherent tensions in the process of facilitation that cannot easily be resolved (Thompson & Hoggett, 2001). One central issue that must be recognized and acknowledged is that the facilitator necessarily adopts the position of leader of the deliberative discussion (Moore, 2012). Although the ideals of deliberation seek to utilize the neutrality of facilitators, several authors show that facilitators are not (or even shall not be) completely neutral participants in such discussions (Aakhus, 2001; Dillard, 2013; Moore, 2012; Spada & Vreeland, 2013; Sprain & Carcasson, 2013). In choosing how to organize and guide the process of facilitation, the danger of overly influencing or poorly structuring the deliberative discussion arises. Another tension derived from the theoretical foundations of deliberation is the question of whether deliberative discussion must enforce consensus (Karpowitz & Mansbridge, 2005). Although people in the deliberation process frequently change their views and “come to understand one another’s needs, values, and beliefs better, they rarely reach complete agreement” (Levine et al., 2005, p. 3). Analyses of deliberative discussions show that facilitators’ insistence on consensus can cause anger and frustration among participants if their views are overlooked, potentially leading to pseudoconsensus and dissatisfaction with the discussion (Karpowitz & Mansbridge, 2005). For analytical purposes, it may be fruitful to separate consensus from the other dimensions of deliberation and treat it as an outcome of deliberation instead. Another broad set of concerns stems from early critics of deliberative democracy (Sanders, 1997; Young, 2000), who argued that following methods of rational and logical argumentation can exclude and marginalize disadvantaged groups (Mansbridge et al., 2006).
Several studies recognize that professional facilitators can address these tensions via their expertise and a well-prepared plan, thus neutralizing them and playing a constructive role (Epstein & Leshed, 2016; Fulwider, 2005; Quick & Sandfort, 2014; Ryfe, 2006). More specifically, Dillard (2013) shows that skilled facilitators are able to handle elements of discussion that hinder deliberation like issue polarization, adversarial framing, speaker domination, and socioeconomic inequality. Active and more interventionist forms of facilitation can ensure that interaction, engagement, mitigation of dominant voices, collaboration, and mutual understanding are more likely to occur (Carcasson & Sprain, 2016; Smith, 2009). Nevertheless, the exact form of facilitation involves several factors; in other words, facilitation is not a uniform, single method (Dillard, 2013, p. 231), because the facilitator’s role “is (or should be) dependent on the aims and context of the discussion forum” (Wright, 2006, p. 551). Therefore, facilitation designs and practices vary among deliberative practices.
Facilitation’s Effects on the Quality of Group Deliberation and Attitude Change
Although the positive influence of facilitation on the qualities of group deliberation and attitude change seem almost self-evident, researchers rarely discuss the specifics of the characteristics and mechanisms in the process of facilitation that bring about positive outcomes. One stream of argumentation builds on the assumption that the discussions about controversial issues are essentially uncomfortable or even threatening and consequently require some sort of rules of engagement (Schudson, 1997). If such rules are absent, discussants will more likely resort to silence, antagonisms, and/or stick to their own attitudes rather than demonstrate initiatives for cooperative discussions, respect, and empathy toward others (Fulwider, 2005). As Fishkin (1995) claims, it cannot be expected in natural settings that all arguments will be heard, and that everybody would be equally willing to join in the discussion. For those reasons, facilitators’ interventions are needed to assure deliberative process. Another stream of argumentation builds on the assumption that people have essentially limited cognitive and psychological competences for activities like perspective taking, respect, intelligible utterances, and coordinating perspectives (Reykowski, 2006). Facilitator in this context promotes the establishment of democratic norms that reduce power asymmetries and other differences in personal characteristics that might impact the quality of deliberation. With neutral, professional stances, facilitators enable platform for fairer and intelligible discussions (Levine et al., 2005).
A few recent studies provide mostly qualitative perspectives on facilitation in online deliberation (e.g., Epstein & Leshed, 2016; Trénel, 2009; Wright, 2006). Regarding face-to-face deliberative discussions, scholars have examined how the views (Spada & Vreeland, 2013), roles (Farrar, Green, Green, Nickerson, & Shewfelt, 2009), discursive strategies (Dillard, 2013), and expertise levels (Park, 2012) of facilitators influence group norm development; the ways in which participants contribute to discussions, and participants’ attitudes; or even challenges presumptions of facilitators, such as that communication process and content are separate (Aakhus, 2001). Few experiments (e.g., Fulwider, 2005; Reykowski, 2006; Trénel, 2009) have manipulated different styles of facilitation, from more interventionist to the one with less or no intermediation. Fulwider (2005) examines whether the presence of a facilitator influenced deliberative quality, knowledge increase, attitude change, and deliberative outcomes in general, as compared with discussions without a facilitator. Surprisingly, the results of that study showed that a facilitator’s presence does not significantly affect participants’ ratings of deliberative quality or knowledge increases, although the facilitator’s presence does make attitude changes more likely. Fulwider (2005) concludes that facilitation has an impact but future research must illuminate how facilitation affects changes in attitude. Similarly, Reykowski (2006) employs various facilitation styles to test whether facilitation influences the quality of group deliberation and opinion change. The results showed that differences in facilitation “were not potent enough to produce real difference in participants’ behaviour,” but in groups where the facilitator was most active in implementing the deliberative norms, opinion changes were more common (Reykowski, 2006, p. 344). He concluded that facilitation has a substantial effect and may be a decisive factor in shaping deliberative functioning, although further experiments are needed. In the context of these two surveys, which did not so conclusively as one would expect, show the importance of facilitation for deliberative discussions, it is even so more important to analyze the effects of facilitation versus non-facilitation and not the effects of variations of facilitation. Thus, it seems that there are enough theoretical and empirical reasons to expect differences between facilitated and non-facilitated groups in terms of quality of deliberation and attitude change. Our hypotheses are as follows:
Additional Factors Influencing Quality of Group Deliberation and Attitude Change
Besides the influence of facilitation, other factors that pertain to individual- and group-level characteristics need to be considered in explaining the attitude change and quality of deliberation. From the literature, it seems that besides typical sociodemographic factors (e.g., gender and age), group processes used need to consider contextual factors that refer to the pre-discussion attitudinal characteristics of individuals and groups, more specifically participants’ and groups’ initial attitudes and group attitudes’ heterogeneity. Karpowitz and Mendelberg (2007) point out that there is still little research on the group-level effects in the deliberation literature, in contrast to existing elaborate research evidence in disciplines such as social psychology. In our study, the factors identified above are also not in the forefront of analytic focus but in the role of controls.
Initial attitudes might play an important role in deliberation. Extensive research (e.g., Gastil et al., 2008; Gastil & Dillard, 1999) indicates that participants’ ideological bias negatively affects the likelihood of attitude change. When deliberating, conservative and liberal participants typically move away from one another in terms of their attitudes, with the former more strongly supporting liberal beliefs and more evidently rejecting conservative ones, and vice versa. Wojcieszak (2012) draws a similar conclusion, stressing that firm attitudes are particularly difficult to alter and that this affects the ways in which people process messages. Moreover, she contends that attitude strength is a multidimensional construct with many components, for instance, intensity, certainty, importance, and extremity, which have varying effects on a deliberation. Attitude extremity and intensity are particularly emotionally based, which appears to hinder a reconsideration of biases (Wojcieszak, 2012).
Regarding the impact of attitudes at group level, research on small groups has shown that discussion tends to move collective opinion in the direction of the preexisting views of the majority (Moscovici & Zavalloni, 1969; Myers & Lamm, 1976). This happens either because (a) the discussants with rarer voices want to conform to the mainstream to be accepted or liked by group (Nolan, Schultz, Cialdini, Goldstein, & Griskevicius, 2008) or (b) they do not want to speak out publicly an opinion that differs from the (perceived) majority (Myers, 1978), or (c) the majority, owning greater power, argues more convincingly and therefore convinces others (Delli Carpini et al., 2004).
Another potentially relevant factor regarding initial attitudes pertains to the heterogeneity/homogeneity of attitudes in the discussion groups. Although indirect evidence comes from research on online discussions, it might be expected that a homogeneous group would be likely to increase the quality of a discussion. Namely, sense of community, which pertains (also) to similarity of beliefs, norms, and values, is demonstrated to be a predictor of communication quality (Petrič, 2013). This is somewhat paradoxical because, on one hand, the variability of opinions is a precondition of deliberation, but, on the other hand, such variability may induce conflicts, especially given non-facilitation. The rare studies of group effects on deliberation (Farrar et al., 2009; Luskin, Fishkin, & Hahn, 2007) show that, in terms of attitudes, participants modify their preferences to conform to the preference of their group. Those results contrast with the psychology literature that analyses group polarization effects (see Sunstein, 1999). It should be pointed out that these studies involved structured discussions led by facilitators. In these cases, the use of facilitators and the heterogeneity of participants seem to encourage open-mindedness and decrease social pressures (Farrar et al., 2009).
Considering the rather controversial effects of various contextual variables on attitude change and quality of deliberation, we propose the following exploratory research questions:
Method
Case Background
The issue for deliberation—school discipline measures in Slovenian schools—is relevant to many Slovenians because there were almost 170,700 students in compulsory primary education in the 2014-2015 school year. These students were taught by 15,994 teachers in 452 primary schools. After adding in their parents or guardians, this number accounts for a large share of Slovenia’s population of 2 million. The issue is particularly pertinent for teachers because they must intervene daily to ensure appropriate behavior among students.
Schools remain relatively independent in terms of the selection and application of school discipline measures. Specifically, since 2009, they have autonomously defined their own comprehensive systems of educational activity. The theoretical considerations influencing the imposition of school discipline measures mainly stem from the retributive theory of punishment and, more recently, the restorative justice paradigm (Kroflič, Klarič, Štirn Janota, & Stolnik, 2011). Retributive justice focuses on establishing clear rules and sanctioning violations, whereas the basic principle of restorative justice, as a disciplinary measure, is to seek to restore all those affected to the state they had been in before the violation. Thus, those who are directly involved in restorative justice should be able to fully participate in finding suitable solutions.
This issue lends itself to deliberation for several reasons. First, in the seven years following the individualization of school discipline measures in primary schools (2009), this change has not yet been systematically evaluated. Second, the issue of school discipline measures creates a divide among members of the professional community, as well as teachers and other school actors. Some proponents of strict discipline and clear boundaries call for retributive action and do not see any value in restorative measures. Third, this highly controversial topic is often subject to public scrutiny, especially in extraordinary cases covered by the media (e.g., publicized examples of school violence). Fourth, large differences exist among schools regarding the concept of alternative discipline measures, particularly their application; this is evident in the educational plans of 13 schools from our sample and was also established during the deliberations. Fifth, we notified the participating schools’ management and other educational decision makers (e.g., the Institute for Education) of the teachers’ recommendations. However, they would not commit themselves to directly implementing the deliberation outcomes. In this sense, our experiment can be regarded as prompting “slightly consequential deliberation” (Fulwider, 2005, p. 6). The teachers’ recommendations were also forwarded to the principals of their schools. The principals whose schools were selected for participation were all interested in the deliberation outcomes regarding their schools. The participating teachers were thus aware that their recommendations would count at the micro level (in their school) and at the macro level (at the state level). Throughout the research, we cooperated with the Slovenian Association of Educationalists and the National Education Institute of Slovenia. These institutions prepare various programs to support teachers and provide them with additional education and training. A specially organized symposium was held with both institutions’ and the participating schools’ representatives (together more than 40 participants) to jointly ponder the outcomes.
Experimental Design and Procedures
This study adopts a randomized control experimental design, where in addition to the experimental intervention several individual- and group-level control variables are included in the explanatory model. In a real-life context, we test the effect of professionally facilitated versus non-facilitated discussions on the perceived quality of group deliberation and attitude change.
We designed and conducted 26 discussions among teachers on the topic of school discipline measures. These discussions lasted 1.5 hours each. Two discussions were conducted simultaneously at each sampled school. The teachers were randomly divided into one of the groups, with seven to 12 teachers being assigned to each group. The experimental groups were facilitated by a professional from the Slovenian Association of Facilitators (a chapter of the International Association of Facilitators) using a highly structured and well-prepared plan. The facilitator guided the teachers’ discussion in terms of content and procedure to ensure that they respected deliberative criteria such as equal participation in the discussion, argumentation, respect, readiness for change, and honesty. The control groups only followed one deliberative criterion, namely that teachers stay on the topic. Because this criterion is also prerequisite for other communication intermediation forms and a basic procedural task for other types of intermediaries, such as chairs, moderators, and mediators (see Landwehr, 2014), keeping the discussion on topic cannot define even basic, weak, or minimal facilitation. These discussions were not manipulated regarding facilitation style or intensity, from weak and non-interventionist to strong and active. 1 Rather, the goal was to establish an environment for a control group that would closely mirror non-facilitated, real-world discussions. The control groups were accompanied by a research team member who posed only leading questions.
The facilitator did not enforce a consensus, because practice shows that an insistence on consensus can create frustration or even anger when participants’ arguments are overlooked, leading to a forced consensus or pseudoconsensus (Karpowitz & Mansbridge, 2005). Facilitated groups were instead invited to seek common ground (Chambers, 2003; O’Doherty, 2013), carefully weigh each recommendation, and adapt it if necessary.
Our study design has advantages in comparison with the designs commonly used to study deliberative discussions. The participants did not need to be specifically informed in advance regarding the subject of deliberation because they tackle the topic daily. It is reasonable to assume that they were motivated to engage in the discussions because they were substantially concerned with the issue at hand. In a previous survey of Slovenian teachers, their responses demonstrated that they felt they were not properly equipped to impose school discipline measures (Peček, Vončina, & Kroflič, 2009). All subject teachers from each individual primary school participated (except those on sick leave and those with prior engagements). Discussions were held in schools, in the participants’ natural surroundings. Also, the participant groups were real-life collectives of individuals with shared responsibilities (not artificial groups assembled for a short time).
Before the discussions, anonymity was assured, as the aim was not to judge teachers’ professional expertise. In this way, we intended to establish that teachers could speak honestly based on their real-life experiences. The professional facilitator for the experimental group and the research team member for the control group were unbiased because they were not experts from the pedagogical field or financially supported by any actor or group that might have an interest in influencing the results.
Prior to and after the discussions, the participants filled out a questionnaire. The questions related to school discipline measures were repeated before and after the deliberation. Before the discussions, the participants answered a set of sociodemographic questions; after the discussions, they answered a set of questions about the discussions and three additional sets of psychological dimensions, which are not the focus of this article. Upon arrival, participants were provided with random numbers, allowing us to link questionnaire responses completed before the deliberation with those completed afterward and also to ensure anonymity.
Participants
The participants in the study were teachers (N = 226) who worked in the second stage of primary education (Grades 6-9). We selected 13 schools by taking several criteria into account. We made sure to include schools from (a) all 12 Slovenian regions; (b) larger and smaller cities, suburban areas, and the countryside; (c) schools of different sizes; and (d) private schools and/or schools with different education theories. 2
After selecting the sample of primary schools, we contacted the principals and invited their schools to participate. Three of the initially selected schools’ principals refused to participate, so we contacted other comparable schools in the same region. The principals obtained the consent of their teachers who were informed about the topic and format via a leaflet. In the individual schools, all teachers in Grades 6 to 9 participated (except for sick or otherwise justifiably absent teachers). Discussions were held in December 2015 and January 2016 in the participating schools.
The final sample consisted of 226 teachers; 78.9% were women, which can be expected given the gender structure of the population of primary school teachers, which has recently been found to be 79.4% female for the second stage of primary education (Statistical Office of the Republic of Slovenia, 2016). The age of the respondents ranges from 26 to 64 years, with an average age of 45 (SD = 9.76). In the sample, the largest share of study participants had been teaching between 20 and 30 years. The teachers worked at schools with various numbers of students, ranging from 230 to 837 (M = 541.58, SD = 173.87). Teachers in rural schools account for 15.5% of the sample, teachers from small cities account for 25.7%, teachers from medium-sized cities account for 28.3%, and teachers from schools in Slovenia’s first or second largest cities account for 30.5%.
Measures
Perceived quality of group deliberation was measured on the basis of a comprehensive approach that integrates various conceptual perspectives (i.e., Bächtiger, Shikano, Pedrini, & Ryser, 2009; Dahlberg, 2004; Gastil & Black, 2008; Steiner, 2012). We intended to measure a variety of deliberation qualities that are commonly reported by researchers: equal participation in the discussion, argumentation, analytical approach, readiness for change, honesty, and respect, with the last of these being divided into recognition and appraisal respect (Darwall, 1977). The quality of deliberation was measured via participants’ overall perceptions of the group discussions. Although content-based measurement instruments are more commonly used in assessing group deliberations (i.e., Steenbergen, Bächtiger, Spörndli, & Steiner, 2003; Steiner, 2012), this type of survey-based operationalization is necessary for our explanatory model, in which we attempt to predict individuals’ attitudes. Moreover, this method seems to be the most valid way of grasping phenomena such as showing respect, carefully listening to others, being aware of one’s own presumptions and prejudices, and taking on the perspectives of others (Dahlberg, 2004; Gastil & Black, 2008). Such a method is similar to that of Gastil et al. (2008), in which active members of an online group were asked to evaluate the communication among those involved in group discussions. Moreover, the construct and criterion validity of similar measures have been already empirically demonstrated in various fields (Petrič, 2013).
In our study, we adopted items from various scales that address deliberative criteria (Fulwider, 2005; Gastil et al., 2008; Graham & Witschge, 2006; Halvorsen, 2001; Nabatchi, 2007; Petrič, 2013; Reykowski, 2006; Steenbergen et al., 2003), whereas some items were originally developed based on the conceptual definitions of the deliberative criteria. The initial item pool was evaluated for face and content validity by members of the research team, and item selection was performed so as to arrive at a minimum of three items per deliberative criterion. Each item was measured on a Likert-type scale ranging from 1 = completely disagree to 5 = completely agree.
Altogether we anticipated seven dimensions of quality of deliberation. Factor analyses (oblimin rotation) were conducted to analyze measuring quality. After the exclusion of several items, we arrived at seven scales with good reliabilities. The scale for equal participation contained three items (example of reverse coded item: “Most of the time, only a few people were speaking”), and it demonstrated satisfactory reliability (α = .78). The scale for measuring argumentation in the discussions consisted of four items (example item: “I could say that the discussion was an exchange of high-quality arguments”) and demonstrated satisfactory reliability (α = .73). The measurement instrument for analyticity (example item: “Many solutions for the discussed problem appeared during the discussion”) contained three items and was marginally reliable (α = .68). Respect was measured with two scales, both of which contained five items—one for recognition respect (example item: “Most of the time, the speakers listened carefully to what others had to say”), which demonstrated satisfactory reliability (α = .78), and appraisal respect (example item: “The speakers perceived one another as competent discussants”), which had good reliability (α = .87). Readiness for change was measured with three items (example item: “During the discussion, the speakers started to demonstrate changed views on the issue”), and this scale had marginal reliability (α = .63). The perceived honesty of the participants during the discussions (example item: “I am sure that the speakers mostly said what they really meant”) was measured with five items, and this scale demonstrated good reliability (α = .80).
We also adopted a second-order factor analysis approach (Thompson, 2004), in which seven dimensions were enter deliberation quality, ed as indicators of overall group deliberation. The factor analysis demonstrated a single factor for which all dimensions had factor weights higher than .4. Cronbach’s alpha of the total deliberation scale was .79. Consequently, a total perceived deliberation score was computed as an average of all seven dimensions or more precisely:
In the empirical analyses, where quality of deliberation figures as an independent variable, we used seven separate dimensions to get detailed insight into where differences between facilitated and non-facilitated groups occur and which dimensions of deliberation have impact on attitude change. Where quality of deliberation figures as a dependent variable, we used a total index of deliberation to reduce complexity (as in the opposite case we would have to run seven explanatory models, separately for each dimension of quality of deliberation).
Changes in attitudes toward restorative and retributive disciplinary measures were measured based on a vignette describing a case based on a real educational: Three boys from seventh grade left the school building. Martin and Luka rode their bicycles. Luka pushed Martin off his bicycle while he was riding it in the school yard. Martin fell off the bicycle and became angry. He grabbed the chain lock, started spinning it around his head, and ran after Luka. Tim also started to chase Luka, caught Martin, and held Martin so that he could reach him with his chain. When Martin approached, he hit Luka with the chain, but Luka managed to break away and run off. A teacher observed the entire scene through a window.
Before and after deliberation, the teachers assessed the appropriateness of 15 measures taken from a set of the most common retributive and restorative school discipline measures. The appropriateness of each individual measure was assessed based on participants’ answers in response to attitude statements, which were measured on a scale from one to 10.
An exploratory factor analysis (oblimin rotation) was conducted on the set of items before and after the discussion. In both cases, two factors were extracted that accounted for 28.8% of the variability regarding the disciplinary measures before the discussion and 35.8% of such variability after the discussion. The obtained factors clearly correspond to two types of attitudes: attitudes toward restorative disciplinary measures (example item: “Conversation with students so that they reflect on the consequences of their behavior”) and attitudes toward retributive disciplinary measures (example item: “Prohibition from participating in a study excursion or receiving any other benefits”). Based on the obtained factors, we created a restorative index and a retributive index, as well as confirming the reliability of each index with Cronbach’s alpha values. The alpha value for the restorative index before the discussion was .80, and after the discussion it was .86; the value of alpha for the retributive index before the discussion was .74, and after the discussion it was .81, all indicating satisfactory reliability. Each index is an average of individual indicators and includes two values, one for the pre-deliberation replies and another for the post-deliberation replies. Consequently, two variables were computed to represent the differences between pre- and post-discussion replies and thus changes in attitude toward disciplinary measures: change in attitudes toward retributive disciplinary measures and change in attitudes toward restorative disciplinary measures.
The initial attitudes were measured on an individual level, whereas group attitudes were computed as mean values of individual attitudes. Thus, we arrived at group index of restorative attitudes and group index of retributive attitudes. The lowest theoretical score of the index was 1 and the maximum 10.
Another set of variables pertains to group homogeneity/heterogeneity of attitudes, which were computed as an average of the absolute differences between participants’ attitudes. The lowest theoretical score of the index was 1 and the maximum 5.
Results
Testing H1a
Regarding deliberation quality, participants evaluated the discussions as relatively good (see Table 1). The mean value of the total index of deliberation quality for all individuals was 3.96 (SD = 0.45), whereas the most important criterion was found to be appraisal respect (M = 4.25, SD = 0.48). On the contrary, the lowest value of all the deliberative dimensions was assigned to the readiness for change criterion (M = 3.04, SD = 0.62). On the basis of independent-samples t-tests, several statistically significant differences can be observed between the facilitated and non-facilitated groups. First, the total index of deliberative quality was significantly (p < .0001) higher for individuals in the facilitated groups (M = 4.07, SD = 0.40) as compared with those in the non-facilitated groups (M = 3.85, SD = 0.47). Teachers participating in the facilitated deliberative discussions, as compared with those in the non-facilitated, evaluated the facilitated discussions as better in terms of equal participation, recognition respect, and readiness for change. No significant differences were observed in terms of argumentation, analyticity, appraisal respect, and honesty.
Comparison of Quality of Deliberation Dimensions Among Individuals in Facilitated Versus Non-Facilitated Groups (t-tests).
Note. Significance was tested at α = .05.
Testing H2b
The participants in the discussions reported mostly positive attitudes toward both types of disciplinary measures, yet they preferred restorative disciplinary measures over retributive measures (see Table 2). In Table 2, we also report results of paired-samples t-test to analyze the difference in attitudes before and after the discussions. We noted changes in attitudes for all groups only in the case of attitudes toward restorative measures (Mdifference = .20,p = .07), demonstrating that participants had more positive attitudes toward restorative measures after the discussions. The change in restorative attitudes was stronger when we limited our analysis only to the facilitated groups (Mdifference = .36, p = .0001), whereas this change in non-facilitated groups was non-significant. Interestingly, the changes in attitudes toward retributive measures were significant for both the facilitated (Mdifference = .23, p = .044) and the non-facilitated groups (Mdifference = .45, p = .0002), but the direction of change was different for these two types of groups.
Comparison of Attitudes: Facilitated Versus Non-Facilitated Individuals (t-tests).
Note. Significance was tested at α = .05.
Analysis of RQ1
Table 3 presents the means of heterogeneity of attitudes toward retributive and restorative disciplinary measures and also differences between facilitated and non-facilitated groups before discussion. Mean heterogeneity of attitudes toward restorative disciplinary measures is 1.58 and that toward the retributive ones is 1.85. The heterogeneity of attitudes is not significantly different between facilitated and non-facilitated groups.
Comparison of Heterogeneity and Group Averages of Attitudes Before Discussion: Facilitated Versus Non-Facilitated Groups (t-tests).
Note. Significance was tested at α = .05.
Testing H1b and Analysis of RQ2
For the analysis of explanatory hypotheses and research questions regarding the impact of facilitation on quality of deliberation and attitude change, a multilevel modeling approach seems to be most valid, as the data are structured in two levels (individuals and schools). However, the intraclass correlation coefficients which identify the amount of the between-group variability of dependent variables were rather small (.145 for quality of deliberation, .031 for change in restorative attitudes, .033 for change in retributive attitudes). Thus, there was no need for multilevel modeling approach. Consequently, the group-level predictor variables were entered into a regression analysis as individual-level variables, meaning that group values were assigned to individuals. A hierarchical, ordinary, least-squares, multiple regression analysis approach was undertaken (Cohen, Cohen, West, & Aiken, 2003). This was conducted in such a way that three successive linear regression models were estimated for each of the three dependent variables. This procedure allows researchers to test if the successive model fits better than the previous one. Before the analyses, we checked for multicollinearity, and there were no such issues.
Table 4 reports the results of regression analyses on total quality of deliberation index. In the first step, we added only demographic control variables; neither teacher’s gender, nor teacher’s age, nor the number of pupils at school was significantly associated with total quality of deliberation. The model fit itself was also not significant (F = 0.57, p = .636).
Regression Coefficients of Independent, Contextual, and Control Variables on the Perceived Quality of Group Deliberation.
Note. N = 181.
p < .05, **p < .01.
In the second step, we added restorative index (group means of attitudes toward restorative measures before discussion), retributive index (group means of attitudes toward retributive measures before discussion), heterogeneity of restorative index, and heterogeneity of retributive index. This provides us with results regarding RQ2. In comparison with the model in Step 1, the increase in R2 was significant (ΔR2 = .153, p ≤ .001). Retributive index (group before discussion) (β = .181, p = .026) and heterogeneity of restorative index (β = −.355, p ≤ .001) were found to be significant predictors of the perceived quality of deliberation index.
In the final step, we added the facilitation variable and, in comparison with the model in Step 2, the increase in R2 was significant (ΔR2 = .063, p ≤ .001). Facilitation has a significant, weak to moderate influence on the quality of deliberation (β = .268, p < .001).
Testing H2b and Analysis of RQ3
To test the hypothesis and research question regarding the impact of facilitation on changes in attitudes, we performed two sets of regression analyses: one in which changes in attitude toward retributive disciplinary measures were the dependent variable and another in which changes in attitudes toward restorative disciplinary measures were the dependent variable.
Table 5 reports the results of regression analysis on change in restorative index. Similarly, as above the regression was made in three steps. In the first step, we added only demographic control variables and the model fit was not significant (F = .614, p = .607). In Step 2, predictors that pertain to dimensions of quality of deliberation, initial attitudes and heterogeneity of attitudes were added to the model. The increase in R2 was significant (ΔR2 = .261; p ≤ .001). In addition to initial group restorative attitude, two dimensions of quality of deliberation had statistically significant impact on change in attitudes toward restorative disciplinary measures: appraisal respect (β = −.299, p = .005) and argumentation (β = −.207,p = .041).
Regression Coefficients for the Changes of Restorative Index.
Note. N = 183.
p < .05, **p < .01.
In the third step, we added facilitation variables and, in comparison with the model in step 2, the increase of R2 was significant (ΔR2 = .025, p = .016). Facilitation variable has a statistically significant impact on changes in attitude toward restorative measures (β = .196, p = .016). Individuals’ initial attitudes (β = −.492, p ≤ .001) and appraisal respect (β = −.332, p ≤ .001) are also significantly associated with change in restorative index.
Table 6 shows the results of hierarchical regression analysis on change in retributive index. We added demographic control variables in the first step and the model did not significantly fit the data (F = 1.832, p =.143).
Regression Coefficients for the Changes in Retributive Index.
Note. N = 183.
p < .05, **p < .01.
In the second step, predictors that pertain to dimensions of quality of deliberation, initial attitudes, and heterogeneity of attitudes were added to the model. In comparison with the model in Step 1, we can note a significant increase in R2 (ΔR2 = .170, p < .001). In this model, the predictors readiness for change (β = −.173, p = .003) and retributive index (group before discussion) (β = −.338, p < .001) have significant impact on the change in restorative index.
In the final step, we added facilitation variable and, in comparison with the model in Step 2, a significant increase in R2 can be noted (ΔR2 = .045, p < .002). Facilitation is a statistically significant predictor of change in attitude toward retributive disciplinary measures (β = .255, p = .002).
Discussion
The main purpose of our study was to test the effects of facilitation on the quality of deliberative discussions and attitude changes regarding controversial issues. As suggested by comparison of the facilitated and non-facilitated groups and confirmed by a series of regression analyses, facilitation indeed plays a significant role in raising the (perceived) quality of group deliberation and in changes of attitudes in the direction of more democratic disciplinary school measures. Even after controlling for a number of sociodemographic, attitudinal, and group characteristics, facilitation demonstrated a moderate impact on the key dependent variables. This result is especially noteworthy if we take into account that 1.5 hours of deliberation can produce such an influence.
Empirical Support for Assumptions of Facilitation’s Role
Our analysis shows a positive impact of facilitation on the quality of deliberation. The regression analysis showed that facilitation is a statistically significant predictor of the perceived total quality of group deliberation index, whereas bivariate analysis suggests that impact is present especially in dimensions of the equal participation in the discussion, readiness for change, and recognition respect. Facilitation is obviously an important procedural element that enables all participants to voice their opinions, become open to various perspectives, and show at least a minimum of respect for others. On the contrary, facilitation, at least the type used in our study, did not manage to improve the quality of discussions in terms of participants giving justifications for their statements, reflecting on what others said, being honest in their utterances, and showing appraisal. A partial explanation may be found in the fact that the topic itself did not invite that much reasoning and analytical thinking, as it did empathizing. The result that facilitation does not have an impact on honesty and appraisal respect might have contributed to the fact that the discussion groups were composed of people who likely know one another from their work environment. How individuals perceive the honesty and respect of others (whether it is low or high) in the discussion group is likely predetermined by the history of relationships in the work environment. As these qualities (or lack of them) are probably perceived as relatively stable traits, it is not surprising that 1.5 hours of discussion did not change such perceptions.
Furthermore, our analyses suggest that the facilitated individuals are inclined to express more positive attitudes toward restorative measures, whereas their inclination toward retributive measures declined. On the contrary, in the non-facilitated groups, attitudes in favor of retributive measures became stronger. This may be explained by the fact that the deliberative criteria, which are exposed via the facilitation process, reflect the restorative norms of inclusion, cooperation, and identification with other people’s roles. The changes in attitude regarding the degree of retribution in the group of non-facilitated individuals show the prevalence of voices in favor of a unilateral, formalist, and strict retribution system that does not reflect the weighing of various views and arguments, which is one of the key purposes of deliberation.
Independently of facilitation, the perceived quality of deliberation, albeit only some dimensions, was also shown to play a role in attitude change. The strongest impacts were that of appraisal respect and argumentation on the changes in restorative attitudes and of readiness to change on change in retributive attitudes. The negative impact of perceived appraisal respect on attitude change suggests that the more the individuals perceived other group members and their contributions negatively in terms of competence, the less they were inclined to change their attitudes toward restorative measures. In addition, when the group deliberation was perceived as being based on arguments, individuals were more likely to express more positive attitudes toward restorative disciplinary measures after the discussion. In predicting attitude changes toward retributive measures, only readiness for change had a marginal impact. This impact was negative, suggesting that participants who perceived the group as constructive and willing to change its opinions were less likely to develop more positive attitudes toward retributive measures. This may be connected with the already-discussed finding that the nature of deliberation resembles restorative disciplinary measures and at least implicitly opposes some aspects of retributive measures.
Importance of Context in Deliberation Studies
One important element of our empirical study pertains to the set of variables that describe the individual- and group-level context of the discussions. Nevertheless, they are important as control variables because taking them into consideration in the explanatory models brings us to a very distilled impact of facilitation on the quality of deliberation and attitude change. Second, they show how individuals’ initial attitudes and diversity of attitudes in groups could affect the deliberation process itself. Facilitated and non-facilitated groups in our experiment did not statistically significantly differ at the beginning of discussions neither regarding individuals’ initial attitudes nor as regards group attitudinal averages or group attitudinal heterogeneity.
The analysis showed that, among all the variables, the most important ones were related to the initial attitudes about restorative and retributive discipline that teachers held before they entered into the discussions. Among those with more positive starting attitudes, attitudinal changes were less likely. At the same time, individuals’ initial attitudes (neither retributiveness nor restorativeness) did not statistically significantly predict the perceived quality of group deliberation.
Regarding attitudinal homogeneity/heterogeneity, there is one notable impact, only on the perceived quality of group deliberation. Group heterogeneity regarding attitudes toward restorative measures moderately lowered the perceived quality of group deliberativeness. This result points to internal tension in the deliberation process, which, on one hand, involves heterogeneous perspectives on a given issue but, on the other hand, lowers the perceived quality of deliberation. One perspective to adopt to better understand this tension is that of the online community research field, which has demonstrated a strong association between a sense of community and trust toward group members (Blanchard, Welbourne, & Boughton, 2011). More precisely, the congruence of beliefs (as one dimension of a sense of community) will result in assessing others as trustful, honest, and respectful during communication. However, the deliberation process should surely not disregard the need for heterogeneity, and especially in groups with history of relationships and differing attitudes the role of facilitation seems to be in particular significant. In addition, our study suggests that the tensions invited by a diversity of opinions can be limited to certain attitudes. The reason for the finding that the heterogeneity of attitudes toward retributive disciplinary measures did not affect quality of deliberation may lie in the fact that such disciplinary measures are much less complex than the restorative ones.
Limitations
Our study has several limitations that may encourage future studies. Some of these are somewhat standard and can be assumed from the study design. One pertains to sample demographics. Due to the fact that around 80% of primary school teachers in Slovenia are female, most participants in our deliberative discussions were women. In further studies, it would be reasonable to form a gender-balanced sample in the case the influence of gender on deliberation is measured, which has already been partly tested (Mendelberg & Karpowitz, 2007; Spada & Vreeland, 2013), but not in such experiments. The sample’s specific characteristics also include the participants’ equal education level and similar social status. On one hand, this may be an advantage because the participants are more equal and are not distinct concerning social differences; on the other hand, these factors’ influence on the participants’ attitude change and (the perception of) deliberativeness could also be measured in other contexts, as already partly tested (e.g., Abdel-Monem et al., 2010) but again not in similar experimental conditions. Our sample consists of participants from school collectives (preexisting groups), which may have predisposed participants to group interactions, particularly in the non-facilitated groups. Especially for the purposes of researching the deliberative democracy framework, further experiments with people who do not know one another are recommended.
Next, in the experimental groups that were professionally facilitated, this role was performed only by a single professional facilitator. It would be relevant to introduce variation in the form of several qualified facilitators. Ideally, the variation would also be needed on the level of issue for deliberation and also regarding different facilitation styles. In the latter case, the research should look differences not only between non-facilitated and facilitated groups but also between non-facilitated groups and various different types of facilitated groups, which appear in various typologies of facilitation (Reykowski, 2006; Ryfe, 2006; Trénel, 2009).
Cultural context would also need to be considered in such type of experiment that we undertook. Jelavic and Salter (2014) argue that facilitation should be sensitive to the national culture, which plays an important role in the cognitive processes of participants and also in defining which methods of facilitations are acceptable and comfortable. In this study, we did not pay special attention to this finding, as Slovenia seems to be a typical European Union (EU) country, as statistics regarding values’ position is close to the average of EU member states (European Values Study, 2016).
Implications for Research and Practice
Our research methodology and findings present important contributions to the scientific and practitioners’ communities. For one, the study clearly demonstrated the need for structured discussions. In light of the omnipresent tacit assumption of the positive role of facilitation, this finding was expected, yet the limited empirical insights were till now somewhat controversial. Our study shows that, in a relatively short time, the facilitated discussions managed to achieve more democratic results, in terms of the way the controversial issue at hand was discussed and the direction of attitudinal change, in comparison with non-facilitated discussions.
Another implication for practice and research stems from a moderate influence of initial attitudes on reluctance to change attitudes during discussion. This influence is especially emphasized when the attitudes are somewhat of less democratic nature. This suggests that in this type of deliberative discussions we cannot expect large-scale transitions in participants’ attitudes. Similar phenomena can nowadays be observed in the online world, where people isolate themselves in echo chambers to prevent any kind of intervention that would shake their existing attitudes (Jacobson, Myung, & Johnson, 2016). Consequently, we suggest that the facilitation process should somehow more directly address initial attitudes and values behind them and make them more transparent. With certain techniques, facilitator might then induce reflection of participants’ own positions and soften their attitudes.
The findings of our study have also practical implications, particularly for the educational field. Discussions on the topic of disciplinary action and education in a broader sense are certainly dependent on factors such as school culture (e.g., implemented restorative justice, non-violent communication, or some other programs regarding relationships and/or educational principles), including the existing cultures of sharing and reflecting other issues (i.e., some schools have regular supervisions, in some schools principals might be trained in leading in a dialogic way). Our study indicates that following the democratic elements of deliberative discussion could lead to more democratic patterns of educational practice.
Regarding the research implications, our results confirm the appropriateness of a decision to make a distinction between one total measure of deliberation quality and individual dimensions of deliberation. Consequently, we suggest future studies to continue to explore the distinct dimensions of deliberation and the relationship between them, and explore the different effects that they have on the discussion and resultant attitudes. If further research was to ascertain our result that only certain dimensions have impact on attitude change, then the facilitation process might be more effective by being sensitive to such findings.
What remains for further investigations is also to use a mixed-methods approach in analyzing the participants’ statements. We focused only on the discussants’ self-assessments of deliberativeness. Our suggestion for future studies is to code those deliberative criteria that can be coded and thus cross-validate the perceptions of group deliberation using the actual quality of group deliberation (see Stromer-Galley, 2007).
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Slovenian Research Agency under Grant No. L5-5547.
