Abstract
Evaluation theories can be tested in various ways. One approach, the experimental analogue study, is described and illustrated in this article. The approach is presented as a method worthy to use in the pursuit of what Alkin and others have called descriptive evaluation theory. Drawing on analogue studies conducted by the first author, we illustrate the potential benefits and limitations of analogue experiments for studying aspects of evaluation and for contributing to the development and refinement of evaluation theory. Specifically, we describe the results of two studies that examined stakeholder dialogue under different conditions of accountability frame, interpersonal motives, and epistemic motives. We present the studies’ main findings while highlighting the potential for analogue studies to investigate questions of interest concerning evaluation practice and theory. Potentials and pitfalls of the analogue study approach are discussed.
Calls for Descriptive Evaluation Theory
Although some evaluation theory is empirically based, “much more of it is hypothetical, conjectural, and unproven, still waiting to be tested” (Shadish, 1998, p. 2). Although empirically based contingency theories are the norm in many social science fields that overlap with evaluation, in general they remain a distant goal for the discipline of evaluation (Donaldson & Crano, 2011; Mark, Donaldson, & Campbell, 2011). Over the past several decades, evaluators have issued repeated calls for the field to develop more descriptive, empirically based evaluation theory (Alkin, 2003; Campbell & McGrath, 2011; Christie, 2011; Cousins, 2003; Donaldson & Lipsey, 2006; Henry & Mark, 2003; Mark, 2008; Mark et al., 2011; Miller, 2010; Shadish, Cook, & Leviton, 1991; Smith, 1979, 1993). Fewer of these calls, however, have offered explicit guidance about how precisely to go about developing that descriptive evaluation theory. The general idea of increasing the amount of empirical research in evaluation is appealing to many, at least in principle. But too few supporters of this general notion are actually conducting research on evaluation (Szanyi, Azzam, & Galen, 2013). The sheer size of the gap between evaluation theory and empirical research which tests evaluation theory, coupled with a lack of specific guidance about how to address this gap, may leave many would-be evaluation researchers feeling rather overwhelmed and not sure just where to start (Campbell & McGrath, 2011).
Until recently, calls to advance the empirical scholarship of evaluation make impassioned arguments for the benefits of such research but lack clarity on how precisely to go about generating more descriptive research. In an attempt to move the discussion beyond general calls for more research on evaluation, Henry and Mark (2003) proposed an agenda, suggesting six types of research studies that might be used to advance evaluation research. Mark (2008) went somewhat further, proposing a taxonomy of the types of questions that might be the focus of research on evaluation as well as the types of studies that might be conducted. Notably, for the present article, he suggested the analogue study as one method (among others) that may be particularly useful for answering some of evaluation’s important theoretical questions.
We hope to further the discussion by describing our recent work using the experimental analogue study as a method to test causal hypotheses about theories of stakeholder dialogue. Campbell and Mark (2006) initially examined stakeholder dialogue in an analogue situation, with the goal of assessing the consequences of a set of alternative ways of carrying out stakeholder interactions. Since then, Campbell has conducted additional research using the analogue study to examine stakeholder dialogue under different conditions. In this article, we present some of what we have learned about stakeholder dialogue in particular and comment more generally on the analogue study as a tool for research on evaluation. First, we provide a description of the analogue study as a general research method.
Features of the Analogue Study
Analogue studies are intended to mirror or approximate the reality of social phenomenon, even while the study occurs in a different setting than the one of interest. In part the idea is to emulate, as closely as possible, a real-life situation (in this case, stakeholder dialogue) while also controlling for extraneous variables (Heyman & Slep, 2004). Analogues also permit comparisons to be made in a way that will facilitate conclusions about the variables of interest, beyond what might be possible in a more natural setting (Aronson & Carlsmith, 1968).
Although the distinction is not universally accepted, we differentiate simulation studies and analogue studies. In simulation studies, participants are asked to imagine themselves in a particular role and to make decisions or behave as though they were in that role. A number of simulation studies have been conducted on evaluation topics, primarily utilization, where evaluators or others are presented with information and asked to make decisions as though they were in the situation described (Braskamp, Brown, & Newman, 1982; Brown & Newman, 1982; Brown, Newman, & Rivers, 1984; Brown & Prentice, 1987).
Participants in analogue studies, on the other hand, are behaving in quasi-naturalistic social situations (Heyman, Malik, & Slep, 2009). In other words, participants are not simply imagining a scenario. Rather, they are directly involved in a situation, while at the same time they are aware that it has been created for the purpose of conducting a research study. The analogue situation is designed in such a way as to elicit from participants the kinds of behavior the researcher is interested in studying. The skillful analogue researcher strives to create a study situation that elicits responses like those that would occur in a natural setting. In our studies, research participants meet face-to-face to actually engage in a dialogue about their opposing views on a policy issue. Although all of our participants are aware that they are participating in a research study, their dialogue with each other is real, not imagined. Bringing together stakeholders who actually have opposing views on the issue to be discussed adds to our ability to elicit and observe the responses of interest.
A key purpose of an analogue study is to allow the researcher to control and manipulate aspects of the situation so as to be in a better position to make stronger causal inferences about the processes being studied. Analogue studies are often used in developmental and clinical psychology, for example, to study parent–child or family interactions. In such studies, situations are set up so that the researcher can “prompt the behavior of interest and can constrain the context enough to minimize unwanted, unstandardized influences” (Heyman & Slep, 2004, p. 164). In our work, stakeholder dialogue is prompted by asking participants with opposing views on a policy issue to come to an agreement, through discussion, about how to prioritize the importance of different issues involved. We constrain the context by conducting the study in a laboratory setting and by following the same script and procedures for each pair of stakeholders who participate in the study. The only systematic variation in procedures across groups is through our experimental manipulations. The primary benefit of the analogue study, as just described then, is its use as a hypothesis-testing tool, allowing the researcher to isolate—and test the effect of—possible determinants of behavior. The idea is not to seek a comprehensive ecology or phenomenology of the domain of interest (e.g., stakeholder processes) but to investigate more focused questions, such as how the processes and outcomes of stakeholder dialogue are affected by structuring the dialogue in one specific way or another.
Stakeholder Dialogue: An Evaluation Practice in Search of a Descriptive Theory
Stakeholder dialogue is widely presumed to result in benefits ranging from stakeholder empowerment to increased validity of research results (Brandon, 1998; Cousins, 2003; House & Howe, 1999). Key decisions about program and policy evaluations are often made with the input of diverse stakeholders who have competing interests. For example, stakeholder input may be used to guide decisions about which of many possible program outcomes to measure, in the face of limited resources. Stakeholder dialogue has been used as a method for sorting through and resolving such divergent interests (e.g., Abma et al., 2001; Greene, 2000; House & Howe, 2000).
Dialogue has been proposed as a starting point for resolving stakeholder differences by helping people to identify and sort through their divergent opinions and the values underlying them (Greene, 2000; House & Howe, 1999; Mark, Henry, & Julnes, 2000; Ryan & DeStefano, 2001). In addition to uncovering “genuine” interests and values, proponents claim that dialogue can also help stakeholder groups to “put aside narrow self-interest and address issues among themselves through respectful, reciprocal conversation” (Ryan & DeStefano, 2001, p. 190). Moreover, the nature of the dialogue process is expected to impact the quality of dialogue outcomes. Dialogue characterized by mutual perspectivetaking and cooperation should help generate evaluation decisions that more accurately represent the interests of various stakeholder groups, whereas competitive and adversarial dialogues are expected to yield biased decisions that privilege a particular stakeholder interest over others (Baur, VanElteren, Nierse, & Abma, 2010; Greene, 2001).
For many evaluators including a variety of stakeholder perspectives in an evaluation is critical, and stakeholder dialogue can be one way of achieving this goal of inclusion. The ideal stakeholder dialogue is a cooperative process, yielding outcomes that represent the interests of all stakeholders. But precisely how to achieve this ideal is not well understood. The classic problem for those who engage stakeholders in dialogue is how to bridge the gulf between visions of the ideal—cooperative, mutually respectful dialogue—and the all-too-often realities of adversarial interactions between stakeholders who have competing interests (Greene, 2000). A set of related questions remain, then: How precisely does one achieve cooperative dialogue? When is dialogue unlikely to be cooperative? How might we facilitate integrative outcomes while discouraging biased ones? Which specific aspects of the stakeholder dialogue context are malleable? Which are not? To answer questions like these, a more precise, empirically based understanding of stakeholder dialogue is needed. Unfortunately, evaluators in search of guidance for practice decisions will find volumes of prescriptive advice but very little in the way of descriptively oriented evaluation theory (Alkin, 2003; Christie, 2011; Smith, 1993). Thus, to increase understanding of the conditions under which stakeholder dialogue is more or less likely to be cooperative (or, conversely, adversarial), an experimental analogue study has much to offer. In the absence of clearly specified evaluation theory, we drew on social psychological theories of negotiation (Beersma & DeDreu, 2002; DeDreu, Beersma, Stroebe, & Euwema, 2006; DeDreu, Weingart, & Kwon, 2000) combined with observations from the evaluation literature (Abma, 2000; Greene, 2001; Ryan & DeStefano, 2001) to design a stakeholder dialogue situation that is controlled, consistent with accumulated findings in the negotiation literature, and rooted in evaluators’ practice knowledge.
In our analogue studies, we used motivated information processing theory to examine the impact of accountability frames, stakeholders’ interpersonal prosocial motives, and epistemic motives on the quality of stakeholder dialogue processes and dialogue outcomes. Within motivated information processing theory, the motives of the participants in a negotiation are viewed as key predictors of the nature of negotiation processes and outcomes (Beersma & DeDreu, 2002; Carnevale & DeDreu, 2006; DeDreu et al., 2006). For instance, the negotiation process is more likely to be cooperative and to focus on problem solving, thereby yielding jointly beneficial outcomes under two conditions: (1) when concern for others is high rather than low (prosocial interpersonal motives; DeDreu et al., 2000), and (2) when negotiators are motivated to consider the issues at hand deeply, rather than cursorily (epistemic motives; TenVelden, Beersma, & DeDreu, 2010). Accountability to a constituency is also a primary driver of competitive negotiation strategies, but framing that accountability in terms of broader, more heterogeneous group interests (accountability frame) can have positive effects on the negotiation process and outcomes (Campbell & Mark, 2006; Dovidio, Gaertner, Niemann, & Snider, 2001). In a series of laboratory analogue studies, we have been exploring the influence of these contextual variables on different aspects of stakeholder dialogue. In the next section, we describe what our analogue situations looked like and what we learned from them.
Stakeholder Dialogue in the Laboratory: Details and Outcome Measures
To date, we have conducted stakeholder analogues with a total of 96 dyads and have manipulated three different independent variables. Campbell and Mark (2006) have previously reported a portion of this data (n = 61 dyads), focusing in particular on the interactive effect of accountability audience and prosocial instructions on observed dialogue and self-reported outcomes. The results reported here also include data from an additional 35 dyads, who discussed a different policy issue (i.e., mandatory transit pass), were presented with an additional independent variable (epistemic motive) as well as an outcome measure of conflict handling styles. In the present article, we focus on the main effects of each independent variable separately on participants’ self-reported evaluations of the dialogue process and outcomes.
Stakeholder Dialogue: The Analogue Situation
In our analogue studies, we have used convenience samples of undergraduate students who interact in pairs. We take care to mimic, as closely as possible, the presence of stakeholders with differing values and self-interest in a policy issue. First, by focusing on ongoing policy debates relevant to our student study participants, we are able to recruit actual policy stakeholders. Dialogue topics have included policies surrounding university involvement in the off-campus behavior of students (e.g., alcohol use) and a policy to implement a mandatory, student-funded transit pass. In addition to choosing relevant policies for stakeholder dialogue, we also deliberately create pairs who differ in their self-interested positions on the issue (i.e., each pair contains a self-identified policy supporter and a policy opponent). Finally, our participants believe that their videotaped discussions will be shared with campus and community interest groups with a particular view on the policy, and they are encouraged by the research team to act as “representatives” of these interest groups. In other words, we try to create a believable constituency in the lab.
Participants’ activities are modeled on an important task for many stakeholder dialogues in evaluation: prioritizing the evaluation issues. Stakeholder pairs are given a list of 6 to 10 issues that need to be considered to properly evaluate the university policy in question. The issues have been constructed to reflect both sides of the policy debate. The chief goal for the stakeholder dyad is to try to agree on a priority ranking of these critical evaluation issues. The task is structured so that each individual first ranks the issues in order of his or her personal priority, without any discussion with the stakeholder partner. After completing this ranking task independently, the stakeholder pair must work together to decide together (i.e., engage in dialogue) on a final priority ranking of the evaluation issues. Once the dialogue is complete, participants respond to a questionnaire that assesses their self-reported negotiation strategies as well as their perceptions of the dialogue process and outcomes. These serve as our primary dependent variables and will be described in more detail later. We now turn to the experimental manipulations we have used in these analogue studies.
Experimental manipulations: What factors influence the dialogue process?
The primary advantage of the analogue study is the ability of the researcher to manipulate aspects of the situation in order to examine causal relationships between contextual variables and the behaviors and outcomes of interest. In our studies, prior to engaging in any dialogue, stakeholder pairs have been randomly assigned to different experimental conditions that are hypothesized to influence the quality of the dialogue process and outcomes. The experimental manipulations we have used include accountability frame, interpersonal prosocial motives, and epistemic motives. Next, we briefly describe each of these constructs and our operationalization of them.
Accountability frame: Homogeneous versus heterogeneous
Accountability is a complex, multidimensional construct, affecting people’s thoughts and judgments in different ways depending on, for example, the characteristics of the group to which one is accountable (i.e., the accountability audience), and how accountability is construed by the perceiver (Kramer, Pommerenke, & Newton,1993; Pruitt & Carnevale, 1993; Tetlock, 1991, 1992). Within the social psychology literature, negotiators who are expected to justify their actions to another party (e.g., accountable negotiators) tend to be defensive, competitive, and less likely to engage in perspective taking, relative to those who are only negotiating on their own behalf (Kramer, 1995; Pruitt & Carnevale, 1993; Tetlock, 1991, 1992). Similarly, conflicts between evaluation stakeholder groups can stem from “strict adherence to narrow self-interest” (Ryan & DeStefano, 2001, p. 190), attributable in large part to what we would classify as accountability pressures. Following these observations, we set out to frame the accountability experience differently in order to reduce some of this press to self-interest.
To examine the impact of accountability frame on the dialogue process, we have randomly assigned dyads to either feel accountable to (a) a self-interested group with a single, unified position on the policy issue being discussed (homogeneous accountability frame) or (b) a group representing many different policy positions, including, but not limited to, the position of the research participant (heterogeneous accountability frame). We manipulate accountability frames by informing our stakeholder participants that their videotaped discussions will be shared with one of three possible campus and community groups that have a vested interest in the policy being considered (i.e., the accountability audience). Each participant is encouraged to view himself or herself as a representative of this accountability audience. Describing the audience as a like-minded interest group whose views on the policy are completely aligned with the participant’s views (i.e., either strongly ‘for’ or strongly ‘against’ the policy), and whose membership all tends to share this same viewpoint, activates a homogeneous accountability frame. In this condition, each member of the dialogue pair represents a different accountability audience, and self-interested behaviors are predicted. The heterogeneous accountability frame, by contrast, is activated by describing the constituency as a varied interest group that is open to a variety of views on this policy issue and is committed to working through differences of opinion (i.e., within-group heterogeneity). In the heterogeneous condition, each member of the dialogue pair represents the same, diverse, accountability audience, and more cooperative dialogue is predicted.
Interpersonal motives: Prosocial versus control
In addition to the influence of accountability pressures, many stakeholders are influenced by the interpersonal motives they bring to the dialogue situation. Failure to adequately consider the views and positions of other stakeholder groups can significantly derail the dialogue process. Although prosocial motives can be studied as a trait, De Dreu and Van Lange (1995) provide compelling evidence for the impact of manipulated interpersonal motives in negotiation situations. Consistent with that research, we manipulated interpersonal prosocial motives through simple predialogue instruction sets. Stakeholder pairs are randomly assigned to either a prosocial motive condition or to a control condition. Prior to engaging in dialogue, participants in the prosocial motive condition are instructed to “ask for more information about your partner’s position, such as reasons for why he/she feels this way; think about ways in which your opinions are similar to those of your partner; repeat back in your own words, what you see as your partner’s essential arguments; and try to point out reasonable elements in your partner’s position” (Carnevale & Pruitt, 1992; De Dreu, 2007; Rapoport, 1960).
As a contrast to the prosocial instructions, participants in the control condition are asked simply to “maintain eye contact, speak loudly and clearly, use everyday language, and to listen carefully to their partner while he/she is speaking.” More positive outcomes are expected when prosocial motives are activated, relative to the simple control instructions.
Epistemic motives: Deep versus shallow processing
Stakeholder participation can range from relatively shallow polling or consultation, to extensive deliberation of the issues surrounding an evaluation. Some evaluators count deep participation as a prerequisite for truly effective stakeholder dialogue (House & Howe, 1999; Ryan & Johnson, 2000). This observation is in line with the findings on epistemic motives in negotiation and group decision making (DeDreu, 2007; TenVelden et al., 2010). Groups of negotiators who are required to document a deep, considered approach to the negotiation issues tend to engage in significantly more cooperative behaviors than do groups not given such instruction (Scholten,Van Knippenberg, Nijstad, & DeDreu, 2007).
Stakeholder pairs in one of our studies 1 were randomly assigned to conditions designed to either evoke deeper, more careful thought about the topic of discussion (high epistemic motive) or not receive any special instructions on processing the information (low epistemic motive). Immediately before engaging in dialogue, participants in the high epistemic motive condition are given the following instructions: “after you finish your discussion, I will conduct a short interview with each of you to talk about how the discussion went, the quality of the discussion process, and the ways in which you were able to come to an agreement on each specific point. To keep track of the discussion process, I will ask you to jot down a few points as you go along to help you remember the reasoning behind the joint ranking decisions.” By contrast, participants in the low epistemic motive condition are not given any special instructions and simply begin dialogue with their partner. The expectation was that dialogue with deeper processing (high epistemic motive) would yield more positive outcomes than dialogue with more shallow processing.
Outcomes: Measuring effective stakeholder dialogue
To measure the effects of our experimental manipulations on the quality of the stakeholder dialogue process and outcomes, a variety of self-report measures have been used (see Appendix for the measures used). In all of our studies, participants provided self-reports of satisfaction with the dialogue (2-item scale, α = .85), positive views of the experience (5-item scale, α = .60), perceived change in one’s own and partner’s views (2-item scale, α = .88), and optimism about the dialogue outcomes (4-item scale, α = .75; Campbell & Mark, 2006). A measure of self-reported, conflict-handling style has also been administered to 35 dyads (DeDreu, Evers, Beersma, Kluwer, & Nauta, 2001). The Dutch Test of Conflict Handling (DUTCH) is comprised of five, 4-item subscales that measure self-reported use of particular styles of conflict handling in a negotiation process, including yielding to the wishes of the other party (α = .28), compromise, or the use of split-the-difference strategies (α = .63), forcing one’s views on the other party (α = .57), problem solving or searching for solutions with mutual benefit (α = .78), and avoidance of conflict altogether (α = .66).
Key Findings From Analogue Studies of Stakeholder Dialogue
The findings we report here are based on a series of one-way multivariate analyses of variance (MANOVA) examining the main effects of each contextual variable separately (accountability frame: heterogeneous vs. homogeneous; interpersonal prosocial motive: prosocial vs. control; epistemic motive: high vs. low) on our self-report outcome variables. Because random assignment was done at the dyadic level, and to avoid issues of dependence in the data, all analyses were conducted at the level of dyad.
Effects of Accountability Frame on Stakeholder Dialogue Outcomes
A one-way MANOVA was conducted with accountability frame as the independent variable, and measures of optimism, positivity, satisfaction, and perceived change as dependent variables. 2 The multivariate omnibus test was statistically significant, F(4, 51) = 3.95, p = .007. Table 1 provides the univariate results for this analysis. Dyads who adopted a heterogeneous accountability frame tended to view the dialogue process more optimistically and were more satisfied with the experience relative to dyads with a homogeneous accountability frame. Accountability frame did not significantly impact positive views of the dialogue process or perceptions of change in the dialogue partner. While not uniformly consistent, there is at least some evidence to suggest that the way in which the accountability experience is framed can impact the dialogue process in a positive way.
Effects of Accountability Frame on Stakeholder Dialogue Outcomes.a
Note. ns = not significant; SD = standard deviation. a Scale ranges from 1 to 7.
Effects of interpersonal, prosocial motives
Another key observation from the work we have done so far is that compared to dyads where no special instructions were given, dyads primed with prosocial motives showed better outcomes on a variety of measures.
A one-way MANOVA was conducted with prosocial motive (prosocial vs. control) as the independent variable, and measures of optimism, positivity, satisfaction, and perceived change as dependent variables. 3 The multivariate omnibus test was statistically significant, F(4, 89) = 3.19, p = .017. The univariate results, provided in Table 2, show that compared to dyads given no special instructions (control), dyads primed with prosocial motives were significantly more satisfied with the dialogue process and outcomes, felt more optimistic about reaching agreement with people who hold different views, and perceived the process more positively overall.
Effects of Prosocial Motive on Stakeholder Dialogue Outcomes.a
Note. SD = standard deviation. a Scale ranges from 1 to 7.
A separate one-way MANOVA was conducted for the 35 dyads, who completed the DUTCH inventory to examine the effect of prosocial motive on self-reported avoidance, forcing, compromising, and problem-solving conflict-handling strategies. 4 A statistically significant multivariate effect emerged, F(4, 29) = 3.16, p = .028. The univariate results provided in Table 3 indicate that relative to dyads given no special instructions (i.e., controls), prosocially motivated dyads reported using more problem-solving and conflict-avoidance strategies and fewer forcing strategies. Indeed, providing negotiating groups with instructions designed to promote mutual perspective taking seems to be a relatively simple and efficient method to promote more effective stakeholder dialogue and is certainly one worth exploring in practice.
Effects of Prosocial Motive on DUTCH Conflict Handling Scales.a
Note. DUTCH = Dutch Test for Conflict Handling; ns = not significant; SD = standard deviation. a Scale ranges from 1 to 5.
Effects of epistemic motives
The analysis failed to reveal an effect of epistemic motives on stakeholder dialogue. The results of the MANOVA that included all of the dialogue outcome variables did not reach statistical significance, F(8, 25) = 1.61, p = .17, so we did not explore this effect any further.
Summary and Discussion of Results
The experimental analogue does very well at isolating variables of interest, manipulating possible causal variables in a laboratory setting, and examining the effects on a carefully chosen set of outcome variables. In the case of the analogue studies summarized here, we believe they reveal an interesting pattern of results and observations about stakeholder dialogue, patterns that are worthy of further investigation. Although single studies are hardly the final authority on any topic, we suggest that analogue studies can be a worthwhile avenue for advancing our knowledge about an area of evaluation practice, sometimes as a starting point drawing on related literatures and at other times testing with greater causal confidence assertions made from other forms of evidence about evaluation practice. In either case, analogue studies may also suggest future directions for research on evaluation.
In particular, the results of this research suggest a role for the way in which accountability or group membership is “framed” in stakeholder dialogue. Others have found that the way in which group membership and identity is framed, and the extent to which inclusiveness and commonality (vs. exclusion and differences) are emphasized, can have a striking impact on important group dynamics and outcomes (Cunningham & Chelladurai, 2004; Nier, Gaertner, Dovidio, Banker, Ward, & Rust, 2001). Steering away from an emphasis on differences and toward a focus on highlighting similarities, including perhaps a superordinate group identity, can be extremely beneficial. The main focus of such interventions is to change people’s perceptions of the inclusivity of different social groups. Related empirical tests have shown that changing perceptions of groups of people, from “different groups working separately” to “different groups on the same team,” has reduced prejudice and increased cooperation among people in different racial groups (Dovidio, Gaertner, & Validzic, 1998). The results of our work suggest that a similar “intervention” may be effective in stakeholder dialogue settings. The chief goal with interventions such as these is to attempt to loosen the grip of partisanship, and to encourage stakeholders to view group membership boundaries as more permeable, allowing for a broader definition of “sameness.” Certainly there are other ways that evaluators might successfully reframe accountability perceptions. Some of these possibilities will be explored in future research.
Encouraging effects were also observed with the manipulation of prosocial motives. Almost consistently, prosocially motivated dyads demonstrated more positive reactions to the dialogue process, relative to controls. Specifically, dyads who were instructed to engage in mutual perspective taking and to try to truly understand their partner’s viewpoint: viewed the dialogue process more positively, were more satisfied with the process and outcome, and were more willing to problem solve and less likely to force their views on the needs of the other party. Prosocially motivated dyads also reported higher levels of conflict- avoidance behavior compared to dyads given no special instructions.
In responsive evaluation approaches, an important goal is to motivate stakeholders with opposing views to approach the dialogue situation ready to truly listen to and appreciate different viewpoints. The goal is to achieve “mutual understanding among stakeholders and to gain respect across differences” (Baur et al., 2010, p. 245). With this in mind, homogeneous and heterogeneous “circles” or dialogue groups are sometimes used to facilitate mutual learning among stakeholders with different interests (Abma, 2000, 2006). The results of our research lend further empirical support for such responsive methods while also demonstrating promise for techniques such as providing stakeholders with simple instruction sets to promote quality dialogue processes.
Although we found promise with interventions based on accountability frame and prosocial motive, the evidence did not support a role for epistemic motivation in stakeholder dialogue settings. Given that we have only examined epistemic motive with a total of 35 dyads to date, quite possibly our manipulation was not powerful enough to yield the predicted effects. Moreover, simply giving our stakeholders a brief warning that they would be asked to report back the process of the negotiation at the end of the experiment may not have created a sufficiently potent manipulation; more structured, stronger manipulation (such as providing stakeholders with ample time and encouragement to think through the issues before engaging in dialogue and multiple reminders to consider the issues carefully) may yet facilitate mutual perspective taking and productive stakeholder dialogue. Continued may yet facilitate mutual perspective taking and productive stakeholder dialogue. Continued work is needed to determine whether epistemic motivation truly has no valuable role in stakeholder dialogue or whether the problem lies with our operationalization or our test of this variable. Given the strong and consistent effects of epistemic motives observed in the negotiation literature (TenVelden et al., 2010), we are not yet ready to give up on this factor as a potentially effective contextual manipulation.
Limitations of Analogue Studies
Having discussed the findings from our analogue studies of stakeholder dialogue, we turn now to an acknowledgment of some of the limitations of our studies and of analogue methods in general. Chief among the criticisms of analogue studies are limits with respect to external validity. A primary strength of the experimental analogue study is that it privileges internal validity, providing a setting conducive to random assignment to conditions and with that the potential to strengthen causal claims (especially for effects that are modest in size relative to naturally occurring variation). Analogues are, however, by very definition, contrived; they involve situations created by the researcher specifically to examine the behavior of interest. This contrived nature of the analogue study leaves it vulnerable to the concern that its findings may not generalize to other people, settings, or times (Goldman, 1976).
In our research, we used convenience samples of undergraduates who engaged in dialogue in a lab setting. Although our research participants were actual stakeholders in terms of the issues being discussed, they may be quite different from a “typical” evaluation stakeholder. We also used pairs of stakeholders, rather than larger groups for dialogue, which is probably more typical in actual evaluation settings. This aspect of the context may further limit the generalizability of our findings. Finally, our stakeholders engaged in dialogue in a controlled lab setting, which is not the typical set up for actual stakeholder dialogue. Together, these realities could limit the generalizability of our results to actual stakeholders in actual evaluation settings. More generally, it can be argued that analogue studies fail to capture the rich contextual characteristics of real evaluation settings, limiting their usefulness or applicability to real settings.
Still, many of the criticisms that might be launched with respect to the analogue study can be answered with an encouragement to view the analogue study within a larger framework of research on evaluation. Alone, the analogue study has its flaws but viewed in the context of a broader program of research, it can be viewed as an important piece of a larger puzzle. For example, as a follow-up to the analogue studies described here, the first author is currently interviewing practicing evaluators for their perceptions of our analogue findings; the goal is to draw on evaluators’ experience and professional judgments to assess techniques like those we manipulated in the lab (i.e., accountability manipulations, prosocial motive manipulations, epistemic motive manipulations) in terms of their real-world feasibility and utility. In the end, we hope to shed light on whether these kinds of contextual manipulations (perhaps with certain modifications) have the potential to be effective with real stakeholder dialogues. This work is an intermediate step toward externally validating the results of our analogue studies, with a more distant goal of application in the field. More generally, analogue experiments can be part of a sequential mixed-method design (Greene, 2007). Other methods, such as reflective case studies or surveys could come first, contributing to the development of research questions and hypotheses to be tested experimentally in analogue research. Such other methods could alternatively come after an analogue study, extending external validity to the findings from the internally valid analogue experiment. Taken in the context of a broader portfolio of research on evaluation, there are many important benefits of the analogue study, three of which are discussed next.
General Benefits of the Analogue Study for Evaluation Theory Building
Having outlined the specific findings (and limitations) from a series of analogue studies on stakeholder dialogue in particular, we turn now to a discussion of some of the more general benefits of the analogue study for evaluation theory building. A well-done analogue study can have at least three potential benefits: (a) permitting stronger causal inferences about evaluation processes, (b) forcing a clearer explication of evaluation theories, and (c) fostering conceptual interchange between evaluation and related disciplines. We discuss each of these in turn, drawing where necessary on examples from the study just described.
Analogues Can Permit Stronger Causal Inferences
One of the primary advantages of the analogue study is that it allows the researcher to test very specific hypotheses and to exert experimental control to isolate the variables of interest. Existing evaluation theory often leaves the practicing evaluator, with little by way of specific guidance about important procedural details. How, for example, might an evaluator promote mutual perspective taking among a group of stakeholders who must decide on an evaluation’s priorities? Is it possible for relatively simple techniques or strategies to positively influence the dialogue process, such as the instructions we give to stakeholders? Unless we can systematically compare similar cases where such guidance is and is not offered, we cannot know the potential for such strategies to work. Although a number of ideas and suggestions for improving aspects of evaluation practice exist in the literature, few of these ideas are based on systematic, transparent, empirical evidence. Because analogues permit the researcher to use random assignment to conditions, and to exert experimental control, relatively strong causal inferences can result, giving researchers and would-be practitioners perhaps greater confidence in the expected outcomes from a particular technique for practice. The evidence from our work suggests that providing simple instructions to stakeholders, with instructions designed to encourage mutual perspective taking, has a positive effect on stakeholder dialogue outcomes and processes. Such causal evidence may support many evaluators’ existing working hypotheses about how to improve stakeholder dialogue. Causal evidence from analogue studies may also, however, call evaluators’ working hypotheses into question or may reveal the potential in strategies or techniques that evaluators have not yet tried in their own practice.
Despite the encouraging findings, a degree of modesty is warranted when interpreting the results of our analogue studies. Indeed, the results from a single analogue study do not make a theory of stakeholder dialogue. Still, findings from studies such as these can make important contributions to developing evaluation theory, at the least by adding a type of evidence that has so far been missing. In any discipline, good theories are built of cumulative evidence, generated from a variety of methods. The experimental evidence provided here both adds to and complements existing practice-based evidence about stakeholder dialogue and the factors that facilitate its effectiveness.
Analogues Can Encourage Greater Theoretical Explication
The very act of designing an analogue study is an exercise in conceptual and theoretical explication. Deciding how to set up the analogue situation, how to manipulate important contextual variables, and how to measure outcomes of interest absolutely requires conceptual clarity and clear hypotheses about how the variables of interest work together—in other words, a working theory. In our work, we were forced to confront issues of defining, in concrete, measurable terms, predictors of effective stakeholder dialogue (i.e., accountability frames, stakeholder motives) as well as measuring outcomes associated with effective dialogue. Also essential in designing our analogue studies was consideration of which contextual factors may influence the quality of dialogue and how precisely to define and operationalize them.
Beyond issues of simply defining constructs, the analogue researcher must also think dynamically about the process being studied. A working theory about how particulars of context affect the dialogue process, and how some contextual variables might interact with other contextual and/or individual difference variables, is essential not only to the design of a good analogue study but also for the development of a good theory of stakeholder dialogue. Increased conceptual and theoretical clarity may be beneficial far beyond the single evaluation research study. Such explication adds to the broader task of theoretical scaffolding; building “deeper” and more precise theories of evaluation practices and processes is essential for theoretical growth.
Analogues Can Foster Cross-Disciplinary Interchange
A stronger commitment to the use of analogue studies in research on evaluation may also strengthen ties between evaluation and related social science disciplines. Important findings from communication theory and decision theory did much to enrich the early simulation studies on evaluation utilization (Braskamp et al., 1982; Brown et al., 1984). Other social sciences have much to offer in the way of accumulated knowledge about processes that evaluators encounter every day (Donaldson, Gooler, & Scriven, 2002; Fleming, 2011; Geva-May & Thorngate, 2003; Sanna, Panter, Cohen, & Kennedy, 2011; Taut & Brauns, 2003; Tindale & Posavac, 2011; Vaessen & Leeuw, 2010). The field of social psychology alone has amassed evidence on themes including persuasion, attitude change, decision making, bargaining and negotiation, organizational change, and hundreds of others too numerous to mention. The analogue study is an excellent platform for combining the hunches and working hypotheses from evaluation practice with parallel phenomena from social psychology, cognitive psychology, organizational psychology, and countless other disciplines. A marriage of disciplines can result in theoretical expansion and refinement for both evaluation and other disciplines. For example, Campbell and Mark (2006) found that a better understanding of accountability itself was gleaned through a process of comparing the existing work in social psychology on this topic to what is known about the accountability experience in evaluation contexts (Greene, 2000; Ryan & DeStefano, 2001). Disciplinary interchange (e.g., each discipline informing the other) can be mutually beneficial, opening doors to new ways of looking at and thinking about the concepts studied and their compatibility in other domains.
Conclusion
If the long-standing and growing call for more research on evaluation is to be answered, it will likely be fueled in part by more widely shared ideas about how precisely to conduct this research, that is, of the various forms this research may take. Well-conducted analogues should be considered one option, indeed one potentially valuable option, among the many method choices that exist for empirically testing evaluation theory (Mark, 2008). Our research, summarized here, has shown the potential for analogue studies in the context of exploring optional ways of framing stakeholder dialogue. Similarly, well-designed analogue studies on a wide array of evaluation practices could contribute to the evidence base for a more descriptive evaluation theory.
Footnotes
Appendix
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) declared the following financial support for the research, authorship, and/or publication of this article. Some of this research was supported by a grant to the first author from the Social Sciences and Humanities Research Council of Canada.
