Abstract
This article introduces a set of evidence-based principles to guide evaluation practice in contexts where evaluation knowledge is collaboratively produced by evaluators and stakeholders. The data from this study evolved in four phases: two pilot phases exploring the desirability of developing a set of principles; an online questionnaire survey that drew on the expertise of 320 practicing evaluators to identify dimensions, factors or characteristics that enhance or impede success in collaborative approaches in evaluation (CAE); and finally a validation phase involving a subsample of 58 evaluators who participated in the main phase. The principles introduced here stem from the experiences of evaluators who have engaged in CAE in a wide variety of evaluation settings and contexts and the lessons they have learned. They are understood to be interconnected and loosely temporally ordered. We expect the principles to evolve over time, as evaluators learn more about collaborative approaches in context. With this in mind, we pose questions for consideration to stimulate further inquiry.
Introduction and Background
The principles introduced here are an empirically derived system for thinking about collaborative approaches to evaluation (CAE). They are intended to support considerations of professional practice, both generally and in situ. In spearheading this work, we were swayed by the argument that expert behavior is not simply a product of experience, but the disposition to use experience as a learning mechanism (Daley, 1999). The principles presented in Figure 1 as an interconnected set are grounded in the experiences of over 300 practicing evaluators. In synthesizing what worked (or did not work) during their collaborations, and in considering why these approaches worked (or did not work), we began to see patterns in how evaluators’ decisions in collaborative contexts, shaped the nature of their work, and ultimately, the way they perceived success in their evaluation practices. Continuous learning about such patterns can be informative in refining what Kennedy (1983) referred to as “working knowledge.”

An integrated set of principles for use in guiding collaborative approaches to evaluation.
Working knowledge is holistic and shaped by a meaningful synthesis of theory, skills, experiences, assumptions, beliefs, and one’s personal orientation to practice. A rounded working knowledge of evaluation is more likely to help practitioners function in new or unfamiliar contexts, identify complexity, and foresee the effects of potential decision options. This set of principles is offered as a mechanism to support the development of professional working knowledge and evaluator expertise in the use of CAE.
In this paper, we elaborate on the rationale for developing principles for CAE and then describe the empirical process we used to devise them. We then turn to a description of the principles augmented by evaluators' authentic experiences and finally to considerations for application and use, as well as ongoing inquiry.
Why Principles? Why Now?
To guide our work, we adopted the Oxford Dictionary (2015) definition of principles, namely, the “foundation for a system of belief or behaviour or for a chain of reasoning.” The task of generating principles that could be both explicit and well-grounded meant finding a way to reveal the nature of collaborative approaches as it exists in practice (McNiff, 2013). Rosch (1999) argued that distilling knowledge accumulated from experience is the optimal way to derive a system of principles. Patton (2015b) concurs and suggested ways to proceed. Principles are built from lessons that are based on evidence about how to accomplish some desired result. Qualitative inquiry is an especially productive way to generate lessons and principles precisely because purposeful sampling of information-rich cases, systematically and diligently analyzed, yield rich, contextually sensitive findings. (pp. 715–716)
This same thoughtful reflection from both evaluation theorists and practitioners is what buttresses the Program Evaluation Standards, 3rd Edition (Yarbrough, Shulha, Hopson, & Caruthers, 2010). These standards already serve as an effective primary resource for evaluator thinking and practice generally. It was the particular complexity inherent in evaluator/stakeholder interdependence and the influence this interdependence has on more conventional evaluation practices, however, that led us to envision a complementary resource such as principles.
The timing for this project seemed optimal given that the CAE family is growing and now includes a wide range of familiar evaluation approaches and models. These members include community based (e.g., Mark & Shotland, 1985), fourth generation (e.g., Lincoln, 1989), participatory (e.g., Cousins & Earl, 1992; Cousins & Whitmore, 1998; King, 2007), transformative (e.g., Cousins & Whitmore, 1998; Mertens, 2009), deliberative democratic (House & Howe, 2000), empowerment (Fetterman, 2001; Fetterman & Wandersman, 2005), collaborative (O’Sullivan, 2004, 2012; Rodriguez-Campos, 2005, 2012), and developmental evaluation (Patton, 2011), as well as significant change technique (Davies & Dart, 2005) and participatory rural appraisal (Chambers, 1994), to provide only a partial list. Each of these approaches provides evaluators with a strategic direction for their work and material to inform decisions of practice.
While acknowledging the importance of these unique approaches, like Dahler-Larsen (2009) we argue that, “ … the same evaluation model or approach would probably work differently depending on the political, strategic, cultural, and organizational conditions under which it is applied, and evaluators would be intelligent if they kept their evaluation practices flexible and adaptive to the varying contexts” (p. 312). We have taken the stance that applications of CAE are likely to be most powerful when they remain responsive to the purpose and context for the requested evaluation and to the needs and capacities of stakeholders. We contend that fidelity to the processes and strategies associated with a single model have the potential to obfuscate the need for evaluators to be continuously adaptive to the social, historical, ecological, and cultural complexities of the evaluation context (Cousins, Whitmore, & Shulha, 2013). For this reason we maintain that applications of CAE stand to benefit greatly from being informed by principles.
Cousins et al. (2013) argued that efforts to compartmentalize CAE are limiting in that such a focus privileges the given evaluation approach over the context within which the program is being implemented. In answering concerns raised about this position by Fetterman, Rodriguez-Campos, Wandersman, and O’Sullivan (2014), we provided clarification but first confirmed our support for all those making efforts to advance the theory and practices associated with CAE and expressed little doubt that access to well-defined methods, procedures, and processes serves a significant need for many practitioners. Evidence for this is seen in the degree of interest in workshops and resources that provide training and professional development associated with these approaches. However, our case remains firm. During evaluations—especially those in which processes and products are grounded in close human interaction—there are likely to be continuous contextual disturbances. If the selected approach does not address the specifics of how to proceed in the light of such disturbances, evaluators may either feel compelled to stay the course or feel lost at sea as to how to proceed (Cousins, Whitmore, & Shulha, 2014). Either way, the outcome is bound to be less than desirable. In her reflections on collaborative evaluation, the approach promoted by O’Sullivan (2012) and Rodriguez-Campos (2012), Fitzpatrick (2012) provides further support for our position: “…I would argue, distinguishing among different models of evaluation may not be an appropriate goal at this stage of evaluation. Evaluation capacity building, mainstreaming evaluation, attention to context and other recent issues which have drawn the attention of evaluators have been offered not as new models, to replace old, but, rather, as issues that any evaluator should consider and use, as appropriate, to the situation of the evaluation” (p. 559).
General evidence-based principles for collaborative approaches to evaluation are intended to yield guidance rather than direction. Such guidance will be particularly informative when the need for a collaborative approach is being deliberated, when evaluators are not intimately familiar with or wedded to a specific approach, and where the complexity of the program and its environment requires flexibility and adaptation of the collaborative inquiry (Cousins et al., 2014). We agree with Brown (2013) that the hallmark of professional evaluator decision-making is “the enactment of choice among alternative courses of action made in response to perceived changes in circumstances and conducted in a context of ambiguity” (p. 2). Consequently, the value of these principles will rest in their capacity to illuminate complexity rather than resolve it, to inform decisions rather than prescribe them.
And so, we introduce a set of empirically grounded principles that individually, and as a set, show promise in guiding evaluation practice in contexts where the meaning of evaluation is jointly constructed by evaluators and stakeholders. We make no claims that this set as it stands today is either exhaustive or enduring. It is, however, a product of the collective wisdom of evaluators who both embrace CAE in their practice and took the time and interest to help inform how these practices could be understood. Our hope is that as collaborative approaches are both used and refined, empirical and scholarly work will seek to test the veracity of these principles. In encouraging such work, we recognize that the principles themselves are likely to evolve.
Deriving the Principles
The interconnected set of principles in Figure 1 emerged from a 4-year multiple-method, multi-phase study, the sequence of which is summarized in Figure 2. Phase 1 began with deliberations about how best to access the experiences and insights of practicing evaluators around collaborative practices. Ultimately, three of us (Cousins, Whitmore, and Shulha) generated short narratives describing for each of two evaluations: one considered to be highly successful, and the other considered to be less than successful. Our use of counter examples was partly based on our knowledge that it can provoke deeper thinking about the qualities of a complex phenomenon in case-based learning (Poumay, 2001). As we shared our examples, we realized they represented work with a variety of collaborators in assorted evaluation contexts. We also realized that no single standard could be used to judge the success of the evaluation itself. The criteria we had each used to judge our efforts as either “successful” or “less than successful” were personalized and typically connected to expectations we each had of ourselves and for the evaluation. By constructing these self-reports, however, we described instances and circumstances in which the quality of collaboration was deemed to have made some sort of impact on the quality of the evaluation. It was this intersection, as understood by practicing evaluators that we wanted to tap into and use to underpin the yet-to-be developed principles.

Sequential phases of study from 2011 to 2014. Note. AEA = American Evaluation Association; CES = Canadian Evaluation Society; IDEAS = International Development Evaluation Association.
We used Phase 2 to gage the appeal of identifying and constructing an overarching set of principles for CAE. To do this, we facilitated two “think tank” sessions at annual meetings of the American Evaluation Association (AEA, Phase 2a, 2011, 2012), both sponsored by the Topical Interest Group: Collaborative, Participatory and Empowerment Evaluation. The interest and feedback from participants in these sessions spurred our further research. We also learned from them the importance of inviting evaluators to be collaborators in this research, not just as data sources for the project. Subsequently, we invited participants from both think tank sessions to pilot test an online instrument we developed, asking evaluators to submit comments on their experiences in completing the requested task (Phase 2b). Based on this feedback, we further revised the instrument in preparation for distribution to a wider range of evaluators.
In Phase 3, we invited evaluators from three professional evaluation societies to participate in the research: the AEA, the Canadian Evaluation Society (CES), and the International Development Evaluation Association (IDEAS). We ultimately received 320 useable responses, the majority of which came from AEA members (93%). Almost three-quarters of the participants (72%) self-identified as evaluation practitioners, the remaining respondents were from a variety of roles, specifically, teachers/trainers (6.9%), researchers/theorists (10.0%), commissioners or overseers of evaluation (3.4%), and others (6.9%). The majority of participants (54.2%) had over 11 years of experience working as an evaluator.
The qualitative data the participants provided were coded using NVivo (version 10) and analyzed using conventional inductive techniques (Patton, 2015a). Data were sorted into categories that we called contributing factors. Each contributing factor identifies certain influences, conditions or reasons, positive or negative, which shaped the success of the respective applications of CAE. Factors that contributed positively illuminated how certain conditions shaped evaluation success. Factors that contributed negatively typically implicated conditions, attitudes, or events that frustrated the process. Once we identified contributing factors, we grouped them into meaningful clusters and used each cluster to underwrite the naming of the respective principle, an iterative, time-consuming collaborative process. We ultimately generated eight draft principles supported with evidence through this process.
In Phase 4, we circulated the draft principles in a summary paper to the 297 participants who had indicated (in Phase 3) their willingness to participate in the validation process. Included in this communication was an online survey that invited participants to provide feedback on the principles as well as suggested directions for ongoing inquiry and field-testing. Fifty-eight participants provided timely responses. When asked about the importance of each principle as a mechanism to guide CAE (from 1 = “not important” to 7 = “very important”), participants rated each of the proposed principles highly (M = 6.46, SD = 0.57, N = 56). At the same time, many offered important comments on how to improve communication clarity for each principle. As a team we then worked together to remove obscurities and ambiguities and to take into account participants’ feedback as well as input from critical friends and anonymous reviewers. 1 Table 1 summarizes the results.
Building Principles to Guide Collaborative Approaches to Evaluation.
Principles to Guide Collaborative Approaches to Evaluation
The principles described below are not presented in any order of importance. As illustrated in Figure 1, the principles are conceptualized as a set of interdependent considerations that are relevant when using CAE. The importance of interconnectedness became evident in our analysis, where participants’ quotations about their collaborative experiences were occasionally used to support the development of more than one principle. In the ensuing presentation, we draw from participants’ verbatim quotations to support each of the principles.
Two considerations are pivotal to the use of the principles. First, we contend that the set of principles is not a menu from which evaluators ought to choose in undertaking collaborative work. Adherence to individual principles is a matter of degree, as opposed to a “whether-or-not” proposition. In short, we see each principle as being essential to evaluation practice in a collaborative context; the extent to which any given principle is important will depend entirely on contextual conditions, circumstances, and complexities. For an elaborate discussion of the essentiality of principles, see Patton (2015a).
A second consideration is that the principles should not be prioritized a priori, we lay claim to only a loose temporal order. To reiterate, we reason that a decision about which principle to emphasize and when, is likely to be contingent on the purpose of the evaluation, the stage of the evaluation, the context in which the CAE application is being implemented, and the emergence of complexities as the evaluation unfolds.
With this in mind, we now turn to a brief explanation of each principle. 2 Note that contributing factors associated with each principle appear in the diagrammatic excerpt from Figure 1 used to identify the respective principle.
Principle Descriptions
Clarify motivation for collaboration
As evaluators and stakeholders move toward undertaking a collaborative approach, significant attention should be paid to what this means in practice and why such an approach is desirable. While on the face of it, the need for these understandings appears self-evident, our respondents reminded us that policies mandating CAE are not unusual, and the charge to collaborate certainly does not guarantee collaborative practices in action. Not engaging in a frank, joint discussion with stakeholders around a request or proposal for a CAE can be costly. Learning too late that “the grant application articulated a collaborative approach that the managers did not support,” can railroad not only any hope for a meaningful collaboration but also the chance for a meaningful evaluation as well.
The clarification process suggested by this principle is the task of making understandings around the purpose of the evaluation, the information and process needs embedded in the purpose, and the expectations of stakeholders around the collaboration explicit and transparent for all collaborators. As one evaluator told us, this is one way to have “all stakeholders understand and be deeply committed to the project.”
The presence of a “global misunderstanding of the evaluation purposes for both the evaluation team and the main stakeholders” can leave “staff, [feeling] not part of the evaluation committee, and [as though] it is not part of their job to participate.” When the purpose is agreed upon, “stakeholders add additional perspectives of what is feasible to implement, and what they are willing to commit to [in] implementation.”
Some purposes appear more conducive to the use of CAE than others. When the collaborative approach worked well, we were often told that the program improvement, opportunities for individual and organizational learning, and organizational capacity building were high on the list of purposes. We also heard that: The program managers greatly valued the potential for the evaluation to help them continuously improve the program. The stakeholders were very interested in determining how their program worked, how to improve the program, and how to demonstrate outcomes. Everyone had a common goal of program improvement. The staff was committed to learning how to conduct an evaluation.
In contrast, when accountability and legitimizing purposes were the focus of the evaluation, it was more problematic for the collaboration. In one context, “government renewal of the program was at stake in the evaluation and this made the program participants quite defensive.” Another respondent lamented, “…data were used to justify the program rather than improve it.”
Clarifying the expectations around CAE to evaluation requires learning about the extent to which stakeholders will welcome engagement and be prepared to work at fostering the approach. In one instance, success in the collaborative approach was attributed to “stakeholders [who] leveraged their network for the benefit of the design, data collection, and validation of findings.” In contrast, one funder “used the evaluation to force collaboration … among stakeholders.” Sensing reluctance and coercion at the program site early in the process and then taking time to understand and temper these dispositions may be one way to avoid intentional or unintentional sabotage of the evaluation by stakeholders down the road. An example illustrated this clearly: “The program developers and implementers wanted to do their own evaluation.… When we were hired they participated but changed everything – this was not positive participation.”
Not only is it important to establish the meaning of the CAE application early, there are also benefits to reinforcing this meaning over time. Doing so appears to provide a touchstone as the evaluation unfolds. As one respondent put it, “… we both knew we were committed to collaboration, which didn’t make it easy, it just meant we both kept that goal salient when things got tricky”. Nurturing CAE can begin with an agreement about how information gets communicated among those actively involved. “The organization ensured the evaluator [he/she] could speak independently with all stakeholder groups.” In comparison, another evaluator described how the collaboration was thwarted because he or she did not or could not “fully tap into motives of various stakeholders to voluntarily offer information and participate in the evaluation.”
Evaluators citing successful CAE projects have probed, documented, and shared among stakeholders the information and process needs underpinning their work together. “The clients knew what they wanted to understand, but didn’t know how to find out the answers, so I was able to work with them to sort through their ideas and set priorities.” Sometimes these needs and priorities are discovered through formal evaluation activities. “Our stakeholders are always involved in identifying the evaluation questions and this [was] successful to make sure we [were] collecting information that [would] be useful to them.” Not agreeing on the needs to be addressed can have serious consequences in how evaluators and stakeholders perceive the quality of the collaboration and ultimately the evaluation. “Since everything was a priority and everything got measured, nothing was done well.” Another evaluator lamented, “we couldn’t acknowledge or work through differences or prioritize anything.”
Taking what is learned about information and process needs and through joint effort translating these needs into the evaluation design also enables evaluators and stakeholders to examine assumptions about how collaboration might meet these needs. “People are more likely to buy into this [collaborative] process when they have a hand in designing it.” As much as evaluators may be open to joint design, they may encounter stakeholders who are less than enthusiastic. “Not all stakeholders shared a sense of urgency or expected value added by conducting an evaluation.” Left unattended, this may be an indication of a root problem requiring attention. “Program stakeholders did not want the evaluation to occur. [We learned] it had been requested by outside management.”
Engaging stakeholders early and often about their motivation for working collaboratively is no guarantee of a successful collaboration. Evaluators have told us, however, that processing information and evidence about the proposed purpose for the evaluation, identifying the information and process needs underpinning the call for evaluation, and clarifying stakeholder expectations around how the evaluation will unfold can help both evaluators and stakeholders to articulate the assumptions and needs that the call for collaboration is intended to address.
Foster meaningful relationships
A successful CAE, we were told, relies on the quality of the relationships that evaluators and stakeholders are able to develop and sustain. When reflecting on these relationships, one evaluator described how the evaluation “…was conducted in a highly cooperative, and collaborative organizational context, with abundant positive peer/professional relations and a wholesome, trusting, organizational climate.” Valuing and utilizing the contributions of stakeholders appear to be central to a context of mutual respect. One evaluator talked about how critical the understandings of stakeholders were to their team: “the stakeholders knew much more about the context of the evaluation than we could absorb in the short time we were on the job.” At the same time, evaluators recognized that their ability to establish credibility with new stakeholders may be, in part, a function of previous processes: “The stakeholders valued our relevant prior experience on very similar projects. They knew we had a reputation for doing good work on this type of project. They were very open and eager to work with us.” For evaluators, the danger in assuming that reputation can automatically foster a strong relationship is in not remembering the attention, skills, and the effort that were central to developing this reputation.
Trust among evaluators and their stakeholders is not ensured by a contract. Trust requires purposeful effort on all sides and is more effective when the effort is transparent, as one evaluator shared “evaluators need to demonstrate that they want to, and are listening.” It was encouraging to hear that when respect and trust are in place, it helped stakeholders to “… be honest with the evaluators about their strengths and more importantly weaknesses.” In at least one case, trust and respect also facilitated patience with the collaborative process: “The genuine effort to be collaborative was recognized by partners even if it wasn’t always perfectly implemented as a collaborative process by the evaluator.” In such a context, evaluators must avoid “too many unspoken assumptions.” Instead, as one evaluator reported, when energy was invested to “work out differences directly with each other … learning was less threatening.”
It was clear from evaluator stories that the quality of an emerging relationship does not rest solely in the hands of the evaluator. Rather than proceed in the face of resistance, however, it may be wise to revisit the initial motivation for a collaborative approach and the evaluation itself. Some evaluators lamented the fact that despite their best efforts, “stakeholders did not act as partners and had very little trust in the evaluation process” and “program developers weren’t willing to engage with evaluators.” And finally, “from my standpoint, no matter how accessible I tried to be, no matter how many pies and cookies I baked for team meetings, staff still saw me as an outsider whose main job was to check up on them.”
Huberman (1999) documented the influence of sustained interactivity on research utilization. Respect and trust also appear to benefit from this same form of structured and sustained interactivity. For example, one respondent shared that “close and constant contact was instrumental to real-time communication and relationship building.” We heard about the importance of having “clearly defined and communicated expectations, roles, and responsibilities.” With these in place, it is no doubt easier to value and use the “frequent feedback from program managers and local evaluators during all stages of the evaluation.”
“Cultural competence is a stance taken toward culture, not a discrete status or simple mastery of particular knowledge and skills” (AEA, 2011, p. 1). Thus, we were not surprised to learn that relationships are reinforced when CAE projects are monitored for their capacity to acknowledge, respect, and honor diversity. Evaluators who felt their CAE experience had been successful described different approaches to adopting this stance. One purposefully worked through “a values and needs identification process that was inclusive and participatory.” Another put together a “multicultural team of evaluators all of whom were skilled evaluators in their own right as well as bringing different cultural lenses.” A culturally competent stance can only work in favor of CAE. Such a stance was reported to be central in working with “an innovative program in a strongly cultural space. Without collaboration [the evaluation] would have been largely meaningless, as cultural stakeholders wanted their say.”
Becoming culturally competent in evaluation requires evaluators to “maintain a high degree of self-awareness and self-examination to better understand how their own backgrounds and other life experiences serve as assets or limitations in the conduct of an evaluation” (AEA, 2011, p. 1). If we adopt Gray’s (1989) notion that collaboration is “a process through which parties who see different aspects of a problem [or issue] can constructively explore their differences and search for solutions that go beyond their own limited vision of what is possible” (p. 5), then the ability to build and maintain respectful and valued relationships can be viewed as an expression of cultural competence in evaluators.
Develop a shared understanding of the program
All evaluations are encouraged to “document programs and their contexts with appropriate detail and scope for the evaluation purposes” (Yarbrough et al., 2010, p. 185). When the approach to evaluation is collaborative, engaging stakeholders in documenting the goals, objectives, and intended implementation of the program was reported to help the evaluation itself run more smoothly. Typically, stakeholders are included in the program description process: “The involvement of stakeholders provided a more accurate definition of the terms, problems, and population needs [and] culture.” A less conventional way to go about this task according to one respondent involved, “conducting interviews with program participants [which] allowed funder stakeholders to understand how the program worked.” Another cautioned against this approach because “the program was in transition and difficult to find a consistent thread/voice among program participants.” No matter how the description process takes place, focusing on a mutual understanding of what is being evaluated can reduce the likelihood of stakeholders moving forward in the evaluation with “unrealistic expectations about the program outcomes/design.”
Practicing evaluators also told us that there is value in moving beyond the elements of a logic model as a guide for describing and discussing the program. A more shared understanding of both the organizational context within which the program is operating and the organization’s capacity for engaging in CAE to evaluate may be equally important. Successful collaborations were connected to working with “a program manager who was intent on making sure that her program was successful, constantly improving and had the documentation to prove it” and with “supervisors [who] supported program developers, implementers, and front-line staff to have time to work on evaluation.” In one practitioner’s experience, it was unclear whether an evaluation was ever conducted, but the option for collaboration for this evaluator certainly vanished when disturbing evidence about the program’s organization led to this judgment: “Institutional racism [was] very strong and embedded throughout the bureaucratic system. [It was] perpetuated in research and evaluation.”
It was not uncommon for the evaluators in our study to begin their collaborative work with stakeholders feeling confident in the capacity of the organization to embrace the process. The problem was that over the life of the evaluation, this capacity may have diminished or disappeared. We were told about “a mid-project change in administration [that] decreased political support for the project, [and] the motivation for stakeholders to participate,” and how “significant organizational turnover occurred at the dissemination and use phase, so new leadership wanted to follow a new vision rendering the work irrelevant.” Whether or not evaluators have the wherewithal to mitigate the erosion of commitment to CAE will depend both on the skills of the evaluator and emerging conditions within the organization. In any case, continuous monitoring of the organization can alert evaluators to conditions that may erode the willingness or ability of stakeholders to engage.
Promote appropriate participatory processes
What does it mean for stakeholders to be “involved” in a CAE? One way to answer this question is to refer to the significant work that has been done to name, define, document, differentiate, and compartmentalize specific forms of evaluations that promote a collaborative approach (Fetterman et al., 2014). Alternatively, if we adopt Gray’s (1989) notion of collaboration as cited above, involvement in a collaborative approach to evaluation can be operationalized in a more contextually responsive way. Being involved in a collaborative approach then becomes defined by the context-specific decisions and processes that evaluators and stakeholders make, in order to use their differing visions, in search of optimal ways to address identified information and process needs. Regardless of whether CAE is predetermined, emergent or some combination of both, decisions will need to be made about the optimal form of participation; specifically, who will participate in the collaboration, how those identified will participate, and who will have control over decision making during the various phases of their joint effort (Cousins & Chouinard, 2012; Cousins & Whitmore, 1998). References to these essential facets of participation infused the experiences that evaluators shared with us.
Evaluators in this study were often inclined to attribute the degree of success they experienced in using CAE to the way one or more of these three dimensions played out. Where they varied was in where along these dimensions they chose to operate. For example, one evaluator described how … participants were close to—and ultimately owned—the data. They helped design the tools, collect the data, analyze the data, interpret the data, and presented findings. It wasn’t just buy-in to the process and outcome—it was implementing the process themselves (not being led through) and generating (not being given and asked for their thoughts about) and owning the outcomes.
Many evaluators reported on the importance of identifying and considering a variety of stakeholders, especially when these individuals or groups “otherwise might not have been involved.” Doing so, made it possible to “prevent a few stakeholders and other decision makers from remaining silent partners.” One evaluator was proud that for the first time in a specific context, “the project had a more in-depth process to hear beneficiaries’ voices.” We heard, however, that the actual challenge might not be in identifying the diversity of stakeholders most appropriate for inclusion in the CAE project, but in negotiating their participation. “Decision makers did not want those best equipped to contribute … to participate as anything more than sources of data—following marching orders.”
The optimal depth of participation of stakeholders will be different depending on the purpose or form of collaborative approach. There was evidence, however, that in CAE experiences considered to be successful, those who were identified as being central to the collaboration were also engaged in shaping the approach. In one instance, “stakeholders and the evaluator participated in conceptualizing the project, before it was even funded.” The evaluation appears to benefit when these same stakeholders are engaged in meaning making at critical points along the way: “Preparing reports involved stakeholders so that multiple ways of sharing evaluation information was done. These methods included briefs, interim reports at various milestones, presentations, and final report with an executive brief.” We were also told of successful CAE applications where “stakeholders interpreted the findings, generated and implemented recommendations for change, and presented the findings to their colleagues.” While there is no rule for how deeply stakeholders should participate in the evaluation, the worst-case scenario appears to be when there are “unclear expectations for participation in evaluation activities.”
Decision making within CAE may just be the most difficult participatory process to manage. One evaluator did attribute the success of the evaluation directly to the “evaluator [being] open to sharing the control of the evaluation, particularly as it related to instrument choice and development, data collection methods, and interpretation of data.” Many more, however, attributed their less than successful experience in implementing a collaborative approach to evaluation to the complications that arose around control over decision making. In these examples, it is interesting to note that the power issues were not always between the evaluator and stakeholders. All stakeholders are not created equal. Some have greater influence over others and do believe their voice should carry greater weight in articulating the evaluation findings. The advisory group did not have any real influence to change the evaluation design or the survey assessment tools. My sense is that they perceived their lack of influence and the group did not meet as a group beyond one or two meetings. The steering committee was not representative, and unduly influenced or even forced the whole process. Program developers wanted too much involvement and did not have the skills or experience to help with the evaluation; they co-opted the evaluation and then used it to report what they wanted to report, rather than reality. Funders ultimately were poor collaborators. [They] dominated … power dynamics [were] dysfunctional. A vulnerable group [was able] to drive the evaluation. Because they represented the majority [of] participants, at any meeting, they were able to control data interpretation and meaning making. Instead of an evaluator leading the stakeholders through a well thought out process, the program team was somewhat in charge and the evaluator was at their mercy.
Monitor and respond to the resource availability
The full cost required to support CAE, including budget, time, and skilled personnel can remain abstract to both evaluators and stakeholders, sometimes until well into the process. Typically, this set of resources works together to shape the feasibility of the collaborative approach. A change in one can dramatically influence the others.
Only the evaluator headlining this principle made reference to a purposeful redistribution of funds in support of the time and effort required of the stakeholders engaged in the collaboration. It was more common to hear from evaluators that there were “insufficient funds to meet the expectations for the evaluation”. If the collaboration is identified as “part of the job” for those who will be heavily involved, then asking what will be removed from the list of their responsibilities during the evaluation may be a way to revisit the purpose of and expectations around the CAE.
Even when everyone is confident that the evaluation has been appropriately funded, fiscal conditions are known to change. Evaluators told us about how a “budget crisis led to an evaluation cutback.” Such changes are often beyond the control of those collaborating. But evaluation funds can also be vulnerable in the face of existing or emerging organizational priorities. “The funded initiative was the result of available, external funding. Although the management had agreed the [evaluation] work was important, really they just wanted the money.”
It is not unusual for CAE to require more time to implement than conventional approaches. This appears to be especially true when there are process and capacity-building goals. Stakeholders may need to negotiate the timing of their engagement or be coached in the skills defining their participation. One evaluator attributed success in a CAE project to “taking the time not to rush the process and [providing] ample opportunity for partners to provide feedback.” The fact that the evaluator was willing to take the time necessary for meaningful collaboration “was recognized and appreciated by partners.” Being conscious of externally imposed time constraints and the implications on time for growing the collaboration and nurturing the appropriate inquiry skills is critical. One evaluator told us about how success was constrained by, “limits [that] existed around the school year.” Another lamented, “the yearly evaluation [was] based on a funding cycle [that was] too short to capture more qualitative changes.”
In CAE, the most important resource may be the people who are working together. Many evaluators talked about having difficulty motivating stakeholders to stay engaged, and it was not because the evaluation was going badly. Instead the collaboration suffered from emerging conditions within the evaluation context. For example, [There was] sporadic participation by stakeholders … because of schedule conflicts and work demands. The program [was] coming to an end and staff [were] looking for other job opportunities. [There was] staff turnover due to job instability within the organization. [This] significantly limited the ability to engage more stakeholders.
Whether grounded in a specific and well-defined collaborative approach or evolving from an examination of evaluation purposes, needs, expectations, and context, a collaborative approach works best when the personnel collaborating can, as a team, bring to the table an appropriate combination of facilitation and inquiry skills. Predicting what these skills might be can help to establish realistic expectations for roles and responsibilities. “The evaluator was not an expert in the program content area and absolutely needed stakeholders to provide clarity about how the data would be used and what the boundary conditions were for asking questions of intended beneficiaries.”
Assessing the extent to which the skills critical to the approach are present is one step. One evaluator said, “the program managers did not have adequate data management skill,” while another noted that, “the evaluation team did not have breadth of experience to deal with complexity of program.” Deciding what to do about the need for more expertise in a supposedly collaborative approach is more problematic, in which one fallback position is “more reliance on the evaluator for their skill sets and expertise.”
Monitor evaluation progress and quality
Evaluations will progress toward important findings and useful outcomes as long as the design driving the evaluation remains credible. Evaluation design decisions are typically the most appropriate and powerful at the moment they are agreed upon. Our participants reminded us, however, that, as the CAE study unfolds, it is not unusual for the evaluation context (i.e., people, programs, organizational context, information needs, and process needs) to change. The continued implementation of early design decisions, therefore, may actually be problematic in moving the evaluation forward. Frank conversations at these times allow for modifications that can keep the evaluation on track. For example, “[the] climate in which the evaluation was being conducted was conducive for talking about disagreements openly and honestly and coming to a resolution that everyone was okay with moving forward.”
Evaluators talked about the need to remain “flexible [and] very open to [making] adjustment/changes in the evaluation design and implementation.” Staying attuned to the consequences of early design decisions, “on an ongoing basis”, can alert the evaluator and stakeholders about the need to re-envision the path forward. Acknowledging and sometimes confronting each other with any accelerating lack of fit between the intended evaluation design and the capacity of the collaboration to implement this design can be productive and evaluation saving. As one respondent reported “the evaluation design changed and adapted to include negotiations around stakeholder needs and requests.” A preliminary assessment of the organization’s stability and evaluation history may help to avoid a situation such as this one: “The evaluation design, specified in a grant proposal, [became] awkward and burdensome, and it could not be changed after the grant was awarded.”
Evaluators in this study identified issues around data collection as the most common threats to evaluation progress and technical quality. Evaluators are typically well trained in how to create logically and methodologically sound evaluation designs—ones that will lead to reliable/dependable data and justified/trustworthy findings. Our participants revealed that issues around accuracy (Yarbrough et al., 2010) are often no different in CAE than they are in non-collaborative designs. For example, “sadly, data requested were not accurate because stakeholders were “hiding” duplication and overlap of services for their clients” and “evaluation data, by the time it reached those who could use it best, was so massaged by the client … [it had to look just so] that it was watered down and meaningless.”
It is apparent, however, that there are some unique pitfalls when purposefully engaging stakeholders in data decisions. Assuming that stakeholders are appreciative of the implications of data quality on findings and outcomes may be the first of these. “Frontline staff, who were responsible for collecting the data, did not understand the importance of getting it collected accurately.” Also problematic are situations where collaborators see response rates as the gold standard for assessing the quality of their work in developing or administering data collection tools. For example, “the staff changed instrumentation during the collection phase but did not inform the evaluator about the need to change the instrument.” This situation may have been triggered by a legitimate concern for those acting as data sources and how they were interacting with the instruments. Another evaluator commented, “instruments used in evaluation were interpreted like a test to be passed.” Not all stakeholders will be engaged in shaping elements of the evaluation even when a collaborative approach is guiding decision-making.
It is critical to remember that individuals and groups whose experiences, ideas, and feelings are central to answering the evaluation questions are also stakeholders. Guiding collaborators on how to help those offering data to see the importance of their role and the logic behind the data collection tools may be necessary. Doing so will improve the likelihood that data collection instruments are working as intended and may even build evaluation capacity in the collaborators themselves.
In an ideal world, the resources to build the data collection skills of those involved are integrated into the budget. For example, “the stakeholders provided the personnel to train and collect our data systematically across all sites and followed up on any data collection problems that we identified.” A more common story is of stakeholders striking out with minimal or no preparation. As a consequence we were told that, “some of the changes to the evaluation made by stakeholders resulted in questions that were duplicative or difficult to interpret” and “measurable indicators were poorly chosen and did not answer critical questions about program benefits.” In the absence of formal training of stakeholders clarifying the assumptions that stakeholders have concerning the value of data collection instruments and the processes of data collection are recommended, such attention may reduce the amount of monitoring necessary during data collection itself and go a long way in preserving the integrity of the evaluation.
Promote evaluative thinking
Archibald (2013) defined evaluative thinking as, “an attitude of inquisitiveness and a belief in the value of evidence, that involves skills such as identifying assumptions, posing thoughtful questions, pursuing deeper understanding through reflection, and perspective taking and making informed decisions in preparation for action” (November, 2013, p. 5). Preskill and Torres (1999) described how evaluative thinking requires “members [to] come together to engage in the learning processes of: (1) Dialogue, (2) Reflection, (3) Asking Questions, and (4) Identifying and Clarifying Values, Beliefs, Assumptions, and Knowledge” (p. 45). When evaluative thinking is done collaboratively, it is intended to make evaluation processes and findings “more meaningful to stakeholders, more useful for decision makers and more effective within an organization” (Torres et al., 2000, p. 28). The strong relationship between inquiry and learning already established as part of formal evaluative thinking was verified by participants in this project.
It was common to see CAE successes attributed to both the collaborations and their evaluations connected to a commitment to inquiry. For example, “the organization put evaluation [as] the top priority. [They were] willing to spend time on it” and “the evaluation had the buy-in of agency management and staff at all levels.” Specific reference was often made to the disposition of stakeholders toward evidence-informed decision making: The program managers have used evaluative data for program improvement and to communicate program successes to stakeholders. The program administrator was committed to collecting data and using information for program improvement, accountability and future funding purposes. [They] believed in the value of using a rigorous evaluation design and data collection/analysis procedures.
How does an evaluator help stakeholders to keep the inquiry focused on learning? This was the challenge faced by at least one evaluator in our study who described “low interest in learning about project successes and challenges.… [It is] easier to implement [a program] when you [think you] already know it, than to modify and change according to the findings, conclusions, recommendations.” From what participants in this study said, the key may be to suppress any early inclinations to assure collaborators of the current merit, worth, and significance of their program. In more successful contexts, significant energy seems to have been spent helping collaborators first becoming invested in the learning process and being prepared for the unexpected. For example, Because of the stakeholder commitment, results were used as an opportunity to learn and grow. Program designers and implementers were open to feedback and recommendations based on data from the evaluation. Stakeholders were willing to accept negative or contrary results without killing the messenger.
The educative role may be equally important when it is apparent that stakeholders have misunderstandings about how evaluation might work for them. In one context, an evaluator reported that “the culture of the group receiving the evaluation was not one that fostered data use. They saw the evaluation as being for ‘someone else’ or as a ‘proof of concept’ to justify expenditures.” In another, “line staff were not invested in the evaluation and had no interest in learning about evaluation because they felt it did not apply to their lives. Once the grant was over, they felt they’d never see evaluation again.” Evaluators may need to draw liberally on the strategies suggested by Archibald (2013) and Preskill and Torres (2000) if they are to help stakeholders move beyond initial obstructing assumptions. In the absence of strong evidence that stakeholders and their organizations are invested in evaluative thinking, the experiences of evaluators who participated in this study suggest that it would be wise to cultivate it.
Follow through to realize use
Kirkhart’s (2000) framework on evaluation influence was pivotal in helping evaluators expand their vision of the consequences of evaluation for persons and systems. Our data were not rich enough to examine how all three dimensions (i.e., source, intention, and time) Kirkhart suggested may have been molded by CAE. They did reveal, however, that evaluating the outcomes dimensions, as stimulated both by the evaluation results and processes, were shaped by collaboration. The influence of CAE on outcomes, were easily characterized using a framework proposed by Cousins and Whitmore (1998). Practical outcomes at the organizational level influence program, policy and structural decision making. For stakeholders, practical outcomes are seen through a change in disposition toward the program or evaluation, and the development of program skills including systematic evaluative inquiry. Transformative outcomes are connected primarily to a change in the way organizations and individuals view the construction of knowledge and the distribution and use of power and control. Such changes are labeled transformative only when they promote the type of social change that enhances independence and democratic capacities of ordinary people. The strength of this framework is its recognized capacity to capture outcomes that can surface in participatory contexts and to stimulate thinking about how to proceed with CAE when these outcomes are a priority (King, 2007).
In this study, collaboration influenced practical outcomes directly through the way findings were derived. By working with the data together, “stakeholders were better able to consider multiple reasons for results.” One evaluator identified the need for buy-in from those not normally targeted for participation. Expanding the stakeholder group can be an appropriate adaptation as the evaluation unfolds. “Representatives from among the program’s intended beneficiaries were actively involved in data collection and in reporting results; the evaluation [findings] had greater success in getting a serious hearing when program decisions were made.” When stakeholders were not engaged in the production of findings, it was more common to hear of a “miscommunication or misunderstanding of what the final product (i.e., the report) could tell us.” One evaluator lamented, “we tried to get them engaged in interpreting results but to no avail. The management team had no time to give to … meaning making.”
One positive but possibly complicating consequence of collaborating around data is that stakeholders may form conclusions around findings as the evaluation proceeds. Helping stakeholders suspend judgments until the data are in can be challenging. Acting on partial answers to evaluation questions, however, can be perilous for practical outcomes, as the following comments suggest. The agency did not wait long enough for results, decided that early intervention had succeeded, adopted a version system wide, and saw costs balloon because the intervention did not [work]. … the results created problems for senior management. They revealed that major barriers to applying evidence-based practices were their internal systems.
Possibly the most disappointing experience for evaluators using CAE is to discover that, despite best efforts, there was “no real intent to do anything with the results. Once [the evaluation] was complete, they could get back to business as usual.” For those confronting this kind of frustration, documentation throughout the evaluation on how the collaborative approaches were conceptualized and implemented can at least enable a productive internal meta-evaluation of assumptions and decisions (Yarbrough et al., 2010). This form of continuous professional learning typifies evaluators who grow, rather than repeat, their practice. Despite these pitfalls, CAE appears to enhance opportunities for practical outcomes, as seen in this example: “Participatory data interpretation process led to insights about program improvement that were immediately adopted and implemented.”
Transformative outcomes have arisen in contexts where evaluative inquiry has been used to deliberatively engage stakeholders in the social construction of knowledge, and when these emerging understandings were then used both to propel the evaluation forward and to re-shape the program and the organization (Flores, 2007; Harnar, 2012; Monkman, Miles, & Easton, 2007). Our participants confirmed this stance even though they did not explicitly identify transformative outcomes as a goal for their work. At the organizational level, our participants reported that “working collaboratively deepened the sense of community among the stakeholders,” and that “stakeholders became more empathetic to intended beneficiaries of results due to understanding the complex nature of problems.”
Transformational outcomes were implied more when the facilitating evaluator appeared to be skillful in promoting inquiry and have expertise in human and social dynamics. 3 Being prepared to work toward transformational outcomes almost certainly means being prepared to work in contexts where there are differences and even conflict. For example, “meetings were held with stakeholders with different viewpoints and interests, and when the program started there was some tension among them. Meetings and co-creation of evaluation promoted meaningful discussion, understanding and helped to reduce tension.” In another context, “the collaborative approach was highly successful because it supported relationship building between the two communities. The processes revealed that the two communities had not had formal relationships for decades and this needed to be addressed.”
Evaluators in this project confirmed a relationship between the use of collaborative approaches and the appearance of practical and transformational outcomes. We realize, however, that these types of outcomes are not totally independent. In some instances, the actual choice of using a collaborative approach was evidence of an intentional effort to generate outcomes that would have immediate and positive consequences for stakeholders and their organizations. Disappointment in their ability to consistently accomplish this goal suggests that evaluators working on CAE would be wise to negotiate with stakeholders: (a) the range of outcomes possible given the scope of the evaluation, (b) which outcomes are most worthy of purposeful attention, and (c) how joint effort might best facilitate these outcomes.
Moving the Principles Forward
The principles as derived and described above reflect the understandings that current evaluation practitioners bring to the complexity of implementing CAE. We introduce these principles, not as advocates for or owners of them. Rather, we encourage colleagues—academics, practitioners, commissioners, and stakeholders in evaluation—to take these principles forward and make them “the subject of continuous analysis and renewal through dialogue and systematic inquiry” (Cousins et al., 2013, p. 18). Four possibilities come to mind, although we would not see the list as exhaustive. These possibilities are using the principles as (a) a guide to planning and implementing CAE; (b) a basis for retrospective reflection on completed projects (with an eye to surfacing lessons learned); (c) a framework for structuring evaluation education, training, and professional development; and (d) a conceptual framework for ongoing research on evaluation (RoE).
To be relevant, these principles must show potential for adding value to our collective efforts. To be powerful, the nuances of interdependence need to be made even more explicit. The true test of these principles will be the extent to which they are able to strengthen collaborations and enhance evaluator working knowledge.
We also challenge readers to keep in mind the boundaries of the knowledge that has shaped them. For any set of principles to be truly appropriate for collaborative work, stakeholders need to be included in the conversation. This point is most certainly not lost on us. The principles could, no doubt, benefit greatly from having wide variety of stakeholders and evaluation teams reflecting systematically on the complexities of interdependence (Jill Chouinard, personal communication, November 2014). Such efforts could shed important light on how these guiding principles might best be refreshed. In our view, other important questions that remain unaddressed or only partly addressed include at least the following: How specifically do evaluators define success in CAE? How do these principles align with specific members of the CAE family? Do some principles more than others complement these approaches? To what extent are these principles responsive to contextual complexities? Do the principles resonate well in varying cultural contexts? Are these principles differently useful to novice and seasoned evaluators? And if so, in what ways?
Our sense is that the principles, when used as a set to guide and reflect on collaborative practice hold strong potential for enhancing the success of such evaluations and we encourage ongoing, well-documented field trials to confirm this hunch. But there are other potential benefits of the principles. Within the realm of evaluation education and training, one can easily imagine the development of instructional modules around each principle. Might they not also serve as an interesting basis for organizational evaluation policy reform? Regardless, it is our conviction that the principles require solid test-driving opportunities and they should be revisited and perhaps re-engineered sometime not too far down the road.
In conclusion, we want to underscore the contributions of the significant group of evaluators who helped to shape this research and development project, and the hundreds more who were willing to contribute their experiences. Our effort to develop these principles was sustained by the commitment of colleagues to the project and to the idea of improving evaluation practice. We regard this as the beginning of a dynamic and evolving conversation and look forward to the dialogue.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
