Abstract
There is a growing call for evaluation capacity building (ECB); although the area currently lacks a rich research base, there are few robust methods and practice through which to define it. The argument in this article is that the impact of evaluation is mediated by program stakeholders’ engagement in evaluation activities. This mediation provides a foundation for a consideration of the merit of ECB. This study sought to find confirmation that stakeholder engagement in evaluation could influence the outcomes of a program and thus subsequently provide evidence as to the merit and significance of evaluation engagement. Multiple forms of evaluation data were collected for two long-term public health evaluations and coded and collated again on an evidence-based rubric in close consultation with the community. The influence of stakeholder’s willingness and capacity to engage in evaluation activity was also investigated. The analysis revealed that stakeholders’ engagement in evaluations provided reasonably unique contributions to the overall program outcomes. This article provides the impetus for evaluation-based capacity building, in that it presents empirical evidence that willingness, capacity to engage in evaluation activities, and the use of evaluation information increases the probability of achieving desired outcomes and sustainability.
Background
Over the past several years, evaluation capacity building (ECB) has increasingly dominated discussion as it has become a critical component in many areas of evaluation. According to Preskill and Boyle (2008), ECB is a process by which strategies are designed and implemented to assist individuals, groups, and organizations in the process of conducting effective, useful, and professional evaluation practice. While there is a growing base of theoretical and empirical ECB literature (Cousins, Goh, Clark, & Lee, 2004; Preskill & Boyle, 2008), there is still a low level of research and evaluation of various ECB initiatives. At its heart, ECB lacks construct specificity in that its purpose, definition, and worth remain questionable.
Why is ECB slow off the mark in demonstrating its worth and developing a meaningful common language? Why are individuals and organizations not embracing a concept that intuitively makes sense and is in high demand? Through our current base of knowledge, we know that ECB is a complex phenomenon involving issues of individual learning, organizational change, sustained change, program processes, and various outcomes (Preskill & Boyle, 2008). Yet, despite the encouraging headway made by many leading evaluation experts (Cousins et al., 2004), the “how to” of ECB is still to be developed and evaluated.
Wing (2004) suggested “the challenges to evaluating ECB derive from the complex interplay within and across individual and organizational factors, the difficulty of isolating causal factors of change both for the organization and for program outcomes, and the long-term goals of sustainable change.” Labin, Duffy, Meyers, Wandersman, and Lesesne (2012) in the development of their integrative ECB model and research synthesis tackled the notion of why evaluators should bother with ECB. They suggest that the need for ECB might stem from a combination of internal and external factors. While Preskill and Boyle (2008) suggest that understanding the motivation as well as expectations for engaging in ECB is critical for successful implementation of ECB programs. Ultimately, we need to determine the merit of stakeholder engagement in ECB and be confident that ECB does add value. To do this, we must also be assured that evaluation per se adds value. It can be argued that if a majority of evaluation sponsors are not convinced of the worth of evaluation as a discipline, then few are going to invest in capacity development in evaluation. Like many worthwhile notions, it is difficult to isolate the effects of evaluation and so it follows that attributing an influence of ECB is also difficult. This leads to the continuing plea for evaluators to specifically identify the areas in which they add value and address the key driving evaluation question: What is the true impact of evaluation?
Once we have a greater understanding of the nature of this impact, we may further define the role of stakeholders in this process and thence the needed capacities to be developed. It is the case that the impact of evaluation is mediated by program stakeholders’ engagement in evaluation activities (Clinton, O’Connor, & Mahony, 2010). The contention in this article is that this mediation provides the foundation for considering the merit of ECB. This study sought to find evidence that stakeholder engagement in evaluation could influence the outcomes of a program and therefore provide evidence as to the merit and significance of evaluation engagement.
The Evaluation Projects
The two long-term public health evaluations that this study focuses on include an explicit requirement to illustrate the impact of evaluation on the organizations’ learning environments, thus embedding an element of ECB in both projects. These evaluations provided an opportunity to measure engagement in evaluation activities as well as evaluation use, influence, and impact.
Both evaluations were multisite and multi-intervention physical activity and nutrition health promotion programs with a focus on community collaboration. The programs were conducted across two large regions in New Zealand with a total of 292 initiatives across the two programs. A culturally adapted form of the Centers for Disease Control Evaluation Framework for Public Health provided high-level guidance and the structure to guide the collection of multiple forms of data in both evaluations (Clinton et al., 2010). A common set of constructs was collected in both evaluation projects over a period of 4 years (see Table 1). These constructs allowed for the measurement of both the process and product for each program.
Measurement Constructs Utilized.
Method
Multiple forms of data collection were used, including reports, surveys, case studies, interviews, observations, and available secondary data. These data were collected regularly by the evaluators, program providers, or funders. Common dimensions were used to code key information from the 292 initiatives using scoring rubrics developed as part of the evaluation; the rubric and their criteria were created from an evidence base (see Table 2). The standards for the indicators were developed in various workshops and in consultation with the communities. The community stakeholders assessed the exemplars on a scale of 1–10 where 1 was not evident and 10 was advanced. Stakeholders were also asked to explain their judgments, and this information was recorded. Thus, the coding of the evaluation information was informed by a research base and stakeholder input.
Summary of Scoring Rubric.
While all dimensions form a part of this research, a key focus is the influence on the program process of stakeholder’s level of readiness to engage in evaluation activities. Evaluation readiness in this context could be described as the willingness and capacity to engage in evaluation activities—capacity relates to experience, skills, and knowledge of evaluation, whereas willingness relates more to attitude and motivation. Willingness is the construct often discussed by evaluation practitioners yet not often researched, Preskill and Torres (2009) and Owen (2006) discussed the importance of understanding stakeholders’ attitude and agenda for evaluation, and Labin etal., (2012) suggested that positive attitudes were not often considered in research on ECB, however, negative attitudes were described as a barrier.
The methodology not only allowed the evaluators to obtain information on the level of evaluation engagement from stakeholders and the organizations, but it also provided an opportunity to add further insight into the construct. The rubric (Table 2) illustrates that it was possible to obtain a high level of achievement on the evaluation readiness construct when the functioning of the specific initiative groups is characterized by several of the following markers: High-achieving groups were fully aware of and engaged in the evaluation process, and they readily provided the evaluation team with documentation of their progress in the form of one-on-one interviews, meetings minutes, progress reports, and various other documents. Further, these groups needed to display a keen interest in the utilization of the findings from the evaluation to inform their own future practice and improve upon the current successes and implementation of the program. High-achieving groups also demonstrated that they had the capacity within their organization to engage in both internal and external formal evaluation processes. Evaluation information collected from these groups demonstrated a good knowledge of evaluation processes and the ability to provide detailed information, showing a depth of understanding about the project. These initiatives valued and supported the time and resources devoted to evaluation activities (e.g., evaluation was noted in job descriptions for project managers and community boards).
An additional construct to be considered was program adaptation, as it is suggested that this is strongly related to evaluation engagement. Program adaptation relates to the degree of change observed within the initiative or organization as a consequence of information gathered and use through evaluation activities. For example, following the development of a program logic, were the initiative objectives tightened? Alternatively, does an organization commence monitoring the adherence to the logic model?
Combining the coded information collected from the evaluation of both programs allowed for an analysis of the influence of evaluation for each program and over time. The data collected from the 292 initiatives over a period of 4 years were analyzed using descriptive statistics; followed by a factor analysis to understand the strength of the high-level variables. Subsequently, a structural equation model was determined to explore relationships between the variables.
Results
The means and standard deviations are presented for the 292 initiatives in Table 3; they range from 2.60 to 7.27 and thus are not exceptionally high. Note that adaptation has an inverse relationship and should be low. Adaptation is a measure of initiative change as a consequence of evaluation information, and these indicators show a reasonable level of adaptation across both programs.
Factor Structure of the Evaluation Components.
The factor analysis revealed two very clean factors that can be categorized as variables relating to either program process or program products (Table 4). Evaluation readiness and engagement loaded strongly on the process factor. The correlation between these two factors was very low (r = −.22), indicating that they were providing reasonably unique contributions to the total evaluation.
Factor Means.
As is the case for most programs, it was the program outcomes and sustainability that determined the merit and significance for these two programs. A structural equation model was used to build an understanding of what factors contribute to program outcomes and program sustainability, as well as to understand the role of evaluation in this process. All dimensions were used in analysis; however, the contribution of evaluation engagement was considered the focal point (Table 5).
Evaluation Readiness.
Overall, the analysis demonstrated that all the dimensions in some way contribute to program outcomes and sustainability. A full structural equation model was analyzed, whereby all of the six predictors were related to the two outcomes (χ2 = 147.23, df = 16, p < .001; root mean square of approximation = .0443). The predictors explained 54% of the variance for program sustainability and 81% of the variance of the progress toward a goal (Figure 1). The analysis also revealed that the degree of implementation, achievement of key performance indicators (KPIs), and subsequent adaptation were significant predictors of progression toward outcomes. The model illustrates that organizational development, evaluation readiness, and collaboration are all good predictors of sustainability, although the weighting varied according to time and program. The place of degree of implementation is highlighted, as it is the only dimension to load on both outcome variables.

Program evaluation and its relationship to outcomes and program sustainability.
Discussion
The model demonstrates that it is too simplistic to suggest that an outcome can be achieved through an intervention activity alone—as a complex mixture of constructs come into play. As would be expected, the degree of implementation is the most significant factor for both program outcomes and sustainability. While the importance of the “dosage” of an intervention cannot be underestimated when looking for outcomes, the mediating variable must not be ignored. This study has shown that these mediating variables are intertwined or perhaps work in unison. Take, for example, key performance indicators—the high levels of achievement of KPIs relate specifically to program outcomes, as does program adaptation; both contribute in some way to the program. Following on from this, we have demonstrated that using evaluation information to implement program adaptation is essential to achieving outcomes. Consequently, for change to occur and hence a program be successful, stakeholders need to engage in a process of critical reflection, whereby information is sought, reflected upon, used, and adaptation occurs.
Process variables, such as organizational development and collaboration, were found to provide a critical contribution to the sustainability of a program. It has long been established that if a program is to continue successfully having appropriate resources, a plan for quality implementation management and appropriate workforce is essential. While the research presented confirms this premise, it also highlights that evaluation activity has an essential role in this process. Similarly, collaboration is a worthwhile construct for both program development and evaluation. While many do not see collaboration as important and certainly in this model it does not contribute to achieving specific outcomes, it does however relate to program sustainability. Labin et al. (2012, p. 18) argued that collaboration is a critical variable for ECB, as “it is an essential thread in the fabric of ECB efforts.” The construct of collaboration in these programs and in the evaluation suggested that stakeholders were sharing ideas, information, and action, hence, thinking and working collectively in an evaluative way. The critical component here is the willingness and capacity of the stakeholders to engage in this process of reflection and adaption collaboratively.
The analysis presented in this article demonstrates that evaluation engagement and the use of evaluation information are essential components of the process of achieving program outcomes and sustainability, henceforth highlighting the critical role of evaluation. However, it must be acknowledged that the evaluative activity and thinking does not come easy, particularly for those stakeholders who have their hearts and funds in the project. Too often, evaluation simply detracts from the intervention. However, this study provides the motivation for project providers, funders, and policy makers to engage in evaluation. The model is able to provide a clear picture of the place of evaluation in bringing about sustainable change—what does this mean for ECB? It simply provides the impetus for ECB, in that it presents the empirical evidence that willingness, capacity to engage in evaluation activities, and using evaluation information increases the probability of achieving desired outcomes and sustainability. Therefore, increasing the willingness and capacity of stakeholders to engage in evaluation is a worthwhile pursuit, and one that should be at the forefront of organization’s aiming to sustainably achieve their desired outcomes.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
