Abstract
In this article we defend the idea that theory-based evaluations—contribution analysis, logic analysis, and realist evaluation—are complementary components of a new theory in evaluation. We also posit that we are currently observing the emergence of a fifth generation in evaluation: the explanation generation. Theory-based evaluations have featured prominently in the discourse of evaluators since the mid-1980s. They have developed mainly in response to the need for evaluation of complex interventions. In this article we analyze certain approaches that have matured in their design and application. We use the framework of Shadish et al. to analyze the ontological, epistemological, and methodological foundations of various theory-based approaches in evaluation to appraise their similarities and differences. We observe that all these approaches are grounded in critical realism. Similarities seen in their ontological, epistemological, and methodological positionings, as well as their complementarity in terms of the evaluative questions they address, suggest we may be observing the consolidation of a new theory in evaluation and the emergence of a fifth generation.
Introduction
In this article we defend the idea that theory-based evaluations—contribution analysis, logic analysis, and realist evaluation—are complementary components of a new theory in evaluation. We also posit that we are currently observing the emergence of a fifth generation in evaluation: the explanation generation.
By comparing various theory-based approaches on their positions in relation to knowledge construction, valuing, and use—dimensions used by Shadish et al. (1991) and Alkin (2004)—we see they share foundational principles and answer complementary, and even overlapping, evaluative questions in such a way that they appear to constitute a new evaluative theory. Theory-based evaluations, which have emerged in reaction to current “normal” evaluation practice, assert the need for a program theory when evaluating complex interventions. Their emergence marks a new explanation generation in evaluation that follows the four generations—measurement, description, judgment and pluralism—identified by Guba and Lincoln (1989).
Since the 1980s there has been an emergence and consolidation of theory-based evaluations (Blinded for review; Coryn et al., 2011). These are characterized by the development of a plausible program theory, which is a primary product of the evaluation and upon which are based results, recommendations, and conclusions. Evaluations of this type differ from logic model construction in their analytical focus: their objective is not to report on the expected links between resources, processes, and outcomes, but rather to provide a validated model that will enable a judgment to be made on the intervention being evaluated. Called theory-based, theory-driven, theory-anchored, or theory-oriented evaluations (Blinded for review; Coryn et al., 2011; Donaldson, 2007; Rogers, 2007), they are all aimed at reinforcing the explanatory power of evaluations (Weiss, 1997).
Program theory evaluation has historically attracted considerable interest in the field of evaluation. In the early days, this infatuation prompted Bickman (1987, 1990) and Rogers et al. (2000) to publish three issues of the journal New Directions for Evaluation in which the benefits, difficulties, and advances of these types of evaluations were discussed. Since then, various trends have been noted in the field of evaluation: an expansion in the number of evaluations identifying themselves as theory-based evaluations, an increased application of theoretically developed approaches giving prominence to program theory, and the growing body of literature presenting critical or synthetic analyses of theory-based evaluations (Coryn et al., 2011; Donaldson, 2007; Sridharan and Nakaima, 2012).
Since the 2000s, there has been a consolidation of approaches based on program theory. We can trace the methodological literature describing the approaches and their applications, as well as the literature that has strengthened the theoretical and methodological aspects by incorporating lessons learned from those applications. We will not here systematically analyze the practice of theory-driven evaluations, as has been done by Coryn et al. (2011); rather, we will analyze some of the approaches that have become more common over the past 20 years, to understand their foundations and compare them. Our objective is to attempt an in-depth reading of this movement to discern more clearly its core components and variations.
First, we present the methodological approach and its results by describing the selected evaluative approaches and where they stand in relation to the three foundations identified by Shadish et al. (1991): knowledge construction, valuing, and use. We then discuss the consequences of these results in terms of the consolidation of theories in the evaluation field.
Framework for comparing theory-based approaches
Chen (1990: 17) defined theory as “a frame of reference that helps humans to understand their world and to function in it.” Program theory is defined as “a specification of what must be done to achieve the desired goals, what other important impacts may also be anticipated, and how these goals and impacts would be generated” (Chen, 1990: 43). In the literature, definitions, boundaries, and differences among program theory, intervention theory, and theory of change are not clear-cut (Funnell and Rogers, 2011; Weiss, 1997). As Weiss (1997) suggested “to keep the terminology simple”, here we will use the term program theory to represent all causal pathways, including mechanisms and influences. Theory-based evaluations are based largely on program theory and are aimed at refining it using a variety of evaluative devices. A theory-based approach to evaluation represents “any evaluation strategy or approach that explicitly integrates and uses stakeholder, social science, some combination of, or other types of theories in conceptualizing, designing, conducting, interpreting, and applying an evaluation” (Coryn et al., 2011: 201). Program theory is the linchpin of the theory-based approach to evaluation; it describes how inputs, activities, and outputs centred on the program’s process theory lead to immediate, intermediate, and final outcomes or impacts (Chen, 1990; Shadish et al., 1991).
Our objective is to analyze some of the approaches that have been conceived, applied, and adapted over recent years, and not to produce an exhaustive analysis of theory-based evaluations. Specifically, we will analyze three approaches—logic analysis, contribution analysis, and realist evaluation—by examining foundational articles, as well as articles that have proposed their refinement. Each approach will be analyzed in relation to the dimensions identified by Shadish et al. (1991) and taken up by Alkin (2004), which are knowledge construction, valuing, and use. Shadish et al. (1991) identified these as the foundational dimensions upon which all evaluation approaches should be assessed in terms of their positioning to understand in what ways they constitute a theory, as well as to appraise their similarities and differences.
The knowledge construction component addresses the issue of determining how evaluators construct reliable knowledge (Shadish et al., 1991). They approach this concept in relation to ontology, epistemology, and methodology. Ontology relates to positioning with regard to the state of reality of the subject being evaluated. It can be classified into three views: “1) the reality exists and is governed by immutable natural laws that are knowable; 2) the reality exists, but cannot be grasped objectively in all its complexity; and 3) the reality is a mental, social, or experimental construct, and is therefore multifaceted” (Champagne et al., 2011: 255, authors’ translation). Epistemological positioning has to do with the relationship between the evaluator and the object of evaluation and standards of knowledge development. That relationship can be purely objective or subjective, or objective while incorporating contextual subjectivity, which could allow the use of evaluation results to be extrapolated to similar contexts (Champagne et al., 2011; Shadish et al., 1991). Methodological positioning refers to the technical devices and mechanisms used in the evaluation to develop knowledge (Champagne et al., 2011).
Valuing is central in evaluation. This component clarifies the evaluator’s position regarding values (Shadish et al., 1991). Three objectives are possible: the development of a meta-theory – “the study of the nature of and justification for valuing” (Shadish et al., 1991: 48); a prescriptive theory; and/or a descriptive theory (Alkin, 2004; Champagne et al., 2011; Chen, 1990; Shadish et al., 1991). A meta-theory uses a reliable and structured method to substantiate the value’s foundation and justification (Shadish et al., 1991). A descriptive theory describes the values without necessarily assigning any supremacy (Chen, 1990; Shadish et al., 1991), whereas a prescriptive theory relates particularly to the supremacy of certain values when it indicates “what ought to be done or how to do something better” (Chen, 1990: 40). A prescriptive theory is characterized by a directed action, a formal conceptualization of the intervention, and an implementation strategy, as well as a range of options from which to select criteria related to outcomes (Chen, 1990).
The use component of evaluation theory refers to the utility of evaluative assessment in resolving social problems (Alkin, 2004; Shadish et al., 1991). This dimension is made up of three essential elements: intention, types of use, and ways of fostering use (Alkin 2004; Champagne et al., 2011; Chen, 1990; Shadish et al., 1991). The intention of the evaluation is formative when data are collected prospectively and the results are used to inform decision-making, the implementation process, quality assurance, and reporting. Evaluation has a summative role when information is collected retrospectively to determine whether or not the intervention should be repeated (Stufflebeam and Coryn, 2014). The intention is developmental when the objective of the evaluation is to foster the emergence of social innovation and the intervention’s transformation in a dynamic environment (Patton, 2011). Although presented in the literature as exclusive categories, in practice we observe overlaps between these different intentions. They are more aptly considered to be abstract categories for analyzing approaches than exclusive categories into which evaluations fit. There are three types of use: conceptual, symbolic, and instrumental (Alkin, 2005; Champagne et al., 2011; Chen 1990; Patton, 1988; Shadish et al., 1991; Weiss, 1988). Conceptual use refers to changes in the conceptualization of a problem, of the intervention, or of potential solutions. Symbolic use is intended to demonstrate or legitimize certain positions that have been predetermined by the stakeholders. Instrumental use refers to real and programmatic changes related to the results of the evaluation, expressed in terms of decisions or their implementation. Ways of fostering use refer to all methods employed (Shadish et al., 1991), whether those are methodological processes to foster the transfer of results into other contexts, the commitment of stakeholders and their control over the evaluation process, or the means used to disseminate results.
Synopsis of selected approaches
Logic analysis is “a type of program theory evaluation that uses scientific knowledge to evaluate the validity of the intervention’s theory and identify promising alternatives to achieve the desired effects” (Rey et al., 2012: 62). It is used to test the plausibility of the program theory (Brousselle and Champagne, 2011; Champagne et al., 2011). It sheds light on the program’s strengths and weaknesses, elucidates the links between the program’s design and the production of desired outcomes, and identifies contextual influences (Brousselle and Champagne, 2011; Contandriopoulos et al., 2015; Rey et al., 2012; Tremblay et al., 2013). Logic analysis examines: “(a) the important characteristics the interventions must have to achieve the effects and (b) the critical conditions required to facilitate the implementation and produce the effects” (Rey et al. 2012: 63). There are two types of logic analysis: direct and reverse (Rey et al., 2012). Direct logic analysis scrutinizes essential characteristics of the intervention and critical conditions leading to desired or other effects. It has important similarities with realist review, which documents the links between context, mechanisms, and outcomes using existing literature in the scientific and empirical fields (Pawson and Tilley, 2004; Pawson et al., 2005). Here we chose logic analysis because it encompasses both direct and reverse logic analysis. Reverse logic analysis explores the best means to attain desired outcomes or other outcomes (Brousselle and Champagne, 2011; Rey et al., 2012). Logic analysis relies on available scientific knowledge, either evidence-based or expert knowledge (Brousselle and Champagne, 2011; Blinded for review; Contandriopoulos et al., 2008). Both direct and reverse logic analysis involve three steps: 1) building the logic model; 2) developing the conceptual framework; and 3) evaluating the program theory (Brousselle and Champagne, 2011).
Contribution analysis is an effect analysis approach (Mayne, 2001, 2008, 2011, 2012a). It examines, through credible causal claims, the contribution rather than the attribution a complex program is making to expected outcomes and impacts in complex settings (Delahais and Toulemonde, 2012; Mayne, 2011, 2012a, 2012b, 2015). Contribution analysis involves six key steps: 1) setting out the cause–effect issue to be addressed; 2) developing the postulated theory of change and risks to it, including rival explanations; 3) gathering evidence on the theory of change; 4) assembling and assessing the contribution claim and challenges to it; 5) seeking out additional evidence; and 6) revising and strengthening the contribution story (Mayne, 2012a: 272). It aims to “infer plausible association between the program and a set of relevant outcomes by means of systematic inquiry” (Lemire et al., 2012: 295). Dybdal et al. (2010) indicate that contribution analysis ascertains the program’s contribution by establishing the postulated theory of change, identifying key threats to impacts pathways, establishing other contributing factors, and evaluating the principal rival explanations. It considers uncertainty in evaluating complex dynamic programs (Biggs et al., 2014). Five criteria with regard to the embedded theory of change must be met to infer causality for the program: “1) plausibility of the theory of change; 2) implementation as outlined in the theory of change; 3) evidentiary confirmation of key elements; 4) identification and examination of other influencing factors; and 5) the extent to which key alternative explanations have been disproved” (Mayne, 2011: 7). These steps can be supplemented by the use of the Relevant Explanation Finder (Biggs et al., 2014; Lemire et al., 2012), a tool that helps to clearly identify the factors influencing the chain of impacts, as well as alternative explanations.
Realist evaluation assesses complex programs by probing what works, for whom, and under what circumstances (Pawson and Tilley, 1997, 2004). Realist evaluation involves four core steps: 1) articulating the program theories to be tested, 2) collecting data to test the hypotheses; 3) testing the hypotheses; and 4) interpreting and refining them (Mehdipanah et al., 2015; Pawson and Tilley, 1997, 2004; Ranmuthugala et al., 2011; Salter and Kothari, 2014). It uncovers underlying implicit or explanatory theory leading to the program and its multiple components, and it identifies contextual factors that spearhead pathways of change to produce expected outcomes (Jagosh et al., 2015; Pawson, 2002; Pawson and Tilley, 1997, 2004; Ridde et al., 2012; Salter and Kothari, 2014). It is a logic of inquiry that illuminates the program theory underlying the inherent characteristics of program implementation (Hewitt et al., 2012; Pawson and Tilley, 1997, 2004) to investigate the generative mechanisms associated with the program (M), the contexts under which the pathways operate (C), and the ways in which outcomes occur (O) (Salter and Kothari, 2014). Context–mechanism–outcome (CMO) configurations, as outlined by Pawson and Tilley (1997, 2004), foster the examination of recurrent patterns in the midst of complex social reality through in-depth explanations of causal pathways. This helps the evaluator articulate the program theory to be investigated and test hypotheses to produce transferable advice based on that theory and to inform decisions as well as evidence-based policy-making processes (Hewitt et al., 2012).
Positioning of the approaches
Logic analysis, realist evaluation, and contribution analysis all have their foundations in the critical realism paradigm (Hewitt et al., 2012; Mayne, 2015; Pawson and Tilley, 1997; Pawson et al., 2004; Ranmuthugala et al., 2011; Rey et al., 2012; Salter and Kothari, 2014; Tremblay et al., 2013). Critical realism is founded on “ontological realism, epistemological relativism and judgmental rationality” (Groff, 2004: 10). The discussion presented here is largely inspired by the work of Bhaskar (2008) and other authors who built on his ideas.
All these approaches, while guided by natural laws that can be understood in a fractional way, apprehend reality as real even if it cannot be captured in great detail. An action is causal only when its effect is activated by a particular mechanism in a particular context (Pawson and Tilley, 1997). According to Pawson and Tilley, “this proposition—causal outcomes follow from mechanisms acting in contexts—is the axiomatic base upon which all realist explanation builds” (1997: 58). This principle underlies generative causation, upon which all the aforementioned approaches are based. Consequently, realism is the cornerstone of the realist evaluation (Jamal et al., 2015; Pawson and Tilley, 1997, 2004; Ridde et al., 2012; Salter and Kothari, 2014). Logic analysis, realist evaluation, and contribution analysis also have the power to explain unexpected outcomes by uncovering mechanisms hidden beneath the surface.
Logic analysis examines critical features of the main intervention or alternatives, as well as pillar conditions that might explain how the intervention leads to desired or other outcomes. Social interventions are socially mediated and conveyed by human agents, such that the trajectory from causal mechanisms to outcomes is nonlinear, unstable, and recursive. Logic analysis helps to uncover causal pathways that may be observable, or else discernible but not always perceptible, or even hidden. It uses multiple methods to gather knowledge from different domains and various types of data to show how the intervention causes social benefit and desired effects. It is able to shed light on unexpected results, such as how resistance to change impedes achievement of desired outcomes. Even if the foundational literature on logic analysis is not always clear, the arguments presented above, when carefully considered, support the notion that logic analysis is within the stream of critical realism. They also explain how an evaluator can grasp knowledge about causal pathways underlying the intervention.
As discussed above, contribution analysis assesses an intervention’s contribution by taking into account other explanatory factors and rival explanations. Realist explanatory mechanisms operate in three ways: 1) by reflecting the intervention’s embeddedness within the structured essence of social reality; 2) by providing statements that describe how both macro and micro mechanisms impact the intervention; and 3) by demonstrating how the intervention outputs are generated from stakeholders’ activities and resources (Pawson and Tilley, 1997). “Contributory causes are called INUS causes: an Insufficient but Necessary part of a condition that is itself Unnecessary but Sufficient for the occurrence of the effect” (Mackie, 1965: 245). Contribution analysis is very useful for studying interventions nested in multilevel systems that are commonly characterized by uncertainties, turbulence, and dynamic mechanisms operating in an open system (Mayne, 2015). The philosophy underlying contribution analysis is centred on critical realism, making it a powerful tool for assessing underlying assumptions and risks behind causal links and for identifying other key competitors, as well as unexpected results in the outcomes chain, inasmuch as it integrates the three previously mentioned layers of ontological realism to infer causal relationship between intervention and outcomes. Regularities in logic analysis and realist evaluation, as well as in contribution analysis, are the causal pathways that have been generated, which represent the chain of impacts and contextual factors that are likely to influence the production of effects (Easton, 2010; Mayne, 2012b).
All the approaches share the same
Whereas each approach addresses a different evaluative question, they all proceed in the same way: their research design is a stepwise approach that includes constructing program theories and then enriching and validating them based on knowledge derived from various sources of data and fields of knowledge. The fact that each approach addresses a different question explains why their main use intentions (formative, summative, or developmental) are different.
Table 1 shows how each approach is positioned in relation to the dimensions of knowledge construction (ontological, judgment rationality, epistemological, methodological positions), valuing, and use.
Comparative positioning of logic analysis, contribution analysis and realistic evaluation.
Framing the existence of a new theory in evaluation and the rise of the 5th generation in evaluation
The three approaches—logic analysis, contribution analysis, and realist evaluation—selected because they all have a strong focus on program theory, present one particular feature of interest: while they have similar bases for characterizing and differentiating evaluative theories, the evaluative questions they address are complementary. They are empirically tested approaches intended to respond to the main evaluative questions by: 1) assessing the plausibility of the intervention (logic analysis); 2) analyzing an intervention’s effects (contribution analysis); and 3) analyzing the intervention’s implementation (realist evaluation). However, the boundaries between them are not clear cut.
In fact, the three approaches have important overlaps that make them non-exclusive and complementary. For example, with respect to program theory, one can easily argue that direct logic analysis addresses mainly the resources–activities–outcomes mechanisms while contribution analysis puts the emphasis on documenting the outputs–outcomes–impacts chain, making these two approaches complementary components of the program theory. With a logic analysis further scrutinizing consequences as impacts, what would be the difference between a contribution analysis and a logic analysis? The answer is probably the stage at which the evaluation is done, with logic analysis staying at a more theoretical level, not documenting observed effects, whereas contribution analysis explains observed and documented effects of the intervention.
The same kind of argument could be made regarding differences between realist evaluation and contribution analysis. As both aim to document context–processes–effects mechanisms, aren’t they similar approaches? It is quite obvious in reading about these approaches that they are different and offer complementary evaluation perspectives, but their differentiation is not as clear cut as we would hope, which is not surprising as, in the end, the objective is to build an evidence-based program theory.
A third example would be the differences and similarities between logic analysis and realist evaluation. Realist evaluation relies mainly on empirical data and observations; when the source of information shifts to written data, the type of evaluation done—realist review—is similar to direct logic analysis.
As we have seen above, logic analysis, contribution analysis, and realist evaluation share several foundations related to ontology, judgment rationality, epistemology, valuing, and components of the use dimension. There are methodological differences in how they construct a valid program theory while responding to complementary but still different evaluation questions. Differences in their use are due to the fact that they address different types of questions. In our view, the coherence in the positionings of logic analysis, realist evaluation, and contribution analysis, combined with their complementarity, suggests strongly that these are the components of a new emerging theory in evaluation. 1 Clearly, further work on this topic would help to deepen the understanding of their links and complementarity in terms of evaluation questions. This work would lay the groundwork for describing this new theory, which represents a departure from the existing evaluative questions of plausibility, effect, and implementation analysis.
If we analyze the construction of theory-based evaluations from a historical perspective, the consolidation of a new evaluative theory might even mark the emergence of a fifth generation in evaluation: the explanation generation. In fact, Guba and Lincoln (1989) identified, in the history of the domain of evaluation, four generations of evaluation—measurement, description, judgment, and pluralism—the fourth being the only one marking a real paradigmatic shift. We would not go so far as to refer to paradigm change when describing the movement coalescing around the consolidation of theory-based evaluative approaches. Nevertheless, similarities in positioning around ‘anomalies’ is another element confirming the rise of a fifth generation. Here we observe the same phenomenon as for the first three generations: approaches sharing the same paradigmatic foundations are emerging in response to certain challenges that cannot be resolved using current mainstream approaches. The first proponents of theory-based evaluations were, in fact, reacting to black-box evaluations. Suchman (1967), Weiss (1972), and Chen and Rossi (1983) proposed that evaluations would have greater explanatory power if they included a well-founded logic model that detailed the action mechanisms of the intervention being evaluated. Chen (1990: 18) pointed out that “black box evaluations may provide a gross assessment of whether or not a program works but fail to identify the underlying mechanisms that generate the treatment effects, thus failing to pinpoint the deficiencies of the program for future program improvement and development” and that they are not sensitive to the influence of political and organizational contexts. The legitimacy of this type of evaluation was greatly advanced by Weiss (1972) and Suchman (1967), who noted that failure to find program effects could, when not attributable to faulty evaluation design, be due to either inadequate implementation or wrong theory (Bickman, 1987; Birckmayer and Weiss, 2000; Blinded for review; Chen, 2004; Weiss, 2007).
Constructing and analyzing program theory also appears to be an essential method for resolving the problems inherent in complex interventions (Dubois et al., 2012; Morell, 2010): How to deal with uncertainty created by interdependency among numerous actors who are constantly evolving and adapting? How to adapt to non-linear and sometimes unpredictable relationships? How to assess emergent and unanticipated outcomes resulting from relationships that are sometimes non-linear? (Morell, 2010; Shiell et al., 2008). Morell (2010) points out that program theory is crucial for evaluating this type of intervention because it helps reduce uncertainty. In 2012, Dubois and colleagues, in a special issue of the Canadian Journal of Program Evaluation on the evaluation of complex interventions (Houle et al., 2012) showed that evaluators employed program theory, in their evaluative practices, as a core approach to better encompass the complexity of interventions (Dubois et al., 2012; Zimmerman et al., 2012). The explanatory power of a theory is helpful in anticipating the unexpected, understanding surprising phenomena, and reducing uncertainty (Morell, 2010). Likewise, the fifth generation underscores the explanatory power of contextual characteristics, implementation processes, and causal pathways to show, by identifying expected effects and impacts, how an intervention’s activities and outputs led to outcomes. As the intervention unfolds, several implicit causal mechanisms result in the cumulative success or failure of the entire intervention or some of its components. Theory-based approaches to evaluation are used to shed light on these mechanisms that operate in open systems and are embedded in multiple social systems.
Our analysis suggests a new reading of theory-based approaches. This reading is not based on an exhaustive analysis of approaches, nor has it been validated by different theory designers. These two elements could bring significant information to the discussion.
Conclusion
Evaluation theoreticians often deplore the fact that new theories emerge without having been tested on the ground. The analysis of several theory-based approaches that have already been tested reveals the extent to which these approaches share similar foundations while being complementary in terms of the evaluation questions to which they can be applied. Together they form the basis of a new theory-based evaluation theory that could, in light of its historical evolution, mark the emergence of the 5th generation in evaluation.
Footnotes
Acknowledgements
We are thankful to John Mayne and two anonymous reviewers who commented on the draft of this article, bringing useful insights that contributed to improving the manuscript.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
