Abstract
George Grob presented the fifth and final Eleanor Chelimsky Forum address at the 2017 annual meeting of the Eastern Evaluation Research Society. In this commentary, I respond to several points that George raises in the American Journal of Evaluation paper based on that address. An overarching theme of my comments involves the potential for strengthening the linkages between evaluation theory and evaluation practice. Research on evaluation should also be added to this pairing.
Keywords
George begins his paper by alluding to a song from Fiddler on the Roof. I did not realize in advance that the papers for this year’s Chelimsky Forum were supposed to begin with a reference from a musical. Unfortunately, musical theater is not my strongest Jeopardy category. Perhaps an apt lyric for my presentation comes from the musical Oklahoma!: “the farmer and the cow[poke] should be friends.” 1 I will return to this lyric but first turn to the origins of this forum and then to comments inspired by George Grob’s outstanding paper.
As George noted, the Forum was inspired by Chelimsky’s (2012) address to the annual meeting of the Eastern Evaluation Research Society (EERS). In that paper, Eleanor offered suggestions aimed at increasing the interplay between what we refer to as evaluation theory and evaluation practice. In essence, Eleanor advocated that we find processes whereby, first, the kind of people who get labeled as evaluation theorists would hear from the kind of people we call evaluation practitioners about the sorts of problems typically encountered in evaluation practice, and second, the two groups could engage in dialogue about possible resolutions to these problems.
Eleanor offered four specific suggestions “for practitioners and theorists to reflect together on…problems experienced in practice, along with [the resulting suggestions for] theoretical modifications, efforts at resolution, and follow-up” (Chelimsky, 2012, p. 11). Eleanor’s suggestions were (a) a specialized, ongoing forum; (b) brief reports written when an evaluation finds a conflict with theory; (c) a blog, listserv, or other electronic gathering; and (d) an annual debate…on a specific “balance” issue. As to the first of these suggestions, the ongoing forum, Eleanor indicated that the mission would be “the presentation of often-encountered practitioner experience that appears to challenge theory in some explicit or implicit way” (Chelimsky, 2012, p. 11). The aim of the ongoing forum would be to “surface the various unresolved issues, commonly raised by theorists and practitioners, for discussion by experts in theory and practice, along with a panel of diverse, experienced evaluators” (Chelimsky, 2012, p. 11).
In a sense, the distinction between evaluation theorists and evaluation practitioners is fictive. All or at least almost all of the people who are thought of as evaluation theorists spend part of their time doing evaluation. Indeed, theorists’ practice experiences are likely to be a major source of their ideas as well as a test bed for trying out any new approach they may devise. As to evaluation practitioners, they have, at least implicitly, some kind of mental model or theory of evaluation that serves to help guide their practice.
Despite its arbitrariness, there is value to the distinction between evaluation theorists and practitioners at least in terms of relative emphasis. There are outstanding practitioners, for example, who may have detailed perspectives on evaluation but do not try to disseminate them. In contrast, there are others, called evaluation theorists, who invest much of their time writing and speaking about their ideas regarding how evaluation should be done.
Reading Eleanor’s suggestions for more productive exchanges between evaluation theorists and practices reminds me of comments Ernie House made in the context of the so-called qualitative–quantitative (or paradigm) wars. House (1994) suggested that a good deal of human capital was misdirected toward that debate and instead could have been used in far more productive ways. In this light, Eleanor’s suggestions can be viewed as a set of ways to try to ensure that evaluation theorists’ work is directed toward issues that do make a difference in evaluation practice. With 2017 being the last Chelimsky Forum, it seems worthwhile for us collectively to think about additional ways to achieve Eleanor’s aim, so that those writing about evaluation can gain input from full-time practitioners about which matters are especially worthy of attention.
In the remainder of this article, I offer a set of thoughts stimulated by George Grob’s excellent paper. Given the motivation Eleanor Chelimsky had when she inspired this Forum, my comments in large part are drawn from or at least linked to what might be called evaluation theory.
Schools of Thought
Early in his paper, George notes that various “schools of thought” about evaluation have emerged, and varied methods have been employed. This observation leads me to put in a brief plug for evaluation theory. Why should practitioners bother to be familiar with evaluation theory? The primary answer is that evaluation theory can serve as a guide to practice. It can help navigate the choices associated with different schools of thought and varied method options. As Will Shadish said in 1998 in his presidential address to the American Evaluation Association, “Without evaluation theory, evaluation practice is little more than a collection of methods and techniques without guiding principles for their application" (Shadish, 1998, p. 13). In other words, what the field needs is people who know when, where, why, and how different methods could and should be used in evaluation practice. Theory tells us that.
Even more valuable is familiarity with multiple evaluation theories or, put differently, being multilingual with respect to evaluation theories. This gives evaluators an understanding of different rationales and perspectives regarding when, where, why and how different methods could and should be used. Understanding these different perspectives can increase evaluators value, relative to others with similar methodological skills but without the same conceptual frameworks to aid in thinking about when, where, why, and how to use particular methods for an evaluation. Evaluation theory’s potential as a guide to evaluation practice is an important benefit but not the only one. Shadish (1998) also highlights that evaluation theory is, or at least should be, an important part of our very identity as evaluators.
Eleanor Chelimsky’s suggestions in her 2012 EERS paper were aimed at helping to focus evaluation theorists’ attention on current challenges in evaluation practice as well as to encourage dialog between theorists and practitioners on these challenges and suggested solutions. Given the potential benefits of evaluation theory, perhaps it is worth thinking about interchanges on a broader range of topics. For instance, evaluation conferences might routinely include sessions on general implications of evaluation theory for practice. Given the many pathways by which people come to be an evaluation practitioner, and the numerous people who come into evaluation each year, such sessions could be a worthwhile addition to evaluation capacity building.
Proof
George organizes the body of his paper in five sections. In the first of these, George reflects on the “Adequacy of Proof,” or as he alternatively phrased it, “Understanding ‘Proof’ and the ‘Truth.’” Evaluation theorists have long been concerned about what constitutes credible, valid, or trustworthy evidence. This is reflected, for example, in the review of evaluation theorists presented by Shadish, Cook, and Leviton in their 1991 book, Foundations of Program Evaluation: Theories of Practice. Shadish, Cook, and Leviton (1991) contend that good evaluation theories need to address five issues. One of these is knowledge construction, that is, how we construct knowledge and justify knowledge claims. Evaluation theorists have drawn on a substantial range of perspectives on knowledge construction from the critical realism of Campbell to the social constructivism of Guba and Lincoln. These theoretical perspectives lead, respectively, to such evaluation practices as ruling out plausible alternative explanations by experimental or quasi-experimental design and conducting member checks to verify inferences.
George presents a more pragmatic account in his “standard professional evaluation model” and “universal proof.” This account is not based on an abstract philosophical school but on what George has found to be effective in practice over the years and in various settings. That experience has led George to be an advocate of mixed methods, that is, the combination of quantitative and qualitative methods, in evaluation. Evaluation theorists such as Greene (2007) also endorse the use of mixed methods. On this point, however, I will offer a minor dissent from George’s position. While the combination of qualitative and quantitative methods will often make sense, I don’t think this is universal. For instance, the use of two quantitative or two qualitative methods may at times be preferable.
As something of an aside, the other four components of the Shadish et al. (1991) model are social programming, which includes a perspective on how programs operate and the role of programs in social change; valuing, including how one should explicate value issues and select the values to which to attend in an evaluation and its interpretation; use, involving what kinds of use are most important (when) and what the evaluator should do to facilitate use (under various conditions); and practice, a component that draws on the other components and indicates how in particular to do evaluation and how the profession of evaluation should be arranged. This five component model can serve as a framework for comparing the contents of different evaluation theories as shown in Shadish et al. In my experience, it is also valuable for evaluators to be familiar with the framework in and of itself. For example, relatively new evaluators often have not thought about the social programming component (or the equivalent for those evaluating something other than social programs). Attention to the nature and role of the program (or other object of evaluation) can be quite useful, such as by helping to identify possible times or places where evaluative information may have leverage to make a difference.
Evaluator Independence
George turns from the topic of proof to that of evaluator independence. Discussions of evaluator independence often focus on organizational arrangements such as whether an evaluator is internal or external and what the reporting line is for an evaluation unit. George expands upon the usual discussion, emphasizing instead the personal attributes and skill set of the evaluator. George indicates that his reflections on evaluator independence have led him “to look inward rather than outward.…[And from this exercise he] concluded that threats to independence are fundamentally within the evaluator rather than the client” (Grob, 2018, p. 127). In this way, George’s perspective on evaluator independence is related to work on evaluator competencies (e.g., Stevahn, King, Ghere, & Minnema, 2005). That is, George emphasizes the way in which evaluator independence depends on the characteristics and skills of the evaluator.
One of the challenges with the “inward” approach to evaluator independence is that, especially for those who are not directly involved with the evaluator, it can be difficult to convey convincingly that the evaluator’s behavior and personal style begets independence. In contrast, it is easy to convey a structural arrangement such as that an agency has contracted with an external evaluator or engaged a meta-evaluator (Scriven, 2009). While easier to communicate, these structural arrangements, as thoughtful advocates such as Scriven acknowledge, are not in and of themselves full guarantees of evaluator independence. An external evaluator can have conflicts of interest, such as the desire for future contracts, for instance.
Evaluators probably could give more attention to how we could better demonstrate and convey the kind of inward-focused independence George describes, especially when communicating outside of smaller networks. In any case, there is wisdom in George’s concluding thought on this subject, “The most effective way to protect your independence is to be independent” (2018, p. 129).
Cheap Evaluation
George notes that it is common both for funded projects to require evaluation and for the evaluation budget to be some fixed percentage of the overall project budget. As George points out, this can result in small evaluation budgets that create severe constraints on the evaluation work that can be done. As possible approaches to “evaluation on the cheap,” he suggests the evaluator serve as a kind of coach or evaluation consultant to grantee staff. This is one potentially good approach in the face of severe budget constraints.
Other options have been pointed out by various evaluation theorists. For example, formative evaluation will often be cheaper, and possibly more flexible, than summative, so one might do formative work to the extent the limited budget allows. Wholey’s rapid feedback evaluation is another option, with an evaluation expert drawing conclusions and making recommendations based on existing program data as well as on observations and other evidence collected during a site visit. Evaluability assessment, as described by Wholey and others, might also be appropriate, establishing whether the program or project is ready for more intensive summative evaluation. Alternatively, capacity building might be called for, such as in the development of better data systems (for a review of the various evaluation approaches Wholey has suggested, see Shadish, Cook, & Leviton, 1991).
We in the evaluation community might also do more in terms of offering judgments about funders’ practices, including with respect to their approach to mandating and funding evaluation. An argument can be made, I think, against requiring a fixed percentage for project evaluation. Projects vary in terms of the extent to which they provide opportunities for important learning, which may at best be weakly related to overall budget size. More generally, we evaluators could probably do far more in terms of making recommendations about future evaluations in our reports and other interactions with funders. The educative function of evaluation, which Cronbach wrote about long ago, should apply to lessons about evaluation as well. Future efforts in evaluation theory and practice might include more focus on evaluators’ relationships with others, including those who set the policies or make ad hoc decisions that directly affect evaluation (Mark, Cooksy, & Trochim, 2009)
Evaluation Reports
George refers to new and innovative ways for sharing evaluation findings, in contrast with the standard paper report with technical appendices from years ago. I offer three additions to his comments about reports. First, evaluation findings may be more consequential when presented in broader frameworks. A program’s effectiveness, for example, might be discussed in terms of a broader class of interventions to which the program belongs. Or findings could be laid out in the context of both other evaluations and results from relevant research literatures. Second, the findings that matter may not always be the results about the focal evaluation question(s). For example, early evaluations of interventions for the homeless appear to have been influential partly because of descriptive information about the homeless. Early welfare reform evaluations were probably important partly because they simply demonstrated the feasibility of successful implementation of alternatives to traditional welfare. Again, the “core message[s]” and “concise ‘takeaways’” to which George refers might not be based only on findings about the key evaluation question(s). Third, important consequences sometimes may arise because of the procedural attributes of the evaluation. For example, the widespread measurement of children’s well-being as an outcome in several state-level welfare reform evaluations may make children’s well-being a more salient consideration in subsequent policy deliberations. Mark and Mills (2007) describe such effects as “procedural influence,” expanding on Patton’s (1997) notion of process use.
Evaluator Development and Careers
The final issue that George raises, involving careers and career development for evaluators, falls within the practice component of Shadish et al.’s (1991) five component model of evaluation theory. My impression is that the training issue, like professional issues in general, has received less attention by evaluation theorists than most of the issues included in the Shadish et al. model. A notable exception is the final chapter in Cronbach and associates’(1980) Toward Reform of Program Evaluation, a substantial portion of which is devoted to discussion of evaluator training. In part, this chapter laid out a plan for training “a kind of evaluator-for-all-seasons.” For this all-purpose evaluator, Cronbach and colleagues supported providing various evaluation classes and experiences within the context of PhD-level training in a relevant discipline. They also noted the need for training for those who would have narrower responsibilities.
Elaborating on the idea that different training and professional development experiences are needed for various groups, different experiences are likely needed depending on what pathway a person has taken into evaluation. Some new evaluators are trained in evaluation, while other newcomers have strong backgrounds in research methods but no evaluation training per se, and yet others enter evaluation from service provision or administration. Different training and professional development experiences are also likely to be desirable for people with different expected levels of involvement with evaluation. Different training options might be appropriate for the (long term) professional evaluator (which was more George’s focus) than for the occasional and/or short-term evaluator, and even different professional development would be fitting for evaluation users.
Like the issue of evaluators’ training and professional development, the issues of evaluator independence (and the more general issue of evaluator role) and of evaluation funding practices fall in the professional issues portion of Shadish et al.’s practice component. A case can probably be made that such issues deserve more attention from contemporary evaluation theorists and others. This may be especially true for education for evaluation users and for those who set the policies and practices that guide and constrain evaluation practice.
The Farmer and the Cowpoke
The song from the Musical Oklahoma! is based on the idea of conflict between farmers who wanted fences to protect their fields of crops and those who raised cattle and so instead preferred open ranges. When I first heard the song, growing up in a rural area in central Nebraska, this made no sense to me. Conflicts about wide-open ranges were unheard of. Most people in the area raised crops and cattle. The farmer and the cowpoke didn’t need encouragement to be friends; they were usually one and the same person. When this wasn’t true, the interdependencies were obvious. A person who exclusively or predominantly raised cattle needed hay, corn silage, and other cattle feed from those who were more focused on crops.
Perhaps I’ve taken too seriously the implicit challenge from George to make use of an analogy from musical theater. Or perhaps we can expand on Eleanor Chelimsky’s valuable suggestions about how to increase communication between theorists and practitioners. Perhaps we can encourage more members of the evaluation community to be involved in evaluation theory. We might adopt a slogan, “Evaluation theory is too important to be left in the hands of a few people labeled as evaluation theorists.” The tendency toward exclusivity, at least in the past, is evident in the Shadish et al. (1991) book, where the chapters describe the perspectives of seven major evaluation theorists. Evaluation theory building should be the responsibility of a wider portion of the evaluation community. This is probably even more the case for theory testing and revision, which can involve case studies of the experiences of single evaluations but alternatively can involve an array of methods used in research on evaluation. Thinking of the scene from Oklahoma!, we need another group, in addition to farmers and cowpokes, on the dance floor. That is, those doing research on evaluation need to be added to the mix.
We might also try to shift focus away from evaluation theories, which are expected to have something of a soup to nuts comprehensiveness. People often talk about evaluation theories with the expectation that they should address most if not all of the five components described by Shadish et al. (1991). Instead, people might more commonly concentrate on one or more theoretical issue, such as stakeholder participation, evaluation question formulation, or use and influence. If evaluation theory can be developed, trialed in practice, and tested in research in smaller bite sizes, it becomes more feasible for a wider array of people to be involved in evaluation theory.
In closing, I will try to take Eleanor Chelimsky as a model—always an excellent idea—and offer four suggestions for increasing interplay on theory, practice, and research across (and within) members of the evaluation community. (a) Evaluation conferences might hold round table sessions (or another forum) for people who want to partner with others on some integration of practice, research on evaluation, and theory development or testing. For example, an evaluator about to begin a major evaluation might find a partner in someone who wants to test competing practice advice from two theories. (b) We could advocate for agencies and other funders of evaluation to offer supplementary funds when an explicit theory test is piggybacked on an evaluation proposal. (c) This journal (or others) could establish a practice–theory section or even better a theory–practice–research section with a priority for short and possibly untraditional pieces. This could include brief reports from evaluation practitioners who have garnered evidence about an evaluation theory issue from a recent evaluation. Lowering the cost of entry should increase the ability of practitioners to contribute their insights about evaluation theory issues. (d) One or more of the suggestions Eleanor made could be employed to go further in achieving the aims she suggested. Or the first or third suggestion here could be adapted toward that end.
The farmer and the cowpoke can be friends. The farmer and the cowpoke can collaborate. The farmer can also be a cowpoke. And the agricultural researcher, while not in the original musical, can similarly be in the mix.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
