Evaluation Practice and Theory

Abstract

Debra Rog presented the 2014 Eleanor Chelimsky Forum address, offering ways to integrate evaluation theory and practice by abundant use of practice examples. These examples illustrate the effective use of the Ladder of Abstraction from semantics, working from the concrete to the abstract and back again.

Keywords

evaluation policy evaluation practice evaluation theory theory–practice relationship

The Eastern Evaluation Research Society founded the Eleanor Chelimsky Forum specifically to improve a dialogue between evaluation practice and theory. For her presentation to the Forum and her article for American Journal of Evaluation, Debra Rog explored three themes, namely, infusing evaluation theory into practice, infusing practice into theory, and steps that the evaluation profession is taking to integrate the two. Integrating them is a vital function, important for any field of inquiry. Aristotle first made the formal distinction between epistêmê (theory) and technê (practice), and the Stanford Encyclopedia of Philosophy notes that they have “an intimate positive relationship” (Parry, 2014). As Pressman and Wildavsky (1984) put it in their seminal book on implementation, there is an important difference between “knowing that” certain features are desirable and “knowing how” to do anything about it. Knowing both theory and practice denotes excellence in evaluation. Debra’s article exemplifies this kind of excellence.

Infusing Theory Into Practice

Debra outlined how program theory, social science theory, and evaluation theory help us make sense of the evaluation context and offer guidance about how to create more successful evaluations. Her examples infused theory into practice very well indeed. Debra’s application of social science theory in the areas of small group participation and psychological ownership represent important tools for the evaluator’s tool box—they very directly help evaluators get high-quality data from stakeholders, settle on an evaluation question (where possible), and prevent derailing of the process. These observations represent generalized knowledge for evaluation practice! Debra’s “cupcake intervention” (to make sure that program staff feel appreciated) had an undeniable impact on data quality, although by itself it would be deemed too concrete for theory. Yet, as one of a set of activities under the more abstract rubric of “program staff engagement,” it was a brilliant addition. Her illustration of evaluation theory was even more compelling. Evaluation theory recognizes the dimensions of knowledge and utility. By engaging homeless shelters directly, rather than sitting on an evaluator’s “throne” to accept whatever participants were offered, Debra worked to assure a high-quality sample of participants so that better knowledge and utility were assured. This example drives home not only her identification as an evaluator but also the way that concrete practice impinges on our general understanding of what constitutes good evaluation.

Infusing Practice Into Theory

One of the five dimensions of evaluation theory that Shadish, Cook, and Leviton (1991) discussed was practice, so it sounds circular to say one should infuse something into itself. Yet this aspect of evaluation theory needs much more discussion and amplification. As Debra pointed out, expertise in evaluation is an individual professional’s generalized knowledge about practice, that is, the ability to recognize contexts and draw upon a wide ranging repertoire of responses. It is not too far a step from individual expertise, to more general, shared knowledge about evaluation. Theorists ranging from Weiss, to Rossi, to others in the current day, derived many of their ideas from their extensive experience of evaluation practice.

Debra distinguished general evaluation expertise from content expertise. We need a mix of both, just as an evaluation team could use collective expertise in sampling, management of data collection, writing of reports, and stakeholder engagement—evaluators have different strengths, but together they can offer better evaluation capacity (Leviton, 2001). Content expertise is often an indispensable part of the mix, however. In my experience of overseeing over 100 evaluations in my current role, I can state that it is sometimes disastrous to give evaluation contractors “on-the-job training” in a content area for which they lack familiarity. Debra described her ability to recognize patterns in the area of homelessness that a generalist simply could not do. Sometimes it is helpful for evaluators to offer a fresh perspective independent of content, so the idea of goal-free evaluation still has plenty of merit. But expertise on a topic should usually allow for a better test of assumptions, a better theory of change, and a product that is more useful to stakeholders than a content novice might offer.

Beyond content expertise, certain aspects of practice still seem to be grossly underreported. For example, I see very little on how to engage stakeholders well and efficiently, some exceptions being a paper by Preskill and Jones (2009) that was commissioned by the Robert Wood Johnson Foundation as well as the splendid chapter by Bryson and Patton (2010) and the systematic review by Brandon and Fukunaga (2014). On management of evaluations, there is little or nothing, although Debra herself authored a chapter on this subject for multisite evaluations (Rog, 2010). The lack of attention to management of evaluations may be due to the competitive advantage that contract research firms perceive if they keep management expertise to themselves. But guidance on the management of applied research, in general, is absent from the literature, and no one learns it except through craft knowledge. In 2010, Darlene Russ-Eft and I searched everywhere for writing on the management of applied social research. Except for very spotty discussions of the management of data collection in the survey research literature, we could not find it. Beyond the supervision of data collectors, inspection and cleaning of data, and organizing around stakeholder discussions, there is a lot to consider in the management area, ranging from appropriate budgets and timelines to human resource management and building a team with complementary expertise. The Robert Wood Johnson Foundation commissioned a guidance document on research management, especially to address timeline and budget, the most common problems in our experience (Nakashian, 2007).

For practice, we also need to discuss the impact of Requests for Proposals (RFP) on the shaping of federal and private funder evaluations, and both the intended and unintended consequences of the RFP on the actual products. From the perspective of the evaluation profession, what should a good RFP look like? Do procurement regulations help or hinder the process? And how prescriptive should an RFP be, given the nature of the evaluand? Let me offer examples. As Scheirer (2012) has pointed out, programs have a life cycle, and evaluation focus needs to be matched to their stage of development. When they are brand new, they may require approaches such as empowerment, developmental, or good old-fashioned formative evaluation. As programs mature and become routinized, more conventional evaluation methods could follow. But RFPs for evaluation do not often recognize these distinctions. This may be a waste of taxpayer money. I could find very little on this topic, other than a highly useful and enduring chapter by Weidman (1977) that reprinted in Eleanor Chelimsky’s (1985) edited volume on program evaluation for the American Society for Public Administration. Reprise of Chelimsky as a major influence for good, in our profession.

Private funders are not immune from flaws in the RFP process. I recently invited proposals to evaluate a major initiative of the Robert Wood Johnson Foundation. I invited firms and people whom I knew had both content and methods expertise and would do a superb job. But several of these expert individuals and firms declined to bid on the evaluation. Why? Probably, I was overly prescriptive in the RFP about what the evaluation tasks were. I made these choices based on my literature review and the need to address stakeholders’ priority questions. But an overly prescriptive RFP may have precluded the evaluators’ opportunity to address the challenges using their unique expertise, thus defeating my purpose in selecting them. My lesson from this situation reflects Weidman’s (1985) first of 10 hints for better RFPs, that is, avoid unnecessary constraints. My version of this hint: Leave some scope for gifted evaluators to address the evaluation questions in their own way. It is a balance between prescription and trust in evaluators’ expertise.

Integrating Evaluation Theory and Practice

Debra suggested that Weick’s (1984) concept of small wins and large gains could help us integrate theory and practice. She offered a variety of evaluation venues, meetings, and institutes that can do so. But her own experiences, shared in the literature, offer much more specific examples of “small wins,” and their description connects with readers, at both an experiential level and an abstract level. We find value in Patton’s (2008) Utilization Focused Evaluation for much the same reason—it gives the reader a sense of how to address the big questions of evaluation based on specific practice examples. It is Patton’s ability to move easily between the general and the specific that makes him such an effective writer.

Based on the need to connect with readers at both the practice level and the theoretical level, here’s my own interpretation of small wins leading to big gains for this kind of integration. As in Debra’s article, they would best occur intentionally, bit by bit, reviewing specific examples of practice from diverse perspectives of theory and deliberately challenging theoretical statements on the basis of practice. This approach, frankly, would be a refreshing change of pace. Dreary, singular, one-by-one descriptions of evaluations are often a terrible letdown if the presenter does not relate the evaluation to the larger issues of the field. In the same way, the total abstraction I hear in so many discussions of evaluation theory is disappointing because no specifics are offered with which to test the abstractions. Do I believe this or that abstract statement about evaluation? On what basis and with what qualifiers?

Sometimes, in discussions of evaluation theories or approaches, it seems we are talking past each other with little understanding. The concept of the Ladder of Abstraction from semantics (Hayakawa & Hayakawa, 1992) helps us to understand what might be missing from our discussions of theory. Abstracting from specifics is something that human beings naturally do. Communicating well, however, requires that we (1) understand the level of abstraction that others are using and (2) assure that the abstractions are deriving from the same specifics. I believe that evaluation theorists could better communicate with each other by more abundant use of, and mutual understanding of, the same specific evaluation practice issues. There are at least two potential resources for this purpose. First, Debra described several studies with evaluation practice as their empirical focus. She suggested that much more could be done to utilize these empirical results, both to test existing theory and to develop better theory. Second, we could do more to utilize specific case studies of evaluation, testing and elaborating on theoretical positions in light of those specifics. In other words, integration of theory and practice requires that we go up and down the Ladder of Abstraction repeatedly, using examples to create or evaluate generalizations. As Hayakawa and Hayakawa put it (1992, p. 93), a speaker “whose high-level abstractions can systematically and surely be referred to lower-level abstractions is not only talking but saying something.”

Footnotes

Acknowledgment

I gratefully acknowledge collaboration with Darlene Russ-Eft on the question of evaluation management.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Brandon

Fukunaga

L. L.

(2014). The state of the empirical research literature on stakeholder involvement in program evaluation. American Journal of Evaluation, 35, 26–44.

Bryson

J. M.

Patton

M. Q.

(2010). Analyzing and engaging stakeholders. In Wholey

J. S.

Hatry

H. P.

Newcomer

K. E.

(Eds.), The handbook of practical program evaluation (3rd ed., pp. 30–54). San Francisco, CA: Jossey-Bass.

Chelimsky

(Ed.) (1985). Program evaluation: Patterns and directions. Washington, DC: The American Society for Public Administration.

Hayakawa

S. I.

Hayakawa

A. R.

(1992). Language in thought and action (5th ed.). San Diego, CA: Harcourt.

Leviton

L. C.

(2001). Building evaluation’s collective capacity. American Journal of Evaluation, 22, 1–12.

Nakashian

(2007). A guide to strengthening and managing research grants. Princeton, NJ: The Robert Wood Johnson Foundation. Retrieved January 21, 2015, from http://www.rwjf.org/content/dam/web-assets/2007/11/a-guide-to-strengthening-and-managing-research-grants

Parry

(2014, Fall Edition). Episteme and techne. In Zalta

E. N.

(Ed.), The stanford encyclopedia of philosophy. Retrieved from http://plato.stanford.edu/archives/fall2014/entries/episteme-techne/

Patton

M. Q.

(2008). Utilization-focused evaluation (4th ed.). Thousand Oaks, CA: Sage.

Preskill

Jones

(2009). A practical guide for engaging stakeholders in developing evaluation questions. Princeton, NJ: The Robert Wood Johnson Foundation. Retrieved January 21, 2015, from http://www.rwjf.org/content/dam/web-assets/2009/01/a-practical-guide-for-engaging-stakeholders-in-developing-evalua

10.

Pressman

J. L.

Wildavsky

(1984). Implementation: How great expectations in Washington are dashed in Oakland; or, why it's amazing that federal programs work at all. Berkeley: University of California Press.

11.

Rog

D. J.

(2010). Designing, managing and analyzing multisite evaluations. In Wholey

J. S.

Hatry

H. P.

Newcomer

K. E.

(Eds.), The handbook of practical program evaluation (pp. 208–242). San Francisco, CA: Jossey-Bass.

12.

Scheirer

M. A.

(2012). Expanding evaluative thinking: evaluation through the program life cycle. American Journal of Evaluation, 33, 264–277.

13.

Shadish

W. R.

Cook

T. D.

Leviton

L. C.

(1991). Foundations of program evaluation: Theorists and their theories. Newbury Park, CA: Sage.

14.

Weidman

D. R.

(1977). Writing a better RFP: Ten hints for obtaining more successful evaluation studies. Public Administration Review 37, 714–717. Accessed on February 9, 2015 at https://www-jstor-org.web.bisu.edu.cn/discover/10.2307/975341?sid=21105299061071&uid=3739832&uid=4&uid=2129&uid=70&uid=2&uid=3739256

15.

Weick

K. E.

(1984). Small wins: Redefining the scale of social problems. American Psychologist, 39, 40–29.