Abstract
The Practice of Health Program Evaluation provides an overview of the evaluation process for public health programs while diving deeper to address select advanced concepts and techniques. The book unfolds evaluation as a three-phased process consisting of identification of evaluation questions, data collection and analysis, and dissemination of results and recommendations. The text covers research design, sampling methods, as well as quantitative and qualitative approaches. Types of evaluation are also discussed, including economic assessment and systems research as relative newcomers. Aspects critical to conducting a successful evaluation regardless of type or research design are emphasized, such as stakeholder engagement, validity and reliability, and adoption of sound recommendations. The book encourages evaluators to document their approach by developing an evaluation plan, a data analysis plan, and a dissemination plan, in order to help build consensus throughout the process. The evaluative text offers a good bird’s-eye view of the evaluation process, while offering guidance for evaluation experts on how to navigate political waters and advocate for their findings to help affect change.
Keywords
Grembowski, D. (2016). The practice of health program evaluation (2nd ed.). Thousand Oaks, CA: Sage. ISBN-13: 978-1483376370. Paperback, 352 pp. $85.98.
Over the past quarter century, evaluation of public health programs has grown substantially, due in part to increased emphasis on implementing evidence-based practices, focus on prevention, and scarcity of resources. Program evaluation is the systemic collection of information about the activities, characteristics, and outcomes of programs to make judgments about the program, improve program effectiveness, and/or inform decisions about future program development (Centers for Disease Control and Prevention, 2011). Public health evaluation has grown into a specialized field and has given rise to professionals with expertise in associated areas, from health economics to implementation science. The Practice of Health Program Evaluation is a good resource for individuals entering the field as well as seasoned practitioners. The book offers a foundation of basic evaluation concepts, while covering many aspects of evaluation that are oft-neglected, such as developing sound recommendations, disseminating results to encourage use of evaluative findings, and navigating political waters.
The first chapters of the book provide the reader with an understanding of the basic fundamentals and tenets of evaluation. The importance of engaging stakeholders is emphasized, and the reader is reminded that the final determination of a program’s worth is a political decision made by those stakeholders based on factual findings from the evaluation as well as held values. From there, the book likens evaluation to a three-act play: In Act 1, decision makers and other stakeholders define the questions the evaluation will answer. During Act 2, research methods are applied to collect and analyze data to answer the questions raised. Last, in Act 3, answers to evaluation questions are disseminated to provide insights that may influence program decisions.
The evaluation play opens with Act 1 starting in Chapter 3. It is during this act that the process of defining evaluation questions is laid out in four steps: (1) develop conceptual or logic model that shows the chain of causation in accordance with the program theory; (2) identify program objectives and categorize them as immediate, intermediate, and ultimate objectives; (3) formulate evaluation questions based on program theory and objective; and (4) select which questions to answer based on consensus, budget, how the findings will be used, and other factors.
Chapter 4 begins Act 2 by addressing how program impact can be determined using different research designs. Types of experimental and quasi-experimental designs are described, complete with accompanying diagrams to illustrate the structure of each design in a succinct way. The text identifies the strengths and weaknesses of each design, which can be particularly useful when publishing evaluation research. Threats to internal and external validity are discussed, as well as ways to help mitigate them, such as adding a control group, making observations before program implementation, or adding variables to explore alternate explanations of program effects. Related to publishing evaluation research, the book reiterates the importance of reporting the three “Rs” of impact evaluation: representativeness of findings, robustness of the program, and replicability of the intervention.
Subsequent chapters in Act 2 detail other types of evaluation, including economic analysis and implementation evaluation. Chapter 5 lists four types of economic evaluation: (1) cost-effectiveness analysis, (2) cost–benefit analysis, (3) cost minimization analysis, and 4) cost–utility analysis. The book provides guidance on conducting an economic evaluation, which includes identifying an alternative to the program for comparison purposes, describing how inputs result in benefits and costs, and conducting a sensitivity analysis. Chapter 6 categorizes implementation evaluation into four types according to purpose (descriptive or explanatory) and timing of data collection (cross-sectional or longitudinal). These types of evaluation are most often used to monitor implementation and consider program changes or to explain program outcomes estimated in an impact evaluation. This chapter also briefly describes mixed methods and how quantitative and qualitative research can be integrated or supplement one another in a comprehensive evaluation.
During the latter part of Act 2, the importance of developing an evaluation plan that reflects the evaluation type, questions, and research methods is emphasized. The plan should outline the proposed populations, sampling, measurement, data collection, and data analysis to identify methodological problems before the evaluation begins. Due to the importance of such aspects of the evaluation to valid design, findings, and recommendations, the next three chapters are devoted to addressing these methodological details.
Chapter 7 describes probability and nonprobability sampling and instances in which each sampling method is most appropriate. Calculating minimum sample size based on the type of evaluation, the effect size, and the size of the population is detailed. This chapter also discusses sufficient power and how this relates to Type I and Type II errors in determining statistical significance of differences.
Chapter 8 acknowledges formal and informal data collection as part of the evaluation process, with the latter providing an essential cultural, social, and political context. The book also delineates between quantitative and qualitative data collection. The steps of quantitative data collection are identified, along with detailed instructions that enable the next phase. Steps include deciding what concepts to measure, identifying measures of the concepts through operational definitions, assessing and selecting reliable and valid measures, identifying data sources for each measure; organizing measures into independent and dependent variables for the purposes of analysis, and finally, collecting the data. The book also briefly describes qualitative methods most commonly used in the evaluation of health programs: ethnography, participant observation, field observation, informal interviewing, focus groups, and document content analysis.
Regardless of the type of data collection, concepts of validity and reliability are introduced, and their relevance and importance to evaluation are emphasized. Types of validity include face, content, criterion, and construct validity. The text also describes ways to assess reliability of a measure, including interrater/interobserver, intrarater/intraobserver, split-half, test–retest, and alternate form reliability.
Act 2 comes to a close with data analysis to yield results that can answer each evaluation question. Chapter 9 espouses developing a question-oriented data analysis plan, which identifies the overarching aim of the analysis, data that are needed, appropriate analytic techniques, and how results will be formulated and disseminated to decision makers and other stakeholders. Similar to data collection, separate sections describe aspects of quantitative and qualitative data analyses. The book recommends starting quantitative analysis with basic descriptive statistics and histograms to detect missing data or coding errors and to ascertain data skewness. Variables should subsequently be categorized by type and level of measurement to determine potential appropriate statistical techniques. The research question will also help indicate whether descriptive, bivariable, or multivariable analysis is needed. More advanced techniques to help correct for selection bias are explained as well.
The evaluation text recognizes that data collection and analysis are often not separate activities when using qualitative methods, yet leaves it to the research methods literature to describe those analytic techniques in detail. The importance of triangulating qualitative findings is emphasized, with four types of triangulation identified: (1) theory triangulation, (2) data triangulation, (3) methods triangulation, and (4) discipline triangulation.
The third and final Act of the evaluation play entails the dissemination of findings and recommendations. Chapter 10 provides guidance on how to develop sound recommendations and encourage their adoption. The reader is advised to consider all issues and staff for improvements, collaborate with decision makers and stakeholders throughout the process, and decide whether changes should be fundamental or incremental. To facilitate their practical application, recommendations should be defensible, specific, realistic, and timely and use plain language. The book encourages the development of a dissemination plan for evaluation findings, which should detail the type of recommendations that will be shared with each stakeholder, when those results will be shared, and the mode of distribution. A successful evaluation can achieve many outcomes, such as changing a program, establishing program legitimacy, influencing institutions and interest groups, or even building collaboration among program implementers (Fetterman, 2001).
In final review, The Practice of Health Program Evaluation guides the reader through each stage of evaluation by using the creative vantage point of a three-act play. Grouping the evaluation process into discernible activities, and breaking each task into concrete steps, helps demonstrate how one phase informs the next to build an accurate program story. This enables the beginning evaluator to understand basic principles and digest the evaluation process. More advanced topics, concepts, and techniques are also addressed for those with evaluation expertise.
One of the book’s weaknesses is the lack of importance given to implementation evaluation, and its oft-underused potential to contribute to the evidence base. Another is the glossing over of qualitative methods, especially when compared with the detail and attention afforded to quantitative methods through step-by-step instruction. This may inadvertently underestimate the importance of qualitative data to conducting a comprehensive evaluation. In addition, the book’s treatment of ethical issues is quite brief and fails to acknowledge important principles, such as fairness, transparency, empowerment of program participants, and privacy and confidentiality (Gopichandran & Krishna, 2013). Yet these weaknesses are slight in what proves to be a solid resource on health program evaluation practice. This book will only further in relevance to health education practitioners as greater accountability is requested of public health programs over the next quarter century.
