Abstract

This is a very diverse issue: it includes overviews of performance management and realist evaluation; considers experimental ascendancy in development evaluation and looks deeply into participatory processes in civil society coalition. Something to appeal to and annoy all our readers!
Increasing the rationality, efficiency and responsiveness of public management is at the heart of most public sector reforms; and arguably performance management systems are the prototypical example of such reforms. Jan van Helden, Åge Johnsen and Jarmo Vakkuri consider how we should evaluate performance management systems from a life-cycle perspective. The authors suggest that considering the entire performance management life-cycle through design, implementation, use and (re)assessment allows for a better understanding of the practical problems that public managers face and the identification of research and evaluation priorities. Up to now van Helden and colleagues argue there has been too little focus on the use and assessment parts of the cycle. This wide-ranging and authoritative consideration of what we know about performance-management systems highlights the limited theoretical underpinnings of many existing evaluative studies. The lack of evidence about causal links between performance- management systems and intended outcomes in particular, signposts a rich agenda for future evaluative efforts.
This issue contains two reflective articles on the realist evaluation tradition. The first is by Ray Pawson and Ana Manzano-Santaella so speaks with the voice of one of the founders of realist evaluation. The second by Bruno Marchal and colleagues reviews cases of realist evaluation in the health domain. These two articles should be read in conjunction with each other and can be seen to mark an early stage in the maturation of this still new evaluation tradition, that can now be discussed less from an advocacy position and more on the basis of the experience and trials faced by practicing evaluators.
Ray Pawson and Ana Manzano-Santaella review some of the many evaluations that describe themselves as ‘realist’ and ask the apparently simple question: ‘are they really realist?’ The authors indicate why even some that ask variants of the ‘what works for whom in what circumstances’ question – through what mechanisms and in what context – may not really be part of the realist ‘family’. Some present no theory to explain apparently significant statistical analyses; others advanced explanations but included no data about outcome patterns. The authors pare down what from a realist perspective counts as ‘conjecture’ and theory building; and reinforces the need to focus on both processes and outcomes; the quantitative and the qualitative. Pawson and Manzano-Santaella see the purpose of this kind of analysis as ultimately about methodological development that will help evaluators refine the realist approach through practical experience.
Bruno Marchal, Sara van Belle, Josefien van Olmen, Tom Hoerée and Guy Kegels consider the application of realist evaluation in the health domain. Situating the realist evaluation school as elaborated by Pawson and associates as a form of theory-based evaluation, the authors begin by attempting to clarify some of the often-confused terms such as ‘Theory of Change’, ‘theory driven evaluation’ as well as realist evaluation itself. These distinctions are then deployed in a systematic search strategy and review. This review sets out to establish how realistic evaluation principles are used and applied in healthcare evaluations; and what methodological problems have been encountered. After a substantial scan and the elimination of non-realist evaluations (e.g. solely theory-driven; or advocacy papers without an empirical base) the authors were left with 19 studies. They point out the confusion even at a philosophical level in what is still a ‘young’ and immature evaluation tradition. At the same time Marchal and colleagues identify genuine methodological challenges that need to be overcome – such as how to differentiate mechanisms from context and how to identify theory in areas where not much is known. These authors at times emphasize the absence of ‘methodological guidance’ to take realistic evaluation forward. But they also – like Pawson and Manzano-Santaella – seem to regard the analysis of evaluation experience and careful conceptualization as the best way to develop the guidance that is needed.
Robert Picciotto, in a vigorous ‘opinion piece’, revisits the rise of experimentalism, a powerful source of contention in the evaluation of international development. Picciotto sets the debate in context: how failures to demonstrate the effectiveness of aid were met by the claims of advocates of experimentalism that they would be able to demonstrate ‘what works’. The author further maps the international development debate against the historical emergence of positivist and experimental traditions in philosophy, economics and medicine. Picciotto highlights the disjuncture between methods that ‘are best suited to the assessment of simple and stable programs’ even though ‘the development enterprise is mostly made up of complex, adaptable interventions implemented in volatile environments’. He also points out that randomized experiments are not as is sometimes claimed the only ‘scientific basis of ascertaining causation or attribution’. Picciotto’s advocacy of ‘mixed methods’ expresses an emerging consensus in many parts of the development evaluation community but is not yet reflected among those major aid agencies that are still committed to the search for certainty in an indeterminate and complex world.
Yuriko Sato unlike other authors in this issue comes from an experimental and statistical evaluation tradition. She is both interested in impact and in the magnitude of impact. She proposes using the ‘Standardized Mean Difference’ between a treatment group and a control group as a measure of impact. For many readers of this journal the greater interest will be in the object of evaluation – Japan’s foreign student policy towards Thailand which supports Thai students studying in Japanese universities with the aim of influencing their attitudes and behaviour towards Japan as a result. This case does lend itself in part to a quasi-experimental approach – there is after all an intervention and there is a large population of students who experience this intervention. But there are also limitations as Sato acknowledges. However, the main purpose of this study was to ‘trial’ a method thus highlighting its strengths and weaknesses. The author recognizes that the use of SMD as an impact measure and the overall design ‘cannot explain the reasons why certain interventions are effective while others are not’. For this Sato argues qualitative analysis is needed – and many evaluators would argue so is some kind of theory-based evaluation.
Hélène Laperrière, Louise Potvin and Ricardo Zúñiga are concerned with the evaluation of participatory programmes. In this case the programme is located in a partnership setting where the State and a coalition of HIV/AIDS civil society organizations are expected to work together. The authors also actively provided the coalition with support and resources as part of their research and the study adopts a civil society perspective. Laperrière and her colleagues concluded that in order to engage in an evaluation, the civil society coalition needed a particular kind of ‘evaluability assessment’ that recognized the disparities of power between the parties. This ‘socio-political’ framework for ‘evaluability assessment’ considerably extends how that term is usually understood. It builds on Friedberg’s sociology of organizations that distinguishes an idealized managerial discourse, the formal structure and informal structures that arise from social interactions. Informal or ‘unruly’ interactions rather than formal hierarchical relations were key to understanding the coalition’s evaluation needs. The socio-political framework grew out of the tension between the more formal and informal ways of interacting in the partnership.
