Abstract

Over the last 20 years we have seen the widespread institutionalisation of evaluation into policy and practice across many countries. The publication of the International Atlas of Evaluation (Furubo et al., 2002) was a landmark in mapping this institutionalisation process. It described and compared 21 countries in terms of 9 ‘indicators’ that sought to capture the institutionalisation of evaluation and the strength of evaluation cultures. Steve Jacob, Sandra Speer and Jan-Eric Furubo have now revisited this Atlas ten years later using the same indicators and scales to try and estimate what has changed in 19 of the initial 21 countries. As with the Atlas itself the process has been subjective although by surveying 4 or 5 experts in each country this exercise has taken steps to strengthen the judgement and measurement process. Overall there has been an improvement across the 19 countries reviewed: 15 out of the 19 are now assessed as being in the ‘high maturity’ category and none are in the low category. However Jacob, Speer and Furubo suggest that differences remain in terms of some of the indicators used – there is diffusion but not convergence. The authors also point out that although no countries have gone backwards over the last decade changes in policy and funding for evaluation could well lead to such setbacks in the future. Inevitably as the Atlas captured evaluation maturity in the early years of the 21st century this article aiming for comparability remains focussed on the same mainly industrialised OECD countries. It would be good to see the baseline for comparison extended in years to come to the many other countries that have also now embraced and begun to institutionalise evaluation, often in quite different ways.
Mita Marra focuses on gender inequality through the lens of complexity theory and realist evaluation. Her choice of conceptual and theoretical perspective follows her chosen evaluation object, gender cooperation between men and women, which although enacted in households is shaped by broader policies and opportunities. At this macro level Marra is interested to know ‘what actions produce equitable outcomes, such as sharing care responsibilities within the household; unbiased assessment and formalization of reward systems within the workplace; and access and capacity in decision making within political institutions and private firms’. At the same time the author utilises realist concepts to understand ‘the social mechanisms of cooperation’. In so doing Marra highlights the ‘links between social representation, social action, and public policy’ – how for example emotional, cognitive and economic mechanisms can both trigger more or less gender cooperative behaviour. Mita Marra’s project is conceptual rather than methodological and illustrates how thinking within complexity science and even neuroscience frameworks can both help identify evaluable mechanisms and at the same time inform agendas for political, institutional and interpersonal change.
Ivan Horrocks and Leslie Budd describe their evaluation of an EU-funded public and community e-services programme, EGOV4U, that sought to address problems of digital exclusion and inclusion. The authors followed a theory driven approach that relied for its causal analysis of ‘impact’ on realist thinking in terms of ‘generative mechanisms, causal powers and contiguous contexts’. The evaluation wanted to know ‘how widening access to ICT-mediated multi-channel government and public services contributes to combating social exclusion and thereby increases public value’. In so doing Horrocks and Budd cast their theoretical net still more widely than programme theory, building on seven ‘capitals’ derived from Bourdieu as ‘impact factors’. (A ‘capitals’ approach has featured previously in this journal, see Elkins and Medhurst in Evaluation 12.4; and Radej in issue 17.2) In an article that reflects in considerable depth on the approaches taken and the reasons for their choices, the authors explore some of the central dilemmas of realist evaluation such as how to put together CMOs; and how to identify dominant mechanisms.
Sam Porter challenges the way that Ray Pawson reads and interprets Roy Bhaskar the champion of ‘critical realism’, who it should be noted sadly died in November 2014. Porter mounts his challenge from a philosophical position demonstrating admirable familiarity with Bhaskar’s text. To set this dispute in context: in the opening chapter of The Science of Evaluation, Pawson identifies Bhaskar as one of his ‘intellectual heroes’ whose 1978 text A Realist Theory of Science, Pawson credits with the core concept of ‘generative causation’ and the motto: ‘Theory without experiment is empty. Experiment without theory is blind.’ Later in the same book Pawson considers how realist evaluation differs in its understanding of complexity from what he understands to be Bhaskar’s stance as a ‘critical realist’. Porter’s concern is with this later discussion arguing ‘that the weaknesses that Pawson ascribes to critical realism are for the most part unfounded, and that its differences with realist evaluation are not as crucial as he makes them out to be’. Nonetheless, Porter does grant two differences which he questions – ‘the interpretation of the relationship between social structure and agency, and the role of values in evaluation research’.
These criticisms of realist evaluation according to Porter have implications for evaluative practice in the realist tradition. I leave it to others in the realist fraternity to debate these points – and invite them to do so. However I was left wondering how such challenges measure up with what realist evaluators actually do! For example, Mita Marra indeed ‘conceives of social mechanisms as generative powers that result from the distribution of material, cultural and jurisdictional resources embedded in relatively enduring social relations’. At the same time Marra whilst recognising constraints on human agency rooted in social structure also identifies cognitive and emotional mechanisms of change that are themselves one expression of human potential and agency. Similarly neither Marra nor Horrocks and Budd could be accused of ignoring ethical and value concerns: ‘the degree to which an intervention supports the development of people’s capacities and potentialities, and the degree to which it inhibits them’ seems central to both of these articles espousing realist evaluation precepts.
Very often those who advocate combining (or ‘mixing’) methods do so from the point of view of ‘triangulation’, arguing that conclusions can be strengthened if different methods lead to the same results. An alternative rationale for combining methods stems from the recognition that different methods – and methodological designs – are good at doing different things and together ensure ‘a more rigorous methodology than each of the individual approaches on their own’. One distinctive feature of the article by Geske Dijkstra and Antonie de Kemp is that it exemplifies this logic. The authors combined case studies drawn from the literature and cross-country econometric comparisons, based on OECD data. Both analyses were set within a theoretical approach that starts with an intervention theory. A second distinctive feature of this article is that it concerns ‘budget support’ a policy instrument widely debated in development aid. This is a type of policy instrument is not unfamiliar in many policy domains – a highly contextualised and multi-component policy instrument that faces several challenges that make counterfactual approaches very difficult to implement even for those who favour that family of designs and methods. At the same time the authors argue (citing Howard White) that counterfactual approaches ‘can be applied to some components of the intervention theory’ if not to others. (Echoes here perhaps of Gill Westhorp’s ideas about ‘nested’ evaluation designs, see Evaluation 19.4).
Ralph Stacey’s complexity matrix first developed in the 1990s – and frequently reproduced since then – is the basis of the well known distinction between the simple, complicated and complex. Marlèn Arkesteijn, Barbara van Mierlo and Cees Leeuwis argue that this matrix based on the two dimensions of agreement and certainty, is insufficient to understand the persistence of development problems such as poverty. They argue for a third dimension, system stability, to explain how problems become systemically embedded and reinforced. The authors discuss three mechanisms that support stability: interlinked ‘system rules’, ‘mutual dependence between actors’ and ‘material components’ such as infrastructures and technologies. These mechanisms underpin path-dependence and problem lock-in. Arkesteijn, van Mierlo and Leeuwis highlight the limitations of both rigid, instrumental evaluation approaches such as the ‘logical framework’; and constructivist, soft-systems, learning oriented and participatory evaluation approaches in the face of complex and path-dependent problems. They favour instead a ‘reflexive perspective’ that ‘should encourage groups of diverse actors to reflect on the rules and relations underlying current practices in order to induce institutional changes’. According to the authors this approach offers the promise of both evaluating change and supporting the change process.
Readers with interests in the institutionalisation of evaluation and in realist evaluation will also be interested in the source material signposted in the News from the Community section towards the end of this issue. It includes new publications by Gill Westhorp and by Marlene Laubli-Laud and John Mayne among others.
Institutionalisation, realism, reflexivity, combining designs and theory-based evaluations – all laced with a healthy measure of philosophical rigour. Not a bad way to begin 2015. A Happy New Year to all!
