Baselines and monitoring: More than a means to measure the end

Abstract

Monitoring is largely ignored in its capacity to provide a distinct contribution to evaluation. It is often thought of as a process of collecting data to feed into an evaluation, rather than for its own powerful transformative potential. Evaluation is considered a mechanism for producing findings that enable learning, improvement and decision-making; but what if monitoring could produce these same outcomes with, in some cases, greater alignment to quality characteristics of utility, timeliness, feasibility, propriety, accuracy, completeness and monitoring accountability? This article examines the utilisation and value of monitoring through a case study of a government funded 12-month rural health project in Victoria, Australia. The project initially commissioned a baseline to assess against post-project outcomes. However, adopting a utilisation-focused perspective to prepare for use and support stakeholder engagement enabled implementation of a multipurpose monitoring framework. The case study provides examples of monitoring in action with timely learning, decision-making and improvements resulting in incremental system and behaviour changes, rather than relying on periodic outcome recommendations at evaluation completion. This article adds to evaluation theory and practice through highlighting monitoring as a significant mechanism for enabling learning, decision-making, and improvement.

Keywords

baseline developmental evaluation evaluation everyday evaluation health informal evaluation monitoring rural urgent care utilisation utilisation-focused evaluation

Introduction

Evaluation is experiencing an era of increasing demand. Government and non-profit programmes are particularly likely to have mandated evaluation expectations, which regularly include evaluation by independent external consultants. While evaluation basks in the limelight, monitoring is sidelined, rarely mentioned except as the ‘M’ in monitoring and evaluation (M&E). This article argues that monitoring is down-played and misunderstood, to the detriment of programmes and projects that could benefit from the real-time data and opportunities for learning and improvement that monitoring can offer. The case study of a rural Australian health project examined in this article was initially ambivalent about explicitly focusing on monitoring processes and data, instead simply requesting a baseline to measure against post-project results. This article explores the utilisation and value of monitoring efforts in the case study to assess the potential of monitoring as a mechanism to enable learning, decision-making and improvement. The article does this by presenting the results of the case study project in terms of data utilisation for learning, decision-making and improvement, and then unpacks the value of these results in the ‘Discussion’ section by analysing them against key characteristics of data quality. As well as elucidating the distinct contribution of monitoring, the findings presented in this article analyse whether monitoring and a system of everyday or informal evaluation could potentially provide a replacement for traditional evaluation in some situations, freeing up traditional evaluation to take the next step into deep evaluative research.

While evaluation is often conceived as an independent concept, monitoring tends to be coupled alongside evaluation rather than standalone. Evaluation is defined as a methodical process of collecting data about an evaluation and deducing its worth, value, quality, significance and merit (Stufflebeam & Coryn, 2014). Evaluations are expected to provide relevant and credible information that impacts decision-making, helps organisations be accountable to their stakeholders, and verifies that evaluands are fulfilling their aims (Mikkelsen, 2005; Organisation for Economic Co-operation and Development-Development Assistance Committee, 2010). Despite many desired uses, programme and project evaluation’s main purpose is commonly cited as improvement (Harman, 2019; Patton, 2012; Stufflebeam & Coryn, 2014; Wadsworth, 2011). Similarly, monitoring can identify concerns and catalyse programme and project improvement. However, while evaluation is usually dependent on interval-based cycles of review with resulting artefacts such as reports with recommendations for uptake, monitoring enables utilisation of continuous real-time data to fine-tune implementation (Owen, 2006).

Monitoring is an ongoing act of gathering data to ensure that programmes and projects are functioning as expected. It is ‘the planned, continuous and systematic collection and analysis of program information able to provide management and key stakeholders with an indication of the extent of progress in implementation, and in relation to program performance against stated objectives and expectations’ (Markiewicz & Patrick, 2016, p. 12). Monitoring is typically understood as a base for evaluation, with evaluation incorporating monitoring data alongside additional datasets such as interviews and focus groups (Markiewicz & Patrick, 2016; Mikkelsen, 2005). As such, monitoring is rolled into evaluation and classified as an activity within evaluation (Bell & Aggleton, 2016; Hatry et al., 2015). The value of monitoring as a distinct activity is largely ignored, a point raised in at least three doctoral theses which highlight that monitoring is neglected and misunderstood in the evaluation literature (Boardman, 2019; Guijt, 2008; Kelly, 2019). Despite this article arguing the value of monitoring as a replacement for evaluation in some situations, and arguing that monitoring is distinct from evaluation, we strongly acknowledge that this distinction is not clear cut and the separation between monitoring and evaluation is a grey continuum rather than two entirely separate entities. This grey is further blurred by a trend towards more frequent evaluative approaches that sit between monitoring and evaluation, such as developmental evaluation. While ‘in many ways there is little difference between developmental evaluation and ongoing efforts by M&E staff to monitor and evaluate complex programs at work’ (Simister, 2017, p. 2), within this article, we argue that developmental evaluation has a stronger intent to make evaluative conclusions about the worth, value, quality, significance, and merit of an evaluand than the monitoring and everyday evaluation approaches we discuss in this article. The latter focuses on building rigour around extant systems and processes to gather small-scale learning and decision-making information for real-time improvements. A key tangible difference within the understanding of monitoring and evaluation in this article is that evaluation tends to produce written reports at intervals that make evaluative judgements and list recommendations. While evaluative thinking is definitely at play during the everyday evaluative activities described in the case study used for this article, the formality of the process and outputs differ from standard evaluation.

The literature on evaluation utilisation lays the groundwork for recognition of monitoring’s potential contribution to programme improvement. Fifty years of research on evaluation use shows that evaluation suffers from common non-use, misuse and low-use (King & Alkin, 2019; Maloney, 2017; Stufflebeam & Coryn, 2014). Despite investigations and solutions aimed at improving utility, evaluation continues to be underutilised (Kelly, 2021). Meanwhile, monitoring practices in multiple sites have demonstrated ability to engender change and improvement (Guijt, 2008; Kelly, 2019). A stark difference between these approaches is that evaluation tends to be top-down, driven by donors or organisational executives, while monitoring tends to be enacted and maintained at ground-staff level with bite-sized incremental changes implemented day-to-day.

Evaluation approaches including developmental evaluation and real-time evaluation seek to combine the value of monitoring with the analytical power of evaluation and can be highly effective approaches. However, they may require greater skill levels and resources to implement than monitoring alone, and may prioritise external evaluators, thereby reducing the ability for the system to be internally sustainable in the long term. In addition, while these approaches have significant merit, their application is not universal. Monitoring systems of some description are ubiquitous in social programming; therefore, monitoring provides an opportunity for organisations to build on existing processes without necessitating substantial extra resourcing or capability building. Throughout this article, ‘evaluation’ refers to interval-based standard forms of evaluation, rather than to ongoing iterative approaches such as developmental and real-time evaluation.

Theoretical framework

This article uses the case study of a rural Australian health project to unpack monitoring utilisation and value against core evaluation and data quality standards to determine the potential for monitoring to replace evaluation in some scenarios. A review by Chen et al. (2014) examined 49 characteristics of data quality and found that ‘Completeness, accuracy, and timeliness were the three most-used attributes’ (p. 5170). In addition, they highlighted the importance of data use as an area requiring investigation when considering data quality (Chen et al., 2014). As these four characteristics (completeness, accuracy, timeliness and utility) are applicable to monitoring data, we have used these as a framework to assess the quality of monitoring in the case study example. While these characteristics are a useful measure of data quality, they fail to provide information about why monitoring may be more appropriate, ethical and feasible than evaluation in some situations. As such, we have integrated these data quality characteristics with core evaluation standards.

Using evaluation standards to assess monitoring is based on the rationale, assessed through the case study, that monitoring can produce the desired results of evaluation by helping programmes and projects learn, make decisions and improve. As such, examining the case study’s monitoring journey against core evaluation standards helps ascertain areas where monitoring may be more beneficial at fulfilling evaluation criteria and quality than evaluation itself. Furthermore, using evaluation standards as a theoretical framework helps identify areas where monitoring may require increased rigour.

There have been numerous attempts to standardise evaluation quality to provide guidelines for evaluators and their commissioners. This article employs the Joint Committee programme standards to unpack the quality potential of monitoring activities in the case study project. The Joint Committee on Standards for Educational Evaluation began as the collaborative effort of a coalition of education professionals in 1974 who wished to define evaluation quality. The most recent version of the Standards was published in 2011 and sets forth five essential sets of standards for programme evaluation: utility, feasibility, propriety, accuracy and evaluation accountability (Yarbrough et al., 2011). While the utility and accuracy characteristics overlap with those determined as key by Chen et al. (2014), this article includes additional characteristics from the Joint Committee standards: feasibility, propriety and evaluation accountability (known hereafter as monitoring accountability). Inclusion of these characteristics will help examine the viability and ethical stance of a monitoring approach. These characteristics are expanded and used as a framework in the ‘Discussion’ section of this article to determine the extent to which the monitoring activities undertaken in the case study project meet with accepted notions of data and evaluation quality.

Background of the case study project

This article employs a single case study approach to examine monitoring. The case study for this article is a 12-month rural health project funded by an Australian government body. The project was led by one small rural health service in partnership with two additional small rural health services, one large regional health service and the state-wide emergency transport service (ambulance). These five services are referred to as project partners throughout this article. The focus is on community use (patient presentations) of urgent care or emergency care. The project sought to address the unnecessary burden placed on the regional health service’s emergency department by community members who could have their care needs met at their local rural health services’ urgent care department.

Health services in Australia are categorised and funded by size (small rural, rural versus regional). Small rural health services offer nurse-led urgent care departments where general practitioner doctors (GPs) attend on an on-call arrangement whereas the regional emergency department has doctor attendance at all times. This project was initiated as community members and ambulance personnel were choosing the regional emergency department regardless of care need.

A co-located evaluator from a university department of rural health was based at the lead site to oversee coordination of the project data collection across the multiple partner sites. Historically, these health service organisations used external evaluation consultants and their expectations were for an independently verified, objective summative evaluation at project end with minimal stakeholder involvement throughout the evaluation process.

Originally, the project lead organisation requested the evaluator to capture the baseline data at the regional health services’ emergency department (the assumed site of the problem) and then return in 12 months to assess the change that was expected to have occurred by programme end. Feeling that she could offer the project greater value with a modified approach, the evaluator engaged in a discursive process with stakeholders to unpack their evaluation needs. This involved discussing the project strategies and actions affecting not only the regional health services’ emergency department, but also the rural health services’ urgent care departments and ambulance that had capacity to relieve the presentation burden facing the regional health service.

The evaluator invested considerable time in challenging preconceived notions of effective data capture and worked with staff from the multiple sites to develop a streamlined and rigorous monitoring framework to enable continuous data capture and allow incremental, real-time fine-tuning improvements. Multiple ad hoc monitoring systems were in use at the beginning of the project and health service staff change management was required to identify gaps and rework these systems.

The evaluator is part of a hub-and-spoke model established with a large metropolitan university who, through outlying rural health department initiatives, partner with local health services to embed co-located evaluation and research positions. This positioning within the rural health service was advantageous for the evaluator in understanding the project complexities and its evaluation needs, and for establishing rapport with staff and other stakeholders. First, the evaluator sought to establish the issue of concern, the project objectives in addressing the issue, and the evaluation scope and boundaries; necessary information to develop the key evaluation questions with the intended users of the evaluation (Harman, 2019; Patton, 2012).

The issue: The high burden on the regional health service’s emergency department, partly due to the large numbers of people from outlying rural areas accessing the service for care. These people could potentially have their care needs met at their local small rural health services’ urgent care department.

Project objectives: To lower the number of local people attending the regional service and redirect them to local services.

Scope: This project focused on care needs which could be appropriately and safely managed at the local small rural health services’ urgent care department. This was based on current one to five triage assessment categories; with one, being immediate life threatening, to five, low urgency with minor pain, minor symptoms or minor wounds. Small rural health services’ urgent care departments have the capacity to assess all categories and treat three, four and five. The regional emergency department can attend to all categories but in particular, need to have resource capacity to respond promptly to categories 1 and 2.

Boundaries: Project geographical boundaries were established to align with project objectives and help define the scope of data sources and collection to answer the key evaluation questions. This included de-identified, aggregated evidence on where people lived (postcodes) to determine their nearest and most appropriate health service (cross referenced to triage category).

The evaluator developed initial baseline report cards for all five partners encompassing the financial year prior to project commencement. These collated available data on variables such as patient location, age, triage category type and issue, and method of arrival (self-transport or ambulance). The report cards presented the data as monthly figures for each variable to examine time of year or seasonal variances. Subsequently, visual data dashboards for each partner site were developed using Microsoft Excel. These included the baseline figures from the original report cards and updated them monthly with ongoing monitoring data, demonstrating patterns and trends between the current project year and the year prior to project commencement.

This article focuses on the mechanics of the monitoring framework, and does not display the data that were captured by this framework. As such, no formal ethics process was undertaken and all data remain with the project partners who agreed that findings could be utilised for research and project improvement. Ethical considerations were discussed with the team and responsibility for upholding ethical commitments was accepted by the project partners. As part of the review of monitoring activities, the evaluator interviewed key project partner staff whose de-identified comments and perspectives are provided throughout this article.

Monitoring’s ability to enhance learning, decision-making and improvement in the case study project

A utilisation-focused perspective was employed by the evaluator to negotiate and develop the new monitoring system in collaboration with project staff with a strong focus on ‘intended uses by specific intended users’ (Patton, 2012, p. 82). This process identified that monitoring the number of in-scope patient presentations at the regional health service’s emergency department in the year prior to project commencement and comparing this figure again at project end was far too reductionist to offer any information about why or how a change occurred, or to make improvements along the way. The evaluator and project partners assessed the monitoring strategies required to enable data mining of all partner sites for provision of quality information to enhance stakeholders’ learning and ability to make decisions about and improve the project. The use of the baseline report cards and embedding the monitoring activity of visual data dashboards using utilisation-focused strategies was found to be effective in a number of ways, as detailed below.

Initial baseline report cards surfaced learnings and a deeper awareness of the complexities of the issues surrounding urgent care centre usage and therefore increased consideration of feasible project strategies to meet objectives. Compiling the baseline reports highlighted the existing gaps in each of the three small rural health services’ urgent care departments’ processes. It was found that there were significant data gaps in the services’ documentation due to poor or outdated systems. These baseline data activities led to decisions around administrative changes and improvements in data collection processes.

Recording patient presentations was found to be inefficient as these presentations were recorded in multiple ad hoc ways such as in hardcopy books, electronic spreadsheets and entries into patient management systems. Due to the new baseline data collection activities, one partner site learned that there were wide discrepancies in patient presentation data records and actioned staff professional development in electronic client management systems. This resulted in consistent and efficient data collection of patient presentations.

Learning about these discrepancies catalysed a ripple effect across the rural health service sites where staff initiated actions to decommission all hardcopy systems, commence electronic record systems and train staff in the use of electronic client databases. As a result of learnings from the monitoring framework, stakeholders made evidence-informed decisions to improve their systems through the introduction of these standardised electronic recording methods and removal of ad hoc paper-based methods. This occurred alongside staff communication across all shifts to ensure change management.

Project partners became engaged in the project and the evaluative activities due to recognising the value the baseline report cards had for their organisations. It was then easier for the evaluator to advocate for a strong monitoring framework and for services to convince staff to make changes in their data collection habits as they could see benefit. Staff reported that the learnings and improvements made as a result of the monitoring system were highly valuable for the rural health services, as underreporting affects funding and staffing allocation.

The visual data dashboards were presented at every project control and operational group meeting and sent out with meeting minutes to enable in-depth discussion within individual partner’s site staff at all levels (administrative and front-line staff). This kept partners engaged as they reviewed and discussed the monthly updates in activities we consider informal or everyday evaluation (Kelly, 2021; Wadsworth, 2011). Attendance at meetings with nominated representative managers, team leaders, and front-line staff, which were originally poorly attended, grew to over 90% attendance for the duration of the project. Staff commented that relationships and collaboration within and between partner sites were strengthened due to the monitoring data and the information it provided that enabled them to learn and make informed decisions to support project improvement.

The visual data dashboards also provided relevant information for decision-making for the project partners. For example, monitoring data showed that high numbers of in-scope patients were presenting to the regional emergency department with a particular health care issue, and investigation revealed a lack of a referral pathway in the local area. The monitoring data showed that children and youth presenting with certain ailments (particularly abdominal pain) were over represented at the regional service. Further exploration discovered that this anomaly was due to a lack of local rural practitioner confidence in treating ailments common to this group and low practitioner understanding of which tests to complete prior to escalation. Once partner sites learned about this gap, they were then able to take action to support this need through specific practitioner training and implementing clinical pathways documentation to guide practitioner decision-making. The monitoring data were useful as an identifier, a rationale and to support decisions.

The monitoring dashboards have continued since the completion of the project. The original Excel design has been improved and replaced by a more sophisticated database. The services’ investment in these more costly and rigorous systems demonstrates the value now placed on monitoring data. Everyday evaluation through discussion and review meetings to examine the monitoring data has become an embedded process with ongoing use of this information for learning, decision-making and improvement.

Mapping monitoring against characteristics of quality

The results of monitoring activities in the case study project demonstrate the potential for monitoring to enable important learnings to inform decision-making and improvement. This section assesses the quality of these monitoring outcomes by analysing them against the core programme evaluation and data quality characteristics introduced at the beginning of this article. Determining the quality of monitoring seeks to promote the profile of monitoring and highlight its potential for catalysing transformative change. This section is framed around the five key programme evaluation standards: utility, feasibility, propriety, accuracy and evaluation (monitoring) accountability; augmented by four key data quality characteristics identified through Chen et al.’s (2014) review: data use, completeness, accuracy and timeliness. Analysis of how the case study meets these indicators of quality provides a rigorous review of the monitoring approach in the case study project and presents implications for practice. While the evaluation standards were developed to assess quality in evaluation, they are highly applicable to monitoring and focus on characteristics we consider vital to monitoring, which are not all identified in standard data quality assessment methods. Utility and accuracy are noted as important characteristics of both data quality and evaluation. However, evaluation standards of feasibility, propriety and evaluation (monitoring) accountability sit outside of standard data quality assessments but are relevant to monitoring. As such, for the purpose of this discussion, we include each of the five evaluation standards and rename evaluation accountability to monitoring accountability, as noted in the ‘Introduction’ section.

Data use is highlighted as a key characteristic of data quality in general (Chen et al., 2014). In addition, utility and utilisation is noted as a (or even the) key element of quality evaluative practice (Kelly, 2019; Patton, 2012). The Joint Committee utility standards expect that all stakeholders are considered and that the purpose of the evaluation is negotiated with their input and support (Yarbrough et al., 2011). These standards expect that evaluations are useful and utilised and set forth that, for this to occur, evaluations must be meaningful to users, have findings communicated effectively and in a timely manner, and engage qualified evaluators. The utilisation of evaluative data was a key concern for the evaluator in the case study project, hence a utilisation-focused perspective was employed from the start to support and maximise instrumental and process uses (Patton, 2012).

The influence of the monitoring approach on decision-making was demonstrated through noteworthy instrumental use by project stakeholders in the case study, made possible by regularly scheduled space to engage in everyday evaluation practices where stakeholders could discuss findings and implement change. The focus on real-time production of information made possible through the monitoring process highlights the inherent nature of the timeliness characteristic of data quality in the case study. The communication mechanisms for support of the monitoring framework were embedded within the project, through the monthly control (leadership level) and operational (front-line staff) group meetings. The influence of this strategy in fostering utility through continuous and timely communication is typified by a project partner in an interview comment:

I think the team leading the project have been very good at communicating . . . the way their data has been presented has been impressive . . . so from a project management point of view that’s been excellent. (Project partner #04)

Processes to support meaningfulness for users were evident from interviews with project partners. Their ability to reflect on change since the implementation of the monitoring framework is exampled in this interview excerpt:

There has been a whole lot of change. There has been much more open doors between the organisations and an understanding of what’s happening in each one of those [urgent care] facilities. That’s been a really good sharing and so in that sense; the data collection and looking at what we are doing and now implementing those changes. (Project partner #05)

Process in the utility standards is additionally connected to subsequent knowledge translation of monitoring results for change and improvement. The utility of the data was increased through the improved monitoring system and the embedded communication mechanisms, as sharing between stakeholders further informs and provides space for reflection on emerging findings (Donnelly et al., 2014). Thus, attention to meaningfulness for users is a precursor to adopting change (Patton, 2012).

The feasibility standards refer to evaluations’ cost-effectiveness, viability, practicality, contextual-sensitivity and ability to cause minimal disruption to the organisations involved (Yarbrough et al., 2011). The feasibility standards are particularly important in regard to M&E, especially for organisations with limited resources such as non-profits, and are rarely considered as a general data quality characteristic; hence, the need to include the Joint Committee standards in this analysis of monitoring quality. In this case study project, stakeholders noted the tensions of health care expenditure and pressure to show project value-for-money through evaluation outcomes. Feasibility standards were demonstrated through transitioning from the basic Excel monitoring framework to a more robust and permanent system that was devised and pursued by project partners to enhance financial and context feasibility. The value of monitoring as a distinct strategy essentially provided a feasibility study; a proof of concept approach rather than the evaluation segmented as an event. Partner services became invested strategically, intellectually and financially (resulting in instrumental use), and behaviourally, generating change in culturally embedded practices on collection and use of data (resulting in process use).

The propriety standards expect that evaluators act in an ethically appropriate manner and ensure they work inclusively and respectfully with stakeholders (Yarbrough et al., 2011). Like feasibility, propriety and ethics are rarely considered in general reviews of data quality. However, propriety and ethics are of vital importance to M&E, especially in situations where M&E is operating in human-centred situations such as health-care and social programming. The propriety standards include requirements to engage stakeholders in planning and decision-making, when feasible, and communicate with stakeholders transparently and responsively. Upholding ethical obligations is a core principle for evaluators (United Nations Evaluation Group, 2016). The propriety standards in this case study were grounded across the partner services’ hierarchy through inclusive strategies with direct care staff involvement that encouraged active bottom-up participation from project stakeholders. This interlinked with other standards as stakeholder inclusion and ownership translate into understandable and relevant evaluation findings that enhance utilisation (Patton, 2012).

Process use in the propriety attribute included presenting the monitoring data as a communication device (dashboards) to increase the ability of stakeholders to engage with the data. Adams et al. (2015) indicate capacity building stakeholders to cultivate an appreciation of evaluative data is a form of the propriety attribute. This was demonstrable in the case study where stakeholder opinions were valued and guided the design of the monitoring framework and uptake of real-time findings.

The accuracy standards expect that the data sources are rigorous and defensible. In addition, accuracy is identified as one of the three most-used characteristics of data quality (Chen et al., 2014). Accuracy highlights the importance of strong systematic data collection systems leading to methodological analysis and reasoned conclusions (Yarbrough et al., 2011). In addition, the accuracy standards encompass the need for accurate reporting of information and stakeholder confidence in data accuracy. Accurate data support an organisational culture of grounding decisions in verifiable evidence, which is important for strengthening funder willingness to support initiatives (Maxwell et al., 2016).

By streamlining and rebuilding the rural health services’ monitoring framework, this project was able to systematically reorganise data collection techniques to enhance accuracy and minimise space for human error or data loss. Stakeholders’ strong buy-in and involvement with the monitoring framework meant that they trusted the data and were confident relying on the findings to make changes, such as implementing new clinical pathways in response to indicated needs identified from the data.

Data accuracy was strengthened due to the early identification that the project required a monitoring strategy, rather than a simple baseline data capture and event-based summative evaluation. This led to an increased focus on data completeness as another key characteristic of data quality (Chen et al., 2014). In pursuit of data completeness, the monitoring framework in the case study sought to holistically capture operational data to expose areas where data were being poorly recorded or lost. The gaps and inaccuracies in existing data collection systems in some partner organisations would have remained undetected if monitoring did not become a focus. In addition, a utilisation-focused perspective that privileged a focus on the intended users helped navigate the tricky space of critically reviewing inadequate data collection systems. The project partner sites were positive about reviewing gaps due to their collaborative role as partners in the evaluative process and their desire for complete and accurate data. This resulted in process uses whereby stakeholders actioned quality improvements to implement more robust data collection methods with reporting mechanisms to facilitate use. An interview participant captured the value of their investment in the monitoring framework through their ability to make beneficial change:

One thing that has come from the project, [previously] we were manually recording our [urgent care] presentations. So now we have learnt how to put them on [the new database]. We can now get some good reports out of [the new database]. That’s a regular admin type of change. So that’s one big change we have put into place. (Project partner #02)

The final set of standards outlay the need for evaluation accountability, here named monitoring accountability. This involves an expectation that monitoring systems will be subjected to internal and external review to examine strengths and weaknesses, and that the monitoring data are carefully recorded and stored (Yarbrough et al., 2011). In the case study project, the monitoring actions provided transparent records with strong built-in mechanisms for research tracking and data storage. While the monitoring framework is yet to be reviewed by someone working outside the project, the framework is constantly being reviewed by the staff of the multiple project partners and the co-located evaluator. Specific stakeholder meetings are scheduled at regular intervals for discussing the monitoring framework, to unpack what is working well and areas that could be improved. Reflecting on the framework through a process of everyday evaluation provides stakeholders and the evaluator with time and space to consider possible amendments and additions, as well as offering a dedicated opportunity to talk through issues and identify potential problems.

Implication of findings

This article highlighted monitoring as offering a distinct contribution to the evaluative purposes of leaning, improving and supporting decision-making. Rather than a method of providing supplementary data for interval-based evaluations, monitoring in this case study has demonstrated ability to be useful and engender change, while upholding expected standards of evaluation and data quality. Rather than being secondary to evaluation, this article has shown that thoughtful implementation and engagement with monitoring data can enhance utilisation through monitoring’s continuous and embedded nature that requires high levels of stakeholder ownership and everyday evaluative interaction.

There is common rhetoric surrounding the need for evaluation to be embedded (McCoy et al., 2014; Patton, 2011; Rogers et al., 2019). However, embedding interval-based evaluation in practice is extremely difficult. Evaluation often ends up being an added burden that organisations commission an evaluator to conduct with minimal stakeholder input (Kelly, 2019). Conversely, as demonstrated in this case study, the monitoring framework utilised the co-located evaluator as a facilitator to guide the development and maintenance of a framework that was largely stakeholder led. As stakeholders were engaged in the development of the framework and are responsible for keeping it running and using the findings, this has resulted in an embedded method of continuous organisational learning and improvement.

This case study project has provided an example to demonstrate that continuous, user-led monitoring supported by everyday evaluative discussions between stakeholders can fulfil many of the desired findings and process uses and level of quality expected from evaluation. As interval-based evaluation often struggles to achieve these uses, it is time to turn the spotlight on timely and embedded monitoring as a solution.

Footnotes

Acknowledgements

The authors would like to acknowledge that the case study described in this paper was funded by Better Care Victoria and lead by Numurkah District Health Service.

ORCID iD

Leanne M Kelly

References

Adams

Nnawulezi

Vandenberg

(2015). Expectations to change (EC2): A participatory method for facilitating stakeholder engagement with evaluation findings. American Journal of Evaluation, 36(2), 243–255. https://doi.org/10.1177/1098214014553787

Bell

Aggleton

(2016). Interpretive and ethnographic perspectives: Alternative approaches to monitoring and evaluation practice. In Bell

Aggleton

(Eds.), Monitoring and evaluation in health and social development: Interpretive and ethnographic perspectives (pp. 1–14). Routledge. http://doi.org/10.4324/9781315730592-1

Boardman

(2019). Exploring quality in the implementation of development projects: Insights from development NGOs. Deakin University.

Chen

Hailey

Wang

(2014). A review of data quality assessment methods for public health information systems. International Journal of Environmental Research and Public Health, 11(5), 5170–5207. https://doi.org/10.3390/ijerph110505170

Donnelly

Letts

Klinger

Shulha

(2014). Supporting knowledge translation through evaluation: Evaluator as knowledge broker. Canadian Journal of Program Evaluation, 29(1), 36. https://doi.org/10.3138/cjpe.29.1.36

Guijt

(2008). Seeking surprise: Rethinking monitoring for collective learning in rural resource management. Wageningen University.

Harman

(2019). The great nonprofit evaluation reboot: A new approach every staff member can understand. CharityChannel LLC.

Hatry

Newcomer

Wholey

(2015). Evaluation challenges, issues, and trends. In Newcomer

Hatry

Wholey

(Eds.), Handbook of practical program evaluation (4th ed., pp. 816–832). Jossey-Bass.

Kelly

(2019). What’s the point? Program evaluation in small community development NGOs. Deakin University.

10.

Kelly

(2021). Evaluation in small development non-profits: Deadends, victories and alternative routes. Palgrave Macmillan. http://doi.org/10.1007/978-3-030-58979-0

11.

King

Alkin

(2019). The centrality of use: Theories of evaluation use and influence and thoughts on the first 50 years of use research. American Journal of Evaluation, 40(3), 431–458. http://doi.org/10.1177/1098214018796328

12.

Maloney

(2017). Evaluation: What’s the use? Evaluation Journal of Australasia, 17(4), 25–38. http://doi.org/10.1177/1035719X1701700404

13.

Markiewicz

Patrick

(2016). Developing monitoring and evaluation frameworks. SAGE.

14.

Maxwell

Rotz

Garcia

(2016). Data and decision making: Same organization, different perceptions; different organizations, different perceptions. American Journal of Evaluation, 37(4), 463–485. https://doi.org/10.1177/1098214015623634

15.

McCoy

Rose

Connolly

(2014). Approaches to evaluation in Australian child and family welfare organizations. Evaluation and Program Planning, 44, 68–74. https://doi.org/10.1016/j.evalprogplan.2014.02.004

16.

Mikkelsen

(2005). Methods for development work and research (2nd ed.). SAGE.

17.

Organisation for Economic Co-operation and Development-Development Assistance Committee. (2010). Glossary of key terms in evaluation and Results Based Management.

18.

Owen

(2006). Program evaluation: Forms and approaches (3rd ed.). Allen & Unwin.

19.

Patton

M. Q.

(2011). Developmental evaluation: Applying complexity concepts to enhance innovation and use. Guilford Press.

20.

Patton

M. Q.

(2012). Essentials of utilization-focused evaluation. SAGE.

21.

Rogers

Kelly

McCoy

(2019). Evaluation literacy: Perspectives of internal evaluators in non-government organizations. Canadian Journal of Program Evaluation, 34(1), 1–20. http://doi.org/10.3138/cjpe.42190

22.

Simister

(2017). Developmental evaluation. https://www.intrac.org/wpcms/wp-content/uploads/2017/01/Developmental-evaluation.pdf

23.

Stufflebeam

Coryn

(2014). Evaluation theory, models, and applications (2nd ed.). Jossey-Bass.

24.

United Nations Evaluation Group. (2016). Norms and standards for evaluation. United Nations Evaluation Group.

25.

Wadsworth

(2011). Everyday evaluation on the run (3rd ed.). Allen & Unwin.

26.

Yarbrough

Shulha

Hopson

Caruthers

(2011). The program evaluation standards: A guide for evaluators and evaluation users (3rd ed.). SAGE.