Steering a Swarm: Compliance and Learning in a Municipal Performance Regime

Abstract

This is a descriptive longitudinal case study of Ontario’s Municipal Performance Measurement Program that examines what happens in the interaction between performance regimes and public agencies. Specifically, from internal databases, archives, and public documents, this study tests propositions of compliance and benchmarking theories with all 444 municipalities in Ontario. Contrary to expectations, there is little evidence that compliance eventually declines for medium to large municipalities; compliance of small municipalities declined for a stretch of years, before being reversed. In addition, we find limited evidence of widespread organizational learning through benchmarking in this intelligence regime.

Keywords

benchmarking compliance regime performance regime local government intergovernmental relations

Introduction

Municipal performance regimes emerged in the early 2000s, in many cases created by central and provincial governments as a way to provide oversight and guidance to municipal governments. Perhaps the most important goal of these regimes was to institutionalize the process of benchmarking among municipalities. This is accomplished through the “use of formal authority, resources control or information coupled with the way in which these institutions actually use the powers available to them” to increase performance (Talbot & Wiggan, 2010, p. 62). This growing literature on performance regimes includes conceptual expositions (Hood, 2007, 2012), descriptive (Hood, 2008; Nutley, Downe, Martin, & Grace, 2012) and comparative studies (Haubrich & McLean, 2006), reform assessment studies (Downe, Grace, Martin, & Nutley, 2008; Martin, Downe, Grace, & Nutley, 2010; Charbonneau, 2011), and natural experiments (Burgess, Wilson, & Worth, 2013; Propper, Sutton, Whitnall, & Windmeijer, 2010).

Answering Talbot’s (2010) call for “empirical work on what actually happens in the interaction between performance regimes and public agencies” (p. 121), this article offers a longitudinal case study of the Municipal Performance Measurement Program (MPMP) of Ontario, Canada. The study’s aim is threefold. First, we test an element of compliance theory, notably that, because of a lack of incentives or sanctions in the MPMP, weak enforcement, and high autonomy, compliance levels should theoretically be low and in decline (Etienne, 2011, p. 318; Weaver, 2014, pp. 246-247, 249-250). A correlate of this proposition notes that, with heterogeneous populations, “particularly low levels of individual compliance are likely to be concentrated among targets with multiple, and very serious, resource and/or autonomy barriers to compliance” (Weaver, 2014, p. 253). Hence, in small municipalities, the downward trend should be especially sharp. Second, we will examine the MPMP to see whether the necessary conditions for interorganizational learning through benchmarking formulated by Ammons and Roenigk (2015, p. 311) are present as the performance regime becomes more transparent through the years. Finally, we will discuss the explicit and implicit assumptions and theories (Martin et al., 2010) of compliance and learning held by intelligence regime like the MPMP.

This performance regime involves the Ministry of Municipal Affairs and Housing (MMAH) of Ontario, Canada, and all 444 municipalities found in Ontario. Importantly, the focus of the present study is from the provincial ministry’s point of view as it engages in the process of steering hundreds of municipalities, and not that of hundreds of municipalities interacting with one ministry. To meet these aims, we use both public and previously unreleased archival documents and databases to create a fine-grained chronological description of the patterns of compliance and learning among the 444 municipalities participating in this flexible performance regime.

Examining interorganizational relationships within performance arrangements is central to our understanding of authority, information, control, and agency performance. The Canadian context is especially interesting for theory testing. It is a member of the Anglosphere (Pollitt, 2015), a group of countries exporting public management ideas. Although it is not as large as the United States, it has seen moderate reform and rates of change in its machinery of government and the size of its public sectors. It offers an alternative to smaller counterparts such as the United Kingdom, Australia, or New Zealand (Halligan, 2013, p. 358). The importance of these concepts and relationships spans national borders and forms of government, highlighting the importance of this study for an international audience. The results presented here help us better understand the machinations of current performance regimes, the nature of control and compliance, and to create more appropriate reforms to the systems with mandates, information sharing, and incentive systems in mind.

Methodological Challenges to Evaluating Administrative Reforms and Benchmarking

Administrative reforms take place on a somewhat frequent basis; nevertheless, it is rare that administrative reform processes offer guidance on how reforms should be evaluated. Looking at five landmark white papers in the United Kingdom over 41 years, Pollitt (2013, p. 916) found that none of them offered clear targets or yardstick to assess whether or not they were successful. Reviewing the European literature on performance-oriented management reform, Pollitt and Dan (2013) identified 519 studies covering the topic, of which few presented any detailed knowledge of outcome assessment of these reforms. Indeed, they found that “claims and counterclaims outnumber hard, carefully collected evidence by a substantial margin” (Pollitt & Dan, 2013, p. 24).

A primary challenge in assessing administrative reforms—specifically performance regimes—is that control groups generally do not exist at the time of implementation, creating an acute attribution problem. There are, occasionally, natural experiments in which regimes become more demanding in one local—England, for example—while staying the same in a different location—Scotland (Propper et al., 2010). In yet other cases, regimes may become more opaque, as was the case of Welsh schools (Burgess et al., 2013). However, we are not aware of any randomized control trials implementing a municipal performance regime within a country or a province.

In addition to these difficulties, there are myriad performance measures and multiple goals to be tracked (Moynihan, 2013, p. 500), leaving government-wide benchmarking reforms or performance regimes difficult to study in a methodologically rigorous manner. Hence, the effects of reforms on performance are seldom evaluated (Pollitt, 2009, 2013). These difficulties extend to the examination of benchmarking, and assessing the efficacy of benchmarking has been made difficult in part by “the absence of a coherent theory of public sector benchmarking—one relating benchmarking to interorganizational learning” (Ammons & Roenigk, 2015, p. 309). Hence, for these reasons, the goals of performance regimes are frequently ambiguous.

There are a number of competing purposes of performance measurement, which in some cases may produce a lack of clear goals for performance regimes complicating their evaluation. Moreover, there is wide variation internationally as to what constitutes performance within a performance regime. Grace (2012) notes some of the challenging questions that emerge for those establishing performance regimes are how to optimize performance improvement and elevate the role of benchmarking (p. 57). Unless these questions are addressed in initial program design, regimes may face ambiguous goals at best, and conflicting or competing goals at worst. Even the now-defunct and highly structured English Comprehensive Performance Assessment (CPA) did not escape from such competing logics:

. . . the system developed in an ad hoc, pragmatic way, with no careful evaluation of the costs and benefits of different kinds of performance indicator regimes, no attention whatsoever to the well-documented history of target systems, and no attempt even in the more cerebral parts of government to fashion anything approaching a workable theory of when it is better to use subjective rather than administrative data as performance indicators, or of the circumstances in which it is better to use performance indicators as targets, rankings or background intelligence. (Hood, 2008, p. 12)

At that time, Hood (2008, p. 16) opined that English local government would be better off placing more emphasis on intelligence regimes, and less on targets or rankings regimes. Intelligence regimes are, in Hood’s (2007) conceptualization, those in which organizations share information for the purpose of gaining knowledge, rather than scrutinizing and evaluating which organization is doing better (p. 95; Hood, 2012, p. s88). A recent natural experiment in Welsh schools examined a ranking regime devolving into an intelligence regime and found a decrease in school effectiveness (Burgess et al., 2013, p. 66). Taken together, these circumstances present increasing challenges to studying performance regimes.

The Necessary Conditions for Performance: Compliance, Learning, and Capacity

Characteristics of performance regimes, including the overarching goals of those arrangements, are important; if interpretation of purpose is not shared, there will be a trade-off between compliance and organizational learning (Moxham, 2013, p. 198). If a system is designed with a strict and central emphasis on compliance, it would be unlikely to foster performance improvement (Moxham, 2013, p. 198). Although we recognize this challenge, our argument here is different. We agree with Weaver (2014, p. 243) that some level of compliance is a necessary condition to achieve policy or reform objectives.

The challenge of a performance regime is to create a system in which compliance with the policy objective is achieved alongside organizational learning, and that the interplay of these characteristics will lead to lasting performance improvement. The present study does not delve into the performance gains of the MPMP, nor does it try to explain the determinants of variance in yearly compliance rates. Rather, following the lead of studies on performance regime (Hood & Dixon, 2010) and performance reforms (Hood & Dixon, 2015), we work to examine and verify claims about aspects of reform are indeed present. We focus on three held assumptions about performance regime and reforms formulated by compliance theory (Weaver, 2014) and benchmarking theory (Ammons & Roenigk, 2015) that have to do with compliance, learning, and capacity.

Compliance in Performance Regimes

There are a number of different types of performance regimes, and these differences may affect issues of compliance. Before being able to improve performance or increase accountability, members of these regimes need to comply with the demands of the MMAH responsible for oversight, notably the submission of data for the mandated set of performance indicators. This is especially important for intelligence regimes in which there are few demands from one level of government to members at a lower level. Intelligence regimes are, at their essence, “light” versions of compliance regimes. In our discussion here, to comply is to “. . . behave in ways that are consistent with the enunciated objectives of the policy” (Weaver, 2014, p. 243).

For public managers, compliance with a mandate can be favored over the purposeful use of information intended by the spirit of the mandate. Reviewing research on different U.S. Federal performance reforms, notably the Government Performance and Results Act (GPRA) and the now-defunct Program Assessment Rating Tool (PART),¹ Moynihan (2013, p. 502) found that PART and GPRA stressed data collection and dissemination routines, and gave relatively little attention to routines to move from passive to purposive performance information use. The GPRA Modernization Act of 2010 added such routines as cross-agency priority goals, agency priority goals, and quarterly reviews to managers of federal departments and agencies. A recent research pooling the data from two distant Government Accountability Office surveys found that the new imposed routines increased the purposeful use of performance indicators (Moynihan & Kroll, 2016).

A number of European performance regimes share certain similarities here. Surveys of public managers in Portugal note that public managers do not see formal activity plans or reports as performance tools, but rather as compulsory procedures to which they must attend when the budget is finished (De Araújo & Branco, 2009, p. 570). A statistical analysis of approximately 200 municipal managers in Germany by Kroll (2014, p. 187) found that a cynical attitude toward performance information use is an important impediment performance information use, even when data quality and measurement system maturity are taken into effect. A study by Martí, Royo, and Acerete (2012) looked at patterns of disclosure from Spanish municipalities’ websites. An examination of these reports indicated that only 14 out of the 56 cities present service performance indicators, showing a low level of accountability related to the performance of local public services and low levels of compliance with accounting regulations (Martí et al., 2012, p. 878).

Compliance has been noted in extant reports on the Ontario MPMP. A critical review of the reports posted online by municipal managers in Ontario in 2008 stated that municipal managers complied with the mandatory guidelines, but few went out of their way to go beyond this (Schatteman, 2010, p. 542). This also calls into question the chance that these efforts move beyond compliance and into organizational learning and improved performance.

Learning in Performance Regimes

According to Ammons and Roenigk (2015), three conditions need to be met for local government productivity gains via benchmark-induced learning. The first condition is that variations are present in services and results. The second condition is based on internal and external benchmarking as “methods to identify exemplary service delivery must be available and used” (Ammons & Roenigk, 2015, p. 311). The third condition is fulfilled by external benchmarking because “an effective means of interorganizational knowledge dissemination must be employed” (Ammons & Roenigk, 2015, p. 311).

As a goal for a performance regime, learning can in some cases be in opposition to the enforcement of imposed targets. Indeed, learning can be achieved away from public scrutiny or discussion of targets. However, even then, public reports must be produced. In a performance regime, tangible proof of widespread organizational learning may be hard to collect. In the words of Ammons and Roenigk (2015),

When the residue from a benchmarking project includes obvious changes in rules, policies, or procedures—especially when such changes bring new results—the evidence of organizational learning is unmistakable. When the residue is less tangible, as with subtle changes to organizational culture or an informed decision to make no changes to current procedures, evidence of organizational learning may be more difficult to detect. (p. 320)

The likelihood that organizational learning takes place diminishes if a central agency does not provide feedback on the interpretation of attained level of performance (Moxham, 2013, p 198). This makes finding evidence of organizational learning difficult in intelligence regimes. Hence, for benchmarking theory, public organizations must collect and use performance indicators that enable comparisons with themselves in previous periods (internal benchmarking), or with their peers or with best practices (external benchmarking). If comparisons are not present, managers may have few occasions to make sense of and ultimately learn from the data, instead only complying with a mandate. Learning forums are an opportunity for managers and supervisors to discuss performance data (Moynihan, 2008, p. 179). Referring to these learning forums, Moynihan (2013, p. 504) maintains that a common flaw of performance management reforms is that little to no attention is directed in creating “routines of use.” These routines, a hallmark of legitimate learning, are necessary for shaping organizational capacity for collecting, reporting, and using performance data.

Capacity Building

Institutional and individual capacities to design and use performance information appropriately are often taken for granted in performance management research (Hunter & Nielsen, 2013, p. 15). A lack of administrative capacity may severely inhibit performance-oriented reforms; longstanding deficit in skills can take years to reform (Pollitt & Dan, 2013, p. 21), both of which can affect compliance (Weaver, 2014, p. 262). In their study on performance measurement use, Berman and Wang (2000) defined capacity as managers being able to “(1) relate outputs to operations; (2) collect timely data; have (3) staff capable of analyzing performance data; (4) adequate information technology; and support from (5) department heads and (6) elected officials” (p. 417). Capacity, then, is an important consideration in evaluating performance regimes and reform efforts.

Empirically examining complex concepts such as performance regimes requires attention to the context of those regimes, including an understanding of provincial and local governments, and legislative activity related to these organizational arrangements.

Context and Case Selection

Performance Measurement Systems in North America

To systematically study the effects of performance measurement on reporting in a North American context, we turn our attention to municipal performance measurement regimes in Canadian provinces. This particular context allows us to examine systematic data to assess the main effects of performance measurement. There are relatively few North American equivalents to the now defunct English Comprehensive Area Assessment, where local authorities were mandated to report on standardized indicators to the central government (Audit Commission, 2009). In the Canadian confederation, three systematic and mandatory regimes were operating in early 2016: Ontario’s MPMP, Quebec’s Municipal Management Indicators, and Nova Scotia’s Municipal Indicators. These are the only municipal regimes in North America in which all municipalities are mandated to collect and report the same set of performance indicators. The context of these systems is important, and a more thorough understanding of local governments is necessary before more closely assessing performance regimes.

Municipalities in Ontario

Local government in Canada follows the council-manager model often found in the United States. Like their American counterparts, local governments in Canada are responsible for fire, police, public works, water treatment and distribution, refuse collection, parks and leisure, land use, and transit services (Robertson & Ball, 2002, p. 395). As a reform, the existence of the MPMP was not a management fad in circulation, but rather a reform with formal endorsement. The distinction is important (Halligan, 2013, p. 357) to later appreciate the patterns in the compliance behavior. Another important contextual aspect for the MPMP is the division of services offered by municipalities in the province in Ontario. Between 1990 and 2007, a local services realignment was undertaken, when “the province shifted downward responsibilities and/or costs for such social programs as public (social) housing, public health, land ambulances, and social assistance, as well as public transit, water and sewer systems, and policing in rural areas” (Tindal & Tindal, 2009, p. 175). The former relationship between the province of Ontario and municipalities was described as “one-size-fits-all benevolent dictatorship” (Siegel, 2009, p. 22).

The Municipal Act of December 2001 came into effect on January 1 of 2003 (Tindal & Tindal, 2009, p. 185). In exchange for increased powers, greater accountability and more stringent reporting requirements were agreed on (Tindal & Tindal, 2009, p. 185). The modalities were in part crafted through consultations with the Association of Municipalities in Ontario, which represents most municipalities. The role of the MMAH is one of advocacy in that its primary responsibility is to “represent the interests of municipalities in cabinet and ensure that municipalities have the legislative powers and financial and other resources needed to carry out their mandate” (Siegel, 2009, p. 37). Many municipalities in Ontario are highly dependent on provincial funding, which could be leveraged by the MMAH to exercise more control over municipalities (Siegel, 2009, p. 42). Interestingly, that power was not used by the MMAH to leverage municipalities on their performance indicators as it generally is for financial information reports. Last, previous to this devolution, municipal mergers took place on a massive scale. From 1996 to 2000, the number of municipalities in Ontario was reduced to 445 from 850 (Siegel, 2009, p. 28). The Municipal Act creates the context in which the Ontario MPMP is set, providing a setting that shapes data collection and reporting, authority, information sharing, and interorganizational relationships. Simultaneously, municipalities in Ontario gained autonomy vis-à-vis the provincial government when an amendment to the Municipal Act aiming at “beginning a new era in which local governments across Ontario have new powers and more autonomy reflective of their status as mature, responsible governments” (Ministry of Municipal Affairs and Housing [MMAH], 2007).

The Ontario Municipal Performance Measurement Program

As a compliance regime, the MPMP provides resources to help municipalities in collecting and reporting performance. It does not use positive or negative incentives, nor does it require or prohibit behaviors (Weaver, 2014, p. 252). In addition, the MPMP does not require reporting on previously established targets, nor does the MMAH rank municipalities on the values of their indicators or rate their performance (Pollanen, 2011, p. 21). As such, meaning surrounding performance is derived by local managers; it is not shared at the provincial level. The MPMP qualifies as an intelligence performance regime, not a rankings or targets regime (Charbonneau, Bromberg, & Henderson, 2015). Intelligence regimes,

. . . do not imply prior judgments about what should be maximized or what the desirable floor or ceiling level of activity should be, in contrast with both targets and rankings, which do require and rest on explicit judgments about what really matters. (Hood, 2012, p. s88)

In the Ontario MPMP, local governments are mandated to collect the values for a set of indicators shared by all other municipalities, send the values of the indicators to the provincial government, and must report these measures to the public. Failure to comply by local governments in collecting, sharing, and reporting the values for the indicators does not carry consequences. Thus, the compliance requirements for the MPMP are accountability requirements rather than being focused on performance improvement.

Subsequent sections describe data collection, analytical strategies, results, and concluding comments focusing on the evolution of the Ontario MPMP.

Method and Data

Method

Although others have used cross-sectional interviews with high-ranking officials to study these phenomena (e.g., Nutley et al., 2012), we use more objective written documents and a database containing detailed filing information from 444 municipalities over the span of a decade. The filing dates of performance information denote compliance behavior, and not self-reported survey questions of municipal or individual behavior. This is in line with recent methodological recommendations for more realistic measurement in public administration research (Wright, 2015, p. 800). When accessing the effects of a reform, it is important to keep in mind that longstanding, basic characteristics of arrangements require months or even years to be radically modified (Pollitt & Dan, 2013, p. 17). To be reliable and curb revisionist history, it is important to limit the use of retroactive interviews (Eisenhardt & Graebner, 2007, p. 28), where interviewees are asked to provide input on multiple measurement periods over the course of one interview. The use of contemporary data in the form of documents and annual databases permits the establishment of trends and limits overfitting our explanations to revisionist interviews. Thus, our study uses a method of consilience. Consilience “denotes a range of tests, no one of which is necessarily conclusive on its own, but when put together constitute more powerful evidence if the various results tend to point in the same direction” (Hood & Dixon, 2010, p. i286; 2015, p. 17).

Studying a performance regime longitudinally enables us to uncover its developments over time; that is something that is seldom done when studying these regimes (Martin, Nutley, Downe, & Grace, 2016, p. 28). The causal mechanisms we are examining here are likely to come about slowly and not suddenly (Faletti & Lynch, 2009, p. 1153).

Data

This article analyzes reports and internal documents about the MPMP from the MMAH of Ontario. These internal documents include program memos, as well as meeting minutes of 24 sessions of the MPMP Advisory Committee from January 2002 to August 2014. Meeting minutes are a rich source of information about the MPMP, as they chronicle discussion of stakeholders from diverse backgrounds, including MMAH staff members and various external stakeholders. It also provides a window into discussions, plans, rationales, and potential alternatives that contemporaneous are not retrospective. A total of 53 additional public documents were also analyzed, including official MPMP reports and guides and PowerPoint presentations from MPMP program managers. All documents were coded and analyzed by two of the three researchers, and codes were created inductively. Any disparities in the comparisons of codes were resolved through discussion and clarification. From this single set of coded segments emerged a total of 30 single-level themes.

The researchers also had access to five databases used by managers at MMAH which keep track of non-submission patterns of MPMP data by municipalities. Lastly, researchers used updated reports displayed on municipal websites which were then analyzed. The inclusion of a quantitative element detailing submission and reporting in our analysis of qualitative data in the form of reports and memos adds evidence that we have not selectively chosen our qualitative data to support interpretations (Maxwell, 2010, p. 479). Methodologically, we are using a series of straw-in-the-wind tests and hoop tests (Van Evera, 1997, pp. 31-32) to test elements of compliance theory and benchmarking theory. Failing these easy tests weakens or eliminates our hypotheses (Collier, 2011, p. 825).

Framing the Reforms and Performance Activities: Three Time Periods

The compliance aspect of the MPMP performance regime is focused intently on accountability. Accordingly, our discussion here is framed around the compliance and learning realities of the program based on two landmark changes to accountability requirement and the creation of three distinct time periods that frame them. These include the implementation of a web tool called the Municipal Information & Data Analysis System (MIDAS), an online password-protected portal for external benchmarking queries in May of 2007, and the start of publishing the comparative MPMP data with Financial Information Return (FIR) data on the MMAH website in August of 2009. The presence of a mandatory set of indicators enables comparisons across the municipalities in Ontario. The possibilities for organizations engaged in external benchmarking are an important factor in organization change; the presence of benchmarking was the main explanatory factor for performance management across many public services in a recent meta-analysis about performance improvement (Gerrish, 2016).

We are interested in incremental (Georges & Jones, 2000, p. 674) inter-unit (Ployhart & Vandenberg, 2010, p. 97) changes in learning and compliance from the point of view of the MMAH, the agency overseeing the performance regime. Although Weaver (2014) does not mention the time dynamics of compliance theory, compliance theory tells us that compliance levels should be initially low when units are autonomous, and a declining trend in compliance will occur once an absence of sanctions is observed for non-compliers (Etienne, 2011, p. 318; Weaver, 2014, pp. 246-247, 249-250). However, the rich MPMP data make it possible to test the peer effect hypothesis that “[s]imply knowing non-compliers may signal that noncompliance is unlikely to be detected and punished. But there is also a potential normative effect. Compliance is likely to be higher when non-compliance is seen as socially unacceptable” (Weaver, 2014, pp. 248-249). More simply, non-compliance begets more non-compliance. In addition, target heterogeneity hypothesis that “strategies that secure compliance from the modal member of the target population may not work for all” (p. 251). After the first period, learning from external benchmarking should be easier; after the third period, the second and third conditions for learning hypothesized by Ammons and Roenigk (2015, p. 311), internal and external benchmarking, should be the easiest to observe.

Results

Our results from the content analysis of documents and databases are presented with a focus on compliance, accountability, and learning for each of the three time periods of the regime.

Compliance

Compliance in this discussion relates primarily to the submission of documents to the provincial MPMP from local governments. Since May of 2000, there has been, with only one exception, a due date for municipalities to forward their MPMP schedules (separate reports on efficiency and effectiveness measures, plus background information) to MMAH. The only exception was 2009, where an extension was offered by MMAH to transmit MPMP data until June 30 of 2010.

Issues of compliance in the first period, which includes the initiation of the MPMP in 2000 through the introduction of the MIDAS online web tool in April of 2007, are marked by significant variation in actual submission between performance and financial data. In a January 15, 2002 MPMP Steering Committee meeting, it was noted that 89% of municipalities have transmitted their FIRs by September 30—four 4 months late—versus 75% of MPMP schedules (MPMP advisory committee meeting, 2002b). It was believed that recently restructured municipalities were the laggards in reporting. This report marks the first indication that MPMP was keeping track of municipal compliance with submission of data.

In February 26 2003, an MPMP advisory committee agreed that an MMAH employee would investigate issues of compliance and would examine the correlation between late financial documents and MPMP 2000 and 2001 submissions to decrease municipal submission tardiness (MPMP advisory committee meeting, 2003). However, levels of compliance in the initial period remained relatively leveled. The 2003 FIR and MPMP rate of compliance with submission was similar 1 year later (MPMP advisory committee meeting, 2004). The Steering Committee remained keenly interested in issues of report submission as indicated by substantive discussion of the compliance in September 2006 and March 2007 meetings (MPMP advisory committee meeting, 2006b, 2007).

Discussion after implementation of MIDAS indicated a shift in focus to a more granular understanding of compliance via report submission. In March of 2009, the advisory committee asked for a summary of municipalities not submitting their MPMP, by region and population size (MPMP advisory committee meeting, 2009).

Concern with compliance continued in the third period, marked by the initiation of online publication of performance and financial data starting in August of 2009. A MPMP manager for MMAH, while presenting at a meeting in June 2010, stressed that a mandated regime like the MPMP has three unique challenges: compliance oversight, data warehousing and access and enthusiasm, and engagement versus simply doing what is required by legislation (MPMP Program Manager, 2010, p. 6). Although cognizant of these challenges, compliance rates, as measured by timely transmission of MPMP data, remained relatively low. During a March 2012 advisory committee meeting, a member of the MMAH noted that since 2008, there had been a slight drop in timeliness for MPMP submission.

In 2012, an intern at MMAH analyzed the presence of MPMP reports on municipal websites for all municipalities in Ontario, and in July of that year, a briefing note was produced for the MMAH that outlined compliance with the submission of MPMP reports (see Table 1 below, adapted from the official version). Table 1 indicates that municipalities from remote (non-central) regions struggled to post up-to-date reports online.

Table 1.

First Evidence of Compliance: MPMP Report Submission and Publication, by Regions, as Monitored by MMAH in 2012.

Region	Number of municipalities per region	Percentage of 2010 MPMP reports submitted	Percentage of 2010 MPMP reports submitted and available online	Percentage of municipalities with up-to-date reports
Central	78	92	63	62
Eastern	114	82	53	47
North Eastern	110	67	45	36
North Western	34	79	52	41
Western	108	86	58	55
All regions	444	81	54	48

Source. Adapted from MPMP Program Manager’s note (2012b)

Note. MPMP = Municipal Performance Measurement Program; MMAH = Ministry of Municipal Affairs and Housing.

That year, and again in 2014, tables were generated to identify municipalities that did not submit MPMP data on time. The breakdown of compliance for small and medium and large municipalities was not addressed in the MMAH reports from their monitoring databases. Figures 1 and 2 present the same information, and comparing these enables us to determine the veracity of compliance theory’s (Weaver, 2014, p. 251) target heterogeneity hypothesis that “strategies that secure compliance from the modal member of the target population may not work for all” (p. 251).

Figure 1.

Second evidence of compliance: Evolution of MPMP schedules submission for municipalities of less than 10,000 residents (2004-2013).

Figure 2.

Second evidence of compliance: Evolution of MPMP schedules submission for municipalities of 10,000 + Residents (2004-2013).

Figures 1 and 2 above indicate that there is a pattern of late submission and non-submission for municipalities of less than 10,000 residents from 2004 to 2010. However, the pattern is later reversed, coinciding with training sessions throughout the province¹ and follow-up with non-compliant municipalities. A caveat to these findings is that the municipalities that do not submit MPMP data are generally small. Although there are many small municipalities in Ontario, most Ontarians live in large cities. Therefore, the share of the total population included in municipalities that never submitted their MPMP data is relatively small. Regardless of size, these issues of compliance raise related questions of accountability and the manner in which these public organizations interact.

Accountability

As a component of compliance, we now present evidence of accountability during the same three time periods. In April of 2003, 15 MPMP advisory committee members organized a visioning exercise to plan for the next 3 to 5 years. A manager in a larger municipality raised questions of accountability to the public: “[d]ue to lack of great interest by the public with municipal performance results, would it be more prudent to have the program move in the direction of measures useful for management versus accountability?” (MPMP advisory committee meeting, 2003b, p. 3). Early iterations of the MPMP reporting and use of data were seen as ineffective, and a shift in focus to management was seen as a possible improvement on the existing system.

The post-MIDAS period reflected a move to strengthen the MPMP as a tool for accountability. Revisions to the MPMP Handbook in 2007 provided a view of the explicit assumptions held by the decisions makers behind the operational design (Martin et al., 2010) of the benchmarking regime in Ontario on how accountability should work.

Performance measurement strengthens accountability. Government today is very complex, so it is important that elected officials and public servants inform taxpayers what the government plans to achieve, what it is actually accomplishing and what public services cost. With this information, taxpayers can make informed decisions about the level of services they desire. This notion of accountability is fundamental to our form of government.

Measuring performance and setting targets effectively establishes an understanding between municipal staff and council, under which all parties develop a clearer understanding of the expected results or standards for each service area. The result is a shared accountability framework between staff and council, which benefits everyone. It helps focus council’s decision-making and helps municipal staff understand the level and type of service delivery required. For the most part, municipalities already serve their taxpayers well, and that is something the public has a right to know. Performance measurement demonstrates to taxpayers how they are being served and the value they are receiving for their tax dollars. (MMAH, 2007, p. 5)

Shared accountability between elected officials and managers comes from clear information from indicators and agreed-on targets. Performance levels for services are assumed to be of high quality. Data that are useful to managers are shared with the public. This explicit logical model does not consider that measures from managers need to be different from measures for the public.

Discussions in an MPMP advisory committee meeting in April 2009 indicated that posting the MPMP data along with the FIR data on MMAH’s website would improve access by the public and improve transparency (MPMP advisory committee meeting, 2009, p. 2). At that same meeting, there were several other arguments presented under the “Rationale for posting MPMP Schedules on FIR home page.” It remains an interesting window into the decision process to eventually make the MPMP data public. Since the FIR was made public, MMAH found that the quality of the data improved (MPMP advisory committee meeting, 2009, p. 2). Demands from municipalities for comparative data were made to MMAH. A risk of having the media, or a consultation firm like BMA Management Consulting, use the data for rankings was raised. At that point in time, MMAH considered adding a preamble where the MPMP data would be made public. The preamble was constructed as follows: “The information we are reporting is a starting point. Do not base your comparisons on the raw data. Please consider the explanatory notes and factors influencing results that are reported by individual municipalities” (MPMP advisory committee meeting, 2009, pp. 2-3).

The shift to public postings of the MPMP and FIR data on a website indicated an increased official focus on accountability as an end. The official message from the MMAH to municipal managers is that the MPMP is a management tool for improved service delivery and cost savings, and a tool to increase public transparency and accountability through public reporting (MPMP program manager, 2010, p. 13). The third and fourth goals of the MPMP are to inform council members, and to help build cases for grant funding (MPMP program manager, 2010, p. 13). Before presenting the MPMP to municipal managers, a program manager spelled out emphatically what the program is not. “Trust—Province does NOT and will never encourage inappropriate comparisons or attempt to embarrass/punish ‘poor performance.’ MPMP measures are NOT used as a resource to assess successful municipalities for provincial transfer payments and grants” (MPMP program manager, 2010, p. 13). Competition between municipalities is not the working theory of performance improvement. With MPMP data, MMAH is using neither higher provincial transfer (carrots) nor lower transfers (sticks), but rather persuasion (sermons; Vedung, 2010). Accountability and transparency to the public are stressed as a “higher-order objective” for MPMP (MPMP advisory committee meeting, 2012b, p. 2).

During this period, program staff continued to evaluate and assess the proper role of using terms such as accountability and transparency in communication with the public. After touring different localities throughout the province of Ontario for training sessions on MPMP, a MPMP program manager reported his conclusions about the state of the program to the MMAH (2013). In this report, the program manager opined that “words such as accountability and transparency don’t communicate well” (MMAH, 2013, p. 9). Instead, he proposed to present the benefits of the performance regime by informing the public so they better understand the work of the municipalities. He also believed that the regime is more attractive to municipal managers if the benefits of sustaining economic development opportunities are highlighted (MMAH, 2013, p. 9). One of the direct results of the workshops is that 30 participating municipalities who had not previously filled out their MPMP schedules were up-to-date after their training (MMAH, 2013, p. 5).

Other developments in terms of the modalities of the program were discussed at an in-house MPMP meeting in February of 2014. One element under consideration was to require municipalities to report individual performance measurement schedule publically (MPMP program manager, 2014, p. 9). As of February 2014, committee members were still divided over the possible focus on managerial use of indicators over citizen use (MPMP advisory committee meeting, 2014a, p. 2). It is acknowledged that delays in the timeliness of data reporting are due in part to the comprehensiveness of measures covered by the FIR (MPMP advisory committee meeting, 2014b, p. 2). As of August 2014, there were no changes in sight to modify the Municipal Act for public reporting requirements. MMAH still encourages municipalities to transfer their data. Reporting of MPMP data about services is not tied to grants (MPMP advisory committee meeting, 2014b, p. 2), unlike FIR data that are needed before September 30 to qualify for provincial grants and transfers.

Learning

Hood (2012, p. s86) identified learning and diagnostics as the causal mechanisms explaining how performance improvement would come about in an intelligence regime. We now present evidence of learning for the three time periods indicated above. First, we present evidence of learning in internal benchmarking, and then evidence of the contributions of external benchmarking to learning processes within the MPMP.

Learning via internal benchmarking

Initial stages of the MPMP program showed little formal interest or emphasis on internal benchmarking processes. Discussions in March of 2007 (MPMP advisory committee meeting, 2007) centered on a decision to mandate 5-year reporting instead of the 3-year recommendation proposed by the Public Sector Accounting Board—created to set standards for public-sector financial accountability—for all municipalities to allow historical comparisons within municipalities. Other than this mandated reporting, little happened in relations to internal benchmarking for this period.

In contrast, increasing emphasis on internal benchmarking characterized the second period. The MPMP Handbook published by the MMAH in 2007 stated that comparisons of indicators from year to year are the most common type of comparison (p. 18). It is also specified that municipal performance does not change radically from year to year, but that small changes can become significant over the long run (MMAH, 2007, p. 22). The handbook also stipulates that “an important trend could be missed if department results were not reported over time” (MMAH, 2007, p. 22). The handbook concluded that “performance measurement is far more than the annual reporting of past data” (MMAH, 2007, p. 22).

Learning as a result of internal benchmarking became a more central and reified practice when performance reporting was made public via the Internet. At the June 2010 conference presentation mentioned earlier, one of the talking points from an MPMP manager was that non-compliance was present in all regions, partly because of a lack of penalties for not being compliant, although compliance rates vary considerably. The MPMP manager added that the culture of government staff should be changed. To foster the change, the “right incentive and reward system” should be found, “so that people are not doing things just because they have to do it” (MPMP program manager, 2010, p. 7). A constant view of the MMAH is that different stakeholders can evaluate and learn in two ways: “One solution is to track a municipality’s own performance year after year. Another approach is to compare the municipality’s results with those of other, similar municipalities” (MMAH, 2005, p. 2; 2013, p. 1).

From analyses performed in 2012 at MMAH, we assessed in table 2 the proportions of municipalities where citizens could access chronological context and internal benchmarking data, for the latest MPMP indicators, on their municipality’s website. The intern employed by MMAH compiled data on the presence of internal benchmarks in the annual reports available on the municipalities’ websites, but not external benchmarks.

Table 2.

Evidence of Learning Through Benchmarking: The Presence of Internal Benchmarking Information of MPMP Data in Recent Annual Reports Published Online, by Regions, as Monitored by MMAH in 2012.

	2010 or 2011 MPMP reports with internal comparisons
Region	# No internal comparison in the report	# Some internal comparisons in the report	# Systematic internal comparisons in the report	% Some or systematic internal comparison in the report
Central	16	1	21	58
Eastern	18	4	32	67
North Eastern	23	0	17	43
North Western	3	1	10	79
Western	15	1	43	75
All regions	75	7	123	63

Source. Produced from MMAH’s briefing notes (2012b).

Note. MPMP = Municipal Performance Measurement Program; MMAH = Ministry of Municipal Affairs and Housing.

Consequences for MPMP non-submission were discussed during a June 2013 advisory committee meeting. Because the 2011 MPMP data were missing for nearly a quarter of the municipalities, a dedicated project would be needed to acquire “a complete set of municipal lane kilometre data.” After an outreach effort by the ministry staff to secure roads and bridges data, the result was a boost in municipal lane kilometer data from the initial 77% to 100%.

A July 2012 ministry memo underlined that like comments accompanying results, and using the MMAH reporting template, it is encouraged, but not mandatory for municipalities to compare their current and past results (MPMP Program Manager, 2012b, p. 2). Multi-year reports made their way in in the 2012 template and were posted on the FIR website in the spring of 2014. Organizational learning occurred through a number of processes related to both external and internal benchmarking.

Learning via external benchmarking

External benchmarking activities also varied over the course of the MPMP program history. Early on, the MMAH was asked repeatedly by municipalities to view data from comparable municipalities (MPMP advisory committee meeting, 2002a, p. 2), and resultant recommendations from the MMAH indicated that requests should be made directly to specific municipalities (MPMP advisory committee meeting, 2002a, p. 2). At the end of that same year, the MMAH would provide summative data reports without identifying information tied to specific municipalities, to overcome resistance from municipalities (MPMP advisory committee meeting, 2002b, p. 1). A number of committee members affected by the web tool project expressed concern over invalid comparisons from municipalities themselves. “It was suggested that the web tool provide municipalities with a list of similar municipalities for comparison purposes” (MPMP advisory committee meeting, 2006a, p. 1).

Formal recognition of the importance of external comparison was done by MMAH in the MPMP Handbook. “Comparisons make it possible to discover which municipalities have practices that may be emulated” (MMAH, 2007, p. 18). Just after this recommendation, the handbook states that there was little value in the raw indicators if they are not compared with something else. The handbook clarifies this in the sections titled “sharing performance measurement results” and “analyzing results,” and presents similar arguments in both. For analytic purposes,

Comparing Performance Sharing performance data is very useful. It is reasonable to ask why some municipalities are able to achieve apparently better efficiency or effectiveness results and determine whether they use management or service delivery methods that could be copied. Even if differences are due to factors beyond a municipality’s control, sharing the data is useful for both taxpayers and municipal officials to understand local performance in light of local circumstances. (MMAH, 2007, p. 21)

A summary report published by the MMAH about the 2005 results (MMAH, 2008, p. 8) highlighted characteristics of municipalities that enable proper comparisons, including geography (regions of Ontario), municipal type (tier), and population size. In the same report, MMAH went as far as to include a detailed checklist for comparisons (MMAH, 2008, p. 11), with 15 items, grouped in eight sub-categories, under “local circumstances,” “council decisions and budget,” and “municipal structure” categories.

The introduction of MIDAS changed the fundamental abilities of municipal managers to engage in reporting and comparison. In April of 2009, it was argued that even if municipalities were now able to generate and customize comparative reports for themselves through MIDAS, there was still value for the MMAH to post the MPMP data. A bandwagon effect could manifest itself where “other municipalities may be motivated to start comparing their own results with other municipalities and become interested in what MIDAS has to offer” (MPMP advisory committee meeting, 2009, p. 2).

The third phase, the introduction of public reporting, saw increased external benchmarking at various points in the data analysis, reporting, and discussion phases. A presentation at the September 2012 Ontario East Municipal Conference (OEMC) by an MPMP program manager presents a logic model where management purposes and learning come after accountability:

MPMP—data collection and reporting (transparency)

MIDAS—data mining (SWOT analysis and evidence-based decision making)

OMKN—identification and adoption of municipal service area best practices (innovation): sharing knowledge, improving service delivery.

Learning in this case specifically comes from customized external benchmarking at Step 2 through MIDAS queries and from best practices showcased by the Ontario Municipal Knowledge Network at Step 3 (MPMP program manager, 2012a, p. 47).

It was noted in December of 2012 that city councilors are attracted to external comparisons (MPMP advisory committee meeting, 2012b, p. 2), thus reiterating the value of comparative data via the existence of the MPMP. Nevertheless, comparisons between municipalities would be complicated by locally determined capital thresholds and allocations of costs among different services (MPMP advisory committee meeting, 2013, p. 4).

Reporting on the training sessions conducted in different localities throughout the province of Ontario, where 344 municipal managers from 162 municipalities participated, an MPMP program manager remarked that small municipalities, particularly the ones in the remote north of Ontario, discarded external comparisons. “It was evident that there are many unchallenged myths such as the feeling that the data was going to make someone look bad rather than developing a baseline to improve performance” (MMAH, 2013, p.8). What is more, around 50% of the participants said that they were familiar with MIDAS; less than 20% said that they actually used it (MMAH, 2013, p. 7).

The results presented here over three key time periods of the MPMP program’s history indicated shifts in perspectives of compliance, accountability, and learning via internal and external benchmarking. The next section discusses how these three time periods shift with these key trends.Discussion

Explicit and Implicit Assumptions and Theories About Learning in an Intelligence Regime

Discussing the state of the complex English CPA municipal target regime in 2008, Hood (2008, p. 12) lamented that very little thought was given to why subjective data are better than hard administrative data, and under which conditions it is better to “use performance indicators as targets, rankings or background intelligence.” Reading the public and internal documents of the MPMP program in Ontario, it becomes evident that substantial thought was given to those questions. At the launch of the program, many municipalities did not have sufficient capacities to use performance information to inform their decisions. Accordingly, a non-confrontational and flexible approach, an intelligence regime, was chosen. An intelligence regime was not seen as a gateway to rankings or targets by MMAH. Rather, with time, learning would occur through the use of performance measures and analyses of internal and external benchmarking. Hence, it is a natural bottom-up process of citizen inquiry to which the mangers would be held to account, rather than a fabricated top–down target system.

Performance improvement could occur only if learning happened via the purposeful analysis of data. This process of benchmarking as a learning mechanism is hypothesized in a recent meta-analysis of 49 studies where benchmarking is associated to higher performance (Gerrish, 2016). The theory of performance improvement in an intelligence regime is that capacity begets learning, which begets improvement. There is no pressure to improve performance itself, only to foster capacity. Although examples of external benchmarking are documented for large municipalities with another set of indicators (Alcantara, Leone, & Spicer, 2012, p. 122), evidence for the conditions necessary to learning (Ammons & Roenigk, 2015, p. 320) is only present in a fraction of municipalities in the MPMP. This does not mean that learning did not take place. Rather, it means that even with access to virtually all the information that MMAH has, we are not able to find tangible proof that indicates widespread learning.

Detecting Learning in Intelligence Regimes

To the questions of how an intelligence regime aims to enhance performance, Hood (2012) answered that such a regime “[e]ncourage(s) informed choice or developing learning capacity and diagnostic power by adding knowledge about performance, avoiding ratchet effects, threshold effects, and output distortion from gaming behavior” (p. s86). Detecting learning is difficult, particularly in the midst of reforms. The closest trace of a necessary, but not sufficient, condition for learning is for municipal managers to collect internal data of municipalities for many years, or to collect data from other municipalities. As we have seen, from the public reports collected and analyzed, very few municipal annual reports include internal benchmarks. The presence of external benchmarks in annual reports was not, however, studied by MMAH.

The first scenario that might indicate an absence of learning is that among municipal managers who do collect and transmit the value of their MPMP data showing elementary levels of compliance, few bother to analyze their values in light of their own city’s trends or look at the values of other cities. The lack of internal benchmarks in municipal annual documents would be a failed hoop test for that scenario where performance measurement activities are targeted at internal management and reporting to the public. However, a second scenario that would indicate the presence of learning is possible. Performance measurement could either be for internal management or public reporting. In that scenario, managers would indeed collect performance information for learning purposes, but would not include these benchmarks in their public reports.

To validate this second scenario, we could look at the activity within the MIDAS system, where there would be traces of ad hoc external comparison queries formulated by managers. However, the subcontractor running the MIDAS website on behalf of the Association of Municipalities of Ontario does not have the log history of queries made in the system. Along with the non-analysis of external benchmarks in annual reports, it suggests that MMAH does not consider the monitoring of external benchmarking fundamental, even after the latest time period.

Another way to estimate the frequency of learning and capacity-building activities on the part of municipal managers would be to ask them directly in a survey. The closest thing to a comprehensive survey on the use of external benchmarking data is the survey distributed in the regional training sessions run by the MMAH analogous to a learning forum (Moynihan, 2008, p. 179). As we have seen, few municipal managers declared that they are using the MIDAS web tool to compare their municipalities with others.

Aiming to develop capacity building is important in a performance regime. However, discerning whether capacity building happened is a difficult task.

However, gaining external support might be argued to extend the window of opportunity for getting started on behavior change by injecting new energy into the collaboration while also possibly promoting attitude change toward collaboration, especially among fence sitters. Capacity building would not provide these benefits. There is also the danger that creating capacity does not assure that efforts to change either attitudes or behavior will ever take place—It would certainly be not unheard of for an organization to use such capacity mostly for, say, symbolic or turf-building activities. (Kelman & Hong, 2015, p. 147, our emphasis)

In other words, there is a danger that capacity building becomes an objective that is assumed, but is impervious, to evaluation. This would make the MPMP like most performance reforms that are hard to evaluate (Pollitt, 2009, 2013; Pollitt & Dan, 2013).

Explicit and Implicit Assumptions and Theories About Compliance

MPMP managers endeavored to keep track of accountability compliance over the years. This makes sense, as the MPMP has accountability requirements from its inception. However, this should not be taken for granted, as it was not done in the intelligence regime in the neighboring province of Quebec (Charbonneau, 2011). The MPMP does not have performance improvement requirements. As an intelligence performance regime, where municipalities are free to set their own objectives and priorities, it becomes devilishly difficult to establish whether mass performance improvement is taking place. Contrary to targets or ranking regimes, MMAH could not just count how many targets were met or whether bottom performers caught up to middle or top performers. As such, it should be no surprise that MMAH did not try to make sure that performance improved on a grand scale.

The MPMP was designed and retooled as a voluntary management and a reporting tool. The management portion of it rested on a loosely coupled theory of learning. The FIR, a relative of the performance measurement system, is solely a reporting tool. However, if municipalities are really late submitting FIR data—that is, more than 4 months late—they can be disqualified for provincial grants and transfers. The compliance rate for the FIR is virtually 100%.² The compliance rate for MPMP data is not. Notably, there is no downward trend in compliance with the MPMP, although compliance theory predicts otherwise. Efforts from MPMP managers in the later period can be lauded for that reversal.³

Conclusion

Weaver (2014) predicted varying levels of compliance with performance efforts. Promotion of the MPMP program after many years of existence tells us that compliance is indeed an issue. The several training sessions that took place in five regions of Ontario between June and November 2012 are a concrete manifestation of this. As a counterfactual, we can read the copious literature on the now defunct mandatory target performance regime in England on the extent of performance management. All localities used the indicators and were evaluated by the central government based on those indicators. Despite this, there are no studies on the extent of the use of performance measurement. Compliance, as measured by the collection of performance information, was not an issue.

In this research, we look at an additional case to test the limits of the general compliance framework developed by Weaver (2014). Most reform outcomes centered on performance are difficult to evaluate (Pollitt, 2013), because they are not designed to be evaluated (Pollitt, 2009; Pollitt & Dan, 2013). Given the usual methodological difficulties with attributing results to administrative performance reforms (Bovaird, 2014; Moynihan, 2013), it was possible to provide evidence supporting likely results of the reforms: stable compliance for midsized municipalities, with periodical downward compliance trends for small municipalities, coupled with limited internal benchmarking, and lower external benchmarking. The Field of Dreams model for performance regime works for some, but not all. Ultimately, what can be demonstrated through the rich detail of qualitative analysis is that intelligence regimes may breed suboptimal compliance. Assumptions about compliance, learning through benchmarking, but also accountability (Charbonneau, 2011) and improvement (Andrews & Martin, 2010; Charbonneau, Bromberg & Henderson, 2015) have been put to the tests for intelligence regimes in Wales, Quebec, and Ontario. The benefits of intelligence regimes expected by Hood (2008, 2012) have not been witnessed where such a regime has been implemented. Regardless, examining the implementation and effects of actual performance regimes is essential for improving knowledge and practice about current and future reforms.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The first author received financial support from the Fonds de recherche du Québec - Société et culture.

Notes

Author Biographies

Étienne Charbonneau is an associate professor of public management at the École nationale d’administration publique, Montreal, Canada. He is also the codirector of the CERGO research center. He is a Fellow at the Center for Organization Research and Design (CORD), based at Arizona State University. He holds a PhD from the School of Public Affairs and Administration at Rutgers-Newark. His research focuses on performance management and citizen satisfaction.

Daniel Bromberg is an assistant professor of public administration at the University of New Hampshire. He holds a PhD from the School of Public Affairs and Administration at Rutgers-Newark. He is a CORD Fellow. His research interests include collaborative governance, government contracting, e-government, and performance management.

Alexander C. Henderson is an assistant professor in the Department of Health Care and Public Administration at Long Island University. He holds a PhD from the School of Public Affairs and Administration at Rutgers-Newark. He is a CORD Fellow. His current research examines frontline behavior in emergency medical services organizations, as well as broader inquiry into the structuring of public safety services. He previously served as chief administrative officer, operational officer, director, and volunteer with several emergency services organizations in suburban Philadelphia.

References

Alcantara

Leone

Spicer

(2012). Responding to policy change from above: Municipal accountability and transparency regimes in Ontario. Journal of Canadian Studies, 46, 112-137.

Ammons

D. N.

Roenigk

D. J.

(2015). Benchmarking and interorganizational learning in local government. Journal of Public Administration Research and Theory, 25, 309-335.

Andrews

Martin

(2010). Regional variations in public service outcomes: The impact of policy divergence in England, Scotland and Wales. Regional Studies, 44, 919-934.

Audit Commission. (2009). Is there something I should know? Making the most of your information to improve services. London, England.

Berman

Wang

(2000). Performance measurement is U.S. counties: Capacity for reform. Public Administration Review, 60, 409-420.

Bovaird

(2014). Attributing outcomes to social policy interventions—“Gold standard” or “fool’s gold” in public policy and management? Social Policy & Administration, 48, 1-23.

Burgess

Wilson

Worth

(2013). A natural experiment in school accountability: The impact of school performance information on pupil progress. Journal of Public Economics, 106, 57-67.

Charbonneau

. (2011). Assessing the effects of an intelligence performance regime: Quebec’s municipal management indicators, 1999-2010. International Review of Administrative Sciences, 77, 733-755.

Charbonneau

É.

Bromberg

Henderson

A. C

. (2015). Performance improvement, culture, and regimes: Evidence from the Ontario municipal performance measurement program, 2000-2012. International Journal of Public Sector Management, 28, 105-120.

10.

Collier

(2011). Understanding process tracing. Political Science and Politics, 44, 823-830.

11.

De Araújo

J. F. F. E.

Branco

J. F. A

. (2009). Implementing performance-based management in the traditional bureaucracy of Portugal. Public Administration, 87, 557-573.

12.

Downe

Grace

Martin

Nutley

(2008). Best value audits in Scotland: Winning without scoring? Public Money & Management, 28, 77-84.

13.

Eisenhardt

K. M.

Graebner

M. E.

(2007). Theory building from cases: Opportunities and challenges. Academy of Management Journal, 50, 25-32.

14.

Etienne

(2011). Compliance theory: A goal framing approach. Law & Policy, 33, 305-333.

15.

Faletti

T. G.

Lynch

J. F.

(2009). Context and causal mechanisms in political analysis. Comparative Political Studies, 42, 1143-1166.

16.

George

J. M.

Jones

G. R.

(2000). The role of time in theory and theory building. Journal of Management, 26, 657-684.

17.

Gerrish

(2016). The impact of performance management on performance in public organizations: A meta-analysis. Public Administration Review, 76, 48-66.

18.

Grace

(2012). From the improvement end of the telescope: Benchmarking and accountability in UK local government. In Fenna

Knüpling

(Eds.), Benchmarking in federal systems (pp. 41-60). Melbourne, Australia: Productivity Commission.

19.

Halligan

(2013). The role and significance of context in comparing country systems. In Pollitt

(Ed.), Context in public policy and management the missing link? (pp. 356-373). Cheltenham, UK: Edward Elgar.

20.

Haubrich

McLean

(2006). Evaluating the performance of local government: A comparison of the assessment regimes in England, Scotland and Wales. Policy Studies, 27, 271-293.

21.

Hood

(2007). Public service management by numbers: Why does it vary? Where has it come from? What are the gaps and the puzzles? Public Money & Management, 27, 95-102.

22.

Hood

(2008). Options for Britain: Measuring and managing public services performance. Political Quarterly, 79(s1), 7-18.

23.

Hood

(2012). Public management by numbers as a performance-enhancing drug: Two hypotheses. Public Administration Review, 72(s1), s85-s92.

24.

Hood

Dixon

(2010). The political payoff from performance target systems: No-brainer or no-gainer? Journal of Public Administration Research and Theory, 20(s2), i281-i298.

25.

Hood

Dixon

(2015). A government that worked better and cost less? Evaluating three decades of reform and change in UK central government. New York, NY: Oxford University Press.

26.

Hunter

D. E. K.

Nielsen

S. B.

(2013). Performance management and evaluation: Exploring complementarities. New Directions for Evaluation, 137, 7-17.

27.

Kelman

Hong

(2015). This could be the start of something big: Linking early managerial choices with subsequent organizational performance. Journal of Public Administration Research and Theory, 25, 135-164.

28.

Kroll

(2014). Why performance information use varies among public managers: Testing manager-related explanations. International Public Management Journal, 17, 174-201.

29.

Martí

Royo

Acerete

(2012). The effect of new legislation on the disclosure of performance indicators: The case of Spanish local government. International Journal of Public Administration, 35, 873-885.

30.

Martin

Downe

Grace

Nutley

(2010). Validity, utilization and evidence-based policy: The development and impact of performance improvement regimes in local public Services. Evaluation, 16(13), 31-42.

31.

Martin

Nutley

Downe

Grace

(2016) Analysing performance assessment in public services: How useful is the concept of a performance regime? Public Administration, 94, 129-145.

32.

Maxwell

J. A.

(2010). Using numbers in qualitative research. Qualitative Inquiry, 16, 475-482.

33.

Ministry of Municipal Affairs and Housing. (2005). Municipal Performance Measurement Program Summary of 2002 Results. Toronto, ON. 17p.

34.

Ministry of Municipal Affairs and Housing. (2007, January 2). Amendments to Municipal Act, 2001 proclaimed. http://news.ontario.ca/archive/en/2007/01/02/Amendments-To-Municipal-Act-2001-Proclaimed.html

35.

Ministry of Municipal Affairs and Housing. (2008). Municipal Performance Measurement Program Summary of 2005 Results. Toronto, ON. 170p.

36.

Ministry of Municipal Affairs and Housing. (2013). Exploring the FIR Forest: Unlock the Analytical Power of Financial Information Return and Municipal Performance Measurement Program Data Summary Report of the training sessions that took place in five regions from June-November, 2012. Toronto, ON. 39p.

37.

Ministry of Municipal Affairs and Housing. (2014). MPMP non-submission municipalities database. Toronto, ON.

38.

Moxham

(2013). Measuring up: Examining the potential for voluntary sector performance measurement to improve public service delivery. Public Money & Management, 33, 193-200.

39.

Moynihan

D. P.

(2008). The dynamics of performance management: Constructing information and reform. Washington, DC: Georgetown University Press.

40.

Moynihan

D. P.

(2013). Advancing the empirical study of performance management: What we learned from the program assessment rating tool. American Review of Public Administration, 43, 499-517.

41.

Moynihan

D. P.

Kroll

(2016). Performance management routines that work? An early assessment of the GPRA Modernization Act. Public Administration Review, 76, 314-323.

42.

MPMP Advisory Committee. (2002a). January 15th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-3.

43.

MPMP Advisory Committee. (2002b). December 10th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-3.

44.

MPMP Advisory Committee. (2003a). February 26th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-3.

45.

MPMP Advisory Committee. (2003b). April 8th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-10.

46.

MPMP Advisory Committee. (2004). November 24th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-4.

47.

MPMP Advisory Committee. (2006a). May 16th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-3.

48.

MPMP Advisory Committee. (2006b). September 29th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing G, Toronto, ON. 1-4.

49.

MPMP Advisory Committee. (2007). March 7th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-4.

50.

MPMP Advisory Committee. (2009). April 27th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-5.

51.

MPMP Advisory Committee. (2012b). March 8th MPMP Advisory Committee Meeting. In Ministry of Municipal Affairs and Housing, Toronto, ON. 1-4.

52.

MPMP Advisory Committee. (2013). June 6th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-5.

53.

MPMP Advisory Committee. (2014b). August 26th MPMP Advisory Committee Meeting. In: Ministry of Municipal Affairs and Housing, Toronto, ON. 1-2.

54.

MPMP Program Manager. (2010). Service Improvement Information Sharing Transparency. In: CEPMA conference, June 10th. Toronto, ON.

55.

MPMP Program Manager. (2012a). Roll up the Municipal Performance Measurement Program to win! Assessing value in the MPMP. Kingston, ON. 1-52.

56.

MPMP Program Manager. (2012b). Briefing Note. In: Ministry of Municipal Affairs and Housing July 17th. Toronto, ON. 1-4.

57.

MPMP Program Manager. (2014). Municipal performance measurement. In: Ministry of Municipal Affairs and Housing February. Toronto, ON. 1-10.

58.

Nutley

S. M.

Downe

J. D.

Martin

S. J.

Grace

C. L.

(2012). Policy transfer and convergence in the UK: The case of local government performance regimes. Policy & Politics, 40, 193-209.

59.

Ployhart

R. E.

Vandenberg

R. J.

(2010). Longitudinal research: The theory, design, and analysis of change. Journal of Management, 36, 94-120.

60.

Pollanen

R. M.

(2011). Relative performance benchmarking of local governments: Case of Ontario municipalities. International Journal of Business and Public Administration, 8, 19-33.

61.

Pollitt

(2009). Structural change and public service performance: International lessons? Public Money & Management, 29, 285-291.

62.

Pollitt

(2013). The evolving narratives of public management reform: 40 years of reform white papers in the UK. Public Management Review, 15, 899-922.

63.

Pollitt

(2015). Towards a new world: Some inconvenient truths for Anglosphere public administration. International Review of Administrative Sciences, 81, 3-17.

64.

Pollitt

Dan

(2013). Searching for impacts in performance-oriented management reform. Public Performance & Management Review, 37, 7-32.

65.

Propper

Sutton

Whitnall

Windmeijer

(2010). Incentives and targets in hospital care: Evidence from a natural experiment. Journal of Public Economics, 94, 318-335.

66.

Radin

B. A.

(1998). The Government Performance and Results Acts (GPRA): Hydra-headed monster or flexible management tool? Public Administration Review, 58, 307-316.

67.

Robertson

Ball

(2002). Innovation and improvement in the delivery of public services: The use of quality management within local government in Canada. Public Organization Review, 2, 387-405.

68.

Schatteman

(2010). The state of Ontario’s municipal performance reports: A critical analysis. Canadian Public Administration, 53, 531-550.

69.

Siegel

(2009). Ontario. In Sancton

Young

(Eds.), Foundations of governance: Municipal government in Canada’s provinces. (pp. 20-69) Toronto, Ontario, Canada: University of Toronto Press.

70.

Talbot

(2010). Theories of performance: Organizational and service improvement in the public domain. New York, NY: Oxford University Press.

71.

Talbot

Wiggan

(2010). The public value of the national audit office. International Journal of Public Sector Management, 23, 54-70.

72.

Tindal

C. R.

Tindal

S. N.

(2009). Local government in Canada. Toronto, Ontario, Canada: Nelson Education.

73.

Van Evera

. (1997). Guide to methods for students of political science. Ithaca, NY: Cornell University Press.

74.

Vedung

(2010). Policy instruments: Typologies and theories. In Bemelmans-Videc

M. L.

Rist

R. C.

Vedung

(Eds.), Carrots, sticks & sermons: Policy instruments & their evaluation (2nd ed., pp. 59-76) New Brunswick, NJ: Transaction Publishers.

75.

Weaver

R. K.

(2014). Compliance regimes and barriers to behavioral change. Governance, 27, 243-265.

76.

White

(2012). Playing the wrong PART: The program assessment rating tool and the functions of the president’s budget. Public Administration Review, 72, 112-121.

77.

Wright

B. E.

(2015). The science of public administration: Problems, presumptions, progress, and possibilities. Public Administration Review, 75, 795-805.