Are Performance Management Practices Associated With Better Outcomes? Empirical Evidence From New York Public Schools

Abstract

Performance management is widely assumed to be an effective strategy for improving outcomes in the public sector. However, few attempts have been made to empirically test this assumption. Using data on New York City public schools, we examine the relationship between performance management practices by school leaders and educational outcomes, as measured by standardized test scores. The empirical results show that schools that do a better job at performance management indeed have better outcomes in terms of both the level and gain in standardized test scores, even when controlling for student, staffing, and school characteristics. Thus, our findings provide some rare empirical support for the key assumption behind the performance management movement in public administration.

Keywords

performance management management analysis educational outcomes

Introduction

How to improve performance and achieve better outcomes is a question that has faced the public sector for some time and one that has received ever increasing attention, especially since the reinventing government movement (Moynihan & Pandey, 2005, 2010; Wholey, 2001). During the past few decades, the adoption and implementation of performance measurement and performance management approaches to improving public service delivery and to achieving key outcomes has become an important component of the administrative reform around the world (Moynihan, 2006; Pollitt, 2006; Wholey, 1999; Yang & Hsieh, 2007). The performance management approach has been embraced for its potential capacity to provide critical information, to evaluate actions, to make better decisions, to allocate resources more effectively, and to strengthen bureaucratic accountability (Behn, 2003; Holzer &Yang, 2004; Poister, 2003).

As opposed to a more traditional focus in public administration on inputs and compliance, performance-based management holds that organizational performance can be improved by focusing on results and effectiveness, highlighting the relationship between managerial effort and achievement of key organizational outcomes (Moynihan, 2006; Moynihan & Pandey, 2005). Performance management is assumed to improve organizational outcomes by specifying goals and setting performance targets and then attempting to achieve those goals through effective management practices (Poister, 2010; Wholey, 1999). As part of this process, data collected from performance measurement systems are essential because they inform public managers about where to direct resources and improvement efforts and provide a way to monitor how much progress is achieved over time. In this way, the use of performance measures improves decision making about future strategies and managerial practices in ways that presumably enhance organizational effectiveness and outcomes (Behn, 2003; Holzer &Yang, 2004; Sanger, 2008).

Although there is widespread acceptance of the performance management approach in public sector organizations, the question of whether it is actually associated with better organizational outcomes has rarely been tested empirically. To a large extent, the lack of prior research along these lines is because of the difficulty of systematically comparing organizations in terms of their engagement in performance management practices, which involves a complex bundle of activities including goal setting, performance measurement, analysis and reporting, and feedback to staff and others involved in the production of outcomes. In addition, it is quite difficult in the public sector to systematically assess these outcomes and in turn organizational performance itself. Thus, both the independent variable (performance management practices) and the dependent variable (performance outcomes) turn out to be rather difficult to measure. Fortunately, the New York City Department of Education provides both a systematic assessment of the performance management practices of schools and their leaders as well as standardized test scores of students as a measure of performance for more than 1,000 of its public schools. This provides a unique opportunity to empirically examine the relationship between performance management practices and outcomes.

Therefore, this article proceeds in three parts. First, the literature on performance measurement and management will be reviewed, with a focus on the key assumption in the literature that performance management practices are positively associated with better outcomes. Second, we introduce our measures of the main independent and dependent variables and present our statistical findings. Finally, the conclusions and policy implications of our findings will be discussed.

Literature Review and Theoretical Framework

The new public management (NPM) movement has changed both the ways in which public services are delivered and the premise on which public organizations are administrated (Denhardt & Denhardt, 2000; Fattore, Dubois, & Lapenta, 2012; Sanderson, 2001). A key feature of NPM is its attention to organizational results and effectiveness, particularly highlighting the usefulness of performance information. Moreover, public managers are expected to play active, entrepreneurial roles in achieving effectiveness and efficiency by reforming internal management practices and improving their managerial skills (Moynihan & Pandey, 2005). In other words, there are expectations that the link between organizational outcomes and managerial effort should be strengthened, as Moynihan and Pandey (2005) observed that “[public] managers face increased calls to justify their management choices in the context of performance” (p. 422).

The established focus on outcomes in the public sector is reflected in the increasing diffusion of performance measurement systems. The successful utilization of performance measurement in the private sector encouraged this trend in the public sector. Over the past several decades, a burgeoning literature in public administration has facilitated the adoption and implementation of performance measurement and management (Poister & Streib, 1999). It is widely accepted that the problem of information asymmetry is more salient in a top down institutional environment, which hinders the achievement of performance effectiveness. Therefore, the incentive for developing performance measurement systems often stems from a desire to create an effective control mechanism that addresses such challenges (Heinrich, 1999; Sanderson, 2001). The advocates of performance measurement claim that tracking program progress through measureable indicators not only helps identify gaps between targets and actual outcomes of governmental activities but also provides a more effective mechanism to hold public employees accountable for meeting performance standards (Hatry, 2006; Heinrich, 2002; Propper & Wilson, 2003; Roberts, 2002).

Even though governmental agencies may commit to using performance measurement, there is no guarantee that simply collecting and reporting performance-related data will lead to better outcomes (Bouckaert & Peters, 2002; Hatry, 2002; P. Smith, 1995). Evidence suggests that—without translating performance statistics into information that can be used in actual decision making and management practices—increasing amounts of performance data make little difference in actual organizational performance (Kravchuk & Schack, 1996). Empirical studies also suggest that performance-relevant data generated by outcome-based management systems might not always help in directing program managers’ attention to the most productive performance-management activities (Heinrich, 1999). Based on the literature of performance management practices of governments, Wegener (1998) argued that “evaluation remains, in the vast majority of applications, a stand-alone tool not integrated into the public governance process” (p. 191). Kravchuk and Schack (1996) also urge that the data collected from performance evaluation systems should be reported through “preestablished, highly focused feedback channels” (p. 352). Monihan and Pandey (2010) even label the utilization of performance information as one of “the biggest questions for performance management” research (p. 850). In this context, there are increasing calls in public administration to transfer government assessment practices from an evaluation and control orientation to one in which performance information is used to strengthen internal management and improve public service delivery (Sanderson, 2001).

This issue highlights another key assumption underpinning performance management: that management practice matters to public performance. A number of general public management studies suggest that organizational performance is attributable to factors under the control of managers (e.g., Andrews & Boyne, 2010; Boyne, 2003; Ingraham, Joyce, & Donahue, 2003; Meier & O’Toole, 2002; Moynihan & Pandey, 2005; Nicholson-Crotty & O’Toole, 2004; Rainey & Steinbauer, 1999; K. B. Smith & Meier, 1994). They contended that, given their authority and discretion over personnel and resource allocation, public managers’ capacity and skills do account for performance improvement through mechanisms such as taking advantage of various resources and technology, motivating and coordinating key actors, providing ongoing feedback, designing tasks and reshaping work settings, and levering other inputs to performance.

Correspondingly, management practices become key mechanisms through which performance measurement leads to improved organizational effectiveness. If the quality of management practices is poor, the linkage between performance-related information and performance improvement is weakened and the desired organizational outcomes are unlikely to be achieved. Therefore, generating performance-related data should not be an end in itself but rather a means to allow public managers to evaluate how much progress their operation has achieved over time, inform better decisions that adapt to new priorities or needs, and eventually bring about improvement of key organizational outcomes. In sum, performance management practices attempt, in Rousseau’s (2006) words, to “[move] professional decisions away from personal preference and unsystematic experience toward those based on the best available scientific evidence” (p. 256).

Several studies furnish empirical support for the notion that performance management practices are positively associated with indicators of organizational effectiveness. Melkers and Willoughby (2005) analyzed the use of performance management in the operations of local governments. Based on an extensive mail survey of administrators and budget officials in local governments, they concluded that the adoption of outcome-based management system allows government agencies to have a better understanding of how much progress they have made toward achieving their goals. Moreover, collecting relevant data about results also appears to have positive impacts on communication within governments, enhancing program management, as well as informing budgetary decision making. The potentials of performance management were also reported in Berman and Wang’s (2000) study. The authors indicated that for county-level governments, those governments using performance measurement more often tend to value its influence on the improvement of organizational goal setting and bureaucratic accountability (Berman & Wang, 2000).

Additional support for a link between performance management practices and outcomes comes from studies of federally funded programs such as the Job Training Partnership Act (JTPA). As “one of the pioneers in using performance management” (Barnow, 2000, p. 119), JTPA developed its performance management system consistent with the requirements of the Government Performance and Results Act, setting performance targets by which each program is evaluated in order to determine whether it has achieved desired performance goals, with subsequent rewards and penalties. In this way, local job training programs were encouraged to perform in ways that had a positive impact on program outcomes in a multilevel and decentralized system (Barnow, 2000). The results of Heinrich’s (1999) study suggest that governmental decisions about the renewal of training program contractors and funding allocations did rely considerably on performance data. To this extent, the JTPA performance management system played a significant role in creating incentives for employees and service providers to align their efforts with the goals of state and federal policy. In another study of JTPA, Heinrich (2002) also pointed out that, even though there are concerns about the reliability of administrative data generated by performance management systems, evidence suggests that the utilization of these data still has potential to improve organizational performance.

Evidence regarding the effectiveness of performance management is also provided by studies about the significance of goal clarification and target setting (Boyne & Chen, 2007). Governmental agencies are often criticized as having goals and objectives that are more ambiguous and difficult to measure than their counterparts in the private sector (e.g., Dahl & Lindblom, 1953; Pandey & Rainey, 2006; Rainey, 2009; Warwick, 1975). Performance management calls attention to organizational goals by explicitly focusing on expected levels of performance, developing measurable indicators of goal achievement, and engaging in a strategic planning process around goals and objectives.

In the field of educational administration and policy, much has been written about the factors influencing student learning and performance. Some of the modifiable factors found to influence student academic achievement and other educational outcomes include the skills of school leaders (e.g., Cheng,1994; Ekholm, 1992; Hallinger, Bickman, & Davis, 1996), teachers’ qualification (e.g., Clotfelter, Ladd, & Vigdor, 2007; Darling-Hammond, 2000; Wayne & Youngs, 2003), teachers’ attitudes and instructional practices in class (e.g., Brophy, 1986; Carbonaro & Gamoran, 2002; Palardy & Rumberger, 2008), as well as connections and cooperation among the school, family, and community (e.g., Epstein & Sheldon, 2002; Fan & Chen, 2001; Sheldon, 2003). Although these are some of the key modifiable drivers of educational outcomes, attention has been directed also to the potential of reforms in school management systems. In particular, education researchers have investigated the effects of policies that hold school leaders accountable for test scores (e.g., Carnoy & Loeb, 2002; Hanushek & Raymond, 2005; Ladd, 1999). But we know of no prior empirical studies that have explicitly looked at how variation in performance measurement practices at the school level, within a given school system, are related to educational outcomes.

According to Weiss (1995), it is essential to explicate the assumption underlying the relationship between an educational intervention and the intended outcomes, what she calls a theory of change highlighting the logic of cause–effect linkages that describe how and why an initiative works. With respect to performance management in public schools, the theory of change follows this logic: schools use performance indicators to monitor student progress on a regular basis throughout the school year; school leaders use this information to provide feedback and advice to teachers; in turn, teachers use the information and received feedback to help individual students, especially those at risk of not meeting standards, and to generally improve their teaching effectiveness; finally, students receiving this support and enhanced instruction do better on their year-end standardized tests and other outcome measures. In sum, performance management is based on the assumption that the information collected from performance measures in an educational context will provide critical information to school leaders and teachers for setting goals, monitoring plans, modifying practices, and ultimately having a positive influence on students’ test scores.

However, there remain concerns about the actual effectiveness of performance management in practice. A number of prior studies have observed a weak correlation between performance management systems and actual performance outcomes, in part because the performance indicators themselves are often only loosely related to outcomes. Indeed, even in Heinrich’s (1999) study of JTPA, she argues that “the performance standards system [of JTPA program] is not well designed, as performance measures are not strongly correlated with program goals” (p. 363). Barnow (2000) used training sites of JTPA as the unit of analysis and compared enrollees’ earnings and employment after the termination of training programs across different sites. The purpose was to examine whether programs ranking higher in terms of their measured performance were also those that had more positive impacts on trainees (particularly postprogram earnings and employment). His results were somewhat disappointing, as measured indicators produced by JTPA’s performance management system were not closely linked to key program outcomes.

In sum, even though the existing literature generally makes the case for performance management and its potential usefulness in the public sector, there is scant and inconsistent empirical evidence for its actual impact on the performance or outcomes of public organizations. Therefore, we aim to contribute some additional empirical evidence regarding the effects of performance management practices on performance outcomes in the context of public education. We do this using data on New York City public schools, which is unique in many ways because the city’s school system provides both a systematic assessment of performance management practices in its 1,700 schools along with standardized test scores for most of its 1.1 million students (New York City Department of Education [DoE], 2012). The city’s school system is perhaps the most diverse in the country, and it has had a tradition of school-level autonomy and experimentation in educational and management approaches (Ravitch, 2010). Thus, the New York City context is ideal for this study because it provides consistent measures of management practices and educational outcomes across a large number of public schools that vary in philosophy and administration. We describe our data and measures in more detail in the next section.

Data and Measures

The data for our study were obtained from the New York City DoE, which issues both a Quality Review of each school’s management practices (DoE, 2008) as well as provides more traditional statistics on test scores (DoE, 2009a), student characteristics, school staffing, and other school characteristics (DoE, 2009b). After matching and cleaning the available data, there were a total of 1,004 primary and middle schools in our final analytical sample. We excluded high schools from the sample because they do not participate in the same standardized tests and are thus not directly comparable to elementary and middle schools. Because change in student performance is likely to lag management practices, we employ the student test scores and control variables from the 2008-2009 data, 1 year later than the Quality Review data (2007-2008). By doing this, some of the potential confusion regarding the direction of causality in the relationship between the variables is also at least partly addressed (Andrews & Boyne, 2010).

Dependent Variables

There is certainly much debate in education policy about the best ways to measure school or student performance, in particular whether test scores alone should be the main focus or whether multiple performance indicators should also be considered. Although public schools produce multiple outputs and outcomes and thus should arguably be assessed by multiple indicators (Meier & O’Toole, 2002), we have chosen to use two outcome measures based on test scores for several reasons. To begin with, gains in test scores have become the primary performance outcome of interest for most school systems in the United States, particularly gains for low-income, minority, and other disadvantaged students. In addition, the performance management system in New York City public schools is one largely focused on test scores, for example, by identifying underperforming students and intervening to help them raise their test scores by the end of the year. Teachers in the system are also now assessed by the level and gains in test scores of their students. Therefore, we chose to focus both on the overall average level of student achievement, measured by test scores, as well as a measure of student progress for disadvantaged students, and to do so for both English language arts (ELA) and math. Thus, we have four dependent variables in our analysis:

The proportion of all students in the school who meet proficiency standards (Level 3 or 4) in ELA

The proportion of all students in the school who meet proficiency standards (Level 3 or 4) in math

The proportion of students in the school in the lowest third of the distribution of who demonstrated progress (gains) on their ELA scores

The proportion of students in the school in the lowest third of the distribution who demonstrated progress (gains) on their math scores

The descriptive statistics for these variables are shown in Table 1.

Table 1.

Descriptive Statistics.

	Mean	SD	Min	Max
Dependent variables
Proportion meeting ELA standards (Levels 3 and 4)	0.69	0.16	0.07	1.00
Proportion meeting math standards (Levels 3 and 4)	0.83	0.12	0.34	1.00
ELA progress of lowest third	0.87	0.06	0.59	1.00
Math progress of lowest third	0.75	0.09	0.38	0.97
Independent variable
Performance management index	3.51	0.60	1.14	5.00
Control variables
Low-income students (squared)	5146.79	2525.23	5.76	10000.00
Recent immigrants	0.02	0.04	0.00	0.78
Black or African American	33.60	29.69	0.00	96.30
Hispanic or Latino	39.98	26.45	1.80	100.00
Asian or Native Hawaiian/Pacific	11.81	17.19	0.00	92.30
American Indian or Alaska Native	0.48	0.51	0.00	4.30
Female	49.45	5.63	0.00	100.00
Student stability	92.98	3.59	76.45	100.00
Log total enrollment	6.25	0.60	3.66	7.67
Student–teacher ratio	11.91	3.30	0.83	30.78
Teachers with 2+ years experience	64.87	19.32	0.00	100.00
Teachers with masters degree	82.79	10.62	27.00	100.00

Note. ELA = English language arts.

Independent Variables

The New York City DoE’s Quality Review, from which we form our independent variable of interest, comes from an onsite visit to each school conducted by a team of experienced and trained external reviewers from the DoE who observe a wide range of activities at the school related to instructional practices, organizational climate, management techniques, and leadership strategy (DoE, 2008). The 2007-2008 Quality Review had 35 criteria related to gathering data, planning and setting goals, aligning instructional strategy to goals, aligning capacity building to goals, and monitoring and revising. Each of the school was rated on these criteria using a 5-point scale, with 5 being outstanding and 1 being underdeveloped.

To examine the pattern and potential dimensions of the 35 criteria, we conducted an exploratory factor analysis (using principal components factor analysis, the eigenvalue criterion for retaining factors, and varimax rotation in Stata 11). The factor analysis extracted one especially strong factor that accounted for fully 53% of the variance in the 35 criteria scores. The seven criteria that loaded on this factor are as follows:

Item 2-1: To what extent do school leaders and faculty engage in collaborative processes to set rigorous, objectively measurable goals for improvement, and to develop plans and timeframes for reaching those goals?

Item 5-1: To what extent do the school’s plans for improving student outcomes include interim goals that are objectively measurable and have suitable time frames for measuring success and making adjustments?

Item 5-2: To what extent do the school’s plans for improving teacher outcomes include interim goals that are objectively measurable and have suitable time frames for measuring success and making adjustments?

Item 5-3: To what extent do teachers and faculty use periodic assessments and other diagnostic tools to measure the effectiveness of plans and interventions for individual and groups of students in key areas?

Item 5-4: To what extent do teachers and faculty use the information generated by periodic assessments and other progress measures and comparisons to revise plans immediately in order to reach stated goals?

Item 5-5: To what extent do school leaders track the outcomes of periodic assessments and other diagnostic measures and use the results to makes strategic decisions to modify practices to improve student outcomes?

Item 5-6: To what extent do school leaders and staff use each plan’s interim and final outcomes to drive the next stage of goal setting and improvement planning?

These seven items clearly can be seen as indicators of the effective practice of performance management, including an emphasis on measurable goals, ongoing feedback, targeted intervention, and strategic decision making based on performance information. Therefore, we constructed a scale based on these seven items (taking their average), which has a Cronbach’s alpha coefficient of reliability of .94, suggesting a very high degree of internal consistency among items. The descriptive statistics for this seven-item scale, which is the measure of performance management practices we use in our subsequent analysis, can be found in Table 1. Additionally, it should be noted that the scores on this scale range widely from a low of just more than 1 to a high of 5 (a perfect score) and, moreover, are distributed in a nearly normal fashion around a mean of 3.5. Thus, there is ample variation in the quality of performance management practices across schools in our sample. Finally, it is important to note that, in contrast to many previous studies in public administration, our measure of performance management practices comes from a systematic outside assessment by a team of reviewers and not a self-report by agencies or administrators themselves.

Control Variables

In any analysis of factors influencing public school performance, it is necessary to control for school, teacher, and student characteristics that are known to influence school performance and may be beyond the control of school leaders (Meier & O’Toole, 2002). Following the specification of a fairly standard educational production function (Hanushek, 1979), and given the available data, we have included 12 control variables that have been tested extensively in the existing literature (e.g., Caldas & Bankston, 1997; Clotfelter et al., 2007; Fowler & Walberg, 1991; Greenwald, Hedges, & Laine, 1996; Schwartz, Schmitt, & Lose, 2012), as shown in Table 1.

Despite the performance management skills of school leaders, schools with a higher percentage of students from low-income or disadvantaged backgrounds might be expected to have more difficulty attaining high performance (Caldas & Bankston, 1997). Therefore, we include the percentage of low-income students (squared, because this distribution was highly left skewed), and the percentage of students who are recent immigrant. We also include the racial-ethnic composition of the student body (represented by four variables), as well as the percentage of female students (even though schools do not vary as much in gender distribution). Finally, we include a measure of student stability, or the proportion of students who remain at a given school during the year, because it is a proxy for the stability of families and also because it is likely to constrain the ability of school leaders to affect student performance.

School structure and resources are also factors determining academic achievement, even though there are ongoing controversies concerning the magnitude of the influence of these factors (Hanushek, 1996, as cited in Meier & O’Toole, 2002, p. 637). Four measures of school structure and resources are included: total enrollment (logged, because the distribution is highly right skewed), the student–teacher ratio, the percentage of teachers having more than 2 years teaching experience in the school, and the percentage of teachers holding master degree or above. There are debates about how much school size may influence performance, but still it is important to account for this factor. We assume that the student–teacher ratio would be negatively related to student performance, whereas teacher experience and qualifications would be positively related to performance. Again, Table 1 presents the descriptive statistics for all of the control variables.

Analysis and Results

To examine the effect of performance management on school outcomes, we estimated ordinary least squares models for each of our four dependent variables. The models all include our main independent variable, which is the performance management index, as well as the 12 control variables described above. The results are presented in Table 2. To facilitate interpretation of effect sizes across variables and models, we report all of the regression coefficients in standardized (beta) form.

Table 2.

Regression Results (Standardized Coefficients).

	Performance Level (% Meeting Standards)		Progress (% Lowest Third Making Progress)
	ELA	Math	ELA	Math
Performance management index	0.15***	0.22***	0.08**	0.09***
Low-income (squared)	−0.13***	0.00	0.17***	0.00
Recent immigrants	−0.16***	−0.10***	−0.16***	0.02
Black	−0.24***	−0.30***	−0.13**	−0.28***
Hispanic	−0.28***	−0.21***	0.08	−0.17***
Asian	0.13***	0.10***	−0.01	−0.01
Native American	−0.01	−0.03	0.07**	−0.03
Female	0.11***	0.10***	−0.05	−0.10***
Student stability	0.25***	0.15***	−0.21***	−0.01
Log total enrollment	−0.10***	−0.10***	0.02	−0.03
Student–teacher ratio	0.16***	0.15***	−0.10**	−0.06
Teacher with 2+ years experience	0.11***	0.15***	0.03	−0.07
Teachers with masters degree	0.10***	0.17***	0.01	−0.08*
Adj. R²	0.72	0.53	0.17	0.06

Note. ELA = English language arts.

***

p < .01. **p < .05. *p < .1.

We begin with the performance models predicting the proportion of students meeting ELA and math test proficiency standards, which appear in the first two columns of Table 2. The models fit the data well, with an adjusted R² of .72 for the ELA model and .53 for the math model. More important, in both models the performance management index is significantly associated with better student achievement in both ELA and math, even holding the 12 control variables constant. Indeed, the coefficients suggest that a one standard deviation increases in the performance management index is associated with a 0.15 standard deviation increase in the proportion of students meeting ELA standards and a 0.22 standard deviation increase in the proportion of students meeting math standards. Multiplying by the standard deviations for the proportions meeting ELA and math standards from Table 1, these estimates imply a gain of about 2 to 3 percentage points in the proportion of students meeting standards with each standard deviation improvement in performance management.

The last two columns in Table 2 show the progress models, which predict the proportion of students in the lowest third of the distribution who made progress, or gains over last year, on their ELA and math scores. The overall model fit is not as good, with an adjusted R² of .17 for ELA and only .06 for math. But it is important to point out that predicting a 1-year change in a subgroup of students, as contrasted with the level of test scores for all students, is perhaps more difficult (because of regression to the mean and other random effects) and thus this lower overall explanatory power is perhaps to be expected. Still, the performance management index is significantly associated with both ELA progress and math progress of the lower third of students. The size of the effects, although not as large as in the models predicting the level of performance, is still not trivial. The coefficients suggest that a one standard deviation increases in the performance management index is associated with a little less than a tenth of a standard deviation change in progress, which translates (again using the standard deviations from Table 1) into just less than a 1 percentage point increase in the proportion of students demonstrating progress in their ELA and Math scores.

Although not of primary interest, it is worth briefly noting the findings for the control variables. Nearly all of the control variables are strongly related, as expected, to the proportion of students meeting ELA and math test proficiency standards. The race-ethnicity variables are especially strong predictors, as are the variables for student stability, student–teacher ratio, and teacher experience and qualifications. These same control variables, however, are less consistently related to the progress made by students at lowest third in ELA and math. This, of course, accounts for the lower explained variance (R²) in these models. But, as noted before, it may simply be more difficult to explain the proportion of students from the lowest third who made gains in their scores (because of regression to the mean and other random effects) than to explain the average level of proficiency of all students in a school.

Finally, we should note some diagnostics about our regression models. To begin with, the predictors are only moderately correlated with each other, with the strongest correlation between percentage of teachers who have a master’s degree and percentage of teachers who have more than 2 years teaching experience (r = .62). Moreover, there was no sign of multicolinearity in the models, with a mean VIF of only 2.2 and the largest individual VIFs at 5.0 and 4.6 (for the percentage of Hispanic/Latino students and percentage of African American students, respectively). We also tested the models for heteroscedasticity and found none. Although a few schools had studentized residuals greater than 3 in absolute value (indicating an outlier), dropping these cases did not change the direction, significance level, or substantive interpretation the main regression coefficients.

In sum, our regression results are robust and clearly suggest a link between performance management as practiced by New York City public school leaders and both the level of student performance and the progress made by the lowest third of students, in both ELA and math, even controlling for other important factors that are widely documented in the research on educational productivity.

Discussion and Implications

Using data from New York City’s Department of Education, our study has provided a unique test of the association between performance management practices and performance outcomes for about 1,000 public schools. We found that, consistent with the theory and aims of the performance movement in public management, those schools that engage more fully and effectively in performance management practices do indeed perform better as measured by both the level and gain in standardized reading and math test scores. Moreover, the standardized effects of performance management on these outcomes are relatively large, indeed equal to or greater than the effect of teacher qualifications, which is a well-established focus of educational improvement efforts. As a result, our study provides both methodological advantages as well as substantive findings with important implications for the theory and practice of performance management.

As for the methodological advantages, we had consistent measures on a large sample of about 1,000 schools and were able to examine the association of performance management with both the level and gain in reading and math test scores, which are widely considered to be important measures of school performance. More important, we were able to employ a more objective index of performance management practices, based on teams of outside observers visiting schools as part of New York City’s Quality Review rather than relying on self-reporting by managers as is often done in other studies. As a result, the measure of performance management employed in our study reduces some of the potential problems plaguing studies using self-reported methods, such as social desirability bias and common source bias. Indeed, we would argue that New York City’s Quality Review provides an ideal data source to tap the basic elements of performance management: an emphasis on measurable goals, ongoing feedback, targeted intervention, and strategic decision making based on performance information.

The substantive findings from out study have important implications for the theoretical and practical debates around performance management reforms. The results provide strong and consistent evidence of a positive impact of performance management on the improvement of organizational outcomes. The relationship appears with respect to both the level and gains in test scores, for reading and math, and holds up even after controlling for other factors such as student, staffing, and school characteristics. Given the detailed inventory of performance management practices captured by the Quality Review, this evidence is consistent with the notion that performance measurement and management practices do matter. Thus, our findings contrast with the generally weak correlations between performance management and outcomes reported in prior studies of job training programs.

Although our study provides evidence for the effectiveness of performance management on outcomes, it still has limitations that should be acknowledged. First, caution needs to be taken in generalizing the results beyond the context of education policy and New York City’s public education system. It may be that reading and math scores are more directly modifiable, through the intervention of teachers and support services, than outcomes in job training or other policy areas. Clearly, efforts to study the link between performance management practices and outcomes should examine a variety of policy contexts. In addition, New York City’s school system has put substantial emphasis and resources into developing its performance measurement and management systems, with increasingly sophisticated tools available to principals and other school leaders for tracking individual student and teacher performance. New York is also the largest and one of the most diverse school systems in the nation. Thus, it is uncertain how much the relationship between performance management and outcomes would generalize to other public school systems.

Second, improving students’ school performance is a complicated process that involves various individual and contextual factors. Our analysis controlled for the potentially confounding effects of various student, staff, and school characteristics, for which we had available data. However, there may be unobservable variables that still could have influenced our results. For example, there could be institutional or community issues or crises at a school that affected test scores and also required the immediate attention of school leaders, distracting them from a focus on performance measures and management. Another possibility is that some schools may raise more private money, through their parent organizations, freeing up the resources and time available to school leaders to focus on performance. Strong parent organizations may also be associated with higher test scores. In addition, the qualifications and experience of school leaders, which unfortunately we did not have data to measure, may be associated with school performance and student learning independent of performance management practices. Taken together, all of these considerations suggest that performance management practices in our study may be partially endogenous with respect to test scores because both may be influenced be external factors that we were unable to measure and thus control for statistically. Contributions could be made by future researchers who have data on these potentially important omitted variables and can take them into account. Still, we believe our models do adjust for many important student, teacher, and school characteristics.

Third, the causal relationship between engagement in performance management practices and performance outcomes might be reversed. That is, schools that are performing better on standardized test scores may, because of these more encouraging outcomes, invest more time and effort in the process of reviewing and discussing performance measures. We attempted to address this issue by lagging the index of performance management practices, but still the possibility of reverse causation remains because both management practices and outcomes are likely to be serially correlated. If future research could better test the causal relationships across time, the direction of causality could be addressed and, to some extent, the impact of endogenous factors on school performance would be ruled out as well.

Finally, even though our findings indicate that student academic outcomes may be influenced by performance management practices, the concrete processes and mechanisms through which this happens at the school and classroom level need to be investigated. To this end, it would be very useful to conduct case studies and qualitative observations of performance management as practiced in schools. For example, future studies could focus on the implementation process and operations of public management systems in comparable schools with high and low test scores, with particular attention paid to their performance management practices. By doing this, researchers might identify other essential activities and processes that are not considered in our study, but are still critical to determining the successful transformation of performance information into desired outputs and outcomes of public organizations.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Author Biographies

Rusi Sun is a PhD student in the School of Public Affairs and Administration, Rutgers University-Newark. Her research interests include public performance management and improvement, and organizational theory.

Gregg G. Van Ryzin is an associate professor in the School of Public Affairs and Administration, Rutgers University-Newark. He has published widely on various topics in public administration, urban affairs, and program evaluation and is author (with Dahlia K. Remler) of Research Methods in Practice (SAGE). He can be reached at vanryzin@rutgers.edu.

References

Andrews

Boyne

G. A.

(2010). Capacity, leadership, and organizational performance: Testing the Black Box Model of public management. Public Administration Review, 70, 443-454.

Barnow

B. S.

(2000). Exploring the relationship between performance management and program impact: A case study of the Job Training Partnership Act. Journal of Policy Analysis and Management, 19, 118-141.

Behn

R. D.

(2003). Why measure performance? Different purposes require different measures. Public Administration Review, 63, 586-606.

Berman

Wang

(2000). Performance measurement in U.S. counties: Capacity for reform. Public Administration Review, 60, 409-420.

Bouckaert

Peters

B. G.

(2002). Performance measurement and management: The achilles’ heel in administrative modernization. Public Performance & Management Review, 25, 359-362.

Boyne

G. A.

(2003). Sources of Public Service Improvement: A critical review and research agenda. Journal of Public Administration Research and Theory, 13, 367–394.

Boyne

G. A.

Chen

A. A.

(2007). Performance targets and public service improvement. Journal of Public Administration Research and Theory, 17, 455-477.

Brophy

(1986). Teacher influences on student achievement. American Psychologist, 41, 1069-1077.

Caldas

S. J.

Bankston

(1997). Effect of school population socioeconomic status on individual academic achievement. Journal of Educational Research, 90, 269-277.

10.

Carbonaro

W. J.

Gamoran

(2002). The production of achievement inequality in high school English. American Educational Research Journal, 39, 801-827.

11.

Carnoy

Loeb

(2002). Does external accountability affect student outcomes? A cross-state analysis. Educational Evaluation and Policy Analysis, 24, 305-331.

12.

Cheng

Y. C.

(1994). Principal’s leadership as a critical factor for school performance: Evidence from multi-levels of primary schools. School Effectiveness and School Improvement, 5, 299-317.

13.

Clotfelter

C. T.

Ladd

H. F.

Vigdor

J. L.

(2007). Teacher credentials and student achievement: Longitudinal analysis with student fixed effects. Economics of Education Review, 26, 673-682.

14.

Dahl

R. A.

Lindblom

C. E.

(1953). Politics, economics and welfare. New York, NY: Harper.

15.

Darling-Hammond

(2000). How teacher education matters. Journal of Teacher Education, 51, 166-173.

16.

Denhardt

R. B.

Denhardt

J. V.

(2000). The new public service: Serving rather than steering. Public Administration Review, 60, 549-559.

17.

Ekholm

(1992). Evaluating the impact of comprehensive school leadership development in Sweden. Education and Urban Society, 24, 365-385.

18.

Epstein

J. L.

Sheldon

S. B.

(2002). Present and accounted for: Improving student attendance through family and community involvement. Journal of Educational Research, 9, 308-321.

19.

Fan

Chen

(2001). Parental involvement and students’ academic achievement: A meta-analysis. Educational Psychology Review, 13(1), 1-22.

20.

Fattore

Dubois

H. F. W.

Lapenta

(2012). Measuring new public management and governance in political debate. Public Administration Review, 72, 218-227.

21.

Fowler

W. J.

Walberg

H. J.

(1991). School size, characteristics, and outcomes. Educational Evaluation and Policy Analysis, 13, 189-202.

22.

Greenwald

Hedges

L. V.

Laine

R. D.

(1996). The effect of school resources on student achievement. Review of Educational Research, 66, 361-396.

23.

Hallinger

P. B.

Bickman

Davis

(1996). School context, principal, leadership, and student reading achievement. Elementary School Journal, 96, 527-549.

24.

Hanushek

E. A.

(1979). Conceptual and empirical issues in the estimation of educational production functions. Journal of Human Resources, 14, 351-388.

25.

Hanushek

E. A.

(1996). School resources and student performance. In Burtless

(Ed.), Does money matter? (pp. 43-73). Washington, DC: Brooking Institution.

26.

Hanushek

E. A.

Raymond

M. E.

(2005). Does school accountability lead to improved student performance? Journal of Policy Analysis and Management, 24, 297-327.

27.

Hatry

H. P.

(2002). Performance measurement: Fashions and fallacies. Public Performance & Management Review, 25, 352-358.

28.

Hatry

H. P.

(2006). Performance measurement: Getting results (2nd ed.). Washington, DC: Urban Institution Press.

29.

Heinrich

C. J.

(1999). Do government bureaucrats make effective use of performance management information? Journal of Public Administration Research and Theory, 9, 363-394.

30.

Heinrich

C. J.

(2002). Outcomes-based performance management in the public sector: Implications for government accountability and effectiveness. Public Administration Review, 62, 712-725.

31.

Holzer

Yang

(2004). Performance measurement and improvement: An assessment of the state of the art. International Review of Administrative Sciences, 70(1), 15-31.

32.

Ingraham

P. W.

Joyce

P.G.

Donahue

A. K.

(2003). Government performance: Why management matters. Baltimore, MD: Johns Hopkins University Press.

33.

Kravchuk

R. S.

Schack

R. W.

(1996). Designing effective performance-measurement systems under the Government Performance and Results Act of 1993. Public Administration Review, 56, 348-358.

34.

Ladd

H. F.

(1999). The Dallas school accountability and incentive program: An evaluation of its impacts on student outcomes. Economics of Education Review, 18(1), 1-16.

35.

Meier

K. J.

O’Toole

L. J.

(2002). Public management and organizational performance: The effect of managerial quality. Journal of Policy Analysis and Management, 21, 629-643.

36.

Melkers

Willoughby

(2005). Models of performance-measurement use in local governments: Understanding budgeting, communication, and lasting effects. Public Administration Review, 65, 180-190.

37.

Moynihan

D. P.

(2006). Managing for results in state government: Evaluating a decade of reform. Public Administration Review, 66(1), 77-89.

38.

Moynihan

D. P.

Pandey

S. K.

(2005). Testing how management matters in an era of government by performance management. Journal of Public Administration Research and Theory, 15, 421-439.

39.

Moynihan

D. P.

Pandey

S. K.

(2010). The big question for performance management: Why do managers use performance information? Journal of Public Administration Research and Theory, 20, 849-866.

40.

New York City Department of Education. (2008). Quality review 2007-2008 [Data file]. Retrieved from http://schools.nyc.gov/Accountability/tools/review/default.htm

41.

New York City Department of Education. (2009a). Progress report citywide elementary/Middle/K-8 2008-2009 [Data file]. Retrieved from http://schools.nyc.gov/Accountability/tools/report/default.htm

42.

New York City Department of Education. (2009b). School demographics and accountability snapshot 2008-2009 [Data file]. Retrieved from http://schools.nyc.gov/Accountability/data/default.htm

43.

New York City Department of Education. (2012). Home page. Retrieved from http://schools.nyc.gov/AboutUs/default.htm

44.

Nicholson-Crotty

O’Toole

L. J.

(2004). Public management and organizational performance: The case of law enforcement agencies. Journal of Public Administration Research and Theory, 14(1), 1-18.

45.

Palardy

G. J.

Rumberger

R. W.

(2008). Teacher effectiveness in first grade: The importance of background qualifications, attitudes, and instructional practices for student learning. Educational Evaluation and Policy Analysis, 30, 111-140.

46.

Pandey

S. K.

Rainey

H. G.

(2006). Public managers’ perception of organizational goal ambiguity: Analyzing alternative models. International Public Management Journal, 9, 85-112.

47.

Poister

T. H.

(2003). Measuring performance in public and nonprofit organizations. San Francisco, CA: Jossey-Bass.

48.

Poister

T. H.

(2010). The future of strategic planning in the public sector: Linking strategic management and performance. Public Administration Review, 70, s246-s254.

49.

Poister

T. H.

Streib

(1999). Performance measurement in municipal government: Assessing the state of the practice. Public Administration Review, 59, 325-335.

50.

Pollitt

(2006). Performance management in practice: A comparative study of executive agencies. Journal of Public Administration Research and Theory, 16(1), 25-44.

51.

Propper

Wilson

(2003). The use and usefulness of performance measures in the public sector. Oxford Review of Economic Policy, 19, 250-267.

52.

Ravitch

(2010). The death and life of the great American school system: How testing and choice are undermining education. New York, NY: Basic Books.

53.

Rainey

H. G.

(2009). Understanding and managing public organizations (4th ed.). San Francisco, CA: Jossey-Bass.

54.

Rainey

H. G.

Steinbauer

(1999). Galloping elephants: Developing elements of a theory of effective government organizations. Journal of Public Administration Research and Theory, 9(1), 1-32.

55.

Roberts

N. C.

(2002). Keeping public officials accountable through dialogue: Resolving the accountability paradox. Public Administration Review, 62, 658-669.

56.

Rousseau

D. M.

(2006). Is there such a thing as “evidence-based management”? Academy of Management Review, 31, 256-269.

57.

Sanderson

(2001). Performance management, evaluation and learning in “modern” local government. Public Administration, 79, 297-313.

58.

Sanger

M. B.

(2008). From measurement to management: Breaking through the barriers to state and local performance. Public Administration Review, 68, S70-S85.

59.

Schwartz

R. M.

Schmitt

M. C.

Lose

M. K.

(2012). Effects of teacher-student ratio in response to intervention approaches. Elementary School Journal, 112, 547-567.

60.

Sheldon

S. B.

(2003). Linking school–family–community partnerships in urban elementary schools to student achievement on state tests. Urban Review, 35, 149-165.

61.

Smith

K. B.

Meier

K. J.

(1994). Politics, bureaucrats, and schools. Public Administration Review, 54, 551-558.

62.

Smith

(1995). On the unintended consequences of publishing performance data in the public sector. International Journal of Public Administration, 18, 277-310.

63.

Warwick

D. P.

(1975). A theory of public bureaucracy. Cambridge, MA: Harvard University Press.

64.

Wayne

A. J.

Youngs

(2003). Teacher characteristics and student achievement gains: A review. Review of Educational Research, 73(1), 89-122.

65.

Wegener

(1998). Evaluating competitively tendered contracts local governments in comparative perspective. Evaluation, 4, 189-203.

66.

Weiss

C. H.

(1995). Nothing as practical as good theory: Exploring theory-based evaluation for comprehensive community initiatives for children and families. In Connell

J. P.

Kubisch

A. C.

Schorr

L. B.

Weiss

C. H.

(Eds.), New approaches to evaluating community initiatives: Concepts, methods, and contexts (pp. 65-92). Washington, DC: Aspen Institute.

67.

Wholey

J. S.

(1999). Performance-based management: Responding to the challenges. Public Productivity & Management Review, 22, 288-307.

68.

Wholey

J. S.

(2001). Managing for results: Roles for evaluators in a new management era. American Journal of Evaluation, 22, 343-347.

69.

Yang

Hsieh

J. Y.

(2007). Managerial effectiveness of government performance measurement: Testing a middle-range model. Public Administration Review, 67, 861-879.