Abstract
Performance management is widely assumed to be an effective strategy for improving outcomes in the public sector. However, few attempts have been made to empirically test this assumption. Using data on New York City public schools, we examine the relationship between performance management practices by school leaders and educational outcomes, as measured by standardized test scores. The empirical results show that schools that do a better job at performance management indeed have better outcomes in terms of both the level and gain in standardized test scores, even when controlling for student, staffing, and school characteristics. Thus, our findings provide some rare empirical support for the key assumption behind the performance management movement in public administration.
Introduction
How to improve performance and achieve better outcomes is a question that has faced the public sector for some time and one that has received ever increasing attention, especially since the reinventing government movement (Moynihan & Pandey, 2005, 2010; Wholey, 2001). During the past few decades, the adoption and implementation of performance measurement and performance management approaches to improving public service delivery and to achieving key outcomes has become an important component of the administrative reform around the world (Moynihan, 2006; Pollitt, 2006; Wholey, 1999; Yang & Hsieh, 2007). The performance management approach has been embraced for its potential capacity to provide critical information, to evaluate actions, to make better decisions, to allocate resources more effectively, and to strengthen bureaucratic accountability (Behn, 2003; Holzer &Yang, 2004; Poister, 2003).
As opposed to a more traditional focus in public administration on inputs and compliance, performance-based management holds that organizational performance can be improved by focusing on results and effectiveness, highlighting the relationship between managerial effort and achievement of key organizational outcomes (Moynihan, 2006; Moynihan & Pandey, 2005). Performance management is assumed to improve organizational outcomes by specifying goals and setting performance targets and then attempting to achieve those goals through effective management practices (Poister, 2010; Wholey, 1999). As part of this process, data collected from performance measurement systems are essential because they inform public managers about where to direct resources and improvement efforts and provide a way to monitor how much progress is achieved over time. In this way, the use of performance measures improves decision making about future strategies and managerial practices in ways that presumably enhance organizational effectiveness and outcomes (Behn, 2003; Holzer &Yang, 2004; Sanger, 2008).
Although there is widespread acceptance of the performance management approach in public sector organizations, the question of whether it is actually associated with better organizational outcomes has rarely been tested empirically. To a large extent, the lack of prior research along these lines is because of the difficulty of systematically comparing organizations in terms of their engagement in performance management practices, which involves a complex bundle of activities including goal setting, performance measurement, analysis and reporting, and feedback to staff and others involved in the production of outcomes. In addition, it is quite difficult in the public sector to systematically assess these outcomes and in turn organizational performance itself. Thus, both the independent variable (performance management practices) and the dependent variable (performance outcomes) turn out to be rather difficult to measure. Fortunately, the New York City Department of Education provides both a systematic assessment of the performance management practices of schools and their leaders as well as standardized test scores of students as a measure of performance for more than 1,000 of its public schools. This provides a unique opportunity to empirically examine the relationship between performance management practices and outcomes.
Therefore, this article proceeds in three parts. First, the literature on performance measurement and management will be reviewed, with a focus on the key assumption in the literature that performance management practices are positively associated with better outcomes. Second, we introduce our measures of the main independent and dependent variables and present our statistical findings. Finally, the conclusions and policy implications of our findings will be discussed.
Literature Review and Theoretical Framework
The new public management (NPM) movement has changed both the ways in which public services are delivered and the premise on which public organizations are administrated (Denhardt & Denhardt, 2000; Fattore, Dubois, & Lapenta, 2012; Sanderson, 2001). A key feature of NPM is its attention to organizational results and effectiveness, particularly highlighting the usefulness of performance information. Moreover, public managers are expected to play active, entrepreneurial roles in achieving effectiveness and efficiency by reforming internal management practices and improving their managerial skills (Moynihan & Pandey, 2005). In other words, there are expectations that the link between organizational outcomes and managerial effort should be strengthened, as Moynihan and Pandey (2005) observed that “[public] managers face increased calls to justify their management choices in the context of performance” (p. 422).
The established focus on outcomes in the public sector is reflected in the increasing diffusion of performance measurement systems. The successful utilization of performance measurement in the private sector encouraged this trend in the public sector. Over the past several decades, a burgeoning literature in public administration has facilitated the adoption and implementation of performance measurement and management (Poister & Streib, 1999). It is widely accepted that the problem of information asymmetry is more salient in a top down institutional environment, which hinders the achievement of performance effectiveness. Therefore, the incentive for developing performance measurement systems often stems from a desire to create an effective control mechanism that addresses such challenges (Heinrich, 1999; Sanderson, 2001). The advocates of performance measurement claim that tracking program progress through measureable indicators not only helps identify gaps between targets and actual outcomes of governmental activities but also provides a more effective mechanism to hold public employees accountable for meeting performance standards (Hatry, 2006; Heinrich, 2002; Propper & Wilson, 2003; Roberts, 2002).
Even though governmental agencies may commit to using performance measurement, there is no guarantee that simply collecting and reporting performance-related data will lead to better outcomes (Bouckaert & Peters, 2002; Hatry, 2002; P. Smith, 1995). Evidence suggests that—without translating performance statistics into information that can be used in actual decision making and management practices—increasing amounts of performance data make little difference in actual organizational performance (Kravchuk & Schack, 1996). Empirical studies also suggest that performance-relevant data generated by outcome-based management systems might not always help in directing program managers’ attention to the most productive performance-management activities (Heinrich, 1999). Based on the literature of performance management practices of governments, Wegener (1998) argued that “evaluation remains, in the vast majority of applications, a stand-alone tool not integrated into the public governance process” (p. 191). Kravchuk and Schack (1996) also urge that the data collected from performance evaluation systems should be reported through “preestablished, highly focused feedback channels” (p. 352). Monihan and Pandey (2010) even label the utilization of performance information as one of “the biggest questions for performance management” research (p. 850). In this context, there are increasing calls in public administration to transfer government assessment practices from an evaluation and control orientation to one in which performance information is used to strengthen internal management and improve public service delivery (Sanderson, 2001).
This issue highlights another key assumption underpinning performance management: that management practice matters to public performance. A number of general public management studies suggest that organizational performance is attributable to factors under the control of managers (e.g., Andrews & Boyne, 2010; Boyne, 2003; Ingraham, Joyce, & Donahue, 2003; Meier & O’Toole, 2002; Moynihan & Pandey, 2005; Nicholson-Crotty & O’Toole, 2004; Rainey & Steinbauer, 1999; K. B. Smith & Meier, 1994). They contended that, given their authority and discretion over personnel and resource allocation, public managers’ capacity and skills do account for performance improvement through mechanisms such as taking advantage of various resources and technology, motivating and coordinating key actors, providing ongoing feedback, designing tasks and reshaping work settings, and levering other inputs to performance.
Correspondingly, management practices become key mechanisms through which performance measurement leads to improved organizational effectiveness. If the quality of management practices is poor, the linkage between performance-related information and performance improvement is weakened and the desired organizational outcomes are unlikely to be achieved. Therefore, generating performance-related data should not be an end in itself but rather a means to allow public managers to evaluate how much progress their operation has achieved over time, inform better decisions that adapt to new priorities or needs, and eventually bring about improvement of key organizational outcomes. In sum, performance management practices attempt, in Rousseau’s (2006) words, to “[move] professional decisions away from personal preference and unsystematic experience toward those based on the best available scientific evidence” (p. 256).
Several studies furnish empirical support for the notion that performance management practices are positively associated with indicators of organizational effectiveness. Melkers and Willoughby (2005) analyzed the use of performance management in the operations of local governments. Based on an extensive mail survey of administrators and budget officials in local governments, they concluded that the adoption of outcome-based management system allows government agencies to have a better understanding of how much progress they have made toward achieving their goals. Moreover, collecting relevant data about results also appears to have positive impacts on communication within governments, enhancing program management, as well as informing budgetary decision making. The potentials of performance management were also reported in Berman and Wang’s (2000) study. The authors indicated that for county-level governments, those governments using performance measurement more often tend to value its influence on the improvement of organizational goal setting and bureaucratic accountability (Berman & Wang, 2000).
Additional support for a link between performance management practices and outcomes comes from studies of federally funded programs such as the Job Training Partnership Act (JTPA). As “one of the pioneers in using performance management” (Barnow, 2000, p. 119), JTPA developed its performance management system consistent with the requirements of the Government Performance and Results Act, setting performance targets by which each program is evaluated in order to determine whether it has achieved desired performance goals, with subsequent rewards and penalties. In this way, local job training programs were encouraged to perform in ways that had a positive impact on program outcomes in a multilevel and decentralized system (Barnow, 2000). The results of Heinrich’s (1999) study suggest that governmental decisions about the renewal of training program contractors and funding allocations did rely considerably on performance data. To this extent, the JTPA performance management system played a significant role in creating incentives for employees and service providers to align their efforts with the goals of state and federal policy. In another study of JTPA, Heinrich (2002) also pointed out that, even though there are concerns about the reliability of administrative data generated by performance management systems, evidence suggests that the utilization of these data still has potential to improve organizational performance.
Evidence regarding the effectiveness of performance management is also provided by studies about the significance of goal clarification and target setting (Boyne & Chen, 2007). Governmental agencies are often criticized as having goals and objectives that are more ambiguous and difficult to measure than their counterparts in the private sector (e.g., Dahl & Lindblom, 1953; Pandey & Rainey, 2006; Rainey, 2009; Warwick, 1975). Performance management calls attention to organizational goals by explicitly focusing on expected levels of performance, developing measurable indicators of goal achievement, and engaging in a strategic planning process around goals and objectives.
In the field of educational administration and policy, much has been written about the factors influencing student learning and performance. Some of the modifiable factors found to influence student academic achievement and other educational outcomes include the skills of school leaders (e.g., Cheng,1994; Ekholm, 1992; Hallinger, Bickman, & Davis, 1996), teachers’ qualification (e.g., Clotfelter, Ladd, & Vigdor, 2007; Darling-Hammond, 2000; Wayne & Youngs, 2003), teachers’ attitudes and instructional practices in class (e.g., Brophy, 1986; Carbonaro & Gamoran, 2002; Palardy & Rumberger, 2008), as well as connections and cooperation among the school, family, and community (e.g., Epstein & Sheldon, 2002; Fan & Chen, 2001; Sheldon, 2003). Although these are some of the key modifiable drivers of educational outcomes, attention has been directed also to the potential of reforms in school management systems. In particular, education researchers have investigated the effects of policies that hold school leaders accountable for test scores (e.g., Carnoy & Loeb, 2002; Hanushek & Raymond, 2005; Ladd, 1999). But we know of no prior empirical studies that have explicitly looked at how variation in performance measurement practices at the school level, within a given school system, are related to educational outcomes.
According to Weiss (1995), it is essential to explicate the assumption underlying the relationship between an educational intervention and the intended outcomes, what she calls a theory of change highlighting the logic of cause–effect linkages that describe how and why an initiative works. With respect to performance management in public schools, the theory of change follows this logic: schools use performance indicators to monitor student progress on a regular basis throughout the school year; school leaders use this information to provide feedback and advice to teachers; in turn, teachers use the information and received feedback to help individual students, especially those at risk of not meeting standards, and to generally improve their teaching effectiveness; finally, students receiving this support and enhanced instruction do better on their year-end standardized tests and other outcome measures. In sum, performance management is based on the assumption that the information collected from performance measures in an educational context will provide critical information to school leaders and teachers for setting goals, monitoring plans, modifying practices, and ultimately having a positive influence on students’ test scores.
However, there remain concerns about the actual effectiveness of performance management in practice. A number of prior studies have observed a weak correlation between performance management systems and actual performance outcomes, in part because the performance indicators themselves are often only loosely related to outcomes. Indeed, even in Heinrich’s (1999) study of JTPA, she argues that “the performance standards system [of JTPA program] is not well designed, as performance measures are not strongly correlated with program goals” (p. 363). Barnow (2000) used training sites of JTPA as the unit of analysis and compared enrollees’ earnings and employment after the termination of training programs across different sites. The purpose was to examine whether programs ranking higher in terms of their measured performance were also those that had more positive impacts on trainees (particularly postprogram earnings and employment). His results were somewhat disappointing, as measured indicators produced by JTPA’s performance management system were not closely linked to key program outcomes.
In sum, even though the existing literature generally makes the case for performance management and its potential usefulness in the public sector, there is scant and inconsistent empirical evidence for its actual impact on the performance or outcomes of public organizations. Therefore, we aim to contribute some additional empirical evidence regarding the effects of performance management practices on performance outcomes in the context of public education. We do this using data on New York City public schools, which is unique in many ways because the city’s school system provides both a systematic assessment of performance management practices in its 1,700 schools along with standardized test scores for most of its 1.1 million students (New York City Department of Education [DoE], 2012). The city’s school system is perhaps the most diverse in the country, and it has had a tradition of school-level autonomy and experimentation in educational and management approaches (Ravitch, 2010). Thus, the New York City context is ideal for this study because it provides consistent measures of management practices and educational outcomes across a large number of public schools that vary in philosophy and administration. We describe our data and measures in more detail in the next section.
Data and Measures
The data for our study were obtained from the New York City DoE, which issues both a Quality Review of each school’s management practices (DoE, 2008) as well as provides more traditional statistics on test scores (DoE, 2009a), student characteristics, school staffing, and other school characteristics (DoE, 2009b). After matching and cleaning the available data, there were a total of 1,004 primary and middle schools in our final analytical sample. We excluded high schools from the sample because they do not participate in the same standardized tests and are thus not directly comparable to elementary and middle schools. Because change in student performance is likely to lag management practices, we employ the student test scores and control variables from the 2008-2009 data, 1 year later than the Quality Review data (2007-2008). By doing this, some of the potential confusion regarding the direction of causality in the relationship between the variables is also at least partly addressed (Andrews & Boyne, 2010).
Dependent Variables
There is certainly much debate in education policy about the best ways to measure school or student performance, in particular whether test scores alone should be the main focus or whether multiple performance indicators should also be considered. Although public schools produce multiple outputs and outcomes and thus should arguably be assessed by multiple indicators (Meier & O’Toole, 2002), we have chosen to use two outcome measures based on test scores for several reasons. To begin with, gains in test scores have become the primary performance outcome of interest for most school systems in the United States, particularly gains for low-income, minority, and other disadvantaged students. In addition, the performance management system in New York City public schools is one largely focused on test scores, for example, by identifying underperforming students and intervening to help them raise their test scores by the end of the year. Teachers in the system are also now assessed by the level and gains in test scores of their students. Therefore, we chose to focus both on the overall average level of student achievement, measured by test scores, as well as a measure of student progress for disadvantaged students, and to do so for both English language arts (ELA) and math. Thus, we have four dependent variables in our analysis:
The proportion of all students in the school who meet proficiency standards (Level 3 or 4) in ELA
The proportion of all students in the school who meet proficiency standards (Level 3 or 4) in math
The proportion of students in the school in the lowest third of the distribution of who demonstrated progress (gains) on their ELA scores
The proportion of students in the school in the lowest third of the distribution who demonstrated progress (gains) on their math scores
The descriptive statistics for these variables are shown in Table 1.
Descriptive Statistics.
Note. ELA = English language arts.
Independent Variables
The New York City DoE’s Quality Review, from which we form our independent variable of interest, comes from an onsite visit to each school conducted by a team of experienced and trained external reviewers from the DoE who observe a wide range of activities at the school related to instructional practices, organizational climate, management techniques, and leadership strategy (DoE, 2008). The 2007-2008 Quality Review had 35 criteria related to gathering data, planning and setting goals, aligning instructional strategy to goals, aligning capacity building to goals, and monitoring and revising. Each of the school was rated on these criteria using a 5-point scale, with 5 being outstanding and 1 being underdeveloped.
To examine the pattern and potential dimensions of the 35 criteria, we conducted an exploratory factor analysis (using principal components factor analysis, the eigenvalue criterion for retaining factors, and varimax rotation in Stata 11). The factor analysis extracted one especially strong factor that accounted for fully 53% of the variance in the 35 criteria scores. The seven criteria that loaded on this factor are as follows:
Item 2-1: To what extent do school leaders and faculty engage in collaborative processes to set rigorous, objectively measurable goals for improvement, and to develop plans and timeframes for reaching those goals?
Item 5-1: To what extent do the school’s plans for improving student outcomes include interim goals that are objectively measurable and have suitable time frames for measuring success and making adjustments?
Item 5-2: To what extent do the school’s plans for improving teacher outcomes include interim goals that are objectively measurable and have suitable time frames for measuring success and making adjustments?
Item 5-3: To what extent do teachers and faculty use periodic assessments and other diagnostic tools to measure the effectiveness of plans and interventions for individual and groups of students in key areas?
Item 5-4: To what extent do teachers and faculty use the information generated by periodic assessments and other progress measures and comparisons to revise plans immediately in order to reach stated goals?
Item 5-5: To what extent do school leaders track the outcomes of periodic assessments and other diagnostic measures and use the results to makes strategic decisions to modify practices to improve student outcomes?
Item 5-6: To what extent do school leaders and staff use each plan’s interim and final outcomes to drive the next stage of goal setting and improvement planning?
These seven items clearly can be seen as indicators of the effective practice of performance management, including an emphasis on measurable goals, ongoing feedback, targeted intervention, and strategic decision making based on performance information. Therefore, we constructed a scale based on these seven items (taking their average), which has a Cronbach’s alpha coefficient of reliability of .94, suggesting a very high degree of internal consistency among items. The descriptive statistics for this seven-item scale, which is the measure of performance management practices we use in our subsequent analysis, can be found in Table 1. Additionally, it should be noted that the scores on this scale range widely from a low of just more than 1 to a high of 5 (a perfect score) and, moreover, are distributed in a nearly normal fashion around a mean of 3.5. Thus, there is ample variation in the quality of performance management practices across schools in our sample. Finally, it is important to note that, in contrast to many previous studies in public administration, our measure of performance management practices comes from a systematic outside assessment by a team of reviewers and not a self-report by agencies or administrators themselves.
Control Variables
In any analysis of factors influencing public school performance, it is necessary to control for school, teacher, and student characteristics that are known to influence school performance and may be beyond the control of school leaders (Meier & O’Toole, 2002). Following the specification of a fairly standard educational production function (Hanushek, 1979), and given the available data, we have included 12 control variables that have been tested extensively in the existing literature (e.g., Caldas & Bankston, 1997; Clotfelter et al., 2007; Fowler & Walberg, 1991; Greenwald, Hedges, & Laine, 1996; Schwartz, Schmitt, & Lose, 2012), as shown in Table 1.
Despite the performance management skills of school leaders, schools with a higher percentage of students from low-income or disadvantaged backgrounds might be expected to have more difficulty attaining high performance (Caldas & Bankston, 1997). Therefore, we include the percentage of low-income students (squared, because this distribution was highly left skewed), and the percentage of students who are recent immigrant. We also include the racial-ethnic composition of the student body (represented by four variables), as well as the percentage of female students (even though schools do not vary as much in gender distribution). Finally, we include a measure of student stability, or the proportion of students who remain at a given school during the year, because it is a proxy for the stability of families and also because it is likely to constrain the ability of school leaders to affect student performance.
School structure and resources are also factors determining academic achievement, even though there are ongoing controversies concerning the magnitude of the influence of these factors (Hanushek, 1996, as cited in Meier & O’Toole, 2002, p. 637). Four measures of school structure and resources are included: total enrollment (logged, because the distribution is highly right skewed), the student–teacher ratio, the percentage of teachers having more than 2 years teaching experience in the school, and the percentage of teachers holding master degree or above. There are debates about how much school size may influence performance, but still it is important to account for this factor. We assume that the student–teacher ratio would be negatively related to student performance, whereas teacher experience and qualifications would be positively related to performance. Again, Table 1 presents the descriptive statistics for all of the control variables.
Analysis and Results
To examine the effect of performance management on school outcomes, we estimated ordinary least squares models for each of our four dependent variables. The models all include our main independent variable, which is the performance management index, as well as the 12 control variables described above. The results are presented in Table 2. To facilitate interpretation of effect sizes across variables and models, we report all of the regression coefficients in standardized (beta) form.
Regression Results (Standardized Coefficients).
Note. ELA = English language arts.
p < .01. **p < .05. *p < .1.
We begin with the performance models predicting the proportion of students meeting ELA and math test proficiency standards, which appear in the first two columns of Table 2. The models fit the data well, with an adjusted R2 of .72 for the ELA model and .53 for the math model. More important, in both models the performance management index is significantly associated with better student achievement in both ELA and math, even holding the 12 control variables constant. Indeed, the coefficients suggest that a one standard deviation increases in the performance management index is associated with a 0.15 standard deviation increase in the proportion of students meeting ELA standards and a 0.22 standard deviation increase in the proportion of students meeting math standards. Multiplying by the standard deviations for the proportions meeting ELA and math standards from Table 1, these estimates imply a gain of about 2 to 3 percentage points in the proportion of students meeting standards with each standard deviation improvement in performance management.
The last two columns in Table 2 show the progress models, which predict the proportion of students in the lowest third of the distribution who made progress, or gains over last year, on their ELA and math scores. The overall model fit is not as good, with an adjusted R2 of .17 for ELA and only .06 for math. But it is important to point out that predicting a 1-year change in a subgroup of students, as contrasted with the level of test scores for all students, is perhaps more difficult (because of regression to the mean and other random effects) and thus this lower overall explanatory power is perhaps to be expected. Still, the performance management index is significantly associated with both ELA progress and math progress of the lower third of students. The size of the effects, although not as large as in the models predicting the level of performance, is still not trivial. The coefficients suggest that a one standard deviation increases in the performance management index is associated with a little less than a tenth of a standard deviation change in progress, which translates (again using the standard deviations from Table 1) into just less than a 1 percentage point increase in the proportion of students demonstrating progress in their ELA and Math scores.
Although not of primary interest, it is worth briefly noting the findings for the control variables. Nearly all of the control variables are strongly related, as expected, to the proportion of students meeting ELA and math test proficiency standards. The race-ethnicity variables are especially strong predictors, as are the variables for student stability, student–teacher ratio, and teacher experience and qualifications. These same control variables, however, are less consistently related to the progress made by students at lowest third in ELA and math. This, of course, accounts for the lower explained variance (R2) in these models. But, as noted before, it may simply be more difficult to explain the proportion of students from the lowest third who made gains in their scores (because of regression to the mean and other random effects) than to explain the average level of proficiency of all students in a school.
Finally, we should note some diagnostics about our regression models. To begin with, the predictors are only moderately correlated with each other, with the strongest correlation between percentage of teachers who have a master’s degree and percentage of teachers who have more than 2 years teaching experience (r = .62). Moreover, there was no sign of multicolinearity in the models, with a mean VIF of only 2.2 and the largest individual VIFs at 5.0 and 4.6 (for the percentage of Hispanic/Latino students and percentage of African American students, respectively). We also tested the models for heteroscedasticity and found none. Although a few schools had studentized residuals greater than 3 in absolute value (indicating an outlier), dropping these cases did not change the direction, significance level, or substantive interpretation the main regression coefficients.
In sum, our regression results are robust and clearly suggest a link between performance management as practiced by New York City public school leaders and both the level of student performance and the progress made by the lowest third of students, in both ELA and math, even controlling for other important factors that are widely documented in the research on educational productivity.
Discussion and Implications
Using data from New York City’s Department of Education, our study has provided a unique test of the association between performance management practices and performance outcomes for about 1,000 public schools. We found that, consistent with the theory and aims of the performance movement in public management, those schools that engage more fully and effectively in performance management practices do indeed perform better as measured by both the level and gain in standardized reading and math test scores. Moreover, the standardized effects of performance management on these outcomes are relatively large, indeed equal to or greater than the effect of teacher qualifications, which is a well-established focus of educational improvement efforts. As a result, our study provides both methodological advantages as well as substantive findings with important implications for the theory and practice of performance management.
As for the methodological advantages, we had consistent measures on a large sample of about 1,000 schools and were able to examine the association of performance management with both the level and gain in reading and math test scores, which are widely considered to be important measures of school performance. More important, we were able to employ a more objective index of performance management practices, based on teams of outside observers visiting schools as part of New York City’s Quality Review rather than relying on self-reporting by managers as is often done in other studies. As a result, the measure of performance management employed in our study reduces some of the potential problems plaguing studies using self-reported methods, such as social desirability bias and common source bias. Indeed, we would argue that New York City’s Quality Review provides an ideal data source to tap the basic elements of performance management: an emphasis on measurable goals, ongoing feedback, targeted intervention, and strategic decision making based on performance information.
The substantive findings from out study have important implications for the theoretical and practical debates around performance management reforms. The results provide strong and consistent evidence of a positive impact of performance management on the improvement of organizational outcomes. The relationship appears with respect to both the level and gains in test scores, for reading and math, and holds up even after controlling for other factors such as student, staffing, and school characteristics. Given the detailed inventory of performance management practices captured by the Quality Review, this evidence is consistent with the notion that performance measurement and management practices do matter. Thus, our findings contrast with the generally weak correlations between performance management and outcomes reported in prior studies of job training programs.
Although our study provides evidence for the effectiveness of performance management on outcomes, it still has limitations that should be acknowledged. First, caution needs to be taken in generalizing the results beyond the context of education policy and New York City’s public education system. It may be that reading and math scores are more directly modifiable, through the intervention of teachers and support services, than outcomes in job training or other policy areas. Clearly, efforts to study the link between performance management practices and outcomes should examine a variety of policy contexts. In addition, New York City’s school system has put substantial emphasis and resources into developing its performance measurement and management systems, with increasingly sophisticated tools available to principals and other school leaders for tracking individual student and teacher performance. New York is also the largest and one of the most diverse school systems in the nation. Thus, it is uncertain how much the relationship between performance management and outcomes would generalize to other public school systems.
Second, improving students’ school performance is a complicated process that involves various individual and contextual factors. Our analysis controlled for the potentially confounding effects of various student, staff, and school characteristics, for which we had available data. However, there may be unobservable variables that still could have influenced our results. For example, there could be institutional or community issues or crises at a school that affected test scores and also required the immediate attention of school leaders, distracting them from a focus on performance measures and management. Another possibility is that some schools may raise more private money, through their parent organizations, freeing up the resources and time available to school leaders to focus on performance. Strong parent organizations may also be associated with higher test scores. In addition, the qualifications and experience of school leaders, which unfortunately we did not have data to measure, may be associated with school performance and student learning independent of performance management practices. Taken together, all of these considerations suggest that performance management practices in our study may be partially endogenous with respect to test scores because both may be influenced be external factors that we were unable to measure and thus control for statistically. Contributions could be made by future researchers who have data on these potentially important omitted variables and can take them into account. Still, we believe our models do adjust for many important student, teacher, and school characteristics.
Third, the causal relationship between engagement in performance management practices and performance outcomes might be reversed. That is, schools that are performing better on standardized test scores may, because of these more encouraging outcomes, invest more time and effort in the process of reviewing and discussing performance measures. We attempted to address this issue by lagging the index of performance management practices, but still the possibility of reverse causation remains because both management practices and outcomes are likely to be serially correlated. If future research could better test the causal relationships across time, the direction of causality could be addressed and, to some extent, the impact of endogenous factors on school performance would be ruled out as well.
Finally, even though our findings indicate that student academic outcomes may be influenced by performance management practices, the concrete processes and mechanisms through which this happens at the school and classroom level need to be investigated. To this end, it would be very useful to conduct case studies and qualitative observations of performance management as practiced in schools. For example, future studies could focus on the implementation process and operations of public management systems in comparable schools with high and low test scores, with particular attention paid to their performance management practices. By doing this, researchers might identify other essential activities and processes that are not considered in our study, but are still critical to determining the successful transformation of performance information into desired outputs and outcomes of public organizations.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
