Abstract
New York Police Department (NYPD) stop and frisk policy has come under increasing scrutiny in recent years and has been analyzed exclusively in terms of its equity and effectiveness. This study adds a third approach of policy outcome analysis—technical efficiency—by employing a pooled data envelopment analysis of all Stop, Question, and Frisk data from all NYPD precincts from 2004 through 2010 (3,410,300 total stops resulting in 1,721,955 total frisks). The results reveal that the NYPD is input inefficient in many precincts (mean IOTA score = .40) but slightly more output efficient (mean IOTA score = .50). The most efficient precincts and boroughs are also identified to set performance benchmarks for frisks within the NYPD. According to the input-oriented results (the equity side), there should have been 1,091,846 fewer frisks given the outputs produced (arrests, guns, and contraband), and the output-oriented results (effectiveness side) suggest the NYPD should have produced 179,056 more arrests, found 6,306 more pistols and found 59,883 more instances of contraband to be technically efficient, given the level frisks throughout the NYPD. Though a certain amount of inefficiency is enshrined in the frisk decision, these results are placed in the context of police actions and outcomes in the NYPD over this time period, and are used to inform both sides of the current debate. This research is unique to the police efficiency literature and sets the foundation for future research that fully models efficiency antecedents as well as the outcomes that result from inefficient frisks.
Introduction
The stop and frisk policies of the New York Police Department (NYPD) have come under increasing scrutiny throughout the past decade. As Jeremy Travis (in Jones-Brown, Gill, & Trone, 2010) summarizes: For the past several years, there has been a lively debate in New York City on the efficacy of the stop, question and frisk policies of the New York City Police Department. Strong claims are made on both sides of the debate. Proponents of these practices claim that they have made substantial contributions to the crime decline in New York City and have become an essential tool in the Police Department’s crime prevention toolkit. Critics of these practices claim that the stop, question and frisk policies have had an unwarranted disparate impact on communities of color and have undermined the legitimacy of the police and the justice system.
The effectiveness of the policy has been introduced into the debate less often and mainly from the police perspective to justify the local crime control approach. This framework views the stop and frisk as a tool of policing that has been effective in reducing crime in New York City and as such, it has been used aggressively by the NYPD toward that end (see Spitzer, 1999, p. 70). The aggressive policing methods in New York City are founded in Broken Windows theory combined with technology-based management and zero-tolerance patrol at the street level (Bass, 2001), which suggests that enhanced stop and frisk is a purposive course of action to reach a desired end (i.e., a policy) rather than an isolated social phenomenon. Maintaining order, a main focus of this approach, requires proactive police action, increased arrests for lower level offenders (Kelling & Bratton, 1998) and corresponds with an increased application of stops and frisks (Davis, Ortiz, Galinskiy, Ylesseva, & Briller, 2004; Geller & Fagan, 2010; Schneider, Chapman, & Schapiro, 2009; Spitzer, 1999; Wilson, 1994). Overall, there is less empirical evidence on the effectiveness of NYPD stop and frisk policy than on the equity of its application.
Missing altogether is the third focus of policy analysis that has the potential to objectively inform the other two sides of this debate: technical efficiency. While the equity perspective is mainly concerned with limiting inputs (frisks) and the effectiveness perspective mainly focuses on increasing outputs (e.g., finding of weapons and contraband), the efficiency perspective integrates both to determine the relationship between frisks employed and outputs produced. The present study represents the first-known empirical longitudinal efficiency analysis of NYPD “stop and frisk” policy and seeks to inform the broader debate through exploration of four specific research questions: How technically efficient are NYPD precincts in their frisking of suspects and what is the departmental trend over time? Given the level of outputs produced, how many fewer frisks should occur in New York City each year for the department and its precincts to be efficient? Given the amount of frisks performed each year, how many more gun seizures, contraband discoveries, and arrests should the NYPD be producing for the department and its precincts to be efficient? What are the most and least efficient precincts in the NYPD relative to stop and frisk policy?
To study these research questions, data envelopment analysis (DEA) is introduced and a review of the police efficiency literature utilizing DEA is undertaken to place the current study in context as well as show its unique value within the existing research. Next, the methodology is discussed within the framework of two U.S. Supreme Court decisions—Terry and Dickerson—which guide the selection of outputs. The DEA results are then presented, leading to a discussion of NYPD frisk efficiency outcomes while informing the effectiveness and equity debate and benchmarking the top performing precincts and boroughs. Furthermore, specific attention is paid to the value of this methodology and the results for NYPD management, policy makers and stakeholders, and limitations are clearly delineated as are needed areas of future research on this important topic.
Police Research Using DEA
DEA has been a tool of policy evaluation since its conceptual inception by Farrell (1957), who introduced the technique by calculating the relative productive efficiency in American agricultural systems. Since then, the technique has been modified into a programmatic format (most notably in an important paper by Charnes, Cooper, & Rhodes, 1978) and has been used to analyze a wide range of industries, including but not limited to banks, airlines, agricultural systems, military units, and universities. DEA is an established methodology to determine the relative efficiency of public organizations (Athanassopoulos & Curram, 1996; Camanho & Dyson, 1999; Drake & Simper, 2000; Ludwin & Guthrie, 1989; Nyhan, 2002; Nyhan & Martin, 1999a) and compares organizational entities, referred to as decision-making units (DMUs), based upon their comparable input utilization to produce outputs. DEA produces a single efficiency score (IOTA score) for each DMU to identify the most efficient relative entity in the sample (rated as 1.0) against which the inefficiency of the other DMUs can be evaluated (Camanho & Dyson, 1999). In addition, DEA produces “slack” results which detail the mix of additional output or reduced inputs needed for the inefficient DMUs to be as efficient as the top performing entity, making it an important management tool. An advantage of DEA is that it can be used on either cross-sectional or longitudinal data, the latter of which permits the analysis of a DMU over time in comparison to other DMUs as well as itself. Furthermore, this methodology can be focused on the minimization of inputs to achieve a given level of output (input-oriented specification) or on the maximization of outputs given a certain level of input (output-oriented specification). Finally, DEA can either be oriented to constant returns to scale (CRS), which assumes an equal return in output per input, or varying returns to scale (VRS), which assumes either decreased or increased outputs per input utilized.
In American criminal justice system research, there is one published study that applied the methodology to the study of criminal courts in North Carolina (Lewin, Morey, & Cook, 1982) and two published studies in the field of corrections, one which analyzed the efficiency of Michigan prisons (Butler & Johnson, 1997) and another which focused on the efficiency of juvenile justice providers in Florida (Nyhan, 2002). DEA has been used more extensively in police research but has been applied far more frequently to policing entities in countries outside the United States, including the United Kingdom (Drake & Simper, 2000, 2001, 2002, 2003, 2004; Thanassoulis, 1995), Australia (Carrington, Puthucheary, Rose, & Yaisawarng, 1997), Portugal (Barros, 2007), Spain (Diez-Ticio & Mancebon, 2002; Garcia-Sanchez, 2007, 2009), Taiwan (Sun, 2002), and India (Verma & Gavirneni, 2006). Applications focusing on policing organizations and systems in the United States have appeared less frequently in the literature (Ferrandino, 2012; Goltz, 2006; Moore, Nolan, & Segal, 2005; Nyhan & Martin, 1999b).
Though DEA was originally used to study the efficiency differences in public program providers that provided the same program in different locations (Charnes, Cooper, & Rhodes, 1978), it has been applied most commonly in the policing literature in comparative analyses at the organizational level. Consistently, across nations, this research has found inefficient use of police resources within comparative organizational sets. In the studies of American policing, Nyhan and Martin (1999b) had a mean IOTA score of .69 (range .31 to 1.0) and .89 (range .54 to 1.0) in the initial model of their sample (CRS and VRS scores, respectively) while Goltz (2006) had a mean IOTA score of .74 (range .24 to 1.0). Ferrandino (2012) found municipal policing agencies to have higher mean IOTA scores than their university counterparts in Florida (.86 and .42, respectively) though Florida university departments had a higher mean IOTA score (.77) than their North Carolina counterparts (.68). Moore, Nolan, and Segal (2005) did not report a mean IOTA score for the police section of their study. International research also finds efficiency disparities relative to the best performing DMUs in all the studies conducted. These findings make clear two important aspects of DEA. One is its ability to identify the most efficient DMU for others to be compared against. The second is clear statistical enumeration of the mix of excessive inputs or output shortages for other DMUs to reconcile in order to be as efficient as the top performing DMU's which set's the efficiency frontier. Both of these aspects of DEA are central to the research questions explored in the present study.
The present study differs from the previous police efficiency research in several notable ways. First, the unit of analysis in most prior studies is the policing agency or department. In studies of American police: Nyhan and Martin (1999b) analyzed 20 municipal departments from around the United States; Moore et al. (2005) compared 46 different departments from across the nation; Goltz (2006) comparatively examined 113 Florida municipal departments and sheriff agencies; and Ferrandino (2012) analyzed the technical efficiency of 10 Florida university police departments in comparison to their 9 local municipal counterparts as well as 9 university counterparts in the North Carolina system. All of these studies focused on police organizations rather than precincts within one policing organizations as the present study does. While some of the international research has used precincts as the unit of comparative analysis, mainly due to the centralized policing structures in these countries (see Barros in Portugal, 2007 and Sun in Taiwan, 2002), no prior U.S. studies have examined the technical efficiency of precincts that comprise a single department, a central feature of the present study. Kane (2002, 2003, 2005, 2006) has extensively researched the NYPD with precincts as the unit of analysis, providing validity for the current approach despite his studies being more theoretical in nature and not dealing with efficiency outcomes or stop and frisk policy. Ridgeway (2007) further validates the use of this unit of analysis by informing that stop, question, and frisk training is conducted at the precinct level in the NYPD, suggesting variation in the delivery of this training. The size and organizational structure of the NYPD—one of the largest police departments in the world structured as a central headquarters with precincts—lends itself well to a within department rather than between department analysis as it would be difficult to find a comparable organizational set or data collected on the same measures in the same manner.
The second unique feature of the present study that contrasts with prior police efficiency research is the focus on a specific policy (stop and frisk) rather than wider organizational structures, processes, or measures. Most prior police DEA studies are based on analyzing resource inputs such as department staffing levels (Carrington et al., 1997; Moore et al., 2005; Thanassoulis, 1995); total budgets and/or other organizational costs (Drake & Simper, 2000; Goltz, 2006) or some combination of these inputs (Barros, 2007; Nyhan and Martin, 1999b; Verma & Gavirneni, 2006). In terms of outputs, most have analyzed performance measures such as crime rates, response times, clearance rates, or index crimes reported; functional measures related to the police process such as traffic citations, arrests, distance patrolled, or calls for service; or some combination of these outputs (Barros, 2007; Carrington et al., 1997; Drake & Simper, 2000; Ferrandino, 2012; Goltz, 2006; Nyhan and Martin, 1999b; Moore et al., 2005; Thanassoulis, 1995; Verma & Gavirneni, 2006). No prior studies have employed DEA to analyze the efficiency of a specific policy within a policing organization despite efficiency being an important policy outcome to evaluate.
Third, the present study is longitudinal, permitting efficiency outcomes to be measured both between and within DMUs, providing a contrast with the cross-sectional approach in other police efficiency studies (Carrington et al., 1997; Diez-Ticio & Mancebon, 2002; Garcia-Sanchez, 2009; Goltz, 2006; Nyhan and Martin, 1999b; Thanassoulis, 1995; Verma & Gavirneni, 2006) or the more limited analysis of 3–5 years of data (Barros, 2007; Drake & Simper, 2000; Ferrandino, 2012; Sun, 2002), which is less than the 7 years of pooled data analyzed here. This approach further informs of efficiency trends over time at the precinct, borough, and departmental level to more specifically inform frisk policy relative to its application and outcomes. Furthermore, there is a specific analytical framework established (the current equity/effectiveness debate and the legal standards established by Terry and Dickerson) that enhances the validity of this methodological application (Nyhan & Martin, 1999a). Thus, the present study is unique to the police efficiency literature in its analysis of organizational units within a single department; its focus on a specific policy that is widely debated on equity and effectiveness grounds, and its longitudinal approach that permits assessments of change within and between precincts (and boroughs) over time to more specifically inform a controversial policy within the nation’s largest policing organization.
Data, Sample, and Variables
The data used in this study derive from the NYPD Stop, Question, and Frisk annual databases, which are comprised of individual stops in which an officer completes a UF-250 form (NYPD, n.d.). To create precinct- and borough-based data, descriptive statistics of all variables were run and entered by precinct for each year. Then, precinct level data were aggregated for borough- and departmental-level statistics. The present study pools the data for all 76 precincts for the 7-year period covering 2004–2010 (76 × 7 = 532 DMUs). With four total variables (one input and three outputs), this sample size greatly exceeds the minimum requirements of DEA (Nyhan & Martin, 1999a).
The input measure for the present study is total frisks. This measure has the legal precedent founded in Terry as being an established, legitimate police action. Furthermore, it has been argued by the equity research that this input should be much more limited by the NYPD while those that believe it is an essential crime control component suggest that this input should remain at current levels or be increased. The Terry standard—that a frisk is justified if undertaken to find weapons for the sake of officer or public safety during a reasonable stop of a citizen—holds throughout all police departments and their respective officers. This suggests that, theoretically, it is a comparable input across all NYPD precincts and its application, if consistent with legal standards, should lead to similar outputs produced.
There are three output measures employed: pistols recovered, contraband found, and arrests. Terry (1968) frisks relate directly to the suspicion that the stopped citizen may be armed, meaning pistol recovery is an expected outcome of a frisk while Minnesota v. Dickerson (1993) extends this to include contraband recovery if discovered through the “plain feel doctrine” during a valid application of Terry. The finding of a gun and/or contraband relates directly to the legal precedent established in both Terry and Dickerson, respectively, and both variables were utilized in the statistical analysis by Ridgeway (2007). More broadly, the arrest is an output measure that theoretically flows from the officer’s belief that a frisk, not just a stop, is justified by the reasonable suspicion of criminal activity on the part of the suspect. Thus, regardless of whether a gun or contraband is found, many suspects may have warrants, may physically resist officers undertaking the frisk, or may be committing another offense that is uncovered, suggesting that an arrest is an output that should be related to the frisking of a suspect based on reasonable suspicion of criminal activity. Any lack of arrest, gun, or contraband discovery during a frisk means the frisk has been employed inefficiently, separate and apart from being legal, effective, or equitable.
The present DEA assumes VRS based on the nature of frisks and the varying local policies toward the frisk between and within precincts over time. Furthermore, precincts vary in size, environment, and other factors while officers may vary in their motivations for the frisk, making it unwise to assume CRS for frisks conducted. Both input and output orientations are used and reported: the input orientation informs the equity perspective (how many fewer frisks should be conducted to achieve given output levels) while the output perspective informs the effectiveness perspective (how much more output is needed to justify the given level of frisks). DEAP software was utilized to conduct the analysis.
Thus, the analysis that follows objectively determines (a) the departmental trend in frisk efficiency over time; (b) how many fewer frisks should be conducted by the NYPD, given the number of outputs produced, based on the realization that an unsuccessful frisk has ramifications on the individual and mostly minority communities (informing the equity perspective); (c) how many more outputs need to be produced to justify the high levels of frisks undertaken by police in the NYPD (informing the effectiveness perspective); and (d) what are the best and worst performing precincts over the time period studied which provide performance benchmarks in future NYPD efficiency analyses.
Results
Descriptive Statistics
From 2004 to 2010, the NYPD recorded 3,410,300 stops of citizens with 1,721,955 resulting in frisks (50.4% of stops: 41.5% in 2004 and 56.2% in 2010). Of those that were frisked, 162,237 were arrested (9.4% of frisks). Police found 4,383 pistols (a rate of 2.55 per 1,000 frisks) and reported 50,468 contraband findings (a rate of 29.3 per 1,000 frisks). These full results mask the wide variability between years (see Figure 1 for visual comparison of total stops, total frisks, and excess frisks over time; excess frisks are detailed in the following section). For example, total stops increased 92% from 2004 to 2010 while total frisks increased by 161% and arrests increased 155%, suggesting that the police are both more proactive in their stopping of suspects and more aggressive in frisking those they stop. However, the results of this enhanced proactivity and aggression are questionable. The finding of pistols, the main reason for frisking a stopped suspect, increased just 9.3% over this time period (from 610 to 667, respectively), with the rate dropping from 4.7 pistols per 1,000 frisks in 2004 to 1.98 per 1,000 frisks in 2010. The finding of contraband increased 101% between 2004 and 2010, but the rate decreased from 38.6 per 1,000 frisks to 29.7 per 1,000 frisks. Arrest rates during frisks declined from 106.6 per 1,000 in 2004 to 102.5 per 1,000 in 2010. These results suggest a diminishing return of output per input utilized over time.

Total stops, total frisks, and excess frisks, New York Police Department (NYPD) by year, 2004–2010.
DEA
Input oriented
In DEA, efficiency scores are called IOTA scores, with a score of 1.0 representing a completely efficient DMU (precinct for each individual year). Scores less than 1.0 show the level of inefficiency of that DMU compared to the best performer (e.g., an IOTA score of .90 means that precinct is 10% less efficient than the best performing DMU). As shown in Figure 2, the mean department-level input-oriented IOTA score over the full study period was .40 (.47, .42, .35, .38, .37, .38, and .43, respectively) with 2004 being the most efficient year (.47) and 2006 the least efficient (.35). Of the 532 DMUs, 15 (2.8%) were deemed technically efficient (1.0; see Table 1 Input-oriented columns for full results). As shown in Table 3 (Input-oriented columns, Manhattan and Staten Island were the most efficient boroughs (.49 mean IOTA score each), followed by Queens (.41), Bronx (.35), and Brooklyn (.31). Between 2004 and 2010, the efficiency score for Manhattan increased by .078, the Bronx increased by .015, and the other three boroughs all saw decreases in their technical efficiency to varying degrees (Brooklyn −.05, Queens −.18, and Staten Island −.35).

Line graph of input and output IOTA scores, 2004–2010.
Efficiency Scores, by Precinct, by Year.
Efficiency scores at the precinct level also varied greatly within and between precincts over time (see Table 1, Input-Oriented Columns). Directly comparing the first year (2004) IOTA score with the last year (2010), 35 precincts had lower efficiency scores (ranging from −.10 to −.69), while 10 showed no change (−.01 to .01) and 31 showed an increase in efficiency score (ranging from .04 to .70). This comparative approach masks some variations within precincts throughout the other years. One glaring example is the 44th Precinct in the Bronx, which showed no overall change between 2004 and 2010. This precinct was technically efficient (1.0) in both 2004 and 2010 but had a mean IOTA score of .42 due to severe inefficiencies in the other years (0.34, 0.14, 0.16, 0.15, and 0.16, single year scores from 2005 to 2009, respectively). Between precincts, the IOTA scores ranged from .07 (the 67th in 2005 and the 71st in 2006) to the 15 precincts that were technically efficient (1.0). Eighty-seven precincts (16.4%) had IOTA scores less than .20; 223 precincts (41.9%) had IOTA scores ranging from .20 to .40; 138 (25.9%) had scores ranging from .40 to .60; 49 precincts (9.2%) had scores ranging from .60 to .80 while the remaining 35 precincts (6.6%) had scores ranging from .80 to 1.0. The top performing precincts over the full study span are all from Manhattan: the 25th (.80), the 7th (.71), the 9th (.70), the 20th (.69), and the 10th (.64). The lowest performing precincts are the 104th in Queens (.16), the 73rd in Brooklyn (.17), the 115th in Queens (.18), and the 78th in Brooklyn (.19).
The slack results are an important aspect of this research to examine in depth at the departmental, borough, and precinct levels (see Table 2, Input Oriented Column labeled Total Excess Frisks). The input slack denotes how many excess inputs are utilized to achieve a given level of outputs. In essence, these slack statistics inform how many fewer frisks would be required to attain, in inefficient DMUs, the same level of arrests, pistol recoveries, and contraband findings and make them as efficient as the top performing DMUs. In 2004, there were 73,108 excess frisks in the NYPD for the output attained, suggesting a 56% reduction to the 129,727 frisks performed to achieve technical efficiency (refer to Figure 1 and the bottom rows of Table 2 for the excess frisks trend over time). This trend continued in earnest through 2009. In 2005, there were 105,956 excessive frisks (63.3% of total frisks); in 2006, there were 147,638 excessive frisks (68% of total frisks); in 2007, there were 156,687 excessive frisks (64% of total frisks); in 2008, there were 191,641 excessive frisks (65% of total frisks); and in 2009, there was a high of 218,929 excessive frisks (66% of total frisks). In 2010, the number of excess frisks declined from this high to 197,833 (58% of total frisks), despite this year having the most total frisks of the 7-year period. From 2004 to 2010, there were 1,091,846 excessive frisks performed, given the output levels reached for the entire NYPD (see Figure 1 for excess frisk totals for the department each year).
Mean Efficiency Scores, Actual Measures, and Slack Statistics, by Precinct.
Note. DEA = data envelopment analysis.
The borough-level slack results (see Table 3, Input-Oriented Columns) reveal that Brooklyn precincts had the most total excess frisks (415,560 or 38% of excess; 30.6% of NYC population); Queens was second (259,801 or 24% of excess; 27.2% of the population); the Bronx third (218,930 or 20% of excess; 16.9% of population); Manhattan fourth (168,186 or 15% of excess; 19.4% of population); and last was Staten Island (31,719 or 3% of excess; 5.7% of population). Between 2004 and 2010, total frisks in Brooklyn increased by 97% while excessive frisks increased 113%; in Queens total frisks increased 216% while excess frisks increased 335%; in the Bronx total frisks increased 237% while excess frisks increased 218%; in Manhattan total frisks increased 148% while excess frisks increased 77.8% and in Staten Island total frisks increased 280% while excess frisks increased by 476%. Only in Manhattan and the Bronx did excess frisks increase at a lesser rate than total frisks, the main reason these were the only two boroughs that saw efficiency increase between 2004 and 2010.
Efficiency Scores and Slack Statistics, by Borough by Year.
Note. BX = Bronx; BY = Brooklyn; MH = Manhattan; QU = Queens; SI = Staten Island.
At the precinct level over the full study period, total excess frisks ranged from 605 in the 22nd precinct to 63,796 in the 75th precinct (see Table 2, Input-oriented columns). A boxplot of excess frisks by year reveals the 75th precinct to be an outlier in the distribution of all seven study years. This could be partially explained by the fact that in 2011, the most reported index crimes occurred in this precinct (3,407) despite index crimes decreasing by 24% from 2001 to 2011. That said, total stops remained flat from 2004 to 2010 while frisks increased by 96%, meaning the policy of frisking was increasing but stops overall had not. In 2004, 30% of stops resulted in frisks before increasing to 59% in 2010 despite crime dropping. This precinct was inefficient due to this large increase in input (frisks) without corresponding increases in outputs (pistols, contraband, and arrests). The 73rd precinct, second with 57,263 total excess frisks, was also an outlier in all 7 years, the only two precincts to share that distinction though other precincts were outliers in the first 4 years (there were no other outliers in the final 3 years). In all, there were 16 precincts (21%) with less than 5,000 total excess frisks; 16 precincts (21%) with between 5,000 and 10,000 total excess frisks; 27 precincts (35.5%) with between 10,000 and 20,000 excess frisks; and the remaining 17 precincts had at least 20,000 total excess frisks over the study period.
While crime rates, density, land use, and other environmental factors assuredly vary between precincts, some precincts are clearly utilizing the frisk at far greater numbers to achieve the given levels of relative output as other precincts. Thus, it is crucial to explore the output-oriented technical efficiency of the department-, borough-, and precinct-level units to determine output production shortages, given the level of input utilized to produce arrests, pistols, and contraband.
Output oriented
As shown in Figure 2, the mean output-oriented IOTA score over the full study period was .50 (.55, .50, .44, .47, .48, .50, and .56, respectively). The most efficient year was 2010 (.56) with 2006 being the least efficient year (.44). As shown in Table 3, Staten Island was the most output-efficient borough over the study period (.61), followed by Manhattan (.56), Queens (.51), the Bronx (.47), and Brooklyn (.43). Between 2004 and 2010: Manhattan’s IOTA score increased +.09, the Bronx increased +.08, Brooklyn experienced no change, efficiency score decreased in Queens by −.09 and in Staten Island by −.16. Efficiency clearly varies within and between boroughs over time.
Efficiency scores at the precinct level also varied greatly within and between precincts over time when focused on output production (see Table 1, Output-Oriented Columns). Directly comparing the first year (2004) IOTA score with the last year (2010), 26 precincts had lower efficiency scores (ranging from −.11 to −.81), while 12 showed little change (−.01 to .01) and the remaining 38 had higher efficiency scores (ranging from .03 to .70). Between precincts, efficiency scores ranged from .10 to 1.0. Eighteen precincts (3.4%) had IOTA scores less than .20; 170 precincts (32%) had IOTA scores between .20 and .39; 198 precincts (37.2%) had IOTA score between .40 and .59; 86 precincts (16.2%) had IOTA scores between .60 and .79 while the remaining 60 precincts (11.3%) had IOTA scores over .80, including 15 (2.8%) that were technically efficient (1.0). As with the input-oriented analysis, the top five performing precincts are all from Manhattan: the 25th (.87), the 7th and 9th (.78), the 32nd (.77), and the 20th (.75). The five least efficient precincts in this analysis are the 63rd in Brooklyn (.26); the 78th in Brooklyn (.27); the 104th in Queens; the 71st in Brooklyn (.28); and the 49th in the Bronx and the 70th in Brooklyn (.30).
The slack results for the output-orientation analyze how much production shortage exists in order to maximize outputs given the levels of input utilized (see Table 2, Output-Oriented Columns, Output Shortages). In essence, given the number of frisks employed, this analysis determines how many more arrests, guns, and contraband should be produced to be as technically efficient as the top performing precincts which set the efficiency frontier. At the department level from 2004 to 2010, for the amount of frisks undertaken, the NYPD would have had to make 179,056 more arrests, find 6,306 more pistols and find 59,883 more instances of contraband to be technically efficient (see Figure 3 for visualization of arrests, arrest shortages, contraband found, and contraband shortages over the study span). These totals would represent a 110% increase in arrests, a 144% increase in finding pistols, and a 119% increase in contraband recoveries to justify current frisk levels. Between 2004 and 2010, arrest shortages increased by 68.5%; pistol shortages increased by 81% and contraband shortages increased 86.8%. Despite these increases, arrest and pistol shortages fell in 2010 to pre-2006 levels while contraband shortages were lower than 2008, suggesting some recent improvement in producing these outputs.

Total arrests, total contraband found, and total shortages, 2004–2010.
At the borough level, the slack results were fairly equivalent across all three output measures in each borough from 2004 to 2010 (see Table 3, Output-Oriented Columns, Output Shortages). In terms of arrest shortages, Brooklyn needs 59,162 more arrests (33% of total slack), Queens 44,802 (25% of total), the Bronx 35,792 (20% of total), Manhattan 34,774 (19% of total), and Staten Island 4,522 (3% of total). These results were similar relative to pistol recovery shortages: Brooklyn needs 2,275 more pistols recovered (36% of total), Queens 1,608 (25% of total), the Bronx (19% of total), Manhattan 1,157 (18% of total), and Staten Island 94 (1% of total). Finally, in regard to contraband recovery shortages: Brooklyn needs 20,474 more contraband recoveries (34% of total); Queens 14,823 (25% of total); the Bronx 11,798 (20% of total); Manhattan 11,175 (19% of total), and Staten Island needs 1,612 (3% of total). All of these total shortages assume the same level of frisk inputs that have occurred from 2004 to 2010. At the precinct level over the full study period, arrest shortages ranged from 443 in the 20th precinct to 4,941 in the 115th precinct. Pistol shortages ranged from 10 in the 20th precinct to 215 in the 73rd precinct. Contraband shortages ranged from 163 in the 20th precinct to 1,806 in the 73rd precinct.
Limitations
There are several limitations in the present study that need to be noted and briefly discussed. First, the analysis only includes data reported by police. It is safe to assume that police would not omit successful frisk outcomes from the data set (arrests, gun, and contraband seizures) but have been accused of underreporting frisks in the past (see Spitzer, 1999), creating a dark figure. This limitation suggests that efficiency scores, if biased by missing data, could indeed be lower than reported here, but would most likely not improve. Therefore, there is a chance scores are inflated in general. If reporting became more accurate over the 7-year period, the efficiency scores may not follow the U-shape produced in this study as 2004 may be inflated more than the subsequent years. The changes over time need to be understood with this limitation in mind.
Two notes of caution are needed when interpreting the increases of both stops and frisks. First, there has been media reports in the New York City press of lawsuits brought forward by former police officers that claim they were given unofficial quotas to reach relative to stop and frisks, though these lawsuits have yet to be concluded (Parascandola, 2011). Thus, if police officers in the NYPD view stopping and frisking suspects, and the filling out of the subsequent UF-250 form as an expected, important measure of officer productivity (Ridgeway, 2007), these statistics could reflect that expectation, especially relative to the large increase in stops and frisks. From another perspective, the data could reflect not greater police activity in terms of stops and frisks, but a greater reporting of the incidents each year as the policy becomes more scrutinized and/or institutionalized. Both reasons reflect organizational policies which could explain the increase in inputs, making the inclusion of outputs important to analyze in this context.
Another limitation is that while this analysis informs the equity perspective in terms of total frisks conducted, it does not include further equity measures such as race/ethnicity or age. Additionally, while it informs the effectiveness perspective in terms of frisk outcome levels, it does not make any statement about the quality of successful frisks relative to case dismissals, convictions, or officer reprimands for misconduct. Therefore, as NYPD stop and frisk policy is examined more in-depth using various methodologies and approaches, more comprehensive measures of quality frisks are needed. Future research also needs to incorporate efficiency outcomes with other precinct-level organizational, social, and environmental variables to explain the variations in efficiency more than the current study sought to. Concurrent with this limitation is the fact that specialized units within divisions were undoubtedly represented in the samples (such as plain clothes and narcotics units) but were incorporated into the precinct in which the frisk occurred. Future research can analyze the efficiency outcomes between and within these units for greater specificity.
A final limitation is the fact that efficiency may not be a goal of police officers or policy makers in the NYPD, especially if the social goal is deterrence and the organizational goal is productivity. As such, the NYPD may not seek frisk efficiency but rather greater inputs regardless of outputs, or greater total output levels regardless of input. Others that seek equity are more concerned with limiting inputs especially in light of the low level of outputs produced, which should concern those focused on effectiveness as well. Future research on both equity and effectiveness of NYPD stop and frisk policy should include the foundations of efficiency analysis as researched here—both inputs and outputs—to more holistically examine their complex relationship. That said, if efficiency is not a concern of either side, these results have the potential to be more academic when they can and should be considered by policy makers as well as those that criticize or support NYPD policy.
Discussion and Policy Implications
The frisking of suspects by sworn police officers is a subjective action by its legal and situational nature, but it is not intended to be arbitrarily or capriciously undertaken. As a subjective action, there will always be frisks that do not result in further police action, meaning a certain level of inefficiency will always be present in this tool of policing. However, as the NYPD comes under increasing scrutiny for its stop and frisk policies, the results of the present study are meant to objectively inform their performance at the precinct, borough, and departmental levels through benchmarking top performers and identifying DMUs that are not efficient in their application of frisks. Policy makers could also apply this technique in varying time frames (e.g., monthly or annually) at the individual officer level to more quickly identify officers that are inequitably frisking citizens (similar to the conceptual approach of Ridgeway, 2007). By not analyzing frisks relative to their outcomes, department policy makers and administrators are risking more oversight through legislative, executive, or judicial action, which would be detrimental if the policy has been as effective in reducing crime as claimed. Furthermore, they also risk more lawsuits and bad publicity on the equity side of the equation if they focus solely on inputs alone, also potentially impacting their ability to utilize this tool in the future. By combining the two perspectives as done here, the NYPD has the potential of more effective and equitable application of frisks, mitigating these risks to the internal management process that are mounting externally.
Ridgeway (2007), using data from 2006, speculated that the NYPD appeared to have an excess of 170,000–250,000 stops of the 506,429 reported stops conducted that year (33.5–49.4% of total). While police may have a multitude of reasons for stopping and questioning a citizen, the frisk is a much more intrusive interaction bound by legal standards (see Terry, 1968). As such, the present study focused on the frisk and its outcomes rather than the stop itself. The input-oriented efficiency analysis informs that based upon the standard of technical efficiency, the NYPD would have had to reduce total frisks from 1,721,955 to 630,109 (63.4%) relative to the actual level of arrests, pistols, and contraband produced by this action from 2004 through 2010. This would equate to roughly 90,000 frisks per year on average rather than the current average of 245,993 per year. Thus, there exists some evidence that the NYPD may overstop (Eterno & Silverman, 2012; Ridgeway, 2007) and overfrisk as a result of local crime control policy, especially as the trend holds over a 7-year period within and between boroughs and precincts. Policy makers should focus more on the input side of frisk efficiency to improve overall performance, a move that may assuage those attacking the local policy on equity grounds but could also improve efficiency by conducting more frisks that produce intended outputs. Though still inefficient overall, 31 precincts and 1 borough improved their input efficiency between 2004 and 2010 while 38 precincts and 2 boroughs improved their output efficiency between the 2 years, a positive finding of the analysis. These precincts and boroughs, along with the top performing DMUs, provide a performance benchmark for other DMUs to analyze if they seek to improve on this measure throughout the organization.
The input-oriented results would be welcomed by those that seek more equity in NYPD stop and frisk policy (fewer frisks), but would run counter to the local NYPD policy that seeks to deter crime through increasing frisks. The output-oriented analysis assumes the given levels of frisks conducted, consistent with this perspective, and informs of how many more outputs need to be produced, given these input levels in order to achieve maximum efficiency. Department wide over the full period of study, there would need to be 179,056 more arrests, 6,306 more pistol recoveries, and 59,883 more contraband recoveries. These numbers seem excessive until they are placed in the context of the full annual activity statistics within the NYPD from 2004 to 2008. Using available data from the Office of Juvenile Justice and Delinquency Prevention (OJJDP, n.d.), the NYPD made 2,321,002 arrests over this 5-year time period, with 97,798 deriving from stop and frisk interactions (4.2%) as delineated in the current SQF data for the same time period. The shortage of 121,992 arrests would equate to 219,790 total arrests that would have been needed from 2004 to 2008 to justify the frisk levels utilized. If these additional arrests were realized, the total arrests originating from stop, question, and frisks would have been 9% of the revised NYPD arrest total, up from the realized 4.2%, but still a fraction of total NYPD arrests. Had these outputs been reached, the level of frisks utilized would have been justified from the technical efficiency perspective and perhaps somewhat blunted the claims of inequity in frisk applications.
Carrying this approach through to pistol recoveries, there were 4,278 more pistols that would need to have been recovered during frisks from 2004 to 2008 to justify the frisk levels utilized. Over this time period, the 3,052 pistols recovered represented just 6.8% of the total deadly weapons arrests made by the NYPD (45,052). To be efficient given the frisk levels, there would have had to be 7,330 pistol recoveries, which would be 14.9% of total pistol recoveries over this time period. Even if the department were technically efficient on this output given the 1,053,565 frisks from 2004 to 2008, it would still represent roughly 15% of pistol recoveries in the city. In terms of contraband recovery, the NYPD made 453,781 drug abuse violation arrests, with 32,162 coming from stop and frisk interactions (7.1%) with a shortage of 40,072 contraband recoveries. Even if the department was technical efficient on this output, this would represent 14.6% of contraband recovery from 2004 to 2008. Thus, the focus on stop and frisk policy misses a great deal of police action in New York City and while important, must be augmented with broader police process and outcome analysis in future studies.
Future research should fully model the role of frisk efficiency in terms of its causes (organizational, environmental, and political) as well as its effects (crime rates, deterrence, citizen opinion, and other important measures). This is a complex relationship that lies at the heart of the effectiveness perspective, and one that requires specific causal modeling using precise analytics with proper temporal sequencing (such as structural equation modeling). One could argue that lower efficiency (increased inputs and decreased outputs) actually supports the deterrence position that the increased chance of being frisked lowers crime in the form of less people carrying guns or contraband though more recently scholars generally supportive of the NYPD have questioned this as illogical, fatally flawed, inequitable, and possibly illegal (see Eterno & Silverman, 2012) While beyond the general scope and purpose of this article, a correlation analysis was conducted that analyzed the percentage decrease in crime by precinct between the years 2001 and 2011 and several pertinent variables from the present data set that covers 2004 to 2010. The percentage change in the number of frisks between the individual years 2004 and 2010 was negatively and significantly correlated with index crimes, but the correlation was weak in nature (r = −.376, p < .01) for such an extended period of analysis. The percentage change in total stops was negatively and insignificantly correlated with index crimes (r = −.158, p > .05), but the percentage change of stops resulting in frisks was negatively and significantly correlated with the percentage change in index crime (r = −.257, p < .05). Furthermore, the mean output efficiency score from 2004 to 2010 was positively and significantly correlated with the percentage change in reported index crime (r = .235, p < .05), but the input-oriented mean IOTA score was not significantly correlated with the percentage change in index crimes (r = .162, p > .05).
Taken together, one could perhaps attempt to argue that these correlations support the effectiveness position: Frisks are more highly correlated with crime reductions than mere stops, and the less efficient a precinct is in their frisk applications, there is a correlated decrease in index crimes. However, this does not imply causality and these preliminary correlations over an extended period of time are weak to slightly moderate in strength. Thus, future studies of NYPD stop and frisk policy need to be comprehensive in studying the predictors and effects of efficiency and effectiveness within a framework that includes equity as the foundation of the ability to perform a legal frisk. This works adds the component of efficiency data to this timely and important end as it did not exist prior. Going forward, all three perspectives need to be integrated to holistically study this critical social and law enforcement issue in America’s largest and most diverse city. As it currently stands for NYPD policy makers, the equity and efficiency research together outweighs the effectiveness literature, making their case for increased frisks more difficult to make, especially when crime reductions were already occurring before the surge in frisks began (Eterno & Silverman, 2012).
Efficiency scores can also be correlated and modeled in future research with police misconduct complaints at the precinct and officer levels of analysis to study the wider impacts of frisk inefficiency. This approach could then be replicated in other large departments to see if the results are the same throughout the organization field or are more directly related to local departmental policy, factors, or outcomes. That said, this work is unique within the police efficiency literature using DEA and has the potential to spur new avenues of research on police policy outcomes, especially as it replicates the findings of Geller and Fagan (2010) that the NYPD is experiencing diminished returns on its stops and frisks. This reality may eventually reach a point where the NYPD is mandated to more closely balance its frisks to the outcomes of that legal yet controversial action similar to how they were forced to produce public data on stops and frisks such as that used in this analysis.
Conclusion
In sum, these results represent two extremes relative to each perspective. While clearly there should be no quota on frisks to meet, there should also not be a floor, which the input-oriented results produce. On the other end of the spectrum, it is difficult to produce more outputs absent more effective and incisive decision making, as the outcome of a frisk is not known until after it is conducted yet clearly more frisks are being conducted than should be. Thus, the efficiency analysis conducted here points to a combination of these perspectives: Frisks should be reduced by some degree in accordance with the officer safety standard of Terry, and successful frisks should be increased through analysis of the most successful officers, precincts, and divisions to inform the wider department, thus making the NYPD more efficient, effective, and equitable in their application of frisks. Moreover, while full technical efficiency may not be feasible or even desirable, the performance and outcome benchmarks produced here more holistically inform the NYPD, its precincts and their respective borough commands of their comparable frisk efficiency. In the future, just as COMPSTAT has become institutionalized within the NYPD, so too may analytical techniques such as DEA, a proven management tool applicable to stop and frisk policy which is a complex and growing social issue.
Footnotes
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
