Abstract
Background:
Evaluations are routinely conducted by government agencies and research organizations to assess the effectiveness of technology in criminal justice. Interdisciplinary research methods are salient to this effort. Technology evaluations are faced with a number of challenges including (1) the need to facilitate effective communication between social science researchers, technology specialists, and practitioners, (2) the need to better understand procedural and contextual aspects of a given technology, and (3) the need to generate findings that can be readily used for decision making and policy recommendations.
Objectives:
Process and outcome evaluations of technology can be enhanced by integrating concepts from human factors engineering and information processing. This systemic approach, which focuses on the interaction between humans, technology, and information, enables researchers to better assess how a given technology is used in practice.
Subjects:
Examples are drawn from complex technologies currently deployed within the criminal justice system where traditional evaluations have primarily focused on outcome metrics. Although this evidence-based approach has significant value, it is vulnerable to fully account for human and structural complexities that compose technology operations.
Conclusions:
Guiding principles for technology evaluations are described for identifying and defining key study metrics, facilitating communication within an interdisciplinary research team, and for understanding the interaction between users, technology, and information. The approach posited here can also enable researchers to better assess factors that may facilitate or degrade the operational impact of the technology and answer fundamental questions concerning whether the technology works as intended, at what level, and cost.
Keywords
Introduction
Governing bodies and research institutes routinely support efforts to conduct evaluations of technology used in the criminal justice system. These evaluations face several challenges and opportunities including (1) the need to facilitate effective communication between key stakeholders including, but not limited to, social science researchers, technology specialists, and practitioners, (2) the need to better understand the procedural and contextual aspects of a given technology, and (3) the need for research studies that can be used for decision making and to generate specific policy recommendations.
In this article, we begin by discussing the evidence-based criminal justice movement and the sometimes divergent missions and goals of research and practice organizations that can limit technology evaluations and the utility of the results obtained. We then discuss guiding principles that incorporate a human factors engineering (HFE) and information processing approach in the study of criminal justice technologies. These guiding principles address research challenges and opportunities when conducting technology evaluations in the criminal justice system while strengthening the utility of technology-related criminal justice research.
Informing Practice With Research: The Evidence-Based Movement
Contemporary criminal justice has seen increased attention toward developing a knowledge base regarding “what works” in enhancing public safety (Sherman et al., 1997). This movement, influenced by the field of evidence-based medicine, advocates for the direct use of research evidence in practice (Sherman, 1998). Evidence-based criminology has made significant strides in the increased use of science to inform policy and practice (Koper, Lum, & Willis, 2014, Lum, Koper, & Telep, 2011; Sherman, Farrington, Welsh, & MacKenzie, 2002; Skogan & Frydl, 2004; Weisburd & Neyroud, 2011). 1 The evidence-based movement has also been welcomed by both criminologists and practitioners interested in increasing the effectiveness of public safety policies (Clear, 2010; Welsh, 2006).
In accumulating knowledge on what works, evidence-based criminology classifies scientific evidence according to the methodological strength of their research designs, emphasizing the use of randomized controlled experiments and sufficiently rigorous quasi-experiments. The minimum interpretable design includes a measure of the outcome of interest before and after an implementation and comparable control conditions (Farrington, Gottfredson, Sherman, & Welsh, 2002). Proponents declare that a program must be supported by at least two studies using at least a minimum interpretable design to be included in the what works category (Farrington et al., 2002, p. 18).
Notions of Evidence and Their Implications for Criminal Justice Research
Despite its appeal, an evidence-based approach can sometimes be constraining when it comes to deriving specific policy implications. Difficulties arise when researchers consider what is meant by “scientific evidence,” “best practices,” and what works (Moore, 2006, p. 323). In this sense, evidence-based criminology’s emphasis on methodological quality is viewed as overly rigid and potentially restrictive to the formation of new ideas and approaches (Clear, 2010; Greene, 2014; Sparrow, 2011). Moreover, outcomes represent but a small fraction of important issues in criminal justice practice. Many important issues may not readily lend themselves to rigorous statistical evaluation—a research challenge that stretches beyond the criminal justice field. For example, Greenhalgh and Russell (2010) argue that rigid experimentation does not reflect basic realities of eHealth programs in the medical field, which often have multiple and sometimes competing goals and that program outcomes typically “erode and change over time and across contexts” (p. 2). Thus, a focus on the statistical measure of variables, which defines experimental science does not completely capture the often complicated nature of real-world practices.
Within the crime prevention arena, Eck (2002) illustrates a similar situation regarding the effectiveness of metal detectors in preventing airplane hijackings. Eck (2002) noted that since hijackings have always been a rare event, the reduction in such incidents following the installation of metal detectors at airports would not be sufficiently powered to achieve statistical significance. Despite this fact, Eck argues that “no sensible person would claim that these metal detectors are ineffective and demand their removal. Yet, when we use the classical experimental design as the benchmark, this is exactly what we are implying” (p. 284). Further, the cost of a human, system, technical, or procedural error, regardless how infrequent, could result in a lapse of security and an ensuing catastrophic event. That is the very notion of what data should be considered “important” and ultimately statistically significant is inherently different when comparing classical experimental design versus the importance of studying and minimizing human or system errors (U.S. Department of Justice, 2014). Ironically, ignoring such errors, albeit rare in occurrence, may result in statistically significant differences in terms of lives lost and economic costs incurred compared to correcting such errors beforehand. The interaction between humans and technology is an important area for study for criminal justice researchers, given human and system performance issues, even at a micro level, can impact macro-level outcomes.
Arguments posed by Greenhalgh and Russell (2010) and Eck (2002) reflect Moore’s (1995) observation that researchers are often guilty of having “too narrow a view of what constitutes knowledge valuable enough to use in confronting public problems” (pp. 302–303). A recent study by Sidebottom, Guillaume, and Archer (2012) raises the question of what is the proper definition of evidence. In this project, the Warwickshire Police and City Council partnered with the Jill Dando Crime Science Institute to study the theft of customer bags from shopping carts in supermarkets. To avert the problem, a supermarket chain installed safes within shopping carts so customers could securely lock their bags while shopping. Given the limited time frame of the intervention (3 months) and the singular target site, too few theft incidents occurred for a sufficiently powered outcome evaluation. However, surveys of customers conducted to explore the causal mechanisms by which the shopping cart safes may help curtail bag thefts generated a number of important findings. For example, the safes were noticed by a significant number of shoppers who reported few difficulties operating the devices. Many shoppers also reported that they would use the safes again and that the safes were too small for certain bags. Hence, collecting data and feedback from actual users, studying the operational design of the safes, how they were used, and the context they were being used in revealed a range of anticipated and unanticipated perceptions of utility. If the definition of evidence were expanded beyond just outcomes to include an analysis of human performance, operational procedures, and contextual factors, then the Sidebottom et al. (2012) findings and research approach can be viewed as having increased utility for decision makers. Such findings may ultimately influence theft outcomes and future implementations.
Criminal justice research, however, has focused predominantly on outcomes of interest while overlooking the causal mechanisms that may help explain why the outcome occurred (Sullivan, McGloin, and Kennedy (2012). Indeed, for practitioners understanding precisely why a specific program achieved its desired outcome or not is just as important as knowing whether the program worked (Greene, 2014; McGloin & Thomas, 2013).
The Challenge of Criminal Justice Technology Evaluations
According to Weisburd and Neyroud (2011), criminal justice agencies have typically adopted new technologies without first evaluating their effectiveness. In their view, criminal justice practitioners have tended to give new technologies the benefit of the doubt, assuming the technology works in theory but knowing “little about how to use such technologies so that they work best” (p. 7). In all, it is clear that more comprehensive research approaches are needed for conducting evaluations of technology used in the criminal justice system. An emphasis on outcomes is important, but this must be coupled with a continuous focus on the users of the technology and an analysis of the relevant operational procedures and processes. Such a systemic approach is essential for understanding the results of outcome evaluations and for developing policy recommendations.
Technological evaluations unfortunately follow the approach of criminal justice as a whole, with an almost exclusive focus on outcomes. Such an approach fails to consider other important procedural and contextual factors operating which have a direct impact on the effectiveness of the technology. As an example, consider two popular technological criminal justice interventions: the video surveillance of public places (i.e., closed circuit television [CCTV]) and the electronic monitoring of offenders. Both these strategies are reliant on a series of complex technological and human performance factors to achieve their desired outcomes. More specifically, a range of different user and systems tasks, subtasks, procedures, and operations have to be implemented and executed successfully and reliably for the technology system to work.
With both CCTV and electronic monitoring, the required tasks and procedures are also interrelated, with latter tasks and procedures contingent upon successful completion of earlier tasks. For example, CCTV technology requires (1) installation of cameras which have a continuous connection to electricity and a hardwired or wireless telecommunications network, (2) continuous relay of video footage from the cameras to a central station, (3) retroactive or real-time video footage monitoring by a human operator, (4) detection of criminal infractions contained in the footage by the operator, (5) notification of the police of the criminal infraction, and (6) on-site or postinvestigation apprehension by the police of the offender (either on scene or at a later date following an investigation) observed committing the criminal infraction (LaVigne, Lowry, Markman, & Dwyer, 2011; Ratcliffe, 2006). Similarly, the use of electronic monitoring requires (1) an offender is fitted with a monitoring device; (2) the device continuously sends a signal to a monitoring center; (3) the community supervision agency is notified in a timely manner when the offender violates an inclusion, exclusion, or mobile exclusion zone condition of their release; (4) the agency determines a course of action to address the violation; and (5) the course of action is adequately enacted by the agency (Harris, 2013).
As the prior examples describe, outcomes are undeniably in large part a function of human and system performance as well as the operational procedures and policies that govern and direct the usage of such technology. Alternatively the absence of clear and effective usage procedures and policies or the implementation of inaccurate or incomplete procedures and policies may detrimentally impact desired outcomes. Unfortunately, for both CCTV (Welsh & Farrington, 2009) and electronic monitoring (Renzema & Mayo-Wilson, 2005), the vast majority of research has exclusively focused on whether or not the technology produced the desired outcome (i.e., reduced crime rates and recidivism) while the specific tasks, procedures, and usage context have gone largely unexplored. Therefore, while practitioners may be able to gain a general or overall sense of whether a particular technology works or not, they are largely left without any specific performance data regarding why the technology works or not and at what level and cost. Hence, it is challenging for researchers to develop subsequent recommendations that are specific enough to aid in decision making and policy decisions.
Scope of Current Study
This article demonstrates how technology evaluations can be executed and expanded to account for human processes, procedural, and contextual factors. We begin with a review of an evaluation of the public CCTV system in Newark, New Jersey, which was headed by an author of this article. This project continuously highlighted the importance of human performance and procedural factors, which led to refined analyses and findings beyond outcome measures. Guiding principles are then presented for technology evaluations, which can identify and define study metrics and facilitate communication within an interdisciplinary research team. The guiding principles are presented within the context of an ongoing multipronged evaluation of a global positioning system (GPS)-based offender monitoring system. Lastly, an overall human factors development approach is discussed, which stresses the importance of continually focusing on the actual users of the technology throughout the design and evaluation of the system.
The Importance of Human Factors in Technology Evaluations: An Applied Example
In 2006, the city of Newark, New Jersey, committed to more readily incorporating technology to improve public safety. One program involved a public CCTV camera system. From 2007 through 2010, a 146-camera system was installed over several phases. The main goal was the reduction in two outcome metrics: overall street-level crime and public disorder. The Newark Police Department (NPD) also established a video surveillance unit (VSU), which had responsibility for the day-to-day CCTV operations. During all shifts, two VSU operators, under the supervision of a police sergeant, monitored the cameras to detect incidents of crime and disorder. Upon detecting an incident, operators report the event via the department’s computer-aided dispatch system, with police dispatch being conducted in a “differential response” manner: higher priority incidents are addressed before those with lower priority, a process considered standard operation procedure in police departments across the United States (LEITSC, 2008).
Initial technology evaluations of Newark’s system found limited evidence of effectiveness. Caplan, Kennedy, and Petrossian (2011) found, of the first 73 cameras installed, auto theft was the only crime type included in the analysis that experienced an overall reduction. However, a more procedural analysis of the individual camera locations, rather than the entire CCTV system, found auto theft levels did not change at more than half (39 of 73) of the individual camera sites. Replications of these analyses conducted after the full 146-camera system was in place produced similar results. Piza (2014) found that the full CCTV system generated modest auto theft reductions in only one of four police precincts, while less than half (54 of 146) of the individual camera sites showed any evidence of an auto theft reduction (Piza, Caplan, & Kennedy, 2014a). None of the other five crime types included in the study experienced any significant reductions (Piza, 2014).
To improve the operational effectiveness of the technology, the NPD corroborated with a research team from John Jay College and Rutgers University on a series of studies funded by the National Institute of Justice. One study measured how well Newark’s CCTV system increased the “certainty of punishment,” which prior research identified as the key component of deterrence, within CCTV target areas (Piza, Caplan, & Kennedy, 2014b). Crime incidents detected and reported by CCTV were closed by an enforcement action at a significantly higher rate than crime reported via the 9-1-1 emergency line, suggesting that punishment certainty was indeed heightened by CCTV. The potential benefits of increased punishment certainty were negated, however, by the fact that proactive surveillance activity was a fairly rare occurrence. Over the 165-week study period, both proactive detections of crime incidents by VSU operators and subsequent enforcement by police officers steadily decreased. While a weekly average of 26.84 detections and 9.47 enforcement actions occurred during Phase 1 of the CCTV operation (when 11 cameras were in place), activity steadily fell to a weekly average of 2.11 detections and 1.22 enforcement actions by the time the system expanded to 146 cameras. Regression analyses found that the continuous expansion of the CCTV system had a negative effect of proactive CCTV activity, and each additional phase of camera installation caused up to a 47% reduction in weekly detection and enforcement levels. This suggests that the expansion of the CCTV system absent an increase in manpower, and the increased amount of information being collected, overloaded the VSU operators and prevented early levels of proactive activity from being maintained.
A tangential study by Piza, Caplan, and Kennedy (2014c) further documented the importance of analyzing the interaction between humans and technology and how such interactions can directly impact outcome measures. The study explored the use of CCTV as an early intervention mechanism to detect and disrupt street-level activity that can lead to serious violence. From viewing and coding CCTV footage immediately preceding and including serious violent crime incidents, researchers found that violent crimes were typically preceded by multiple “intervention opportunities”: less serious incidents that provided sufficient probable cause or reasonable suspicion for police to intervene. Despite the occurrence of such incidents and the fact that VSU operators viewed these incidents in real time, operators made the decision not to report the vast majority of intervention opportunities. In retrospective interviews with researchers, operators reported that large queue times (i.e., the amount of time between the reporting of a crime and dispatch of a police officer) discouraged them from reporting many of the criminal infractions they observed. As an example of such, Piza et al. (2014c) reported that researchers once witnessed an operator monitoring open-air drug activity on a computer monitor. As reported by Piza et al. (2014c), “After stating that she often views these same individuals engaging in similar behavior, the operator was asked why she didn’t report the incident, to which she responded, ‘Because by the time the radio car gets there they’ll be long gone’” (p. 12). Operator beliefs were supported by quantitative data from this study, as the queue times associated with all but two of the intervention opportunities were likely too large for police to have arrived prior to the violent crime incidents, had they been dispatched.
The findings of this prior research (Piza et al., 2014b; Piza, Caplan, & Kennedy, 2014c) show that the operational impact of the CCTV’s technology is directly tied to human and procedural factors operating within traditional realms of policing (i.e., crime reporting, officer dispatch, and officer response). The large camera to operator ratios and the differential response policy of police dispatch represent “surveillance barriers” that minimize the effectiveness of CCTV (Piza et al., 2014b). This suggests that the improvement in CCTV could be achieved by analyzing and applying the findings pertaining to human performance and other usage factors. With this in mind, the research team and NPD conducted a randomized controlled trial to test how removing these surveillance barriers influences the effectiveness of CCTV (Piza et al., 2014d). To minimize the camera to operator ratio, an additional CCTV operator was deployed to the control room and dedicated to strictly monitor only the treatment area cameras during the experiment. To bypass delays inherent in the differential response manner of deployment, the experimental operator was assigned two patrol cars for the purpose of responding to the incidents detected on treatment cameras. Incidents were not reported through CAD but were relayed via two-way radio directly to the field supervisor patrolling with the experimental police units. The experimental strategy generated statistically significant, and sizable, reductions in violent crime and social disorder. This is noteworthy in light of CCTV’s limited effect on most street-level crime types, as observed in Newark (Caplan, Kennedy, & Petrossian, 2011; Piza, 2014; Piza et al., 2014a) as well as in the general CCTV literature (Welsh & Farrington, 2009). The findings directly support the hypothesis that the integration of CCTV with proactive police activity generated by human operators produces a crime control benefit greater than what has previously been achieved via “stand-alone” camera deployment (Piza et al., 2014d).
The multilevel analysis of Newark’s CCTV system highlights the importance of including an analysis of human and procedural factors in the evaluation of criminal justice technologies. Such an approach enabled researchers to contextualize their findings at each step of the evaluation and to develop/refine research questions in order to maximize the policy relevance of the studies. Despite these benefits, the fact that a human factors approach was incorporated in a post hoc manner, as the evaluation progressed, prevented the research team from providing additional insights. For one, outside of the general change in dispatch policy, the research shed little light on the actual activities of the VSU operators. Specifically, the research did not identify which aspects of proactive monitoring, if any, improved with the policy change. In addition, the experience of key stakeholders, responding police officers, was not measured. Since the apprehension of offenders is contingent upon actions of the responding officers, it would have been helpful to measure officer experiences, both in the standard CCTV operation and within the experimental strategy. Lastly, the research did not measure the long-term sustainability of experimental strategy, specifically in regard to how the policy changes impacted the workloads of VSU operators and police officers.
Building on such lessons learned, we advance a set of guiding principles for criminal justice technology evaluations. Making extensive use of tenets from the disciplines of HFE and information processing, guiding principles are presented within the context of a forthcoming multipronged evaluation of a GPS-based offender monitoring system. The research design incorporates HFE at each step of the research process. By focusing on human factors from the start, the design enables the research team to measure procedural and contextual aspects of the technology and increase the utility of the findings.
Guiding Principles for Criminal Justice Technology Research
With the rise of evidence-based criminology, researcher and practitioner collaborations have increased in popularity, moving beyond ad hoc projects to overarching, stable partnerships between academic institutions and criminal justice agencies (Henry & Mackenzie, 2012). In an attempt to facilitate such research partnerships, which are often characterized by the divergent needs and goals of the two sides (Blomberg, 2009; Wellford, 2009) guiding principles for criminal justice technology evaluations are developed and discussed, which are explicitly geared toward identifying and defining study metrics as well as for facilitating communication within an interdisciplinary research team. While these principles directly relate to research on criminal justice technologies, the ideas presented are applicable to researcher and practitioner collaborations generally in criminal justice.
Guiding Principal 1: Focus on Organizationally Driven Core Metrics From the Onset
To increase the likelihood that a technology evaluation will generate policy-relevant findings, and to facilitate better communication throughout the research process, a guiding principle is described, which directly ties such research to broader mission statements. This involves identifying and operationally defining metrics derived directly from the mission statements of the organizations sponsoring the research, and using such metrics as a means to provide ongoing direction and focus for the evaluation. Although mission statements tend to be written broadly or generically, such statements also identify important metrics that are of overriding or core importance to a given organization. Such core or guiding metrics can be operationalized at the onset of the research process. Ideally such definitions should be concrete, observable, and include behaviorally based examples of the project technologies.
For instance, a mission statement could contain broad objectives such as “provide public safety, offender accountability, and fiscal savings” or “provide safe and effective technology solutions.” The question of exactly what “safe” and “effective” means within the context of a particular technology evaluation, and ensuring this is operationally defined, is an important consideration early on in the research process. Having clearly defined metrics is particularly important when considering human performance and to answer the question: Does the technology work and at what level? For example, within the context of an evaluation of GPS technology used as a deterrent to intimate partner violence, an important metric related to safety is the accuracy of the tasks performed by users and operators monitoring the system as well as the time it takes to correctly complete a given task or procedure. Performance metrics and goals can be developed using the following format: U% of a sample of end users should be able to correctly perform T% of critical tasks within X time and with no more than E errors (Salvemini, 1999; Smith & Siochi, 1995). Applying this format, benchmark performance goals can also be specified—90% of operators monitoring offenders should be able to correctly detect an alarm 99% of the time within 30 seconds from the appearance of the alarm on the operator’s display—and used to assess if the technology is being used above or below expectations. Note the intention here is not to advocate a specific performance goal or standard but to demonstrate how such goals could be constructed in light of technical factors, existing policy, and metrics an organization deems important. In the absence of specific human performance metrics and goals, the tracking of human performance, our understanding of just how well the technology system is performing is less informed. Such performance metrics can potentially be developed into standards, guidelines, and specific performance goals for a given task, process, or procedure. Such clarity would also provide insight into other areas such as training, operational procedures, and policy decisions.
Including such an organizationally driven, top-down approach during criterion development as well as considering human performance metrics should (a) ensure that the technology research is tied directly to the overarching mission of the organization, that is, a clear focus on the big picture is consistently maintained in terms of how research questions are framed, and what is to be measured, and (b) enable the refinement of other potential metrics, study goals, and questions in the early stages of the research process. To the extent a proposed technology study does not address any of the core metrics, these considerations may also serve as gating criterion for the research practitioner and organization to do further up-front analysis to clarify the goals and scope of the research as well as the needs, expectations, and concerns of the respective research organizations and practitioner partners. Effective communication of expectations and mutual agreement on project goals are necessary ingredients for a research collaboration to be successful (Sullivan, Khondkaryan, Moss-Racusin, & Fisher, 2013). Each research organization should clearly see what the overriding goals, metrics, and research questions are, and, more importantly, how this aligns with the mission of the organization. Assuming all parties are on the same page, which is critical at this juncture, then the research can begin to move forward and consider key process and operational components. If not, continued discussion, iteration, and integration of ideas are necessary.
Assessing operational impact
Executing technology evaluations which can be better used for decision making and for generating specific policy recommendations requires an assessment of what the operational impact of the technology is in a number of key areas. For instance, the operational impact of complex technical systems such the GPS monitoring of offenders, or use of CCTV systems, can be felt across various user groups, organizations and, as is true of most technical systems performance, varies based on human performance factors, usage contexts, and other technical and environmental factors. Operational impact is arguably a composite or multivariate criterion and can be defined via three categories of core metrics (a) economic metrics via objective measures such as cost savings, operating costs associated with implementation, maintenance, and day to operation as well as cost comparisons relative to previous technology or technology to be upgraded; (b) human performance metrics; and (c) technology metrics and parameters. By viewing operational impact as a composite criterion enables the research team to begin to answer a critical and fundamental question, that is, does the technology work, at what level, and at what cost?
The collection and analysis of user and system errors committed is a critical component in assessing operational impact. Human errors may be categorized in a number of ways including (a) errors of omission characterized by the leaving out of an appropriate step to a process, (b) error of insertion characterized by the adding of an inappropriate step to a process, (c) error of repetition characterized by the inappropriate adding of a normally appropriate step to a process, and (d) error of substitution characterized by an inappropriate object, action, or place, or time instead of the appropriate object, action, place, or time (Senders & Moray, 1991). In some cases, error probabilities may also be determined for a given series of events and a given number of options a user has to choose among (Sharit, 2012).
To the extent the technology is not performing to stated goals or expectations, such an analysis can inform researchers as to procedural and other technology areas that may be problematical as well as potential issues of user training, motivation, or other factors. This analysis coupled with an analysis of the technology’s performance parameters (for instance, system reliability, false alarm rates) provides researchers with a comprehensive view of precisely how effectively the technology is being used and for identifying areas for enhancements. In defining operational impact, additional core metrics within each category may be included based on the specific technology to be studied and mission of the sponsoring research organization. Such metrics need to be carefully selected and defined by the research team at the onset of the evaluation.
Guiding Principal 2: Build a Bridge Between the Technical and Research Mind-Sets
Sullivan, Hunter, and Fisher (2013) highlighted the importance of discussing the products to be developed at the outset of a research project and how such a discussion contributes to designing a project that is sure to answer the questions being asked and reduces challenges regarding the dissemination of unexpected and potentially unfavorable findings. According to Sullivan et al. (2013), the likelihood for success is greatest when the researcher and practitioner discuss and agree on (a) the products that will result from the study, (b) the intended audience for those products, and (c) the goal of disseminating those products to the intended audience. Moreover, communicating findings in as simple and nontechnical language as possible increases the likelihood that the information will be used to affect policy and practice.
Given the differences in training and expertise of social science researchers, technology developers, and practitioners, this is not a trivial task. There are different viewpoints and mind-sets at play during any technology evaluation. An approach is needed to bridge from the technical mind-set to the research design mind-set, so a researcher can speak the same language early on in the research process. This will help facilitate communication with research team members and the development of final metrics and study designs.
To illustrate this guiding principle consider an evaluation assessing the impact of GPS technology on offender monitoring. Figure 1 depicts a technology model for GPS monitoring that was contained in a baseline report for the project (Harris, 2013). This model was developed to (a) generically depict the overall technology and its major components and (b) facilitate discussions with researchers, technology specialists, and practitioners during the design and execution of the technology evaluation. In essence, this model-building approach has provided stakeholders with a better understanding of the technology and streamlined the conceptualization of major variables of interest.

Technology overview of global positioning system (GPS) components used in the supervision of offenders convicted of intimate partner violence (Harris, 2013).
The major components for the technology system are depicted in Figure 1, such as the transmission of communications data among the various pathways, hardware devices, and user groups. There are a myriad of factors that could impact the technology being evaluated, undoubtedly leading to many interesting paths or directions the evaluation can take. Technology evaluations, particularly those involving such large-scale, complex technology as shown, would be facilitated by having clearly defined approaches for (1) defining the scope of the technology evaluation and keeping it in focus and (2) generating initial and final research questions that are tied to the ultimate research objectives. Developing such approaches is complicated throughout a technology research study by the disconnect between technical and research views of the technology. A mixed methods approach and framework is needed to help bridge the above gap, wherein researchers and technologists could communicate and corroborate more effectively regarding both the technical aspects and social science research issues/questions. Ideally, a goal of such a framework would be to translate and depict the relevant research domains and metrics into a revised, research friendly, technical representation of the system.
To address the aforementioned challenges an HFE approach is used. This discipline is composed of two major activities: (1) analyzing user capabilities, tasks, and the work environment and (2) applying the results of this analysis to the design and testing of products, systems, and work environments (Karwowski, 2012; Salvemini, 1998). Human factors studies are routinely executed across different technological settings and government agencies including the Federal Aviation Administration, Food and Drug Administration, National Highway Traffic Safety Administration, and various branches of the U.S. Army and aerospace industry. 2 Given the cost of a systems or user error in such contexts can be catastrophic, major goals of such research efforts include the reduction of user, system, and procedural errors to ensure the safe and effective use of technology systems and the development of agency-specific policy and guidelines for technology usage and development. This approach further exemplifies the importance of executing technology evaluations that go beyond that of focusing predominantly on outcome measures.
An HFE approach is inherently interdisciplinary, employing methodologies from several knowledge domains including cognitive psychology, computer science, and organizational psychology. A human factors approach to systems development is also inherently collaborative and metric driven, requiring that users, developers, and subject matter experts work together throughout the research and development process. Measurable goals and objectives for design and usability as described earlier are also established from the onset of system development (Salvemini, 1998).
Human information processing models of user behavior are considered to analyze and improve the interaction between people, technology, and information; to develop user performance metrics and methodologies; and for developing recommendations based on findings. 3 Recommendations are developed in the form of specific user requirements and other design guidelines and potential standards for improving the effectiveness of the technology, which speaks directly to answering fundamental process evaluation questions. Information processing models assume that humans, like technology systems, have identifiable stages of information processing, from which data are passed serially or in parallel from one stage to the next. Of particular importance is the analysis of how technology users initially detect or sense information presented from the technology system (i.e., from data, signals, alarms, system messages, etc.), how that information is subsequently interpreted, stored, and recalled from the user’s memory, as well as the associated task(s) a user is required to perform as a result. According to Proctor and Vu (2006) an information processing approach (1) provides the foundation for much of contemporary cognitive science, cognitive neuroscience, and human-computer interaction and (2) uses common language and concepts to integrate concepts across different domains, levels, systems, and disciplines. Hence, this approach is particularly relevant to criminal justice technology research and the study of user interfaces (UIs).
The technology view of the GPS research evaluation in Figure 1 was translated and reillustrated from an HFE and information processing perspective. Figures 2 and 3 depict an information processing and HFE view of this same technology.

Human factors engineering view of global positioning system (GPS) depicting major user interfaces, research variables, and human information processing characteristics.

Organizational and other contextual factors that may impact global positioning system (GPS) technology research outcomes.
Figures 2 and 3, unlike Figure 1, emphasize the key research areas, overall research domains, and user interfaces for the evaluation. The term user interface denotes a flow of information between the user and system. Such information or data could be visual, auditory, as well as tactile, depending on the system and the types of controls and data entry devices used to operate the system. By considering information processing models, user and system behavior can be studied via a similar framework. More specifically, by speaking in terms of information or data flow, researchers can have a more integrated discussion of the technical and human/behavioral aspects (operating efficiency) of the technology system being evaluated. Not surprisingly, humans also have quantifiable limits as to the amount and range of visual and auditory information they can correctly attend to and process (Kondraske & Vasta, 1995; McBride, 2005; Woodson, 1981). Considering such human limitations in a technology evaluation, and to what extent the technology fails to consider human behavior and characteristics, and the consequences as a result, provides a more detailed view of how well the technical system is performing.
At least four categories of research questions and metrics were identified for consideration in the technology evaluation. That is (1) what are the characteristics for the UIs? (2) what are the system or technology areas of concern for those UIs? (3) what are the design characteristics of the information or data being transmitted across those UIs? and (4) what are the overall contextual and organizational factors within which the technology is used? As shown in Figure 2, four categories of UIs and user groups are denoted: (1) monitoring service staff, (2) offender supervisory staff, (3) the offender, and (4) the victim. Each user group has different potential variables of interest and interaction with the technology. Regardless of the user group, the design characteristics of the information presented to those users are key considerations as are those associated with “information design.”
Figure 3 depicts the technology parameters of interest for the evaluation as well as the contextual and organizational factors that could also affect the operational impact of the technology. Technology researchers need to be aware of organizational sources of human error. According to Bogner (1994), latent errors may also occur based on the delayed-action consequences of incorrect decisions made in the upper echelons of the organization system. These include decisions concerning design and construction of equipment, structure of the organization, planning, scheduling, budgeting, personnel selection, and training. Consideration of such factors provides a more systemic research view of the technology under study. More importantly it provides insight into the impact that procedural elements of technology system or policy may have on resulting outcome measures. By researchers, technologists, and practitioners having a common information processing framework to discuss the technology being researched, more collaborative and integrated discussions can commence earlier in the research process.
Guiding Principle 3: Adopt an Information Processing Approach to Study Procedural and Operational Issues
As with any complex technological systems such as GPS tracking, things can and do go wrong (St. John, 2013). An important goal in conducting criminal justice technology evaluations is to understand why a given technology is effective or not as well translating such findings into meaningful policy recommendations. This is challenging, given that the technologies used within the criminal justice system may entail several types of jobs and work activities, user groups, and also vary widely in terms of cognitive complexity. An information processing approach is also critical for evaluating technology with respect to procedural and operational issues. According to Wickens and Carswell (2012) “information processing lies at the heart of human performance. In a plethora of situations in which humans interact with systems, the operator must perceive information, transform that information into different forms, taken actions on the basis of the perceived and transformed information, and process the feedback from that action, assessing its effect on the environment” (p. 117). Such an approach can be used in a technology evaluation and can point the researcher to important human performance, operational, and procedural issues that impact the ultimate effectiveness of the technology.
Within the example of GPS supervision of offenders, information varies depending on the user group in question. As Figures 2 and 3 previously illustrated, such information is presented or displayed to the user via a variety of hardware and software devices, from other people, and from the environmental or usage context. For the user at the monitoring center, for example, data are presented via computer displays, mapping software, voice communications, instruments on a control panel (if any), and through direct and cross communications with other personnel and supervisors. Depending on the circumstance and the action(s) taken or not taken, researchers can then assess whether the action was appropriate or whether some type of error was committed. Decision-making processes can be captured diagrammatically using cognitive task analysis to depict the steps of the decision-making process as well as key informational inputs from the technology (Wei & Salvendy, 2004). Further depending on the type and severity of the error committed, this may directly impact the operational accuracy and efficiency of the technology and the system overall (U.S. Department of Justice, 2014). The error could also be a system or other error such as a decision-making error on the part of the user. Certainly in the case of someone at a monitoring center for GPS technology, their specific job is to detect and receive signals, messages, and alerts, correctly interpret these pieces of information, and take some type of action as a result. Judgment calls and decision making is central to the process and may vary by individual, organizational policy, and training experience.
Adopting an information processing approach is necessary to develop needed definitions, standards, and procedures which operators would use in the handling of critical system data such as in the case of alerts. For instance, the terms “event” and “alert” may differ from one vendor or service provider to the next. A distinguishing factor associated is how and where information is processed into useful information before being passed to an agency or an agency practitioner. For example, an event may be generated in a system due to loss of connectivity between an offender tracking device and the monitoring center, or in response to a zone violation (Harris, 2013). If the event is restored or cleared in the system before a predetermined threshold is exceeded, then the event will be ignored; otherwise an alarm or alert may be generated by the monitoring service provider and passed to the offender, victim, and/or supervising agency for their action. The exact conditions that lead agency actions to occur are dependent upon both agency policy in regard to data consumption, service agreements, and how the service provider processes events into alerts, alarms, or other useful information.
The notion of false alarms, particularly in the development of the research protocols, requires more precise definition and categorization. In the context of GPS offender monitoring, a general exclusion or mobile exclusion zone alarm can occur when offender and victim are within a certain distance of each other. It may be considered a “false alarm” if the victim is unequivocally in no danger, or the offender is not violating any terms of their orders. But the false alarm could be generated by an equipment malfunction resulting in an error in the location data of offender or victim, a processing error in the monitoring services that incorrectly defines the boundaries of where the offender may be, or the offender and victim are in proximity but legally so, such as a court appearance, or when transferring children under the terms of a custody agreement. While these may all be perceived as false alarms, they represent significantly different events and should be categorized appropriately based on frequency, importance, and severity. Such data would serve in part as one metric of how well the technology was actually performing and point to factors that may be degrading the effectiveness of the technology.
According to Marchand and Peppard (2013), any information-based initiative requires that the interaction between people and information be a central focus and that it is crucial to understand how people create and use information. According to the researchers, success for a given analytics systems is achieved in part by challenging and improving the information it uses and how decisions are made. An information processing approach can also be used to better understand specific user tasks and the design of a user’s overall job (McCormick, 1979; Morgenson, Campion, & Bruning, 2012). By focusing on such aspects in the technology evaluation, more specific design and/or policy recommendations can be generated. For example, depending on the goals of a particular technology evaluation, and based on core metrics and other human performance criteria, recommendations, design goals, and standards may be generated regarding job requirements, work conditions, and the performance requirements of other human and system tasks. Such information is directly beneficial to a practitioner and provides direct, measurable impact that can influence agency operations.
Koper, Lum, and Willis (2014) discussed important challenges in using technology in policing. According to the authors, challenges can arise during implementation and with functionality problems with new technology. Koper et al. (2014) advocate user participation in the implementation of the technology as well as pilot testing and collection of data from users that can be incorporated into its final design. This approach can aid in the identification and correction of technology problems before implementation and for determining its most effective applications.
Figure 4 depicts an HFE development approach taken from Salvemini (1999), which is consistent with many of the recommendations by Koper et al. (2014) when applied to technology development and testing.

Human factors engineering development and testing approach (Salvemini, 1999).
In this process, users are continually involved throughout design, testing, and deployment of the technology. An important early step involves an analysis of user behavior, capabilities, tasks, and the work environment. This information is then used to generate design requirements and for iterative design testing with technology users. Human performance data and feedback collected is then applied to improve the design of technology’s hardware, software as well as the technology’s operational, maintenance, and training procedures. For example, assuming testing involved an evaluation of the software interface used by operators of electronic monitoring technology, such human performance testing would reveal in part what level users are performing a range of critical tasks. More specifically, the type and number of errors committed by users could then be further analyzed to identify potential trouble areas for the technology. Problems identified could involve system design deficiencies, which are associated with human information processing errors. For instance, presenting too much information to users, or presenting information that is unnecessarily complex, or inconsistent. The results of this testing would enhance the utility of the research findings and generate specific design and process recommendations for improving the technology system.
Conclusions
This article presents a research approach for facilitating technology evaluations in the criminal justice system. This systemic approach, based on the disciplines of HFE and information processing, emphasizes the importance of analyzing the interaction between users, technology, and information at a procedural level. By including an analysis of human performance, procedural, and contextual factors as part of the technology evaluation, the present approach extends beyond that of traditional evidence-based research methodologies. A set of guiding principles is also presented for identifying and defining important study metrics as well as for facilitating communication within an interdisciplinary research team. Guiding principles can also inform the overall process for conducting criminal justice technology research as well as be integrated into an organization’s research process (Secret, Abell, & Berline, 2011). The guiding principles described here are particularly important in the early steps in the overall research process. This involves executing a critical initial analysis phase wherein researchers clearly define and understand the technology prior to any significant application or integration of social science methods. Only after the proposed guiding principles are addressed can an outcome evaluation take shape. This would have a direct impact on the technology researcher’s determination of the most appropriate research questions and designs in evaluating a given technology and its subsequent impact. Additionally, this phase has direct implications on the feasibility of conducting rigorous technology outcome evaluations and the ultimate utility of the research findings for practitioners and policy makers.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
