Abstract
The current cognitive engineering literature includes a broad range of models of human–automation interaction (HAI) in complex systems. Some of these models characterize types and levels of automation (LOAs) and relate different LOAs to implications for human performance, workload, and situation awareness as bases for systems design. However, some have suggested that the LOAs approach has overlooked key issues that need to be considered during the design process. Others are simply unsatisfied with the current state of the art in modeling HAI. In this paper, I argue that abandoning an existing framework with some utility for design makes little sense unless the cognitive engineering community can provide the broader design community with other sound alternatives. On this basis, I summarize issues with existing definitions of LOAs, including (a) presumptions of human behavior with automation and (b) imprecision in defining behavioral constructs for assessment of automation. I propose steps for advances in LOA frameworks. I provide evidence of the need for precision in defining behavior in use of automation as well as a need for descriptive models of human performance with LOAs. I also provide a survey of other classes of HAI models, offering insights into ways to achieve descriptive formulations of taxonomies of LOAs to support conceptual and detailed systems design. The ultimate objective of this line of research is reliable models for predicting human and system performance to serve as a basis for design.
Keywords
Motivation
With the advent of sophisticated unmanned systems (UAS; e.g., Lewis, 2013) and highly automated automobiles (National Highway Traffic Safety Administration [NHTSA], 2013), the design of automation, control modes, and interfaces has broader impact at work and in everyday tasks. Commercial UAS manufacturers as well as the automotive industry have great aspirations for automation. The FAA predicts the unmanned aerial vehicle industry will be worth $90 billion in a decade (Gent, 2015). Ford recently announced a high production autonomous vehicle by 2021 (Ford, 2016). One of the main challenges in human–automation interaction (HAI) research has been developing models for predicting human and system performance to support design. To address this challenge, there have been many classes of models developed in the literature, including computational models of decision making in HAI (Kirlik, Miller, & Jagacinski, 1993), “look-up tables” referencing potential automation outcomes (Endsley & Kiris, 1995; Kaber & Endsley, 1997; Parasuraman, Sheridan, & Wickens, 2000), finite-state models of user and machine behavior (Degani & Heyman, 2002), network models of human judgments when interacting with automation (Bisantz & Pritchett, 2003), mathematical models of reliance in HAI (Gao & Lee, 2006), and computational operator function models with state transitions (Bolton & Bass, 2011), for characterizing errors in use of automation.
Despite this range of models, taxonomies of levels of automation (LOAs) have predominated in influence through the literature, possibly because of handiness and/or utility as a starting point for the systems design process. Such models are also now realizing use by a diverse range of policy groups and federal agencies, particularly for advanced vehicle automation. For example, the NHTSA recently adopted a new taxonomy of automation for automated vehicles developed by the Society of Automotive Engineers (SAE, 2014; see Table 1). This taxonomy has a focus on the system design issue of who (the driver or automation) is responsible for what vehicle functions and lists LOAs from manual driving to driver assistance with one vehicle control task, partial automation of multiple control tasks, automation permitting human intervention, and automation of all vehicle control tasks under all manageable conditions. The SAE has emphasized that the object of automation in this taxonomy is driving tasks and not vehicle subsystems. From this perspective, a tight taxonomy of LOAs was produced with one level being closely related to the next in terms of variations on function allocations among the driver and assistance systems. It is also important to note that the SAE identified the set of driving tasks to be automated based on capabilities of current vehicle technology; that is, the taxonomy only covers operational and tactical task performance (occurring in real time) versus strategic task performance, such as driver goal formulation, which extends beyond automation functions. Driving modes represent different levels of demand within the driving tasks (e.g., expressway merging, high speed cruising, etc.) that may or may not be manageable by identified assistance systems. The list is intended to provide a basis for classifying manufacturer technologies and future system design efforts and supporting highway regulations. It is also considered as a key concept for increasing LOAs of vehicles. Such observations on the use and potential influence of automation look-up tables are not to imply validity but rather to indicate that adoption as a basis for design is expanding. One question is whether these are the right models to support design given our technological future.
Levels of Driving Automation for On-Road Vehicles
Note. Represents automation of dynamic driving tasks but not strategic tasks (e.g., goal formulation), as addressed by the human driver.
In general, models of HAI need to accurately account for actual human behavior with automation as well as describe real system outcomes. Such accounting is critical to support system requirements specification beyond the conceptual design phase and for engineering practice. Related to these needs, it is possible that current taxonomies have made assumptions about human information processing that may not be completely accurate relative to human use of automation, and consequently, expectations for specific LOA outcomes may be inaccurate. However, this is not to say that in practice (or in specific systems) engineers have not been able to associate certain HAIs with specific LOAs.
As an organizing framework for the paper, I will initially review origins and advances of taxonomies of LOA as well as conceptual/theoretical criticisms of the concept. This review is not intended to be comprehensive; select coverage is provided to make specific points, and the review represents my interpretations of the literature. This will be followed by a brief additional review of empirical assessments of LOA taxonomies as a basis for objective identification of potential conceptual issues and the need for future research. I then settle on two specific concerns and provide some empirical evidence. This is followed by identification of a potential approach to further refinement of the LOA concept as well as identification of existing HAI modeling research that provides some insights to support the approach. In general, I advocate for further development of the LOA approach to enhance predictive utility of existing frameworks versus abandoning a handy existing approach supporting conceptual system design without similar scrutiny of other alternative approaches addressing a similarly broad range of automation design problems.
Review of LOA Origins and Conceptual Issues
The research challenge of defining types and levels of automation in human-in-the-loop systems has been considered very seriously for quite some time. One early example can be found in Fitts’s (1951) MABA-MABA (“Men are better at; Machines are better at”) lists. Fitts’s intent was to provide a basis for making appropriate allocation of system functions to a human or machine for effective coordinated system performance. Fitts’s work was followed by several seminal studies on impacts of supervisory control and adaptive function allocation scenarios on human performance (e.g., Jordan, 1963; Rouse, 1988) with the objective of developing an empirical basis for the utility of human-machine coordination in various contexts. This research was interspersed with other work on how HAI could be specified for complex systems for which fully autonomous capability was not possible. One of Sheridan and Verplank’s (1978) objectives was to “develop taxonomic . . . models of man-machine interaction” (i.e., achievable LOAs) in the context of teleoperations to provide a greater engineering basis. They described different ways in which a decision could be made and implemented by the coordinated actions of a human operator and computer, constituting different LOAs. This work served as the foundation for several subsequent taxonomies of LOAs (Endsley, 1987; Endsley & Kaber, 1999; Endsley & Kiris, 1995; Kaber, 1996; Kaber & Endsley, 1997, 2004; Parasuraman et al., 2000), including reformulations of levels based on broader types of information processing automation. For example, Mica Endsley and I considered “monitoring” (system states), “generating” strategy options, “selecting” an “optimal” strategy, and “implementing” the chosen strategy to represent general and pervasive stages of information processing in human-automated systems (HAS). We systematically formulated assignments of these functions to human and machine (based on capacity estimation) to identify various LOAs with greater frequency of substitution and functional complexity of automation, indicating “higher” LOAs. Subsequently, Parasuraman et al. (2000) identified types and levels of automation based on stages of “pipeline” models of human information processing (e.g., Wickens, 1992, p. 17), including the functions of information acquisition, information analysis, decision making, and action implementation, as well as the assumptions of cognitive behavior implicit to those models. Their LOAs derived from an earlier taxonomy of LOAs by Wickens, Mavor, Parasuraman, and McGee (1998) as well as Sheridan and Verplank’s (1978) taxonomy. In addition, they presented a systematic process by which to select and implement LOAs while accounting for human performance outcomes, automation reliability, and risks of automating. Two notions emphasized in this work included (a) the importance of considering how a function or behavior of an operator is transformed (not supplanted) by automation such that operators in turn need to adapt to new technology and (b) the idea that LOAs identified or selected in conceptual and detailed design phases may change during system operations in response to unanticipated events. For example, during a process control operation, in the event of sensor detection of overpressurization of a flow line, computer control of a feed pump (to the line) may automatically switch to operator control to allow for pressure reduction along with productive flow maintenance. Parasuraman et al.’s iterative LOA (re)specification process provides one way of accounting for the need for operator adaptation or accommodation in such novel situations.
Some LOAs Should Not Be Used
However, several contemporary HAI researchers (e.g., Dekker & Woods, 2002; Kirlik, 1993) have expressed concerns with taxonomic or discrete approaches to defining levels or modes of automation for human use. Early on, Kirlik (1993) offered that under certain multitasking conditions, operators of complex aviation systems may rationally avoid using particular types or levels of automation. He identified the burden on the pilot in interacting with and monitoring advanced automated aids at specific LOAs; that is to say that LOA taxonomies also need to account for operational issues. However, this notion is akin to the concern identified by Parasuraman et al. (2000) that automation should be designed with sufficient flexibility to allow the human to adapt in near real time.
LOAs Don’t Address the “Right” Design Question
Less specific than Kirlik’s (1993) concern, Dekker and Woods (2002) contended that the objective of determining “who” (human vs. machine) does “what” (function) in complex systems control did not serve to advance HAI design but rather the most critical design need is to facilitate human and automation coordination (i.e., “How do we make them [the human and machine] get along?”). Like Parasuraman et al. (2000), Dekker and Woods reiterated that automation necessarily changes an operator’s role. The solutions that Dekker and Woods offered were focused on methods for interface information presentation, with reference to Christoffersen and Woods’s (2002) work (events, patterns, prediction). Christoffersen and Woods emphasized the need for automation activities to be observable by humans, exploit human sensory capabilities, and allow for ease of human direction of automation functions. Ironically, such methods of HAI must ultimately be objectively evaluated by engineers to be operationally effective, likely leading to additional recommendations on interface design and/or function allocation, for example, due to technological limitations of automation. Whereas Woods and colleagues have argued that it is most important to work on facilitating coordination between people and automation, in practice, engineers cannot escape the initial need to define and evaluate function allocations across people and machines in the systems design process. Thus, in my mind, the recommendations of Dekker and Woods actually bring us full circle to Fitts’s (1951) question of “who does what” and that any function allocation decisions may be “fine-tuned” based on the nature of the HAI.
LOAs Are Too Coarse to Be Useful
Where Dekker and Woods (2002) took issue with what research questions we should be asking in HAI research, Pritchett, Kim, and Feigh (2014) raised concern with the nature of function allocation schemes and whether taxonomies of LOAs are actually useful for classifying real systems and providing a framework of sufficient resolution (among other concerns). Feigh and Pritchett (2014) said that particular complex system function allocations do not always establish a clear-cut division of work between the human and machine. They contended current machine and technological capabilities cannot always be neatly classified into existing taxonomies of automation. Therefore, the LOAs identified in the literature may have limited utility for analysis or as a basis for design, though I contend that the breadth of definition of LOAs could be manipulated to address this issue. Related to this, Pritchett et al. observed that some existing LOA definitions are coarse and do not capture different ways in which automation can be implemented/presented within the same LOA and consequently, potential variations in performance. If variations in automation implementation are limited to interface design, look-up tables can be expanded (in sublevels, etc.) to present a finer-grain classification of systems within a particular LOA that uniquely differ in the ways of automation presentation. However, if variations in implementation are not limited to interface design but are functional in nature, then such variations constitute deviations from the LOA specification; namely, differences in function allocations actually represent different LOAs and not simply changes in automation presentation within a LOA. Such a situation would undermine the utility of any look-up table from a design perspective.
LOAs Do Not Identify Responsibility for System Outcomes
Beyond these issues, Pritchett et al. (2014) also identified practical concerns regarding implementation of taxonomies of LOA in design, albeit select models from the literature (not including that which Mica Endsley and I developed). They contended that prior definitions of LOAs have not addressed human and automation responsibility for system outcomes and that such considerations need to be made in terms of defining function allocations and for “team play,” particularly when a human operator remains responsible for the automation outcomes. However, the need for measurement of system performance outcomes as well as implications of use of automation on risk and reliability were previously identified by Parasuraman et al. (2000) as critical bases of making dynamic LOA selections beyond the conceptual and detailed system design phases. Parasuraman et al.’s model laid out a feedback process to account for performance outcomes in answering the function allocation question. Thus, some processes have been developed to reflect human performance and automation reliability in LOA selection.
Other Conceptual Criticisms of the LOA Approach
Beyond the aforementioned concerns, in several recent papers, Bradshaw, Johnson, and colleagues (Bradshaw, Hoffman, Woods, & Johnson, 2013; Johnson et al., 2011) have criticized the LOAs concept, most dramatically identifying “seven deadly myths of autonomous systems.” Some of the criticisms/myths raised by Bradshaw et al. (2013) also appeared in the report of the Defense Science Board on The Role of Autonomy in DoD Systems (DSB; Department of Defense Defense Science Board, 2012). Among other broad sweeping criticisms, Bradshaw et al. contended that the LOA concept is not scientifically grounded and that it is not useful for development of “autonomous” systems. A few specific dimensions of their “myths” were that the existing taxonomies of LOAs assume functional substitutions of humans and automation to be equivalent in terms of system operations and that substitution of automation (for the human) does not otherwise affect human behavior. (This “substitution myth” also appears in the DSB report; Department of Defense Defense Science Board, 2012.) Related to this, Johnson et al. (2011) lumped the Parasuraman et al. (2000) work together with other LOA studies, which they suggest have overlooked functional differences among agents in defining LOAs. In fact, similar concern actually motivated the prior Parasuraman et al. discussion of the need to iterate on the selection of LOAs throughout a system life cycle. Furthermore, if assumptions like homogeneity of agents or a lack of influence of automation substitution on human behavior had been part of the prior LOA research, then there would be no motivation for the empirical assessments of LOA effects on human performance (e.g., Endsley & Kaber, 1999).
Although there are several aspects of Bradshaw et al.’s (2013) myths, there is a fundamental issue underlying all the “myths”: The LOA concept refers to function-specific automation, whereas Bradshaw et al. address “levels of autonomy.” Autonomy and automation are completely different animals in my mind. As Bradshaw et al. note, autonomy implies agent viability, independence, and self-governance. Among these characteristics, only viability within a specific environment is relevant to function-specific automation. Autonomy of mechanized and computerized systems is a lofty and attractive goal for many types of systems but is often technologically or economically infeasible. For these very reasons, Sheridan and Verplank (1978) defined LOAs to best exploit computers for human aiding: They weren’t referring to “autonomous” systems! In general, Bradshaw and colleagues (2013; Johnson et al., 2011) reviewed a corpus of research focused on one construct (i.e., automation as a technology) yet made criticism from the perspective of another (i.e., autonomy as a state of being). I think this problem is fundamental to the majority of criticisms raised by Bradshaw et al. and renders many of the identified myths essentially invalid.
Synopsis of Inferences on LOA Origins and Conceptual Issues
Having said all this, any design references that we develop should effectively support designers and engineers. Referring back to Kirlik’s (1993) research, I agree that some LOAs may not be effective in terms of function allocations among humans and agents given particular operating contexts or task demands (thus, the motivation for taxonomies of LOAs to address various types of systems). Related to this, like Johnson et al. (2011), I think that system designers need to make detailed consideration of task context as well as how humans and machines may interact in parallel in complex system operations to describe and model approaches to teamwork for promoting effective interface design. Although addressing such complexity in cognitive and automated work systems analysis and design may be necessary, I don’t see prioritizing feasible function allocations (in advance of teamwork analysis) as a form of “reductive thinking.” Further, it is not clear how an engineer can begin to identify the “things the human depends on the computer for” or the “things the computer depends on the human for” until he or she knows what things the human and computer (each) may be doing (see Figure 3 in Johnson et al., 2011). Therefore, I believe the LOA approach actually addresses the primary question in automated systems design (“who” is doing “what”), and we can build on this approach for addressing the other questions of “how do we make the human and machine get along” (e.g., Dekker & Woods, 2002). Similarly, the new “autonomous systems reference framework” proposed by the DSB (Department of Defense Defense Science Board, 2012) also begins with a focus on “design decisions on explicit allocation of [system] functions and responsibilities between human and computer to achieve specific capabilities” (p. 4). The approach then advocates for coupling decisions involving high-level automation tradeoffs with development of feedback methods to make automation and system states highly visible. Ironically, despite the dramatic negative position of this work with respect to the LOA approach, the novelty of the new framework appears to be mainly in contextual considerations, such as mission phase and echelon. Finally, it is important to acknowledge here that Pritchett et al.’s (2014) argument regarding the LOAs concept being “too coarse” is likely relevant for many types of existing systems and could be frustrating for designers in application. On this basis, I summarize that the LOAs concept needs additional attention to provide detailed guidance for designers and engineers.
Brief Review of Empirical Findings on LOAs
It is also important to cover empirical analyses of the LOA approach to identify specific, objective research issues. Many individual experiments have quantified operational impacts of LOAs on human performance, workload, and situation awareness (e.g., Endsley & Kaber, 1999; Endsley & Kiris, 1995; Kaber, Onal, & Endsley, 2000; Manzey, Reichenbach, & Onnasch, 2012; Rovira, McGarry, & Parasuraman, 2007; Wright & Kaber, 2005). Most of these studies have been conducted in lab settings using multitasking simulations to develop descriptive accounts of human and system responses to LOAs to detail frameworks with outcomes that engineers can expect when designing for a specific LOA. In addition to the individual experiments, several meta-analyses have been conducted on empirical results of the LOA approach (Kaber et al., 2009; Onnasch, Wickens, Li, & Manzey, 2014; Wickens, Li, Santamaria, Sebok, & Sarter, 2010) with the objective of identifying common trends in performance, workload, and situation awareness across levels or specific performance phenomena when switching among LOAs to better support systems design.
Starting with the individual experiments, situation awareness, performance, and operator workload outcomes have not always been as expected. For example, in an early study (Endsley & Kaber, 1999), we anticipated a broad range of intermediate LOAs, blending human and machine authority in different ways, to lead to greater operator system awareness (due to task involvement) and performance through use of automation. Results revealed that only those intermediate levels closest to manual control and imposing higher operator workload produced expected gains in performance but without corresponding increases in operator situation awareness. In contrast, upper-intermediate modes of automation, including supervisory control, led to gains in situation awareness but without anticipated performance increases. (Here, I use general references to “low,” “intermediate,” and “upper” level automation to convey the extent of functional distribution of automation as well as complexity. As Johnson et al. [2011] have suggested of other similar works on taxonomies of LOAs, I make no assumption of consistent intervals of increase in the distribution or complexity of automation from one level to another.) In some cases (Kaber et al., 2000), we also found situation awareness to be lower under certain levels of intermediate (decision support) automation than manual control or full automation. A consistent effect has been declines in workload with increasing human-system automation. Here, it seems important to state that this empirical work was aimed at further characterizing potential effects of specific function allocation schemes for designers. That is, unlike Dekker and Woods’s (2002) contention that one might select a human-machine coordination scheme and “abracadabra,” a set of expected outcomes will ensue, we sought to identify exactly what might be the outcomes an engineer could expect from possible human-machine combinations.
With respect to the meta-analyses on empirical LOA research, Wickens et al. (2010) and then Onnasch et al. (2014) thoroughly integrated experimental results on types and levels of automation, as documented in the literature. Despite some of the unexpected results I identified previously, they observed trends of increased performance and reduced workload with increasing automation across 18 different studies. They also found degraded SA occurring with more automation across studies. Their integration of experimental data and analysis also led to the “lumberjack” concept, that is, the intuition that greater human dependence on automation leads to greater performance problems on return to manual control under automation failures. This concept (and the supporting evidence) represents one of few descriptive (evidence-based) “models” of LOA effects on human performance appearing in the current literature. Some might describe the concept as an empirical regularity supporting predictions of what might happen with specific LOAs. However, I don’t see this as being any different than a model for system design as all models are abstractions of planned or observed system behaviors, which are intended for extension to novel operating conditions. Related to this, in another earlier integrated analysis of LOA effects on human performance, as a basis for recommendations on HAI in the life science domain (Kaber et al., 2009), we made similar observations that LOAs could be used to reduce operator workload during normal system operations but that operator involvement was critical to promoting error handling during periods of high demand.
Some Factors Contributing to Variability in LOA Results Across Studies
In general, the results of the individual LOA experiments on operator performance and situation awareness and a lack of clear response trends can be frustrating for designers. The results also suggest that some of our intuitions about automation and human performance are either imprecise or incomplete. One might say that a lack of consistency in these results could be attributed to experiment design issues, such as sample size and sensitivity, stimulus frequency, exposure durations, and feedback mechanisms. However, this explanation seems less likely given the plethora of empirical HAI studies that have been published since Fitts’s (1951) lists with very well structured experiment designs (e.g., Endsley & Kiris, 1995). Second, there is the possibility that, as Feigh and Pritchett (2014) note, many issues associated with LOAs or function allocation schemes may only be observed through the dynamics of simulation or actual operations. They suggested that such situations are most likely to occur when human and automation interactions are tightly coupled, as was the case in our prior empirical work. Feigh and Pritchett (2014) commented that any particular function allocation may appear to work well in a context with one temporal dynamic and yet be inappropriate in another context. Related to this, Wickens et al.’s (2010) bottom line to their meta-analysis was that expectations of automation effects on human information processing are often modified by many contextual factors.
Humans May Not Behave as Researchers and Systems Designers Expect
Another more likely explanation for deviations of automation-induced human performance outcomes from expectation may be that existing models of types and levels of automation (e.g., Kaber & Endsley, 1997; Parasuraman et al., 2000) are presumptive in their concepts of how human operators of complex systems actually behave. (Here, I use the term presumptive vs. normative as designer/engineer expectations of user behavior with automation are not necessarily equivalent to how users “should” behave relative to some defined criterion.) More specifically, models of human information processing have been referenced as bases for identifying system functions that can be performed by a human or computer (i.e., LOAs), and these models make assumptions about nominal human cognitive processing, including attention, perception, memory use, planning, and decision making. However, such models and consequently researcher expectations may not sufficiently account for human interaction with advanced automated or “unmanned” systems, which represent recent technological advances in sensing and processing capabilities. Such systems were likely not considered as interactive technologies in the formulation of earlier models of human information processing in human-in-the-loop systems. Related to this, Pritchett et al. (2014) observed that efforts of HAI researchers to establish system-expected outcomes for various LOAs may be compromised by the fact that humans will not always do what engineers expect them to do (nor is all behavior normative). Consequently, attempting to link rigid engineering definitions of function allocation to system performance may be mediated by operator behavior issues. This particular issue seems to be a major concern with respect to the validity of existing and any future models of automation in human-in-the-loop systems for use in design.
As an example, in Endsley and Kaber’s (1999) definition and modeling of the various LOAs in their taxonomy as part of the Multitask simulation, the capabilities of the automation functions were based on how human operators would be expected to perform the same functions (i.e., monitoring, generating, selecting, and implementing). The 10 levels of the taxonomy represent a “feasible” subset of 81 possible combinations of three forms of human-computer allocation to four general system functions. Assumption of greater human information processing capabilities (or equivalence of human and automated agent capabilities) might have led to a broader feasible subset of LOAs. (Even the early taxonomy of LOAs by Sheridan and Verplank [1978] made assumptions about “decision sub-elements” that could be addressed by a human and/or computer; see pp. 8–15, Table 8-1.) As another example, although Parasuraman et al. (2000) presented an approach by which to account for specific human performance issues (e.g., complacency, vigilance decrements, loss of situation awareness, etc.) in selecting types and levels of automation, from a design perspective, their identification of “what should be automated and at what level” (e.g., information acquisition, information analysis, etc.) made implicit and untested assumptions about human capabilities and how people behave when provided with automation aids.
The possible roles that can be prescribed to either the human or automation in complex system operations is dependent on underlying expectations of capability and performance. In addition, the Parasuraman et al. (2000) prescription of, for example, what support automation provides for each stage of information processing and so on, is also based on nominal notions of human behavior in the same role. These implicit assumptions, or the underlying foundations of LOA, in regard to human information processing and associated behaviors under different LOAs may not be accurate (or sufficiently justified), particularly if model predictions are not in line with empirical data (Wright & Kaber, 2005).
Definitions of Human Behaviors/Performance Consequences Are Imprecise
Beyond the presumptive modeling problem, the precision of existing definitions of human performance constructs used to assess LOA models is also critical. In their research, Dekker and Woods (2002) considered many of the human behavioral issues previously referenced in formulating LOAs or used as bases for quantitatively assessing automation technology and interface design (e.g., complacency, situation awareness, etc.) to be meaningless constructs or notions based on “folk” models of human performance. Parasuraman, Sheridan, and Wickens (2008) very strongly argued against these contentions by offering evidence of situation awareness, mental workload, and trust as being empirically verified cognitive constructs and providing useful bases for systems design and engineering. Although I agree that there is a scientific basis for complacency in human performance (e.g., Parasuraman & Manzey, 2010; Parasuraman, Molloy, & Singh, 1993), situation awareness (Endsley, 2015; Endsley & Kaber, 1999; Parasuraman et al., 2008), and trust (Lee & See, 2004), a point to take away from Dekker and Woods (2002) is the need for additional precision in defining such constructs of behavior and performance for automated systems analysis. From an engineering perspective, greater construct validity means more accurate assessments of human performance implications of automation, greater sensitivity for identifying common response trends among LOA implementations, and in turn, more accurate system models and design methods. That is to say, the observations of deviations of human responses to LOAs from expectation may also be attributable to limitations in current operational definitions of these responses.
Summary and Implications for Additional Research
As noted with respect to the burgeoning automated automobile industry, new taxonomies of LOAs have been formulated (e.g., SAE, 2014), and it is unlikely the concept of LOAs is going away any time soon. At this stage, I don’t think the main issue with LOA research is whether we are asking (or attempting to address) the right questions (i.e., what should be automated, to what extent should automation be applied to complex system functions, when should automation be applied, how should automation interact with humans); I think we have identified the right set of questions in one way or another. What seems important is how we prioritize outstanding modeling issues and advance systems design.
As in other related fields of research (e.g., human-computer interaction), it is likely that advances to valid and predictive models of HAI will come through incrementally building on prior research (cf. Norman & Verganti, 2013). Whereas if we want to “start over” in HAI modeling, then a return to other models of human information processing would likely be necessary with consideration of how the nature of human and automation communication defines capability for performance of functions. I think the practical foundation for taxonomies of LOAs (characterizing human and automation capabilities) is clear; however, some may argue with the specific definitions of LOAs and a lack of consistent evidence in outcomes. Regarding definitions of LOAs, of course, any taxonomy is a social construct, just like many other contemporary human factors theories. The important point is whether the concept/theory serves to further explain or account for system variability (as Parasuraman et al., 2008, contended), and the existing meta-analyses on the LOA approach demonstrate this. Thus, I think incremental advances in the LOA approach may be worthwhile for supporting the looming broader HAS design practice.
Based on the previous review, there is a general sense that some researchers are not satisfied with the concept of LOAs and that there is a need to think deeper about how to ensure validity and utility of taxonomies of LOAs as bases for systems design. Fundamentally, the main concern is the accuracy of models providing predictions of specific human behaviors and performance with automation. Issues here include (a) developing further understanding of exactly what humans do when they use automation and (b) coming to additional precision in definition and measurement of behavior. In the remainder of the paper, I address these issues. First, I offer some evidence of the contention that LOAs may not completely articulate what humans will actually do with automation. I then address the need for greater precision in identifying behavioral outcomes of HAI design, which promises some advance in LOA frameworks as bases for systems design.
Need for Descriptive Models of HAI
Some contemporary research studies on HAI have presented methodologies and results that may point the way forward on how to make LOA taxonomies more descriptive in nature. In general, these studies include applications of general human performance modeling techniques for description or assessment of specific automation configurations. They have made use of various tools, simulations, and data sets for this purpose. Bolton and Bass (2011) used an enhanced operator function model language to represent human behavior in interaction with a patient-controlled analgesia pump as well as error behaviors, as predicted by Reason’s (1990) taxonomy of errors. That is, the authors developed models of HAI and performance predictions by considering a well-accepted theory of how human errors arise. The formal error models were based on predictions of how people make slips and mistakes in medical pump use. Such error state modeling can be considered as a descriptive approach to investigating HAI. This work represents an important contribution in advancing formal models of human behavior for characterizing the impact of automation use on errors. The work also provides a basis for developing additional formalisms based on actual observed slips and mistakes in technology use and assessing the utility of such models for predicting performance.
As another example of contemporary descriptive HAI modeling approaches, Degani and Heyman (2002) explored the utility of formal models for assessing relationships between user models of automation, automated system interfaces, and actual system capabilities. They used finite-state representations of expectations of user behaviors under various conditions. The research provided a quantitative approach for determining the degree of correspondence between machine states and user internal models of states as a basis for identifying adequacy of such models and system interfaces to support effective HAI.
Gao and Lee (2006) developed a quantitative model of reliance on automation to predict user preference for manual and automated control in supervisory control situations based on projections of trust. The mathematical model was defined based on subjective expected utility theory and the subjective utility of operator action in interacting with automation. A simulation of the modeling approach was validated against empirical findings on operator reliance in actual automation use. Furthermore, the authors demonstrated how types of automation mutate operator behavior (in this case, measured in terms of trust).
The previous brief survey illustrates that there are some formal HAI modeling approaches that have demonstrated potential to provide descriptive representations of human behavior. However, most are case studies and applications that have been limited to simulations or lab-based experiments. Beyond this, there are some aspects of the modeling approaches that are constraining in terms of types of behaviors and automated system operations that can be represented. However, some aspects of these approaches could be considered bases for extending taxonomies of LOAs to better account for actual human behaviors in automation use. For example, the formal modeling approach developed by Bolton and Bass (2011) could be used as a tool for verification of the implications of specific LOAs on human performance in a range of applications and task conditions, thus supporting the accuracy and precision of automation look-up tables. Although Degani and Heyman (2002) did not use an actual HAI data set for demonstrating their modeling approach, the method has potential to characterize actual user behaviors with types of automation given the limitation of discrete machine states and user behaviors. Aspects of the Gao and Lee (2006) modeling approach could be adopted for characterizing the influence of a broader range of human behaviors (e.g., complacency, satisficing, etc.) on performance with different types of automation and projections of the utility of specific LOAs relative to specific behaviors.
Need for Precision in Identification and Assessment of Behavior With Automation
Further, it is important to address the issue of what humans actually do when using automation. A key behavioral issue that appears to have received little consideration in existing models of HAI is the potential for satisficing (Simon, 1955) when using automation to achieve task goals. This issue may sound similar to addressing complacency in HAI, and thus, it is important to precisely define and differentiate these behaviors. Complacency is considered as a lack of suspicion of states of a system in the presence of limited awareness of modes of operation; that is, the system operator is satisfied with performance but may lack awareness of other safer or more efficient methods of operation. However, in most advanced automated, human-in-the-loop systems, we entrust highly skilled and often highly experienced operators for management. As we know, experts are not immune to complacency (Parasuraman & Manzey, 2010), but rather, such behavior (by definition) should be more common among novices. Related to this, exposure to “failure free” (or highly reliable) automation can lead to lower expectancy for failure even among experts but more likely for novices. On the other hand, satisficing is the behavior of accepting a most accessible or readily identifiable operational solution that meets some minimum level of performance aspiration and can be common in work that has become routine for experts. It can represent an aversion to effort with knowledge of risks or potential losses for the purposes of efficiency in operation. Satisficing has also been identified as the “good enough” cognitive heuristic that can mediate how operators use automation (see following detail on Kaber et al., 2013). Simon (1955) contended that rational decision-making processes posed higher information processing requirements and thus humans rely on use of the heuristic to reduce cognitive load and speed-up performance. That is, satisficing is a behavioral tendency motivated by desires for workload reduction and efficiency in operations, but it is not unique to HAI (and the same goes for complacency). However, in the use of automated systems, those types and levels of automation posing additional information processing requirements on operators may lead to workload and operator switching to other more direct (but less optimal) modes of control to achieve desired systems states. In general, it is possible that satisficing methods may work under many operational circumstances without system losses; however, such behavior will not work in all cases, particularly high-demand, off-nominal conditions. Therefore, use of cognitive heuristics should be considered as leading to less reliable and potentially risk-prone performance when used in conjunction with various types of automation, and such behavior should be accounted for in characterizing implications of LOAs.
Evidence of the Satisficing Problem in HAI
An example of satisficing behavior in HAI can be found in a high-fidelity flight simulation experiment that I was involved in a few years ago (Kaber et al., 2013). In this study, we observed reliance of experienced pilots on advanced navigation automation and use of the “good enough” heuristic in vehicle flight path control.
With a sample of 15 active fixed-wing pilots with at least 15 years of line experience (mean of 14,895 hours), we presented a vertical take-off and landing (VTOL) scenario and simulated a failure of the aircraft-based navigation system. All pilots were extensively trained to criteria on manual controls and ground truth displays by an expert Apache pilot. The failure was reflected in both the common navigation display (ND) course line as well as a “highway in the sky” (HITS) tunnel display superimposed on an advanced primary flight display (PFD) under different experiment conditions. The failure caused the course guidance (course line, tunnel) to diverge from the raw (sensor) data on the localizer (LOC) during approach, as shown by a course deviation indicator. The tunnel dramatically drifted from the desired flight path.
In general, pilots accepted deviations of actual aircraft trajectory from the LOC 94.8% of the time when using the ND course line and 75.7% of the time when using the tunnel. These features were most visually accessible and usable for the pilots. In many cases, the LOC was “pegged” to ±3 dots of deviation from the desired trajectory (minimum and maximum extent of reception of simulated ground-based radar data) when we had to terminate test trials. Although the tunnel promoted pilot detection of failures compared to use of the ND course line, pilots focused on the automated guidance features (the course line on the ND and the HITS tunnel) versus the sensor data (LOC), leading to potentially unsafe flight conditions.
All the participants in our study were well aware and highly trained on forms of aircraft guidance and raw sensor data. In general, it is unlikely that simulator performance was limited by pilot knowledge. The pilots were not unaware of the aircraft technology, nor did they lack suspicion of the state of the vehicle, but rather they appeared satisfied with the most accessible information presented to them (the ND course line or HITS tunnel), and it met their level of aspiration in aircraft control from moment to moment. However, an extra effort to check the LOC, one of the most common features of an advanced PFD in the commercial cockpit, would have led to immediate recognition of gross course deviations from path. The observed pilot behavior, involving a simpler task procedure with the ND course line or HITS tunnel, was indicative of aversion of effort with knowledge of availability of likely more effective and/or safer control actions, namely, satisficing. Although some other explanations of the observed responses were provided by Kaber et al. (2013), in regard to the navigation error detection, the occurrence of satisficing behavior seems highly likely.
Some might contend that “automation bias” (Skitka, Mosier, & Burdick, 1999) was at play in the pilot performance; however, in my view, such bias is not a “root cause” of human performance errors. Compelling automation design features, like the HITS, may exacerbate individual behavioral tendencies for complacency and satisficing, but I don’t think the situation is the other way around; that is, automation design does not cause complacency and satisficing, otherwise they would be unique behaviors to HAI.
Need to Account for Satisficing in HAI
Some research has addressed satisficing behavior in concepts of human interaction with complex automated systems. Hollnagel (1998) identified nonstandard behaviors, such as complacency and satisficing, in defining different cognitive control modes (CCM), including a strategic mode typically applied when interacting with automation, a tactical mode depending on the use of heuristics and rule-based behaviors (involving some satisficing), and an opportunistic mode relying heavily on satisficing behaviors such as applying immediate responses to an environment under demanding conditions. In the tactical CCM, for example, operators may intentionally make use of simpler task/decision heuristics for further optimizing performance along other demands or overall system performance. In general, depending on immediate performance objectives, task demands, and accessible resources (affordances), the CCMs represent selection behaviors with outcomes that may be considered nonoptimal but can be highly efficient and adequate for certain operations. This type of precision in human behavior specification (by Hollnagel, 1998) and outcome identification is needed to support formulation of effective systems design approaches and to support assessment.
Related to Hollnagel’s (1998) research, Feigh (2011) used the CCM approach as a foundation for decision support design for airline operational managers to support different patterns of cognitive behavior under each mode. Among several issues, the decision support strategies were directed at minimizing user effort (cycles in the tactical mode; see Feigh, 2011, Table 1) and providing comparative references on decision alternatives (in the strategic and tactical modes). The intention of the tool was that a decision maker could select different interface modes during system operation as they felt would best support their CCM. The approach was successful, and results revealed superior system performance when the interface mode mirrored the CCM. Having said this, Feigh (2011) assumed the capability to identify and characterize different operator decision behaviors, and the decision support design recommendations for each mode were based on these assumptions, including the occurrence of satisficing.
To effectively cause automation to support “optimal” human performance behavior, there are needs to: identify observable indicators of nonstandard and nonoptimal behaviors, such as satisficing; develop measures of indicators; and identify thresholds of operator effort aversion that may lead to differences in decision-making behavior and use of automation. In general, the prior LOA modeling research has not accounted for satisficing (and related modes of cognitive control for decision making) through the engineering cycle of measuring, modeling, and developing methods for control. There is a need for more detailed and accurate models of human performance when interacting with automation that would account for a range of human behavior. One starting point is to consider, for example, the conceptualizations of tactical and opportunistic behavior in HAI and potential satisficing responses and what they may mean to automation aid design for extension of the LOA approach.
Potential Solutions (Incremental Advances in LOA Models)
Using Empirical Studies of LOAs to Make Taxonomies Descriptive
To advance descriptive models of HAI, simulation studies of human-systems performance can examine specific design configurations and task scenarios. However, such efforts will not necessarily yield generalizable descriptive models for different domains. Results of many empirical studies need to be fed back into models to move from presumptive concepts of human work with automation to create more powerful descriptive characterizations of actual HAI. Addressing this need is, however, dependent on the capability to generalize behavior when developing LOA frameworks. Related to this, some have said that Jordan (1963) made calls for more data on human performance with automation many decades ago, and yet where are we now? I don’t disagree that there have been prior calls for experimental studies of LOAs to identify empirical regularities in use of automation, if possible. To some extent, this call has been answered, but the identification of regularities is ongoing. The findings of the Onnasch et al. (2014) meta-analysis provide one basis for such an effort, including identification of some patterns of actual human performance, workload, and situation awareness subject to LOAs. However, these patterns need to be translated into modifications of definitions of possible function allocation schemes as part of existing look-up tables and so on.
Human behavior can be accounted for in terms of both possible system function allocations as well as association of empirical results on HAI with specific LOAs. The space of possible types and levels of automation might be appropriately constrained based on understanding of satisficing behavior. Furthermore, specific automation implementation outcomes may need to be sorted between nominal and off-nominal operating conditions. In effect, I think the call for research should be for finer-grained taxonomies of alternative LOAs (modifications of the definitions of possible function allocation schemes) to yield descriptive models for human performance predictions.
The existing Parasuraman et al. (2000) model of types and levels of automation or the similar one that Endsley and I formulated a few years earlier (Kaber & Endsley, 1997) could be useful starting points for descriptive models of LOAs wherein consideration can be made of, for example, how human operator minimum levels of aspiration (aversion to effort) and efficiency in fulfilling roles, as identified through function allocation selection/schemes, mediate HAI outcomes. The definitions of information acquisition, information analysis, decision selection, and action implementation put forth by Parasuraman et al. were quite detailed but may not account for exactly how operator task experience, complacency, and satisficing influence delivery of system functions. In general, the convenience of automation can lead to levels of human effort that are less than what is expected by researchers in developing models, and consequently, automated system outcomes may ultimately be less than what is projected.
In any effort to enhance existing taxonomies of types and levels of automation, there is need for precise characterization of satisficing and other relevant human behaviors. Such characterizations should be made based on an empirical data set or mathematical function. The automation reliance equation developed by Gao and Lee (2006) used measures of valence for action as a basis to identify when human preference for automation would change. They also identified a response threshold for indicating preference for manual or automated control. A similar model could be formulated for satisficing behavior as well as a threshold to identify when different forms of human aiding (as suggested by Feigh, 2011) may be needed. Such an equation could then be used to temper expectations for human performance associated with various LOAs and promote the accuracy of model-based predictions. Engineers would not only be enabled to LOA selection based on (a) general performance, workload, and situation awareness trends across levels but could also consider (b) quantitative projections of the likelihood of satisficing behavior and so on (as an “operational risk” level) to narrow down the range of LOAs that might effectively support specific system performance. Table 2 presents an adaptation of the SAE (2014) driving LOAs taxonomy with representation of these two information components (with speculative content) and the objective of supporting designer decision making in the conceptual design phase. Instead of providing a narrative on function allocations within dynamic driving tasks, the table includes a matrix of roles and agent allocations for a broader range of functions, including navigation and specific types of monitoring. In addition, expected implications of each LOA on driver outcomes are tabled versus only identifying a responsible agent in the event of a functional failure. This table retains identification of driving modes/conditions for which the system or LOA is feasible but resolves the conditions to “nominal” and “off-nominal” states versus considering specific driving circumstances without explicit identification. Another extension of the SAE concept is the potential for risk (i.e., credible hazard exposure) due to misapplication or misuse of the automation. In general, operational risk is higher under off-nominal modes for which driver satisficing behavior might be more likely due to high demand conditions.
Rough Example of Descriptive Levels of Automation (“Look-Up”) Table Concept With Speculative Automation Outcomes
Note. L = low; M = moderate; H = high.
Using Feedback to Address Satisficing
Parasuraman and Riley (1997) said some time ago that under high level of “authority” automation, elaborate feedback mechanisms must be used to combat operator complacency and address potential losses in operator situation awareness. The intent of such feedback is to ensure operator awareness of automation activities and states and appropriate calibration of monitoring and intervention behaviors. Their recommendation might be useful for existing systems with poorly designed feedback mechanisms, such that novices may operate with limited awareness of all potential modes of control.
Addressing the issue of operator satisficing behavior under similar circumstances may require other forms of engineering controls, including mechanisms alerting operators to consistent patterns of “minimally acceptable” behavior under nominal conditions that would pose safety-critical states under off-nominal operating circumstances. That is, an operator’s minimum level of aspiration in system control could lead to an accident under “tight” performance tolerances (e.g., flight path control under clear day conditions vs. negotiating areas of convective activity). This type of interface design approach is akin to the approach Dekker and Woods (2002) recommended toward making automation more of a team player in complex systems. Still other administrative controls would include more rigorous operator training and internal calibration to maximum achievable states of accuracy versus maximum acceptable states of control error.
Conclusion
Any engineering modeling approach is evolutionary in nature. At this stage, I think further development of the LOA concept is worthwhile to enhance predictive utility. However, I don’t think abandoning an existing concept that is supportive of conceptual design of complex systems makes sense unless we have other equally approachable and conceptually sound alternatives for addressing a broad range of large-scale automation design problems. I also think the notion of “abandoning” a human factors research concept/method implies that at a minimum, those advocating such action should objectively demonstrate how the concept further clouds understanding of variability in human performance versus providing explanation.
Return to What to Automate Versus How to Make Humans and Automation Get Along
The aforementioned general approaches to advancing taxonomies of LOAs represent one potential strategy to positively extend existing research, based on empirical studies, a focus on making LOA taxonomies more descriptive of human roles in complex systems control, as well as refining definitions and measurements of human performance outcomes. However, this is not to say that the other issues of concern for Dekker and Woods (2002; i.e., how humans and automation get along) and Pritchett et al. (2014; who contributes to what system outcomes) and other researchers are not important. Of course, any lack of accounting for the nature of HAI in systems design may lead to human accommodation of aspects of “clumsy” automation. However, I think we need to be systematic in identifying a starting point for advancing theory, and I question how it is actually possible to discuss HAI if we do not have some concept of what and how the automation and human will do. Only with such information can engineers begin to talk about interface design and interaction protocols (i.e., “team play” will always be a strategy for integration of functional capabilities of “players”). Some may contend that there is a fundamental set of HAI principles that serves as a base for any taxonomy of LOAs. However, one major issue with this contention is change in human and automation capabilities. That is, the “rules of engagement” among autonomous and/or automated agents must change when the sophistication of an agent or the operating context changes. In general, in design and development of any engineered system, a mapping of functional capabilities, function flow, and task allocation are critical conceptual bases for system interface design. Once the range of functional distributions for a domain has been identified and tradeoffs referenced from automation look-up tables, designers can proceed to discuss how various LOAs can be presented to optimize system performance. From my perspective, this is basic human factors engineering, and I believe there are benefits/value to such look-up tables for conceptual and detailed systems design that motivate preservation and extension.
Benefits of Addressing Issues With LOA Taxonomies
A shift in the HAI modeling paradigm of taxonomies of LOAs from presumptive concepts of human behavior with automation to descriptive definitions of LOAs could lead to more accurate and precise predictions of human-automated systems performance. Satisficing is a potentially safety-critical behavior that needs to be clearly defined and, if at all possible, measured and quantitatively modeled and represented for enhancing our capability to predict human performance with complex automated systems. Beyond this, further refining our understanding of specific human performance constructs and measurement approaches should lead to more sensitive assessments of the implications of automation use on human performance and the potential to observe larger trends in behavior due to function allocation versus nuances due to interface design. Addressing these issues should make more precise and complete our understanding of the impact of LOAs on human performance. Ultimately, this line of research should lead to engineering models that can be effectively used as bases for predicting human and system performance, as well as operator workload and system awareness outcomes, to support automation design and implementation processes.
Footnotes
Acknowledgements
I would like to thank Drs. Emilie Roth and Amy Pritchett as well as the four anonymous reviewers for their time and thoughtful comments on this writing. All input was very constructive and brought to my attention a number of conceptual issues. My work on this “thought piece” was supported, in part, by a grant from the National Aeronautics and Space Administration (NASA; Grant No. NNX16AB23A). Terry Fong was the project monitor. The views and opinions expressed are mine and do not necessarily reflect the views of NASA.
David Kaber is a distinguished professor of industrial and systems engineering at North Carolina State University (NCSU) and an associate faculty member in biomedical engineering and psychology. He is director of research for the Ergonomics Center of North Carolina and a NIOSH-sponsored occupational safety and ergonomics education and research program at NCSU. His current research interests include modeling and analysis of workload in unmanned systems operations, human performance and behavior in autonomous vehicle use, and design principles for automation transparency in human-in-the-loop systems. Kaber received his PhD from Texas Tech University in 1996. He is a fellow of the Institute of Industrial & Systems Engineers and the Human Factors & Ergonomics Society. He is a certified safety professional and certified human factors professional.
