Abstract
As autonomous and semiautonomous systems are developed for automotive, aviation, cyber, robotics and other applications, the ability of human operators to effectively oversee and interact with them when needed poses a significant challenge. An automation conundrum exists in which as more autonomy is added to a system, and its reliability and robustness increase, the lower the situation awareness of human operators and the less likely that they will be able to take over manual control when needed. The human–autonomy systems oversight model integrates several decades of relevant autonomy research on operator situation awareness, out-of-the-loop performance problems, monitoring, and trust, which are all major challenges underlying the automation conundrum. Key design interventions for improving human performance in interacting with autonomous systems are integrated in the model, including human–automation interface features and central automation interaction paradigms comprising levels of automation, adaptive automation, and granularity of control approaches. Recommendations for the design of human–autonomy interfaces are presented and directions for future research discussed.
Keywords
Introduction
Autonomous capabilities are being developed for a wide range of systems in order to reduce labor, extend human capabilities, and improve human safety. In aviation, autonomous air vehicles are being developed to deliver cargo, surveil remote locations, take over control from pilots if an imminent collision is detected, and act as teammates with manned aircraft. Autonomous features are being added to automobiles to automatically park, maintain tracking within a lane, and control speed in conformance with traffic. Autonomous systems that can act in milliseconds are being matured to respond to cyber attacks. Autonomous robotic vehicles are being developed to deliver materials to remote locations, repair runways, and retrieve fallen soldiers from the battlefield. And higher levels of autonomy are being developed to integrate information across disparate data repositories, to create real-time health monitoring, and for a wide variety of other applications.
Although autonomy promises that systems will be able to perform actions alone, no matter how capable, most autonomous systems must still interact with humans who serve as supervisory controllers, responsible for directing and overseeing their performance, or as teammates who need to collaborate with them. Creating effective autonomous systems is thus dependent on the development of a successful approach to human–autonomy teaming.
Much of the research pertinent to autonomy has been conducted over the past 40 years in the field of human–automaton interaction. Like automation, the goal of system autonomy is to achieve tasks with little or no human intervention. Recognizing that previous automation has been limited in its ability to achieve this goal, the term autonomy has recently become more prevalent. Autonomy is being designed to achieve functions independently, performing well under significant uncertainties for extended periods of time with limited or nonexistent communication and with the ability to compensate for system failures, all without external intervention (Krogmann, 1999). Whereas previous generations of automation have typically employed logic-based programming, today’s system autonomy efforts are leveraging computational intelligence and learning algorithms to better adapt to unanticipated and changing situations (Krogmann, 1999). In this sense, successful autonomy can be considered to be well-designed and highly capable automation—better able to adapt to a wider variety of conditions (U.S. Air Force, 2015). This concept is in agreement with Hancock (2016), who describes autonomy as a later evolution of automation that has historically been more limited in capability and scope. Although the present discussion will treat the two terms synonymously for the purposes of understanding how people interact with such systems, there are certain technical differences between the two in the underlying software with implications for human use that will be discussed at the end of the paper.
Actually achieving the goal of full system autonomy is quite difficult, and most systems will exist at some level of semi-autonomy for quite some time, which will be discussed in more detail later in terms of levels of automation. For this reason, humans and system autonomy will continue to need to interact, and the development of autonomous systems that can support this teaming should be based on a detailed foundation of research on human–automation interaction.
In this paper I will first discuss the challenges that system autonomy presents for the human supervisory controllers who oversee and direct the autonomy. A model of human–autonomy system oversight is presented that describes the relationship between system autonomy characteristics and human cognitive functions and performance for achieving successful oversight and intervention. Key features of the model are then discussed in more detail, including the human–automation interface and major aspects of the automation paradigm, comprising level of automation (LOA), adaptive automation (AA), and granularity of control (GOC). Finally, guidelines for the design of autonomous and semiautonomous systems are presented, along with needed research to support the move to greater system autonomy.
The Big Challenge: Out-of-the-Loop Loss of Situation Awareness
When autonomous software is introduced, an important role of the human operator becomes one of monitoring and intervening in situations that the software cannot handle. Unfortunately, most automation to date has suffered from brittleness, operating well for the range of situations it is designed to address but needing human intervention to handle situations its programming does not cover (Woods & Cook, 2006), a situation that will continue to exist at least to some degree in the future until fully reliable and robust system autonomy is developed. This automation brittleness creates a challenge for the human operator who may not realize that the automation is acting incorrectly or understand why. Additionally, unexpected automation transitions will occur when the automation suddenly passes control to the human operator who may not be ready to take over.
For example, an aircraft crashed into the river on takeoff from New York’s LaGuardia Airport because both the pilot and first officer were unaware that the auto-throttle was accidentally disarmed, creating insufficient speed for takeoff (National Transportation Safety Board, 1990). In another example, a power grid operator failed to realize system automation had stopped working, leading to a massive power outage in the northeastern United States (U.S. Canada Power System Outage Task Force, 2004).
Operators of automation also find that the systems are hard to understand once they realize there is a problem. An Airbus A330 crashed into the ocean off the coast of Brazil following the malfunctioning of the cockpit automation due to blocked sensors. The confused pilots never correctly diagnosed the underlying problem, nor did they realize they were in a stall, leading to the loss of life of all aboard (BEA, 2012). These examples show that although automation can work correctly much of the time, when it does fail, the ability to resume manual control is critical, and there may be only fractions of a second available.
This out-of-the-loop (OOTL) performance problem creates a significant challenge for autonomy. People are both slow to detect that a problem with automation exists and slow to arrive at a sufficient understanding of the problem to intervene effectively (Wickens & Kessel, 1979; Young, 1969). Endsley and Kiris (1995) demonstrated that the OOTL problem was due to a fundamental loss of situation awareness (SA) when overseeing automation, a finding that has been confirmed in numerous studies (Kaber, Onal, & Endsley, 2000; Li, Wickens, Sarter, & Sebok, 2014; Manzey, Reichenbach, & Onnasch, 2012; Sethumadhavan, 2009). Low SA when overseeing automation occurs due to three primary mechanisms: (a) changes in information presentation with automation; (b) operator monitoring, vigilance, and trust; and (c) operator engagement (Endsley & Kiris, 1995), each of which will next be described in more detail.
Automation Information Presentation
Many automation interfaces do not provide needed information to the operator on the state of the automation and, often, little feedback on the state of the system it is controlling. These interface-related failures include cases where intentional design decisions are made to remove needed information as well as unintentional deletions. For example, tactile cues were accidentally removed with the advent of fly-by-wire flight controls in the F-16 military aircraft (Kuipers, Kappers, van Holten, van Bergen, & Oosterveld, 1990), information from vibration and smell were lost in the automation of process control operations (Moray, 1986), automation of auto-feathering systems in commercial aircraft failed to tell pilots when it shut down the engines (Billings, 1991), and on a Boeing 747 aircraft, the actions taken by a flight management system to compensate for a lost engine inadvertently masked impending loss of control from the pilots (National Transportation Safety Board, 1986). Key information needed for manual system operation may be missing or degraded as well as information needed to assess the state of the automation and its ability to perform tasks. Many automated systems also have a low level of transparency, which refers to the understandability and predictability of their actions.
Vigilance, Monitoring, and Trust
Automation often requires extensive human monitoring, a skill that people do not excel at as they reduce vigilance over even short periods, a problem that may be exacerbated by complacency or overreliance on automation (Parasuraman & Riley, 1997). Although automation is often introduced to lower human workload, Grubb, Miller, Nelson, Warm, and Dember (1994) showed that the act of passively monitoring automation can actually be a high-workload activity. (See Hancock, 2013, for a more detailed discussion of vigilance and its contributing factors.)
Compounding issues of vigilance and complacency that decrease attention allocation to critical information, the degree to which people monitor automation has been shown to decrease with increased trust in the automation (Hergeth, Lorenz, Vilimek, & Krems, 2016; Muir & Moray, 1996). Trust can be defined as “the attitude that an agent will help achieve an individual’s goals in a situation characterized by uncertainty and vulnerability” (Lee & See, 2004, p. 54).
Trust in automation has been found to be based on (a) system factors, including system validity and reliability, robustness, subjective assessments of system reliability, the recency of a system failure, system understandability and predictability, timeliness, and integrity; (b) individual factors, including perceived ability to perform the task, willingness to trust, and other personal characteristics (such as age, gender, culture, and personality); and (c) situational factors, including time constraints, workload, effort required, and the need to attend to other competing tasks. (For a more detailed review, see Hoff & Bashir, 2015, and Schaefer, Chen, Szalma, & Hancock, 2016.) Although each of these factors may have a role, a meta-analysis showed that system factors (most notably, system reliability and performance) had the greatest overall impact on trust, whereas individual and situational factors had a much lower impact (Hancock et al., 2011).
Although concern has been expressed over the potential for operators to overtrust automation (Lee & See, 2004), Moray and Inagaki (2000) make the compelling case that operators employ an optimal attention allocation strategy in which highly reliable automation does not need to be monitored very often. “Because an operator’s attention is limited, this is an effective coping strategy for dealing with excess demands. The result, however, can be a lack of situation awareness on the state of the automated system and the system parameters it governs” (Endsley, 1996, p. 167). Consistent with this finding, Wickens, Sebok, Li, Sarter, and Gacy (2015) developed and tested a model of automation complacency that showed operators’ monitoring performance to be based on the salience of information, the expectancy of the information source (which decreases as rate of change is lowered and reliability and performance of automation increase), the effort required to monitor the source, and the perceived value of the information source for the task.
The presence of competing tasks has also been found to increase trust in automation in a meta-analysis of studies (Wickens & Dixon, 2007), thus shifting operator attention away from the automation and decreasing SA of the functions being automated (Carsten, Lai, Barnard, Jamson, & Merat, 2012; Kaber & Endsley, 2004; Ma, Sheik-Nainar, & Kaber, 2005; Sethumadhavan, 2009). Interestingly, an accurate calibration of human trust in automation may not always be that important on its own (Merritt, Lee, Unnerstall, & Huber, 2015). Hoff and Bashir (2015) state that the degree to which trust affects operators’ actual dependence on automation is mitigated by the degree to which the operator is able to independently assess the system’s performance, the complexity of the automation, the novelty of the situation, the operator’s ability to perform the task manually, and the operator’s decision freedom.
Engagement: The Challenge of Passive Cognition
It is inherently difficult for operators to fully understand what is going on when acting as a passive monitor of automation as compared with when they are actively processing information to perform the same task manually due to lower levels of cognitive engagement. Actively engaging cognitively in a task improves one’s understanding of that task and retention of critical information. This result is similar to the generation effect (Slamecka & Graf, 1978), which shows superior memory for tasks that people are actively engaged in, and the testing effect (Roediger & Karpicke, 2006), which shows superior retention for tested material, as it requires people to actively use the information in a way that goes beyond passive reading or listening.
Endsley and Kiris (1995) were able to rule out complacency and differences in information presentation between automated and nonautomated conditions and still found lower SA to occur due to passive processing. Similar results have been found with experienced air traffic controllers (Endsley, Mogford, Allendoerfer, Snyder, & Stein, 1997; Endsley & Rodgers, 1998; Metzger & Parasuraman, 2001) and in other automated tasks (Manzey et al., 2012). The challenge of lower SA due to passive processing presents a significant hurdle for effective human–autonomy interaction.
The Automation Conundrum
These combined challenges create a fundamental automation conundrum: The more automation is added to a system, and the more reliable and robust that automation is, the less likely that human operators overseeing the automation will be aware of critical information and able to take over manual control when needed. More automation refers to automation use for more functions, longer durations, higher levels of automation, and automation that encompasses longer task sequences.
The automation conundrum potentially creates a fundamental barrier to autonomy in safety-critical systems, such as driving and aviation. As individual functions are automated, and as the reliability of that autonomy increases, less attention will be paid to those functions, SA of the human operator will be lowered, and the likelihood of an OOTL error in the case of unexpected automation transitions will increase.
A Model of Human–Autonomy System Oversight
The relationship between factors creating the automation conundrum is depicted in the human–autonomy system oversight (HASO) model (Figure 1). Overall, the performance of operators when overseeing and intervening in automation tasking is dependent on their level of SA and workload. The operator must have sufficient SA to realize that the present situation is outside of the bounds of automation capabilities, or that the automation is performing incorrectly for the present situation, in order to decide that an intervention is needed. Further, the operator must have sufficient time and resources to be able to make the intervention. Increasing automation reliability (ability to operate accurately) and robustness (ability to operate across a wide range of possible conditions) will act to decrease attention allocation to automation performance (and its relevant data, such as input parameters), as moderated through operator trust, along with the presence of competing tasks and demands.

Human–autonomy system oversight (HASO) model. The model depicts the key system design features that influence the human cognitive processes involved in successful oversight, intervention, and interaction with automated systems.
The model shows that improved automation performance, including both reliability and robustness, will increase the degree to which an individual will trust automation and therefore the likelihood he or she will fail to attend to the displays and environmental information that show how well the automation is performing, for example, externally available indications of system performance, such as whether the aircraft is lined up with the runway or the presence of other vehicles on a collision course. The presence of competing tasks, including both job-relevant tasks and extraneous tasks (e.g., sightseeing, texting, daydreaming, or conversing), and high levels of task demand (e.g., high workload characterized by time and effort demands) will increase perceived trust in the automation itself and act to direct attention away from automation-related displays and information associated with automated functions.
How attention is allocated across available information sources is also controlled dynamically, based on the operator’s current SA. That is, the current model of the situation drives the search for new information (Endsley, 1995a, 2015). The level of SA that operators derive based on their ongoing allocation of attention to competing information will be significantly influenced by several key features of the automation interface. In addition, the fundamental automation interaction paradigm employed, which establishes the ways in which the operator and the automation interact, will have a substantial impact on operator SA and performance in overseeing the automation and intervening when necessary. Each of these key system design features, shown in the HASO model, is discussed in more detail in the following sections.
As an aside, this model portrays the primary system design features and cognitive processes that are involved in determining human–autonomy oversight and intervention performance. There are also many individual differences that can be at play affecting cognitive constructs, such as trust (Hancock et al., 2011; Hoff & Bashir, 2015), SA (Endsley, 1995a, 2006; Endsley & Bolstad, 1994), and workload (Hancock & Meshkati, 1988), which in turn will create individual differences in OOTL and intervention performance. These are considered implicit in the model and are not further detailed here but may be relevant in particular cases. The level of expertise of the individual, for example, can significantly affect trust, SA, and workload perceptions and thus will affect the ability of the individual to provide effective system oversight.
Automation Interface
The impact on SA of decreases in attention to the automation (and related information) will be ameliorated to a significant degree by the quality of the system interface, including (a) the degree to which it effectively presents the needed information for decision making; (b) the salience of cues associated with the state of the automation, including modes and system boundary conditions; (c) support for mode transitions, including that needed to engage the automation and to detect and respond to unexpected automation transitions to manual control; and (d) the transparency of the automation for providing understandability of its actions and predictability of future actions (Endsley, Bolte, & Jones, 2003). An effective design of the automation interface can significantly aid in directly improving SA of automation and the system as well as improve the level of trust in the automation and the appropriate calibration of that trust (Endsley, Sollenberger, & Stein, 1999; Furukawa & Parasuraman, 2003; Hoff & Bashir, 2015; Miller, Pelican, & Goldman, 1999; Sklar & Sarter, 1999). The automation interaction paradigm employed is also highly pertinent.
The Automation Interaction Paradigm: Autonomous but Not Alone
In addition to the physical displays (visual, auditory, or tactile) provided to the operator, there are several fundamental aspects of the system that determine how the human and automation will interact, how roles and responsibilities will be allocated between them, and how often these allocations will change, which I term the automation paradigm in Figure 1. Often system developers give scant attention to these design decisions; however, the automation interaction paradigm has a significant effect on the complexity of the system and the level of engagement and workload of the operator, all of which significantly influence system oversight and intervention.
Complexity, engagement, and workload
First, automation can negatively affect SA by increasing system complexity through the addition of many features or modes, which creates more interactions among system components and a corresponding reduction in system predictability as the system increasingly considers multiple factors or component states (Endsley et al., 2003). Many branches within the system logic, as well as infrequent combinations of situations and events, can create various rarely seen system states and add to the challenge of fully comprehending the system. The flight management systems used in many aircraft are an example of this, with even well-trained pilots expressing surprise at system behaviors (McClumpha & James, 1994; Wiener & Curry, 1980). System complexity makes it more difficult for the operator to create an accurate mental model of the system needed for correct interpretation of information and projection of system states, including situations where manual control will be needed.
The apparent complexity of the system (a function of the cognitive complexity, display complexity, and task complexity of the system created by the automation interface) is most pertinent for forming good mental models, as opposed to actual software complexity (Endsley et al., 2003). Cognitive complexity, from a user standpoint, is a function of the system interface’s compatibility with the user’s goals and tasks and mental models. Users need to be able to make a clear mapping between the system’s functionality as presented through the interface and their goals. For example, if the behavior of a semiautonomous automobile changes based on interactions between different functions, such as auto-steering and adaptive cruise control, this system is more cognitively complex than when behavior is consistent. Users needs to be able to understand how to simply and directly command the vehicle to behave in a way that maps to their expectations.
Both information display complexity—driven by the density, grouping, and layout of displays—and task complexity associated with steps required by the operator—characterized by the number of paths, number of possible desired end states, conflicting interdependencies, and uncertainty in linkages—affect the development of good mental models, which are fundamental for the ability of operators to develop accurate comprehension and projection associated with automation (Endsley, 1995b). The transparency of the system significantly influences the ability to develop an understanding of what the automation is doing and how to affect its actions.
In addition, at least three aspects of the automation paradigm, the LOA, AA, and GOC, will directly affect both operator workload and engagement and thus their SA, as shown in Figure 1. Engagement has already been shown to directly influence SA. In addition, workload can directly affect SA in several ways (Endsley, 1993): (a) If workload is too low, SA can be low due to low arousal and vigilance decrements, and (b) if workload is too high, exceeding operator capacity, operators will have insufficient time to gather and process needed information, and SA can suffer. In general, there is an optimal range of workload in which operators have sufficient arousal and engagement to have the SA to perform well that is characterized by an inverted-U curve. Although many automation initiatives have been driven by a desire to reduce operator workload, these efforts have frequently neglected the need to also maintain high levels of operator engagement and SA (Endsley, 1995b).
LOA, AA, and GOC effects on engagement and workload, and thus on SA and autonomy oversight and intervention in the HASO model, will each be discussed in more detail.
LOA
Different LOAs have been described, relevant to different kinds of systems (Endsley & Kaber, 1997; Endsley & Kiris, 1995; Parasuraman, Sheridan, & Wickens, 2000; Sheridan & Verplanck, 1978). Endsley and Kiris (1995) demonstrated that the loss of SA associated with full automation could be reduced via the use of intermediate LOAs (semi-autonomy) that keep operators more engaged. Kaber and Endsley (1997, 2004) developed an LOA scale based on four major task stages: (a) information—monitoring or taking in information; (b) option generation—generating options or strategies for achieving goals; (c) action selection—selecting or deciding what option to employ; and (d) implementation—implementing or carrying out actions. They further applied these four stages in various combinations to form implementable LOAs, with each level involving differing amounts of automation and ways of combining human and automation inputs across each stage, as shown in Figure 2.

Levels of automation formed from the combination of human and computer performance across four task stages. From Designing for Situation Awareness: An Approach to Human-Centered Design (2nd ed., p. 185), by M. R. Endsley and D. J. Jones, 2012, Boca Raton, FL: CRC Press. Copyright 2012 by Taylor and Francis Group. Reprinted with permission.
Parasuraman et al. (2000) presented a similar stage of automation taxonomy with the information monitoring factor divided into two levels, information filtering and information integration or inference, and with option generation and action selection combined in their action selection category. Action execution makes up their fourth level, equivalent to implementation in the Endsley and Kaber (1997) model. These two taxonomies are shown in Table 1, unifying them and summarizing a considerable body of research that has been conducted evaluating how changing the LOA on each stage of a task affects human performance and cognition.
Impact of Autonomy by Stage of Task Performance on Human Oversight and Intervention Performance, Workload, and Situation Awareness: A Summary of Research Findings
Note. SA = situation awareness; OOTL = out of the loop.
Extensive research shows that the effects of automation on SA, workload, and both normal and OOTL performance are quite different depending on which stage of a task it is applied to (Endsley & Kaber, 1999; Kaber et al., 2000). Onnasch, Wickens, Li, and Manzey (2014) performed a meta-analysis of 18 studies on LOA, revealing what they labeled a “lumberjack effect”; the more automation helps manual performance, the worse is the performance in recovering from automation failure (i.e., OOTL). Reviews of LOA research show the actual effect of automation is highly dependent on not only the stage of the task involved but also several other pertinent task characteristics, as shown in Table 1. This finding is discussed in more detail for each of the three main task stages: carrying out the actions associated with a task, making decisions about what should be done or how to best accomplish a task, and developing SA to support decision making.
I. Action
In many cases, automation is applied to carrying out or implementing physical tasks, such as washing dishes, following a predetermined route, or controlling automobile speed. In general, significant benefits for overall system performance can be derived from the automation of repetitive, routine tasks by reducing human manual labor, especially if such automation is highly reliable (Billings, 1997; Wiener & Curry, 1980). It also can improve performance in many risky, difficult, and time-intensive tasks. Although it can act to reduce manual workload in many situations, cognitive workload has been found to be even higher in many peak-workload times if interaction with the system requires an excessive number of tasks (Bainbridge, 1983), particularly if the system has high failure rates (Manzey et al., 2012). Bainbridge (1983) called it the “irony of automation” that when workload is the highest, it is often of the least assistance.
However, OOTL problems are significantly greater with automation that allows a number of tasks to be sequenced for future implementation (i.e., advanced queuing of tasks), for example, navigation in which the route is laid out in advance or batch processing in which many steps in a process are sequenced in advance (Endsley & Kaber, 1999). Although the automation of continuous-control tasks (e.g., steering a car, aircraft, or robot arm) may reduce attention demands under good automation performance and free up attention for other tasks, there also will be significant OOTL problems (Kaber et al., 2000; Wickens & Kessel, 1979; Young, 1969). “Psychomotor tasks, for instance, may act to subtly transmit information important for SA . . . which may be jeopardized by automation schemes that decrease this type of input” (Endsley & Kiris, 1995, p. 392).
Automation changes a psychomotor task into a series of discrete cognitive tasks, which requires more conscious attention sporadically, as compared with continuous manual control, which generally requires lower attention over an extended period. Maintaining vigilance for discrete interventions can be quite challenging because attention has often been redirected to competing tasks. Studies have shown loss of not only Level 1 SA (perception of data) but also Level 3 SA (projection of future states) presumably due to low engagement and loss of subtle cues when not directly controlling the task (Kaber et al., 2000).
II. Decision making
Automation that aids decision making (making recommendations on what course of action to take, for example) has been found to be problematic due to a decision-biasing effect (Crocoll & Coury, 1990; Endsley et al., 2003; Lorenz, Di Nocera, Rottger, & Parasuraman, 2002; Reichenbach, Onnasch, & Manzey, 2011; Sarter & Schroeder, 2001). In general, when the automation is correct, people are more likely to make a correct decision; however, when it is incorrect, they perform worse than if they had received no decision advice at all (Layton, Smith, & McCoy, 1994; Olson & Sarter, 1999), a situation that is worse with more reliable automation (Metzger & Parasuraman, 2005; Rovira, McGarry, & Parasuraman, 2007). People may also be slowed down by the provision of decision advice in that they need to act to compare the system’s recommendation with other information to decide whether or not to agree with it (Endsley & Kiris, 1994; Madhavan & Wiegmann, 2005; Pritchett & Hansman, 1997).
This research points to a serial model of human–autonomy interaction in which people take in the advice of decision support systems along with other competing information and make comparisons in order to make a decision (Endsley et al., 2003), as shown in Figure 3. Although system designers have generally assumed parallel performance (i.e., that people will be able to back up automation as independent agents), which would produce better outcomes than either agent operating alone, this is not the case. Figure 3 shows that the reliability of two serial systems is reduced, compared with two systems in parallel, which explains the frequent lack of a general performance improvement for decision-making automation. Although many factors will influence whether automation performs better than humans or vice versa, including the competence and experience of the individual and the capability of the automation, the fact that these systems are not truly independent must be considered as a limiting factor on joint performance, along with other interface and system design features.

Explaining the decision-biasing effect. Decision automation recommendations create a serial system with human decision makers, potentially lowering overall decision reliability. Adapted from Designing for Situation Awareness: An Approach to Human-Centered Design (2nd ed., p. 180), by M. R. Endsley and D. J. Jones, 2012, Boca Raton, FL: CRC Press. Copyright 2012 by Taylor and Francis Group.
SA and OOTL problems are also worse with decision automation as compared to manual performance (Endsley & Kaber, 1999; Endsley & Kiris, 1995; Kaber et al., 2000; Kaber & Endsley, 2004; Li et al., 2014; Onnasch et al., 2014) and as compared to automation that acts to integrate information to provide better SA (Endsley & Selcon, 1997; Onnasch et al., 2014; Sethumadhavan, 2009). Manzey et al. (2012) found that about half of people who accepted incorrect system advice failed to verify the recommendation, and the other half apparently checked all the relevant information but still followed the wrong advice, showing a lower level of SA associated with passive processing.
Whether the system only makes recommendations, with the human making the ultimate decision (management by consent), or the system makes the selection and implements it unless the human vetoes (management by exception) makes little difference to OOTL recovery times (Endsley & Kaber, 1999). Systems that aid decision making in other ways, such as critiquing systems (Guerlain et al., 1999), or systems that support what-if reasoning and contingency planning (Endsley et al., 2003), do not suffer from these ill effects, likely because human task engagement remains high and decision-biasing problems are avoided. Critiquing systems encourage decision makers to think through other information and possibilities after their initial decision is made, which avoids the serial process in Figure 3. What-if and contingency-planning systems encourage decision makers to think through multiple possibilities and prepare for them in advance, actually enhancing the process of developing SA and avoiding the downside of decision making based on a narrow subset of relevant information.
III. SA
There are significant benefits from automation that gathers and presents needed information (Level 1 SA) and from automation that better integrates information to support comprehension and projection needs (Levels 2 and 3 SA). This type of automation significantly reduces workload and can enhance both SA and performance with little negative effects (Endsley et al., 2003; Endsley & Selcon, 1997; Onnasch et al., 2014; Sarter & Schroeder, 2001). Aiding the formation of SA leaves the human active in decision making, engagement levels high, and OOTL problems minimized.
Automation that cues people to subsets of information that the system believes are important, however, falls prey to the same sort of decision-biasing effects as decision support systems. This type of automation would include, for example, systems that highlight an item in a visual scene or certain portions of a visual display. When they are correct, people’s performance is improved, but when there is key information that is not flagged by the system, performance is worse (Yeh, Wickens, & Seagull, 1999).
Similarly, automation that filters information can be problematic. Automated information filtering significantly limits information at any point in time by filtering out everything that is not pertinent to the specific task momentarily at hand. Because people need to rapidly switch between goals and tasks, however, they also need global information across goals in order to prioritize their tasks, and they need to project future potential problems to support proactive decision making and contingency planning. Therefore, automated information-filtering systems can significantly degrade SA and performance (Endsley, 1990; Endsley et al., 2003).
IV. Summary of LOA research and implications for autonomy in the HASO model
Figure 4 summarizes the research on LOAs to show that overall human engagement will be (a) high with automation that supports situation awareness, (b) lower with automation that acts to recommend or select courses of action, and (c) in between for automation that carries out the actions associated with most tasks but low if it involves advanced queuing of future tasks or continuous control tasks. Conversely, workload will be generally lowered by automation that supports SA or carries out tasks but can be significantly increased by automation that tries to make decisions. In addition, workload will often be much higher during OOTL recovery periods and when trying to control or adjust the automation itself, showing that workload effects are often time dependent.

Effect of levels of automation on operator engagement and workload. Oversight of autonomous systems will be negatively affected by low engagement levels and by high workload during automation control and out-of-the-loop recovery periods.
Future autonomous systems will generally involve automation applied to two or all three processing stages involved in a task. This design may take the form of full autonomy, in which human intervention is not possible, or supervisory control, in which operators are expected to intervene as necessary (also called human-on-the-loop). Because these high levels of autonomy will involve automation of decision making and continuous-control tasks or advanced queuing of tasks, human engagement can be expected to be low, with reduced SA and higher OOTL problems found. The lower engagement levels, lower SA, and OOTL performance challenges will significantly mediate expected improvements in manual workload and normal task performance under supervisory control. At the heart of the automation conundrum, these challenges will continue to be a problem as levels of system reliability and robustness improve with future autonomy unless overcome by significant improvements in the human–autonomy interface. Although not similarly negatively affected by low engagement, full autonomy is possible only under very high levels of reliability and robustness as no human intervention will be possible. Figure 4 details these effects as a part of the HASO model.
AA
A second automation paradigm shown in the HASO model has attempted to increase operator engagement by intentionally adding periods of manual control to otherwise automated task performance via AA (Rouse, 1988). AA periods of manual control can be activated based on set time periods, the occurrence of critical events, drops in human performance, physiological measures, or human models (Scerbo, 1996) and has been found to improve human–system performance (Parasuraman, Molloy, & Singh, 1993). AA primarily aids human performance by reducing workload (Hilburn, Jorna, Bryne, & Parasuraman, 1997; Kaber & Endsley, 2004; Kaber & Riley, 1999). Improvements in operator engagement, as measured by electroencephalography, have also been demonstrated (Bailey, Scerbo, Freeman, Mikulka, & Scott, 2003; Prinzel, Freeman, Scerbo, Mikulka, & Pope, 2003), and AA has been shown to improve performance during high-workload periods (Wilson & Russell, 2007). Thus, periods of AA can positively affect both workload and SA, via improved engagement, as a partial means of improving human oversight of autonomy.
GOC
GOC is a third major automation paradigm shown in the HASO model that is under the control of automation designers. GOC can range (high to low) from (a) manual control to (b) programmable control, requiring the programming of each task parameter and specification; (c) playbook control, selecting from a playbook of preset behaviors (Miller, 2000); and (d) goal-based control, where only a high-level goal needs to be provided to the system (U.S. Air Force, 2015). In general, workload should decrease with less control granularity so long as the control actions provide a clear mapping to user goals and mental models and the system provides for an easy transition from normal to non-normal conditions and back. However, because increased queuing of tasks also decreases SA, lower control granularity could lower SA and will create the need for operator interfaces that pay particular attention to keeping the operator informed of automation states and able to project future actions.
Guidance for Effective Autonomy–Human Interaction
In summary, the HASO model depicts the critical system-design features that will affect human performance in operating with autonomous and semiautonomous systems, including system reliability and robustness, the operator interface, and at least three automation paradigms inherent in the system design. The desire to build system autonomy that is trusted and effective ultimately depends on creating autonomy that is trustworthy—highly reliable across a wide range of conditions—and able to support human decision making and intervention in those cases where it is not.
Guidelines for the design of autonomy, developed based on extant research on automation interface features and automation paradigms in the HASO model, provide useful guidance for supporting operator SA and autonomy oversight that will significantly support this goal, shown in Table 2. The guidelines address key aspects of successful autonomy interfaces and managing complexity to improve operator understanding of the system. In addition, in that autonomy directed at improving SA has been found to be highly beneficial, several guidelines that support SA are included. These guidelines are explained in more detail in Endsley et al. (2003), along with design methods for meeting each one.
Guidelines for the Design of Human–Autonomy Systems
Source. From Endsley, Bolte, and Jones (2003).
Note. OOTL = out of the loop; SA = situation awareness.
Increases in the amount of autonomy provided by systems will make it increasingly important that attention is paid to the design of the human–autonomy interface via the application of guidelines such as these, coupled with careful testing to show that human operators fully understand what the autonomy is doing, what it is projected to do in the near future, and the limits of its boundaries for successful performance. To summarize some key recommendations laid out in Table 2, this goal can be accomplished through interfaces that have high levels of system transparency, providing both understandability and predictability of the system, along with the appropriate use of salient features to support operator understanding of key states and mode transitions. Information that is critical for understanding system reliability (e.g., how well it is functioning, its confidence level in fused information, or system assessments) as well as its robustness (meaning its ability to handle current and upcoming situations) needs to be made readily transparent to the operator.
Not only must the key information required for operator SA of the environment, the system, and the autonomy be identified and included in the interface, but that information must be presented effectively so that the operator stays in the loop and able to oversee the autonomy. And this design must be accomplished with a manageable level of complexity that will allow the operator to develop accurate mental models and easily effect the appropriate control actions to keep the system in line with operator goals.
These guidelines were developed in the context of semiautonomous systems—those in which it is expected that the human operator will be responsible for directing the system, overseeing its performance, and intervening when needed. At full autonomy, however, no intervention will be possible. In this case, the human may choose to accept using the autonomy or not, for example, get into an autonomous vehicle or activate autonomy for some stated purpose. In this case, it will still be necessary that the human have an accurate mental model of the system’s capabilities in order to develop sufficient trust in the system to choose to use it (U.S. Air Force, 2015) and that the autonomy meets expectations for cooperative behaviors (Chiou & Lee, 2016).
In addition, it is likely that people will still need to coordinate with completely autonomous systems for joint activities, such as proposed manned-unmanned vehicle teaming in military aviation. Therefore, although not all of the guidelines in Table 2 will apply to all cases of full autonomy, the majority will continue to be highly relevant for creating an understanding of the operation of the autonomous system to support trust and joint operations.
HASO Model Summary
The automation conundrum creates a significant barrier to the development of autonomous systems. The HASO model shows the central factors that interact to create this conundrum and the design features that can be used to help mitigate, although probably not completely overcome, it. The HASO model provides guidance for the development of automation interfaces and human–automation integration methods toward this goal. The key benefits of the HASO model are as follows:
The model integrates some 20 years of research on the effects of automation on trust and complacency with that on SA and OOTL performance to provide a more comprehensive model of factors effecting human oversight and intervention in working with various forms of autonomy.
The model organizes the extensive research literature on LOA, including two separate LOA taxonomies, providing guidance on how automation applied to each of three task stages affects human performance, as well as a number of relevant task-related factors.
The model ties AA, LOA, and GOC research together as key autonomy paradigms and shows how they affect SA, workload, and complexity as they relate to oversight and intervention.
The model details the main pathways affecting autonomy oversight and intervention performance, separating the monitoring and trust factors from the effects of engagement as each influences SA.
The model is largely consistent with existing models of complacency (Wickens et al., 2015) and trust (Hancock et al., 2011; Hoff & Bashir, 2015) but extends beyond them to place these constructs in the broader context of cognitive constructs and design features affecting autonomy oversight and intervention.
The model describes the key system-design features that effect human cognition in interacting with autonomy and points the way to design guidance for improving joint human–autonomy performance.
Although largely descriptive, based on a wide body of research results, the model should also be predictive in terms of the relationship between constructs and the direction of effects that can be expected from various manipulations in system design.
Toward Fully Autonomous Systems: Research Needs
Over the next 30 years, many systems will evolve to incorporate some level of semiautonomous capability via a gradual evolution of system control, with intermediate levels of autonomy being applied to various functions as the autonomy becomes more capable over time, can handle a greater range of functions, and can handle greater ranges of variability in the environment. In contrast, a more revolutionary approach promotes a complete shift to full autonomy. Google, for example, is promoting the development of a completely driverless car. Although the revolutionary approach may reduce some of the challenges involved in human–autonomy oversight and intervention, it also will be very difficult to achieve the levels of reliability and robustness needed for acceptable use (Woods, 2016).
As the capabilities of autonomy increase, the frequency of human intervention will likely decline; however, it is anticipated that some level of human–system interaction will continue to be required for the foreseeable future. Thus the evolutionary approach to autonomy is most likely to characterize technical development in many arenas, and the success of these semiautonomous systems will be highly dependent upon effective human–autonomy interfaces that overcome current challenges. The HASO model thus directly supports efforts in the development of such semiautonomous systems.
In addition, there is significant effort being put on the development of systems that are envisioned as being fully autonomous, largely leveraging the capabilities of learning algorithms. The move from programmed automation to system autonomy based on machine learning brings with it many new challenges and research needs (U.S. Air Force, 2015).
Autonomy Software Validation
A means of validating software that has been created through learning algorithms is critically needed, as traditional methods fail to address the complexities of learning systems. Exhaustive testing of rules and potential system states will not be possible, and understanding boundary conditions will be difficult. The ability of the system to degrade gracefully and support human–autonomy interaction in such cases will need to be explicitly incorporated as a part of validation testing, and methods to support such efforts need to be established.
Learning System Consistency
Confusion may arise if individuals interact with multiple autonomous systems that are at different levels of learning, that is, exhibit nonstandardization of performance. If different models of a system learn different lessons, for example, due to the unique experience of different autonomous vehicles, a means of deciding whether those lessons are accurate or generalizable to other systems creates new software management challenges and potentially increases complexity for human operators if different autonomous systems behave differently. How should the operator maintain an accurate mental model of such systems if they vary in the details of how they perform and the situations they can handle? Further, as the autonomy learns new behaviors over time, it will be critical to develop means of conveying these changes to the human operators so that they will maintain an accurate mental model of the autonomy.
Transparency of Learning Systems
Unique challenges will arise for creating understandability and predictability of the autonomy that is critical for the human operator’s SA. The actual logic and lessons “learned” by neural networks and deep learning software are typically opaque not only to the human operator but also to software developers who may not fully understand how the system will behave in all circumstances. Although there are some techniques for deriving rules from such software (Huang & Endsley, 1997), these representations are likely to be incomplete and may not fully represent the entire complexity of behavior of the system. There is a significant need to develop methods for creating transparent interfaces that convey the needed system understandability and predictability to operators.
Human–Autonomy Teams: Different Concepts of Operation (COOs)
The COO inherent in the design of autonomous systems forms a fourth relevant automation paradigm that needs substantially more research. The COO includes such considerations as who is in charge and how interrelated are the tasks of the human and autonomy. Potential COOs include
Human as a supervisor over automation that acts as an aid or an assistant,
Humans and autonomy acting as collaborating teammates, and
Automation that oversees and acts as a limit on human performance.
Much automation research has been conducted based on a COO in which the human operator is a supervisor in charge of overseeing the performance of the system and ultimately responsible for it. With the development of systems that are envisioned as more capable and truly autonomous, many researchers have described an alternate model of interaction in which the human and autonomous systems are partners or teammates who collaborate on the performance of tasks (Taylor & Reising, 1994). For instance, a fully autonomous air vehicle may perform independently in many ways but could still need to coordinate with a pilot in a manned vehicle for joint operations to accomplish a mission. The passenger of an autonomous automobile may also need to interact successfully with it in order to setup a shared goal (destination) and to manage changing needs over the course of a trip.
Cuevas, Fiore, Caldwell, and Strater (2007, p. B64) define a human–automation team as “the dynamic, interdependent coupling between one or more human operators and one or more automated systems requiring collaboration and coordination to achieve successful task completion.” In this context, rather than assuming a one-way flow of information from the system to the human supervisor who directs it, an emphasis is placed on a two-way flow of information and on a wider range of teamlike behaviors, such as collaboration, coordination, and support for joint planning and replanning; reprioritization of goals; and reallocation of tasks.
Klein, Woods, Bradshaw, Hoffman, and Feltovich (2004) identify the need for intelligent systems working in teams to (a) enter into a basic compact of joint goals and understood roles, and signal the other team members when they cannot perform their assigned tasks; (b) possess adequate mental models of each other; (c) be predictable to each other; (d) be directable; (e) share their status and intentions; (f) be able to interpret the status and intentions of the other team members; (g) be able to negotiate goals, particularly when situations change and adaptations are necessary; (h) collaborate to include problem solving, replanning, and retasking as needed; (i) redirect teammates’ attention to important signals, activities, and changes without overwhelming each other; and (j) manage the costs of coordination to maintain acceptable workload levels.
Although the discussion of the HASO model was largely focused on a human supervisory COO, it can also be extended to the human–autonomy teaming COO. In addition to automation oversight and intervention, the broader range of joint behaviors (such as collaboration, coordination, joint planning and replanning, reprioritization of goals, and reallocation of tasks) are enabled interactions in the model in that they also rely significantly on the individual’s SA regarding the state of the system and the autonomy. In addition, the capability for autonomy to exhibit intersocial behaviors, such as cooperation, coordination, and collaboration, will be necessary.
A human–autonomy teaming COO will likely require very high levels of LOA, with very low levels of AA and GOC—thus direct human engagement in the autonomy’s tasks will be low. This COO creates a situation in which it will be critical to create advanced interfaces that support the need for shared SA between the human and the autonomy. Shared SA is fundamental to supporting coordinated actions across multiple parties who are involved in achieving the same goal and who have interrelated functions, such as those that will occur in human–autonomy teams. Both the human and autonomous teammates will require a high level of shared SA to support a number of fundamental operations (U.S. Air Force, 2015): (a) goal alignment as priorities change; (b) function allocation and reallocation based on the relative capabilities and status of both the human and the autonomy for performing various functions as well as keeping up with who is doing what; (c) communication of decisions, including strategies, plans, and actions, as the human and the autonomy make decisions about how to perform their various functions, allowing actions on related functions to be coordinated; and (d) task alignment supported by maintenance of an ongoing understanding of what actions have been taken by the other and how successful those actions are at achieving shared goals.
Shared SA is “the degree to which team members possess the same situation awareness on shared situation awareness requirements” (Endsley & Jones, 2001, p. 47) which are those common aspects of the situation that are needed by both teammates. These aspects include basic data about the situation that are relevant to the goals and decisions of both parties, how each party is interpreting the situation, and projections made that are relevant to the other party.
For example, if the pilot of a manned vehicle and an autonomous vehicle need to coordinate to prosecute a target, certain information about the location, trajectory, speed, and capabilities of each other would need to be shared as well as any pertinent information each partner has regarding the target and its environment and his or her priorities, assessments, and planned actions. This shared SA gets them on the same page without overwhelming each other with nonpertinent details. Because there is a significant potential for the autonomy and the human to have very different assessments of the world driving their decisions due to different sensors and different mental or computational models, methods must be developed for sharing not just the low-level data upon which each is operating but also how that information has been interpreted and relevant future projections each has made.
Work on developing shared SA between human teammates can be leveraged to provide a model for supporting shared SA in human–autonomy teams (Endsley & Jones, 2001). Shared SA between teammates requires the following:
Perception: Shared perception of data needed for joint tasks, data validity, task status and actions of self and other, task assignments, and current goals and priorities of each.
Comprehension: Shared understanding of significance and meaning of data pertinent to interrelated tasks; impact of one’s own tasks on goals, system, environment, and other; impact of other’s tasks on goals, system, environment, and own tasks; and ability of self and other to perform assigned and prospective tasks.
Projection: Projected actions, strategies, and plans.
Although considerable research has been conducted on the mechanisms, devices, and processes used in human–human teams to achieve shared SA (Bolstad & Endsley, 2000; Cooke et al., 2003; Endsley & Jones, 2001; Endsley & Robertson, 2000; Prince & Salas, 2000; Salas, Prince, Baker, & Shrestha, 1995), much less is known about how to create effective shared SA in human–automation teams. Future research is needed to establish effective methods for achieving the required level of shared SA in human–autonomy teams to enable effective team performance.
Conclusion
The HASO model provides guidance needed to support design decisions for many semiautonomous and fully autonomous systems currently in development. As the system reliability and robustness of autonomous systems continue to increase, as the autonomy is capable of performing for much longer periods, and as the LOA increases, the ability of human operators to maintain SA will be challenged. The design of the autonomy interface and the autonomy paradigms employed can significantly ameliorate, although probably not completely overcome, this problem. As long as human oversight of autonomous systems and intervention is required to achieve successful joint performance, the automation conundrum will undermine performance and safety in many applications. When fully autonomous systems become a reality, it is likely that SA and shared SA will continue to be needed to support collaboration and teaming for many tasks. Additional research is needed to enable successful development of interfaces that will support human operation with autonomous systems.
Key Points
An automation conundrum exists in which as more autonomy is added to a system, and its reliability and robustness increase, the lower the situation awareness of human operators and the less likely that they will be able to take over manual control when needed.
The human–autonomy system oversight model shows the key factors affecting human operators’ ability to intervene in autonomous system behavior.
Key features for the design of autonomy interfaces and autonomy interaction paradigms are presented.
The development of autonomous systems will drive the need for additional research to (a) support their validation, particularly in relation to graceful degredation; (b) support the development of operator mental models when working with learning systems that foster inconsistent behaviors; (c) create needed system transparency with autonomous systems based on machine learning; and (d) devise methods for supporting shared situation awareness in human–autonomy teams.
