Abstract
A growing number of domains for Human-Robot and Human-Machine Interactions (HRI/HMI) will involve fleets of autonomous machines. In these fleet environments, robots will encounter not just primary interactors in one-on-one encounters, but also secondary, and even tertiary ones who are bystanders to direct human-to-robot interactions. Thus, the interaction paradigms used by such robots may need to be reconsidered to meet a growing diversity of interactions as fleet HRI/HMI applications continue to grow. Relying on use cases from field testing in mock urban environments, the purpose of this paper is to discuss our lessons learned when supporting multiple “layers” of interactors and how they relate to needs that Human Factors and Ergonomic (HF/E) science can help to address.
Introduction
In most cases, the focus of human-robot and humanmachine research has been on direct interactions between one human and one robot, or with one robot and a small group (Smith et al, 2020; Preusse et al., 2021). Although there are growing numbers of studies of human(s) working with multiple robots, often these studies involve the command and control of robots in simulation or ones in which human users of the system function primarily as monitors or supervisors of the autonomous fleet. The HF/E community is well versed in how to facilitate human supervision of fleets of uncrewed autonomous ground and aerial vehicles (UGVs and UAVs). For instance, by creating control interfaces for deploying multiple vehicles for reconnaissance (Chen et al., 2014; Fern et al., 2011; McKendrick et al., 2014). However, those studies often focus on facilitating good human performance for the supervisor of the autonomy and not necessarily the recipient of the autonomy who could also be thought of as a user/interactor of the autonomous system.
Further, the rate at which fleets of robots are being deployed in the real-world is steadily growing, and at a rate that is outpacing the study of their implementation. For instance, fleets of autonomous delivery robots are already deployed on several college campuses throughout the U.S (Marks, 2019). Similar robots are also being used throughout airports (Triebel et al., 2016) and hospitals worldwide (Guzman et al., 2021), for disinfecting spaces, delivering supplies, and providing hospitality service support (Choi et al., 2020).
Given that such fleets of robots will be operating in large, potentially crowded environments, user experiences with these robots are growing beyond direct one-to-one interactions or one-to-many supervisory interactions to also include secondary and even tertiary ones. For instance, an observer of another’s direct interaction is an example of a secondary interaction because perceptions of a robot may be changed by the observations of someone else’s (i.e., a primary) interaction with it. These observations could then influence subsequent interactions with robots when those secondary interactors become primary ones in the future. Repeated secondary observations of primary interactions may also scale up to represent tertiary interactions as well. Meaning that repeated secondary observations of interactions may influence what people perceive about the larger organization that the fleet represents.
Thus, the experience of secondary and even tertiary interactions is worth exploring because experience as nondirect interactors can alter the way engagement in direct interactions with robots occurs in the future. In the larger robotics community, crowds and bystanders who represent non-primary interactors have often been treated as obstacles to be studied, modeled, predicted, and/or avoided by autonomous systems all together (Shiomi et al., 2014). However, these bystanders matter beyond simply being modeled by the system, as they are potential future direct interactors. They can assist in being sources of information to fleet supervisors, who also represent primary users, which could be used to improve the fleet in subsequent interactions.
Further, the deployment of robotic fleets may also present juxtaposing, and at times potentially competing needs and goals between direct interactors who represent different “users” of the fleet system. To illustrate, a supervisor of an entire robot fleet may be tasked with monitoring and commanding the system, and that supervisor may need to transition from supervisor to operator of a single robot—to intervene in an autonomous behavior, for instance. On the other hand, a user who directly experiences the presence, actions, or consequences of the robot in a public space may begin as a bystander to the system and then transition to a user of the system—by observing and then engaging with an airport kiosk robot, for example.
In both cases, the different types of “users” of this system transition between the spaces of direct and indirect interaction and represent two different sides of the interaction. On one hand control of the system, and on the other use or experience of the system (Yanco et al., 2002; 2004). Given the different applications of the interaction, the system requirements for designs that facilitate interactions may differ, especially as each type of interaction happens simultaneously, and interactors can, and will likely need to, transition through types of interactions.
Relying on use cases from field testing of a fleet autonomy system in mock urban environments, the purpose of this paper is to discuss insights about user needs and preferences when engaged as primary, secondary, and tertiary interactors with the system and highlight needs that the HF/E community could help to address.
Field Testing and Use Case
The focal use case for our observations included field tests of a Defense Advanced Research Projects Agency (DARPA) sponsored project to develop supervised fleet autonomy for reconnaissance operations in complex urban environments. Such operations continue to be a difficult problem for the U.S. military and other governmental agencies because features of the terrain often expose personnel to significant risk. Current techniques rely heavily on foot patrols for reconnaissance which are labor and time intensive. The development and deployment of autonomous systems in such environments can help overcome terrain challenges by combining autonomous sensing, across sensor modalities— both static and mounted on autonomous ground and aerial robots—and integrating sensing with recording human interactions and responses to these robots. When combined, the components of these systems can aid personnel in learning about people and places of interest in the environment.
For our field testing, the supervised autonomous fleet consisted of several ground robots including Clearpath Robotics Warthogs, Clearpath Robotics Huskies, and a Polaris Ranger off-road vehicle outfitted for autonomous navigation and driving. These ground vehicles were accompanied by several small-to-medium autonomous aerial drones: Teal Golden Eagles and Theiss Validus Hex, as well as mounted camera sensors in the outdoor urban environment.One of the purposes of field testing was to evaluate the autonomy engineering of the fleet and to record information about human interactions with the fleet. Field testing of this fleet autonomy system was conducted over the course of about 30 months (and is still ongoing), organized into several iterative agile engineering cycles of approximately 5-6 months. Multiple field tests took place at several different mock Urban Terrain installations across the U.S. Each location included multi-story buildings, stationary automobiles, and streetscapes, often with degraded buildings and other urban infrastructure. Each test event was organized into series of 45– 60-minute test runs, called “missions”, which each represented self-contained scenarios. Each field-testing event was conducted over several days in which multiple missions were run each day.
Participants were recruited to come to a testing installation to participate in the testing events as role players in the missions. Participants were tasked with assuming a “role” in the urban environment and their task involved playing the role of someone who would live and work in the environment, like a shop owner, morning commuter, or mayor of the city’s populace. In addition, some roles represented potentially “hostile” players whose motivation for moving around the environment may be nefarious, to undermine or thwart the system, plant improvised explosive devices, or to provide counter-surveillance of the autonomous vehicles.
In their role, participants were instructed to move through the urban terrain environment, engage with other role-playing participants, and engage in behaviors that would be consistent with their role (e.g., a commuter going to a bus stop). Although the participants were given a role, and a background bio, and were placed within a specific scenario (e.g., a festival, a disaster recovery site, a festival, etc.), the scenarios and participant actions were not scripted. Throughout 2.5 years of testing, over 500 participants (between 10 and 80 per test run) were engaged as role players in the environment. Testing was split into individual mission scenarios, for which some of the role player participants were congregating in crowds or groups of different sizes. For instance, a morning commuter’s role in a given mission might include stopping in a cafe crowded with other participants on their way to a vacant bus stop.
At the same time, the vehicles that comprised the fleet autonomy system would engage the human participants in interactions with the vehicles. To encourage interaction and engagement from participants, the vehicles would announce information (at times similar to public service announcements), ask participants to come over to the vehicle to provide information about themselves by recording their responses on a tablet mounted on the vehicles’ exterior, or issue a command for single or multiple participants to perform an action (e.g., to leave an area), among other actions. Vehicles could at times also create participant engagements that enticed them to congregate into crowds or groups, for instance by asking participants to approach the vehicle to hear an important announcement, or to pick up a bottle of complimentary water.
Additionally, evaluators with subject matter expertise (SMEs) in urban reconnaissance operations were hired to act as “system operators,” called Mission Commanders (MCs) and supervise the fleet of autonomous robots and directly control specific robots if needed in the urban environment. The MCs’ objective was to observe the environment, the vehicles, and interactions between the participants with the vehicles and to make determinations about the role players as representing persons of interest (worth more scrutiny) or not. The MC would guide the autonomous robots in a supervisory manner, both individually and as a fleet, by selecting actions to execute from a large library of actions, and the vehicles would autonomously execute those commands.
In addition to general collection of the passive observations by the autonomous fleet, the MCs could deliberately guide the robots in enticing engagement from the participants with the goal to elicit behavioral responses that would help the human-machine team distinguish individuals as persons of interest or not. As such, the participants’ behavioral observations were gathered using a variety of sensors, and the observations were integrated into evidence chains to make recommendations for the MCs about role players. The end goal of the test was to accurately classify the role players who were assigned to play roles as threat or nonthreat. To perform fleet supervisory control, maintain situation awareness of people in the scene, and review collected evidence, the MCs used a map-based interface which supported all of these tasks.
Insights From Field Testing
Primary interactions
In one of the early agile engineering cycles of field testing, we evaluated participants’ sentiment and affect when interacting with the fleet system and when not interacting with the system, but passively experiencing it in the environment. To provide targeted feedback about the messages the system provided to the engineering team, we asked role players to estimate the number of messages they heard from the fleet system and report on their experience of the system. IRB approval was obtained from the University of Massachusetts, Lowell (UMass Lowell) to collect data about participants’ impressions of the system.
The participants self-reported their perceptions of the fleet system after completing a mission test run during each day of the field-testing event. Some participants completed questionnaires after more missions than others. Eight females, 10 males, and 1 unclassified person (N=19), who generally did not have hearing problems in both ears, completed the evaluation. Six of the role players had prior military/intelligence experience as government employees. The age range of the role players was between 19 to 79 years. All the role players’ primary language was English. Six role players completed high school or obtained a GED, three completed a two-year degree, eight completed a four-year degree, and two completed a master’s degree.
In general, the role players reported that they heard about five messages per mission. The role players’ sentiments, while interacting with the fleet were assessed by asking how calm, comfortable, annoyed, irritated, secure, stressed, and discouraged (inspired by Mikolic et al., 1997) they felt using a 7-point Likert type scales from Strongly Disagree (1) to Strongly Agree (7). In general, the role players on both event days neither agreed nor disagreed (Median = 4, SD = 2) that they felt calm while interacting with the Polaris Ranger, they somewhat agreed (Median = 5, SD = 1) that they felt annoyed while interacting with the Polaris Ranger, they neither agreed nor disagreed (Median = 4, SD = 1) that they felt secure while interacting with the Clearpath Robotics Husky, and disagreed (Median = 2, SD = 0) that the felt discouraged while interacting with the Thesis Validus Hex.
Participant sentiment while not interacting with the fleet system were evaluated in the same manner as above. In general, the role players on both event days agreed (Median= 6, SD =1 ) that they felt calm while not interacting with the Polaris Ranger, Clearpath Robotics Husky, and Thesis Validus Hex (Median= 6, SD =0), they agreed (Median = 6, SD =1) that they felt comfortable while not interacting with the Polaris Ranger, they disagreed (Median = 2, SD =2) that they felt annoyed or irritated while not interacting with the Polaris Ranger, they strongly disagreed (Median = 1, SD =0) that they felt stressed while not interacting with the Thesis Validus Hex, and they disagreed (Median = 2, SD =1) that the felt discouraged while not interacting with the Polaris Ranger. Participant sentiment seemed to be more positive when not interacting with the individual vehicles comprising the fleet system. One potential area which would help to explain these results further would be to investigate the effects of repeated exposure and interaction for both direct interaction and secondary interaction on changes in these sentiments and the degree to which these sentiments can be directly linked to future intentions to interact with the system over time.
Secondary interactions
In another field test, we evaluated the effects of observing a direct interaction with the system on bystanders’ subsequent interactions, thus exploring whether they transitioned from a secondary interactor to a primary interactor. Specifically, we assessed whether the presence of a leader in a group and the size of a group influenced a participant role player to engage in an interaction with a robotic system across six different instances of interaction and across two test runs. Across test missions, group sizes ranged from 5 to 12 (Mean=7.8), two test run missions included a female leader role player, two missions included a male leader role player, and two had no leader role player present.
Eleven females and eighteen males (N=29) completed this field test evaluation. The age range of the role players was between 19 to 75 years. Seventeen role players completed high school or obtained a GED, six completed a two-year degree, three completed a four-year degree, two completed a Master’s degree, and one completed a professional degree. IRB approval was obtained from UMass, Lowell to collect data about participants’ behavior with and impressions of the system.
A group of role players were asked to be at a specific location in the mock urban environment at a particular time during the test mission. Once there, role players needed to decide whether they wanted to physically interact, via a tablet, with a Husky robot after hearing messages from the robot inviting them to engage with the tablet. A role player playing the part of a leader of this group was asked to engage with the Husky at all times after all messages from the robot. The other role players knew who their leaders were but were unaware that the leaders were asked to engage with the Husky.
The results supported that first observations of interactions with the system were important for supporting subsequent direct interactions. The female and male leader conditions had higher engagement frequencies (45.5%, N=4) during the first Mission, compared to the no leader condition (39%, N=2), while the no leader condition had the higher engagement frequencies (62.5%, N=2) during Mission 2 compared to the female leader condition (50%, N=2). Data from the male leader condition in Mission 2 was omitted due to technical errors.
The large group condition (11 or 12 role players) had the slightly higher engagement frequencies (44.7%, N=3) during Mission 1 compared to the small (5 or 6 role players) group condition (42%, N=3), while the small group condition (81.5%, N=2) had the higher engagement frequencies during Mission 2 compared to the large group condition (31%, N=2). Future evaluations are needed to determine if similar patterns are observed with larger data sets and in more controlled settings. This data, however, may suggest that role players leveraged their leaders and larger groups to inform whether to engage with the Husky during Mission 1, but did not need the same conditions or information for subsequent missions. This is an important finding because results from earlier testing cycles suggested that participants felt (self-reported) pressure to comply with robots’ messages because of other participant compliance. However, in those settings, participants were not organized into groups or with designated group leaders. It is possible that for early interactions with robots comprising the fleet the first observations of direct interactions, especially if those interactions happen with individuals who may be perceived as trusted individuals, may be important for transitioning from secondary observation to primary interaction.
Tertiary interactions
In the last testing cycle, we mimicked our field-testing methods in a laboratory environment due to disruptions in testing caused by the COVID-19 pandemic. Participants were brought into the lab one at a time to complete a series of tasks that involved sorting, organizing, and moving household objects from one area in the lab to another and periodically an Omron ground robot would interrupt their tasks to entice interaction with the system. We investigated how different organizational identities governing the system influenced interactivity with the system, and perceptions of the identity governing the system. In one case the organization representing the fleet was suggested to be humanitarian in nature and in the other the organization was suggested to represent an organization responsible for security operations. Specifically, role player participants were asked to identify who they perceived was the associated “authority” governing the operations the robot. Half of the role player participants were introduced to the system as representing a humanitarian identity similar to a non-governmental medical organization and the other half were introduced to the robot as representing a security identity similar to a U.S. military unit conducting security operations. All participants listened to several messages from an Omron robot requesting (in the case of the humanitarian organizational identity) or instructing (in the case of the security organizational identity) their engagement with the robot by interacting with a tablet mounted on the robot.
Eleven females and fifteen males (N=26) completed the evaluation. The age range of the participants was between 18 to 71 years. Nineteen participants completed high school or obtained a GED, two completed a two-year degree, two completed a four-year degree, and three completed a Master’s degree. IRB approval was obtained from UMass, Lowell for the evaluation procedures. The most common reason why participants who interacted with the humanitarian robot identity interacted with the robot was because they reported that they were interested in or curious about the system, information or content of its messages, whereas the participants who interacted with the security robot identity felt that the system was attention-getting, alarming, forceful, urgent, threatening, annoying, and that they felt obligated to engage. The most common reason why participants from both identity sessions did not fulfill the robot’s instructions was because the message from the robot was provided too frequently, there was not enough variability in its messages, or they had already engaged with the system.
Across both organizational identity conditions, the identity participants described the most as responsible for the fleet was the “robot,” “vehicle,” “machine,” or “message” itself. These findings seem to suggest that the entity that participants physically interact with may be more salient than the larger organization that the robot represents as a fleet. In this case, repeated direct interactions with the robot did not seem to affect the perception of the larger organizational identity governing the robot. However, in our testing environment, the robot did not fail in its actions or otherwise behave problematically. It’s possible that if the robot had engaged in problematic failures, people may be inclined to better understand the organization governing the robot, perhaps for legal or other reasons.
User/operator interactions
We also asked MCs (representing user operators of the fleet) to provide feedback on the design of an interface created to supervise the fleet. Demographic information about the MCs is currently unavailable. Overall, the MCs were moderately satisfied with the User Interface provided for them: 4.6 out of 7 for satisfaction with the main (spatial) view, 5 out of 7 for satisfaction with the control of the fleet, and 4.4 out of 7 for satisfaction with the evidence review capabilities on a Likert-style scale. However, they only rated the perceived usefulness of observing role player interactions with the robots for understanding whether they presented an imminent threat only at 3.67 out of 7 - only somewhat useful. In informal interviews, they disclosed that it was hard to understand how people reacted to the vehicles, especially if they were not viewing the video feeds, but rather were relying on the information aggregated and presented to them by the fleet system. To the best of our knowledge, they were not aware of the secondary and tertiary interactions as they unfolded during the mission runs. There appears to be a disconnect between designing robots for the intended interactants or bystanders of the system and designing the interfaces for control and monitoring of such robots. In our system, there was no way for the operator to directly interact with the role players via the robots, and possibly subsequently, role players identified the robot as the authority figure. In future work, it would be interesting to explore how telepresence affects operator understanding of the role player interactions, and vice versa, and how it affects the perception of robot identity and secondary interactions. More work is needed to better understand how we can assist user interactors like fleet supervisors to transition from monitor to direct interactor with other primary and secondary interactors. Feedback from the target user population indicates that having this ability may be helpful in the future.
Discussion and Conclusions
Primary interactions can have two sides including controller-supervisor interactions and user interactions. Novel methodologies are needed to support designs addressing these dual-nature robots and operator interfaces, where the operator is just as aware of the robot’s effect on people, as people are of the intent of the robots. Direct-primary interactions could use support to entice people to move from bystander to interactor. The results of the field test show that they are many cases in which the robots’ announcements may have been uncomfortable or annoying when engaging with machines in the fleet, or they did not perceive an authority that would require or entice them to interact. However, the value of people around the system, in previous paradigms often considered only as bystanders, may be the key to garnering engagement especially for early exposures to the fleet. The results of secondary interactions provide some early evidence (worth future exploration) that people may be leveraging their groups and specific individuals within them to make decisions about whether to become primary interactors.
The HF/E community can help to ensure that the design and usage of these systems is optimized for the different layers of human interaction across the fleet. The lack of perception of an authority or overarching organization that subsumes the entire fleet seems to be a difficult challenge when dealing with fleet autonomy. Participants in testing runs did not seem to perceive any one organization that was responsible for the system and MCs seemed to suggest that being able to communicate this information may have been helpful. New interaction paradigms like including the ability to add elements of telepresence may be promising here and represent a way to help user operator/supervisors to make the transition from supervisor to direct interactor. More work will be needed to determine how to optimally use telepresence in these cases and support good human performance when making these transitions (e.g., balancing workload and situation awareness for operators).
Right now, our findings seem to suggest that supporting comfort and minimizing any negative sentiment is and will continue to be a difficult challenge. One area in which the science of HF/E may be able to contribute is in signaling to humans that the system indeed has knowledge that all the persons in the environment are not merely obstacles. It might be helpful to communicate that the system has awareness of people and is actively working to make its presence in the environment comfortable and usable. Similar calls in the HF/E community have been suggested in the past including for building etiquette into autonomous machines (Parasuraman at al., 2004). It is important not to discount the value of prioritizing the ease of these primary interactions and the value of bystanders seeing these interactions going well especially for early encounters with the system. Findings suggest that in some cases people leveraged seeing groups and leaders interacting with the system as information about their subsequent engagement.
Clearly much more work is needed to support fleet interactions and the HF/E community is well versed in its expertise to contribute to this work. Although these findings provide early insights, there are a number of limitations of the work. These studies were conducted throughout a series of field and engineering tests over the course of many interactive design cycles. Thus, it was difficult to systematically isolate variables with the same level of control that would be available in laboratory environments. Further, for testing that was able to be conducted in the lab, the spread of the COVID19 virus meant that sample sizes were small. However, the insights presented here we believe can be a starting place for systematic investigations to be brought into the laboratory or conducted more systematically in the field. Additionally, thinking about the needs that support user transitions through the types of interactors is novel in much of the literature on human interactions with robots. This may serve as a starting place for advancing theory and practice for enabling future users to transition between types. Doing so will be increasingly necessary as there are a growing number of use cases for fleet autonomy and human-machine interactions with fleets which will be important across a variety of applications.
Footnotes
Acknowledgements
This research was performed under Defense Advanced Research Projects Agency (DARPA) contract \#HR001120C0180, Urban Reconnaissance through Supervised Autonomy (URSA). The views, opinions, and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.
