Abstract
The emergence of Advanced Air Mobility (AAM) has seen the design of new electric vertical takeoff and landing (eVTOL) aircraft that will be utilized to serve the AAM market and associated use cases. Before operations commence, aircraft manufacturers must obtain a government-issued type certificate proving that the new aircraft meets prescribed safety levels. Human factors is an important aspect of AAM certification as the proposed designs have significant changes to how pilots will interact with the aircraft. Published certification standards create an opportunity for human factors researchers to generate meaningful ties between the research being conducted in their laboratories and the work being conducted in the industry to prove the effectiveness and safety of AAM systems. To facilitate this, the current effort identified human factors language used in the certification documents that is related to information processing on the flight deck and relevant to AAM aircraft. This language was then mapped to constructs and associated measures studied in the literature. This mapping can serve as a guide for human factors researchers to ensure the relevancy of the research being conducted for this emerging domain.
Keywords
Introduction
The aviation community is currently aiming to provide air mobility as an alternative for everyday commute requirements through emerging concepts such as Advanced Air Mobility (AAM). AAM represents a rapidly growing ecosystem that will provide passenger and cargo transportation within urban and suburban areas in the low-altitude National Airspace System (NAS; Chancey et al., 2021). Original equipment manufacturers (OEMs) have proposed the launch of passenger AAM services through electric Vertical Take-off and Landing (eVTOL) aircraft using battery-powered propulsion technology (Goyal et al., 2021). These new eVTOL aircraft designs target simplified vehicle operations (SVO) in which inceptors and advanced flight displays are proposed to lower pilots’ workload and enhance situational awareness (SA), and flying performance (Osman et al., 2022).
This is an exciting time for aviation human factors researchers as there is an abundance of research questions to examine. However, there are several challenges associated with conducting research in this space. First, the industry is rapidly evolving with OEMs entering and exiting the market, resulting in various shifts in technology focus, (e.g., design of both hybrid and electric propulsion vehicles, and at varying levels of automation). Second, given the competitive nature of the industry, OEMs are sharing very limited information about their proprietary designs and associated testing. Third, because this is primarily a commercial endeavor, there is limited government funding for human factors research in AAM. As such, it is difficult to develop a program of human research that one can be confident will have a lasting impact on this emerging industry.
There is an opportunity to leverage information from the regulatory environment. One of the key challenges facing OEMs is that 14 Code of Federal Regulations (CFR) Part 21 requires all new aircraft designs to obtain a Type Certificate, the approval of the aircraft design, and all components (Federal Aviation Administration [FAA], 2022a). To obtain this, proving the aircraft meets stipulated safety levels, the FAA conducts a comprehensive review of the proposed design, ground/flight tests, and evaluation of the aircraft’s maintenance/operational suitability (FAA, 2023a).
Human factors is a key part of the certification process for AAM OEMs as the proposed designs have significant changes to how pilots will interact with the aircraft. Airworthiness criteria for OEMs such as Joby Aero, Inc. and Archer Aviation contain language such as, “The pilot compartment, its equipment, and its arrangement to include pilot view must allow each pilot to perform their duties…without excessive concentration, skill, alertness, or fatigue.” (FAA, 2022b, p. 67407). This sets a standard that OEMs must meet to obtain certificates and creates an opportunity for human factors researchers to generate meaningful ties between research being conducted in their laboratories and work being conducted in industry to prove the effectiveness and safety of AAM systems. To facilitate this, the current effort identified human factors language used in certification documents relevant to AAM aircraft and mapped it to constructs and measures in the literature to serve as a guide for human factors researchers to ensure the relevancy of the research being conducted for this emerging domain.
Methods
First, we identified publicly available certification documents relevant to eVTOL aircraft. At the time of this writing, two FAA airworthiness criteria were available to the public: those for Joby Aviation S4 and Archer Aviation Midnight aircraft (FAA, 2022b, 2022c). We also identified 14 CFR Part 23, under which the FAA will type-certify the current generation of eVTOLs as special-class aircraft. Second, we reviewed each document and identified human factors key terms utilized to guide us toward relevant constructs. We focused on language related to information processing on the flight deck and language related to physical ergonomics was considered out of scope for this study. Four human factors researchers reviewed the three certification documents independently, each creating a list of key terms and associated document sub-sections and excerpts. The researchers then met to review independent findings and reach a consensus on the final list of key terms. Third, to understand the intent of the language used in the certification documents, which do not typically contain definitions, we reviewed additional FAA and industry-related reports (e.g., FAA, 1998, 2002, 2011a; Yeh et al., 2016) and held another consensus meeting to determine the appropriate interpretation of these terms. Fourth, we mapped the key terms used in the certification documents to human factors constructs in the extant literature. In several cases, this was a direct mapping; however, in other cases, our interpretation was used to assist in identifying the human factors construct that mapped most closely to what we believe was the FAA’s intent. In this process, we reviewed the extant literature that aligned with the terms and our interpretations, to identify construct definitions and associated measures. Fifth, we developed use-case scenarios and identified metrics/instruments to facilitate empirical evaluation in an AAM research context.
Findings
A total of nine human factors terms related to information processing were identified from the certification documents and mapped to the constructs and measures in the extant literature. This is a non-comprehensive list of terms we believe are most relevant to information processing on the flight deck, including, (a) exceptional skill, (b) excessive alertness, (c) excessive mental fatigue, (d) error minimization, (e) monitoring, (f) excessive concentration, (g) distraction, (h) discernability, and (i) workload. The following paragraphs include, (1) a description of the FAA language used, (2) our interpretation of this language, (3) the associated human factors construct from the literature, (4) construct measures, and (5) use case scenarios for assessment in an AAM context.
Exceptional Skill
The term “exceptional skill” is used in 14 CFR Part 23 such as: “… aircraft is capable of continued controlled flight and landing, possibly using emergency procedures, without requiring exceptional pilot skill.” (FAA, 2023b, p. 178), and similarly in other documents (FAA, 2022b, 2022c). We believe that this terminology refers to the skill level associated with a typical, proficient, appropriately certified, and rated pilot, not a novice pilot who recently received a certificate, nor a test or aerobatic pilot with exceptional skills. Elite pilots with competitive/test flying experience have been shown to perform better with respect to cognitive ability than other pilots with similar hours and non-pilot participants (O’Hare, 1997). Pilot skill can be measured utilizing performance measures (Hunter, 2003), test batteries (AlMamari & Traynor, 2019), and flight experience in flight hours and type of pilot certificate (Schriver et al., 2017), although an equivalent type of experience is important. AAM researchers should aim to utilize a sample of pilots that are comparable to the target population. For instance, if the target population is certified, powered-lift, commercial pilots, then samples should include pilots with similar skills and experience to those who will eventually be certified in this category, not pilots-in-training or expert test pilots. There is a discussion of the potential for AAM pilots to have reduced training than commercially-certified pilots, so for AAM researchers, it might be beneficial to examine differences between commercial pilots and more/less experienced pilots. However, others argue that the required training may not be less but may target a different skill set. Thus, ensuring participant experience and/or training aligns with the appropriate skillset will be important for the generalizability of study findings. Regardless, it is important to ensure appropriate participant skills either by having them self-report certificate ratings and flight hours or assessing skills in a simulator.
Excessive Alertness
The term “excessive alertness” is used in 14 CFR Part 23 such as: “The pilot compartment…must allow each pilot to perform his or her duties, … and perform any maneuvers within the operating envelope of the airplane, without excessive…alertness.” (FAA, 2023b, p.178) and similarly in other documents as “exceptional alertness” (FAA, 2011b, p. 65, 2022b). We believe excessive alertness refers to attention/vigilance required to attend to, detect changes in, and respond to multiple sources of information that supersedes typical levels required to complete the flying tasks. Alertness is defined as a state of sustained attention that enables an individual to respond effectively to both internal and external stimuli (Souman et al., 2018) and has been associated with vigilance and being in opposition to boredom (Borghini et al., 2014). Alertness can be measured physiologically (e.g., eye blink frequency), by performance (e.g., error rate and response rate), and with the Stanford Sleepiness Scale; Oken et al., 2006). Alertness is of particular interest in AAM as the proposed operational concepts may prove a challenge for alertness, such as shorter flights with several repetitive legs, the majority of which are in high-risk, high-workload phases of flight (e.g., take-off, landing), within close proximity to hazardous terrain. AAM researchers should examine the impacts on pilots' alertness within this operational context. For example, in examining novel displays conveying the state of automated systems, researchers could monitor pilots’ blink rate over the course of several back-to-back missions, then introduce an off-nominal state to determine if the pilot detects and responds to the anomaly in a timely manner.
Excessive Mental Fatigue
In opposition to alertness is mental fatigue, a term used in 14 CFR Part 23 such as: “The pilot compartment must allow the pilot to perform their duties without excessive…fatigue…” (FAA, 2023b, p. 192) and used similarly in other documents (FAA, 2002, 2022b). We believe that fatigue pertains to the pilot not having the mental capacity to perform their duties due to extended exposure to workload, vibrations, excessive alertness, or sustained physical exertion. Fatigue is defined as a physiological state of reduced mental, and physical performance capability, resulting from sleep loss, extended wakefulness, circadian phase, and workload (Phillips, 2015). Mental fatigue is associated with increased reaction time, decreased cognitive flexibility, and hand-eye coordination (Wingelaar-Jagt et al., 2021) and can be described as the absence of alertness. Fatigue can be assessed using the Samn-Perelli Fatigue Scale (Gander et al, 2013), physiologically (e.g., heart-rate variability; Matuz et al., 2021), and with reaction times. Researchers assessing AAM technology/operational procedures should assess the impact of the cognitive/physical workload demands on pilot fatigue. Participants could perform back-to-back flights utilizing realistic cockpit configurations with events to which a pilot must respond, while monitoring heart-rate variability, reaction time, followed by a post-hoc Samn-Perelli Fatigue Scale.
Error Minimization
The term “minimize error” is used in14 CFR Part 23 such as: “The system and equipment design must minimize flightcrew errors, which could result in additional hazards.” (FAA, 2023b, p.192) and similarly in other documents (FAA, 2022b, 2022c). We believe minimizing errors means reducing design-induced errors by ensuring error-prone facets of the system are designed out or enhancing pilot recovery from errors. For example, an evaluation of touchscreen controls in the flight deck found that the error rate increased when pilots used the screen for data entry tasks (Dodd et al., 2014). Potential errors can be identified using the Human Error Template (Stanton et al. 2014), which can guide the design of simulation scenarios to elicit expected and unexpected errors, including infrequently expected scenarios. Further, the hazard caused by the error can be assessed, and how adequately a display enables the pilot to perform, respond, mitigate, and recover from such errors can be evaluated. Measures such as response time, accuracy, and error rate can be recorded. In the AAM context, some eVTOLs have a transition phase (e.g., during landing), from wing-borne flight to hover in which airspeed management can be challenging. Although this will likely eventually be fully automated, in initial manned flights, pilots must make a mental shift to recognize how the effects of aircraft attitude on airspeed change throughout this transition phase (Emerson et al., 2022). In such a scenario, researchers could measure errors in airspeed management and the pilot’s ability to recover.
Monitoring
The term “monitor” is used in 14 CFR Part 23 such as: “…information must be presented in a manner that the crewmember can monitor the parameter and determine trends…” (FAA, 2023b, p. 192) and in other documents in a similar fashion (FAA, 2022b; Yeh et al., 2016). We believe it refers to the pilot's ability to observe and interpret information. Monitoring is defined as a sense-making process (Mumaw et al., 2020) and systematic observation and interpretation of the current state of the aircraft and its operational environment through the information presented on the flight deck (Isaac, 2014). Pilot monitoring can be assessed via gaze data (e.g., number/duration of fixations), event detection (e.g., reaction time, correct responses), and SA measures such as the Situation Awareness Global Assessment Technique (SAGAT; Mumaw et al., 2020). In AAM, many proposed vehicles are highly automated, shifting the pilots’ role to primarily monitoring. To assess the impact on pilot’s monitoring, researchers can use eye tracking to monitor gaze data during simulated scenarios, and assess the accuracy of response to SAGAT queries, and off-normal events.
Excessive Concentration
The term excessive concentration is used is used in 14 CFR Part 23 such as: “The pilot compartment, its equipment, and its arrangement…must allow each pilot to perform his or her duties…without excessive concentration” (FAA, 2023b, p. 192) and similarly in other documents (FAA, 2022b, 2022c). We believe excessive concentration refers to the pilots’ need to focus increased attention on one task reducing the ability to multitask when required, creating the potential for cognitive tunneling. Concentration is defined as the ability of an individual to focus their attention on the task at hand without getting distracted by irrelevant stimuli (Wilson et al., 2006). With excessive concentration, the concern is increased selective attention demands, which differs from excessive alertness, which we believe is related more closely to increased divided attention demands. Pilots’ concentration can be measured using pupil dilations to signal attentional effort (Alnæs, et al., 2014), eye gaze data, and the NASA Task Load Index (TLX). In the AAM context, pilots will be required to concentrate on a range of cues and interface elements cannot preclude the ability to multitask. Researchers can evaluate the effects of displays and procedures on a pilot’s ability to concentrate on multiple pieces of information in a simulated scenario in which the pilot has to multi-task (e.g., monitoring flight path information during dense traffic and hazardous terrain, amidst flight deck non-critical alerts). Concentration can be measured by capturing the duration of eye fixation on each piece of information to ensure appropriate attention allocation, pupil dilation measures, and behavioral responses to events.
Distraction
In opposition to concentration is the concept of distraction, used in 14 CFR Part 23 such as: “Residual control forces must not fatigue or distract the pilot during normal operations of the aircraft and likely abnormal or emergency operations.” (FAA, 2023b, p. 181) and similarly in other documents (FAA, 2022b, 2022d). We believe distraction refers to any situation or event that diverts a pilot’s attention away from critical tasks. Distraction can arise from external stimuli such as physical exertion (e.g., residual control forces), multimodal cues, or physiological factors (e.g., stress). In driving research, distraction is defined as a situation in which attention is diverted from the primary task due to an event that compromises a person’s auditory, biomechanical, cognitive, or visual faculties (Pettit et al., 2015). Distraction can be measured using eye fixations on displayed information (Chen et al., 2019) and by a detection task in which random targets are presented in the periphery while executing a primary task (Olsson & Burns, 2009). Researchers could assess the impact of AAM operational procedures on pilot distraction in scenarios in which there are passengers engaging the pilot and ATC communications (e.g., clearances), while the pilot is focused on a primary flight task (e.g., monitoring hazardous traffic or acute weather changes). Distraction could be assessed using fixation data on relevant/irrelevant information, and accuracy of, and time to respond to, relevant/irrelevant events.
Discernibility
Discernibility is used in 14 CFR Part 23 such as: “There must be a discernible means of providing system operating parameters required to operate the airplane, including warnings, cautions, and normal indications to the responsible crewmember.” (FAA, 2023b, p. 192) and similarly in other documents (FAA, 2022d; Yeh et al., 2016). We believe discernibility refers to a pilot’s ability to perceive and interpret sensory information to make decisions and take appropriate actions. Perception has been measured by recording pilots' responses to visual cues like runway obstructions (Grady & Thompson, 2017), and the SAGAT (level 1; Keith et al., 2014). In AAM, an area in which pilot discernability will be critical is emerging battery information displays. Battery power management will be different from fuel management and there are a range of displays under development that attempt to provide pilots the parameters necessary to discern battery power and time remaining for the mission. Methods such as showing pilots mockups of battery displays representing various states and asking pilots to interpret the information and report how they would proceed with the mission could assist in evaluating discernability. More advanced simulation studies could assess how well pilots can discern the meaning in more dynamic settings based on the pilots’ resulting decision and response time, or response to SAGAT queries.
Workload
The term workload is used in 14 CFR Part 23 such as: “No airplane may exhibit any divergent longitudinal stability characteristic so unstable as to increase the pilot’s workload…” (FAA, 2023b, p. 181) and similarly in other documents (FAA, 2005, 2022b) with a focus on workload minimization. We believe that workload refers to how much physical/cognitive work a pilot must do versus the resources they have available to perform a specified task. Workload is defined as the demands on an operator’s mental and physical resources when completing a task (Webb et al., 2010), and high workload levels can degrade performance. Workload can be measured using the NASA-TLX or the Cooper Harper scale, primary/secondary performance measures, and physiological measures (e.g., heart rate, Webb et al., 2010). In AAM, increased automation is proposed to result in lower pilot workload; however, advanced automation may create novel cognitive workload demands even with the reduction of human tasks. Researchers could assess workload throughout simulated missions as the pilots interact with automated systems/autonomy, through real-time monitoring of heart rate, performance, and with the NASA-TLX.
Conclusion
The goal of this effort was to provide insight to human factors researchers in the AAM domain regarding constructs on which to focus their research to increase relevancy to the industry. This effort aimed to provide guidance to AAM researchers by analyzing FAA certification requirements to which all AAM OEMs will be held, extracting relevant human factors terms, and identifying associated constructs and measures from the literature. The resulting terms include (a) exceptional skill, (b) excessive alertness, (c) excessive mental fatigue, (d) error minimization, (e) monitoring, (f) excessive concentration, (g) distraction, (h) discernability, and (i) workload. These findings can guide researchers in selection of dependent variables to ensure research findings are relevant to industry OEMs and regulators. For example, assessing the impact of various automation schemes on pilot monitoring behaviors, fatigue, and the alertness required to detect anomalous situations can help ensure the results of the study are helpful in the design, development, and certification process. Included in this review are measures and use cases to assist in implementation. It should be noted that this list is not a comprehensive list of all human factors terms relevant to certification. Consider this a consolidated list of terms we believe are most relevant to information processing on the flight deck. We did our best to capture what we believe is the intent in the certification documents, but these are our interpretations and may not accurately reflect the intent of the FAA. Finally, this manuscript is not meant to be prescriptive, but to serve as a guide to bridge the gap between academic research and industry practice in this emerging domain.
