Abstract
Three aspects of Kaber’s paper are discussed: (a) the origins of the level-of-automation concept as related to various misconceptions in the literature regarding the intent of the original paper; (b) distinctions between descriptive, predictive, presumptive, and normative models; and (c) the difficulty, even impossibility, of making level-of-automation taxonomies into readily useful tools for system design.
Introduction
Kaber (2018 [this issue]) has provided a thoughtful and extensive discussion of how various investigators have struggled to make the level-of-automation (LOA) concept relevant to design of human–automation interaction. There are very many aspects of integrating human performance with that of automation that Kaber confronts, including the usefulness of current and relatively coarse LOA taxonomies in design, tendencies of humans to satisfice, identity of responsibility between human and machine for outcome, and so forth. I agree with most of Kaber’s arguments, but would comment on three aspects of the ongoing LOA discourse that might be helpful. First, I believe it is important to mention the origins of the LOA concept and the ensuing misinterpretations of the original intentions. Secondly, I would add some caveats about presumptive versus descriptive models and their role in making science useful (predictive) in system design. Thirdly, I suggest that refining LOA taxonomies to become readily applicable design tools is likely to be an unachievable endeavor.
Origins of LOA
Having been credited with originating the idea of LOA (Sheridan & Verplank, 1978), as well as the generic and related notion of supervisory control (Sheridan, 1976), I would claim that in that first LOA publication there was never any intention to be descriptive of any particular experimental data. Further, the 10-point scale was never intended to be normative in any strict sense. I think the reason it caught on was that it was based on simply worded categories that were merely suggestive and enabled users to make whatever inferences were relevant to their problem. We surely know that all models are literally wrong in the sense of being incomplete: There are just too many parameters of the real world that cannot be included. The simpler the model, the more easily a designer can use it, not as an algorithm to get a final answer, but as a metaphoric tool to provoke disciplined thinking about the specific design problem at hand.
The motivation for our initial suggestion of the LOA idea was as follows: In July 1969, astronauts first walked on the moon, and there was much in the newspapers about computers and automation, but the clear implication was that the astronauts had steered the lunar lander to a safe spot themselves. Was the landing automated, or was it not? The principle motivation for that 1978 LOA suggestion was to clarify that automation is not an either–or. Having been a peripheral member of the Apollo lunar lander design team at MIT, I knew that control in some degrees of freedom was mostly automated, whereas control in others was mostly astronaut controlled, and that there were various options the astronauts could use. My student Bill Verplank and I were pondering underwater and space robots at the time, and the combination of circumstances led us to thinking that there were or could be many levels of automation, and so we somewhat arbitrarily suggested some categories, without ever claiming that the 10 levels we described were all there were, that the scale was necessarily one-dimensional or was necessarily monotonic. We only intended to suggest that there could be many different, possibly graded, levels of automation, and the simple phrases could hardly amount to descriptive or normative models.
It has been flattering and somewhat surprising that the LOA idea caught on, but at the same time disappointing that some people (no names) sharply criticized that simple 10-category scale as though it was intended to be an algorithm, or a prescription for designing automation, or whatever. Automation is an extremely broad term, covering a range of sensor, actuator, and decision technology, and an almost infinite variety of functions and ways it can interact with humans.
Descriptive Versus Presumptive Models of LOA
Kaber asserts that LOA taxonomies, as they have emerged, are “presumptive” and that descriptive models that represent how real people interact with real automation would be of significantly greater help to designers. However, Kaber does not explain in what sense LOA scales that have already been devised by various authors are presumptive. Googling “presumptive model”, one finds reference in the legal arena to guidelines for sentencing criminals based on precedent and, in the design theory arena, a technique for subjecting proposed designs to relevant stakeholders to ascertain whether there are provocations, constraints, or norms by which to shape the developing design. Thus, a presumptive model seems to me to be pretty close to what is commonly called a normative model, a model based on norms resulting from constraints operative in the considered context.
There is nothing wrong with encouraging descriptive models to record what happened in specific human–automation experiments of real-world situations, and perhaps these could be fashioned into an ordered LOA set of examples. By the usual definition, every descriptive model is a special case, so an LOA thus constituted from observed data would be a set of well-defined special cases.
Descriptive modeling customarily means that that for each experiment the results are conditioned in terms of the specific task, experimental design, and experimental parameters—allowing a descriptive model for that particular experiment. (The tendency, as is well known, is to overgeneralize the descriptive model from one’s experimental results.) I don’t deny that with sufficient replication (perhaps with some variation in task and parameters), a generalizable model might emerge that has predictive usefulness in design: Given a particular task and form of automation, certain results are likely to follow because the conditions match a prior descriptive model for that situation. Over time, a number of such descriptive models might merge into a truly useful (and likely many-dimensional) LOA scale. That, of course, would be tending toward normativity, based on the presumption of the specified class of tasks, measures, and forms of automation.
It might be worth reiterating the consensual meanings of descriptive, normative, and predictive with respect to modeling. In the strictest sense, a descriptive model says what did happen (conditional upon particular empirical data and context circumstances). A normative model says what should happen—if the underlying cause–effect relations are those assumed, that is, for defined technology and a perfectly trained and rational decision maker operating with a (given) objective function (Bell, Raiffa, & Tversky, 1988). A predictive model says what will happen. A descriptive model may be predictive if the task and parameters of the target context match those on which the model was based. A normative model is predictive if the underlying cause–effect relations (the mechanism) of the target situation match those assumed. Normative models are typically cleaner mathematically, as the cause–effect relationship is more amenable to discovery through transfer function identification techniques. It is common to adapt a normative model (e.g., in feedback control and signal detection) and infer parameters that make the model fit the given empirical data or show discrepancy from the assumed norm. We often fit empirical data to a normative model such as a mathematical function because the mathematical function is easy to specify logically, especially if there are a few parameters that can be adjusted to improve the fit. However, if there are many context parameters (e.g., as is the case with rule-based cognitive modeling), there is the danger of losing generality.
Kaber appropriately points out that some LOAs are not appropriate to a given context. Clearly no LOA is normative for all human−automation situations. I would add an example: The currently salient Society of Automotive Engineers LOA scale (Society of Automotive Engineers, 2014) for self-driving cars implies for Levels 3 and 4 that drivers will be able to take over control in emergencies when the automation is befuddled. (Most experts disagree that drivers will be so able due to lack of readiness.) Kaber also mentions that humans do not always do what system designers expect them to do (an important factor in distinguishing descriptive modeling from normative modeling). Humans do not think like computers but operate on the basis of verbal heuristics (which is why I find fuzzy logic attractive as a means to translate from human thinking to computer algorithms). Simple, one-dimensional LOA scales are far from addressing these issues. However, as is well known, the more complex and less normative a model is, the less useful in terms of the kind of metaphorical thinking so inherent in engineering design, for example in the way that simple linear models of how effort, resistance, inertia, and flow permeate multiple physical relationships (solid and fluid mechanics, heat, electricity) and increasingly economics and even human performance. The true relationship between independent and dependent variables in the real world is extremely nonlinear, but simple linear models go a long way in system design, which fact is true of many models of human performance (Sheridan, 2017).
An Unachievable Goal?
Kaber’s discussion suggests a plethora of dimensions of an LOA, as appropriate to different contexts, almost without limit. Various forms of adaptable and adaptive automation might also be included, assuming experiments including these factors and suitable descriptive models have been developed. Indeed, in the overall system design process, we have inadequate criteria for which of human or machine is made responsible for setting the policy of control for different component tasks (e.g., phases of flight), which in turn determines whether a human or machine should actively adjust parameters of control in a more or less real time basis. In other words, sometimes computers have authority over how people do control, and sometimes people have authority over how computers do control (Sheridan, 2011). The number of possible combinations of circumstance of how human–automation systems are constituted is huge. Ultimately, all the variables of human cognition and motor control combine with all the variables of computers, sensors, and actuator mechanisms—a much larger set than either already large subset!
An LOA taxonomy based on descriptive models that is useful for human–automation design of complex systems may be a lot to hope for, simply because there are so many task contexts and varieties of automation technology, and because automation is constantly being improved—a moving target. I need only mention the recent spectacular success of deep learning (by multilayered neural nets), an automation technique that is totally context dependent. In my paper titled (with tongue in cheek) “Function Allocation: Algorithm, Alchemy or Apostasy” (Sheridan, 2000), I argue that making a science out of human–automation function allocation for system design may be an unachievable objective. Engineers will continue to employ scientific models for design decisions about physical system components, and some aspects of human–automation interaction will be aided by experimental insights. However, overall design of large-scale human–automation systems (for example, design of modern airplanes or air traffic control systems) will continue to be a matter mostly of experience, art, and iterative trial and error.
Footnotes
Thomas B. Sheridan is Professor Emeritus in the Mechanical Engineering and Aeronautics/Astronautics Departments at MIT. He is a former president of Human Factors Society and is a member of the National Academy of Engineering.
