Abstract
Objective:
The aim of this study was to integrate empirical data showing the effects of interrupting task modality on the performance of an ongoing visual-manual task and the interrupting task itself. The goal is to support interruption management and the design of multimodal interfaces.
Background:
Multimodal interfaces have been proposed as a promising means to support interruption management. To ensure the effectiveness of this approach, their design needs to be based on an analysis of empirical data concerning the effectiveness of individual and redundant channels of information presentation.
Method:
Three meta-analyses were conducted to contrast performance on an ongoing visual task and interrupting tasks as a function of interrupting task modality (auditory vs. tactile, auditory vs. visual, and single modality vs. redundant auditory-visual). In total, 68 studies were included and six moderator variables were considered.
Results:
The main findings from the meta-analyses are that response times are faster for tactile interrupting tasks in case of low-urgency messages. Accuracy is higher with tactile interrupting tasks for low-complexity signals but higher with auditory interrupting tasks for high-complexity signals. Redundant auditory-visual combinations are preferable for communication tasks during high workload and with a small visual angle of separation.
Conclusion:
The three meta-analyses contribute to the knowledge base in multimodal information processing and design. They highlight the importance of moderator variables in predicting the effects of interruption task modality on ongoing and interrupting task performance.
Applications:
The findings from this research will help inform the design of multimodal interfaces in data-rich, event-driven domains.
Keywords
Introduction
Operators in a wide range of complex, event-driven domains, such as process control, aviation, and medicine, experience considerable attentional demands. They are required to monitor the performance of an ever-increasing number of automated systems, often resulting in data overload in the visual channel. In many cases, they also need to cope with a growing number of tasks and responsibilities. These new tasks and technologies bring with them an increased risk of interruptions of ongoing tasks and associated performance costs. The effective management of interruptions requires timely detection, accurate interpretation, and appropriate integration of interruptions while performing an ongoing task. A promising means that addresses both the challenge of data overload and the need for effective interruption management is a multimodal interface that distributes information across vision, audition, and touch (e.g., Oviatt, 2003; Sarter, 2002).
The benefit of employing multiple modalities for task and information presentation was first suggested by early research on time sharing (Navon & Gopher, 1979). This research gave rise to the multiple resource theory (MRT; Wickens, 1980, 2002, 2008), which posits that people have the ability to multitask by drawing from separate limited mental resources associated with four dimensions: processing stage, processing code, response type, and modality. With respect to the latter dimension, MRT predicts that multiple tasks and more information can be processed simultaneously if they are distributed across multiple sensory channels.
Traditionally, most information has been presented in visual form. However, the development of new tactile and auditory display technologies in the past two decades has made it possible to use nonvisual channels also. Some studies have confirmed the expected benefits of employing these modalities for interrupting tasks and messages. For example, Sklar and Sarter (1999) found that pilots on a simulated modern flight deck detected unexpected events more reliably with tactile signals than with visual signals yet performed no worse at their ongoing visual task. However, other studies have highlighted drawbacks and limitations of using or combining nonvisual sensory channels. For instance, if an interrupting task is presented in the auditory modality, it may inappropriately draw the operator’s attention away from the ongoing task (Banbury, Macken, Tremblay, & Jones, 2001; Wickens, Dixon, & Seppelt, 2005). The use of redundant modality combinations, which has traditionally been considered beneficial, can result in competition for attentional resources when the same message is presented and processed simultaneously in more than one sensory channel (Wickens, Prinet, Hutchins, Sarter, & Sebok, 2011).
These mixed findings present a challenge for designers of multimodal interfaces and motivated the present research. The aim here is to compare task performance in the context of interruption management, which Latorella (1996, 1998, 1999) defines as “the detection, interpretation, and integration of interruptions within ongoing task performance” (Latorella, 1996, p. 21). In the context of the reported meta-analyses, the ongoing task is a continuous visual task that is potentially disrupted by an interrupting task in a different modality. A prototypical scenario is a driver performing the visual-manual ongoing tasks of lane keeping and hazard monitoring who is periodically interrupted by a message from some in-vehicle device, such as a pedestrian crossing warning system.
To help designers determine which modality to use for presenting such warnings or any other potentially interrupting signal, the three meta-analyses conducted as part of this work integrated findings from numerous dual-task paradigm studies, that is, studies involving an ongoing visual-manual task and an interrupting task. Performance on both the ongoing and interrupting tasks was examined as a function of interrupting task modality. Specifically, the meta-analyses reviewed studies that compared tactile interruptions with auditory ones, visual interruptions with auditory ones, and redundant auditory-visual interruptions with auditory or visual ones.
Meta-analyses were employed because they provide numerous advantages, including seeing the “landscape” of a research domain, keeping statistical significance in perspective, minimizing wasted data, becoming intimate with the data summarized, asking focused research questions, and identifying moderator variables (Rosenthal & DiMatteo, 2001). Naturally, there are some costs of the meta-analysis technique, particularly related to the studies that are not selected for inclusion and the subjective coding of moderator variables. We discuss these potential limitations at the end of the article and note the importance of using meta-analyses and experimental results in a complementary fashion. Importantly, the analyses reported in this article employ a new meta-analytic technique, the ratio score, for synthesizing quantitative data across empirical studies (Wickens, Hollands, Banbury, & Parasuraman, 2012). Their outcomes contribute to a better understanding of multimodal information processing and help inform the design of multimodal interfaces in support of interruption management in a variety of workplaces.
Theoretical Background and Hypotheses
Prior to our research, two meta-analyses had compared information processing in the auditory and tactile modality (Burke et al., 2006; Elliott, Coovert, & Redden, 2009). However, the studies included in this work did not necessarily employ an interruption management paradigm. Thus, these meta-analyses did not provide a basis for making predictions about the relative effectiveness of audition versus touch for supporting multitasking.
Regarding the auditory and visual comparison, the original version of MRT predicts better performance if the ongoing and interrupting tasks are presented in different modalities. Specifically, with an ongoing visual task, better performance is expected if the interruption is auditory rather than visual because of less interference and competition for attentional resources (Wickens, 1980). At the same time, the opposite outcome would be expected because of auditory preemption; this term refers to the fact that, given the intrinsically more salient and disruptive nature of the auditory modality, an auditory signal is more likely than a visual one to capture and draw attention away from an ongoing visual task (Wickens & Liu, 1988; Wickens, Hutchins, Carolan, & Cummings, 2012).
With respect to auditory-visual redundancy, whereby the same information is presented in both channels, very few meta-analyses have surveyed the performance costs and benefits of redundant versus single-modality presentation for an interrupting and an ongoing task. The existing data offer no consistent conclusions (e.g., Wickens et al., 2011; Wickens & Gosney, 2003). Redundancy may result in increased accuracy; however, the dual information-processing load of reading and listening imposed by redundancy can delay the time to process information and therefore reduce efficiency. Furthermore, the added processing requirements of redundant information could result in a performance cost on the ongoing task.
To summarize, based on a review of existing work on multimodal task performance to date, the following hypotheses were formulated:
People will perform an interrupting task better when the two tasks are presented in different modalities.
With regard to performance on the ongoing visual task, no strong hypothesis can be offered, as the effect of interrupting task modality will ultimately depend on the relative strength of two offsetting factors: resource competition and auditory preemption. Our data will show which of these factors has a stronger impact on performance.
Presenting people with redundant auditory-visual interrupting tasks will lead to more accurate, but slower, performance for the interrupting task compared with the presentation of information using a single modality.
Redundancy is also expected to degrade people’s performance on the ongoing task.
Analytical Method
To test these hypotheses, three meta-analyses were conducted. Rosenthal and DiMatteo (2001) define a meta-analysis as
a methodology for (1) systematically examining a body of research and carefully formulating hypotheses, (2) conducting an exhaustive search and establishing inclusion/exclusion criteria for articles, (3) recording and statistically synthesizing the combined data and effect sizes from these studies, (4) searching for moderator variables to explain effects of interest, (5) and reporting results. (p. 62)
In the following sections, we briefly describe the method developed for the purpose of each of our meta-analyses, which closely parallels the aforementioned five steps of a typical meta-analysis but employs a new measure, ratio scores, to contrast performance for different task modalities and modality combinations (Wickens, Hutchins, et al., 2012).
Step 1: Formulating Hypotheses and Examining Available Literature
The four hypotheses we sought to examine are presented in the previous section. The general framework we adopted is that of interruption management (Steelman-Allen, McCarley & Wickens, 2011; Trafton & Monk, 2007), whereby a continuous ongoing task is potentially disrupted by an interrupting task. We conducted a literature search using an iterative three-tiered approach. First, key terms (vision or visual; audition or auditory; touch, haptics, vibrotactile, or tactile; redundant or redundancy; modality; multimodal; cross-modal) were searched in Google Scholar, a number of applied journals, and other types of publications. Some examples include Army Research Laboratory Technical Reports, Ergonomics, Human Factors, IEEE Transactions on Haptics, the International Journal of Aviation Psychology, the International Journal of Human–Computer Studies, Naval Postgraduate School Technical Reports, the Proceedings of HCI International, the Proceedings of the Human Factors and Ergonomics Society, and Transportation Research Record: Journal of the Transportation Research Board.
Second, publications that were referenced in the articles from Step 1 were reviewed. Third, the tables of contents of those publications found in Step 1 and Step 2 were examined for additional relevant articles that might not have been captured in the keyword search. Overall, of the 150 journal articles, conference proceedings, and dissertations that were identified as being of potential interest, 68 (45%) were ultimately used. They were published between 1983 and 2012. Next, criteria were established to determine which of the publications should be included in the meta-analyses.
Step 2: Establishing Inclusion Criteria
The meta-analyses compare information presentation in three modalities within multitask paradigms: auditory, tactile, and visual. Redundant information presentation was also examined but was restricted to auditory-visual redundancy (A+V) because of the scarcity of data on redundant tactile modality pairings. To be included, studies needed at a minimum to involve examination of how the interruption of an ongoing visual task by another task in the same or different modality affected performance on the interrupting task. Ongoing task performance was considered if available. For example, a study for the auditory-tactile meta-analysis might report the average response time and response accuracy to an auditory and tactile warning indicating the presence of a pedestrian while a driver performs the visual driving task. Although a considerable number of studies address modality differences within the interrupting task–ongoing task paradigm, the comparisons of interest were not always all performed within a single study. Therefore, we decided to conduct three separate meta-analyses with the modality comparisons shown in Table 1, all with ongoing visual tasks.
Overview of Modalities Compared in Each Meta-Analysis
In most studies in which there was a redundant A+V comparison, the interrupting task was auditory; only a few employed a visual interrupting task.
In the auditory-tactile (A-T) meta-analysis, we examined studies comparing the performance effects of presenting interrupting tasks in two modalities that can be used to offload vision: audition and touch. For example, Smith, Clegg, Heggestad, and Hopp-Levine (2009) compared the use of the auditory and tactile modality for alerting and orienting attention to an interrupting gauge reading task while participants were performing the ongoing visual task of identifying whether an aircraft was hostile. Note that a visual-tactile meta-analysis was not conducted because very few studies address this comparison.
In the auditory-visual (A-V) meta-analysis, we examined a larger, historically older population of studies that compare the performance effects of visual and auditory interruptions of an ongoing visual task (e.g., Wickens, 1980; Wickens & Liu, 1988). For example, Hurwitz and Wheatley (2002) examined how the appearance of the target letter P during a visual or auditory letter monitoring task affected an ongoing visual driving task. In this analysis, the authors consider a critical variable that is not present in the A-T meta-analysis, namely, the visual angle of separation (VAS) between the ongoing visual task and the location of delivery for the visual interrupting tasks.
In the redundant (A+V) meta-analysis, we focused on redundant auditory-visual interrupting task delivery, in contrast to auditory-only or visual-only presentations. For example, Haas, Hill, Stachowiak, and Fields (2009) compared the effects of visual and auditory-visual warnings in the context of a visual robotic planning task. This third meta-analysis included many of the same articles as the A-V meta-analysis because most of the A+V redundancy studies also contained single-modality auditory and/or visual control conditions. In addition, the redundancy meta-analysis again addressed the VAS, as defined here by the separation between the ongoing task and the visual source of information in the redundant A+V delivery.
The studies included in each meta-analysis can be found in Tables 3, 4, and 5 in the Results section. They are also denoted with * for the A-T meta-analysis, # for A-V, and + for A+V in the References. Note that the in-text citations to studies included in the meta-analyses are not preceded by any distinction.
Step 3: Statistically Synthesizing the Combined Data With Ratio Scores
Typically, meta-analyses concerned with differences between two or more treatment conditions rely on the d′ or Hedge’s g measure, whereby the effect size of each study is the difference in means divided by the pooled standard error (Rosenthal & DiMatteo, 2001). An important shortcoming of the effect size score is that it is an ambiguous measure, affected not only by raw effect size (e.g., percentage difference in means) but also by sample size (N) and variance. Furthermore, not all studies report data that allow extraction of an effect size statistic (d′ or Hedge’s g) for the between-modality comparisons of interest. Therefore, we decided to employ a different measure for representing contrasts: ratio scores (Wickens, Hutchins, et al., 2012). The performance effect of interrupting task modality in each study was compared for the interrupting task itself and the ongoing task, if available. For example, in the A-V meta-analysis, the ratio would be
In our ratio calculations, “performance” was always converted to a metric such that a larger number indicated “better” performance, meaning faster and/or more accurate. Ratios greater than 1.0 corresponded to better performance for the nonauditory modalities (visual or tactile) or represented a redundancy advantage. For measures such as response time or error rate, whereby lower values indicate better performance, the ratio was calculated as such:
Direction of Performance Gain for Each Meta-Analysis
In most studies in which there was a redundant A+V comparison, the interrupting task was auditory; only a few employed a visual interrupting task.
Ratio scores have a number of benefits compared with traditional meta-analytical approaches. For example, since they are based purely on mean performance differences between two conditions within a study, ratio scores allow researchers to include the results of studies that did not report effect sizes or did not provide data for calculating effect sizes. However, the ratio score method is not without limitations. When raw ratios are defined as the basic data point within the meta-analysis, then traditional statistical comparisons, such as t tests, lose statistical power if the number of studies involved in that comparison is small. In addition, as ratios are averaged, large ratios from a single study will contribute disproportionally even if the two means defining the ratio did not differ significantly. Finally, ratios may create positively skewed distributions, making it important for researchers to carefully examine their data prior to analysis. Note that a recent systematic comparison of both the traditional effect size measures and ratio scores approaches revealed a high degree of consistency between the two measures in a meta-analysis of training strategies (Wickens, Hutchins, et al., 2012).
Step 4: Searching for Moderator Variables
An additional goal of the analysis was to both identify and determine the impact of possible moderator variables, which are defined as variables that affect the relationship between two other variables, in this case, interrupting task modality and performance on the ongoing and interrupting tasks. The moderator variables were suggested by recurring themes across studies, such as workload manipulations, and by earlier research suggesting that factors such as workload, urgency, and complexity play an important role in interruption handling (e.g., Hameed, Ferris, Jayaraman, & Sarter, 2009). The specific moderators for each meta-analysis are discussed in the Results section.
Step 5: Reporting Results
Given that ratios were generated from individual studies, or from multiple conditions within a study (e.g., ratios under both low and high workload or for both ongoing task and interrupting task), the condition’s ratio itself could be treated as a single data point in the meta-analysis. For example, one source might provide one performance ratio only, whereas another source may yield several ratios reflecting multiple performance measures. For example, Straughn, Gray, and Tan (2009) compared compatible and incompatible auditory and tactile pedestrian crossing warnings wherein compatible refers to whether the “warning comes from the direction of the obstacle to be avoided” (p.1). Since response times were reported for both compatibility conditions, two response time ratios resulted for the A-T meta-analysis. These data points across studies could then be subjected to statistical analysis, in the same manner that individual participant observations are analyzed with conventional statistical tests, for example, ANOVA or a t test, to see whether there is a significant difference between modalities with different levels of a moderator variable.
Also, the nature of ratio scores allowed the mean value of a set of ratios to be compared to 1.0 to see whether one modality was significantly better or worse than the other, that is, whether 1.0 lies outside of the 95% or 90% confidence interval around the mean. For the t tests and ANOVAs that were conducted as part of this study, p values less than .10 were classified as marginally significant, and p values less than .05 were considered significant.
The following sections describe, for each meta-analysis, the number of studies and moderator variables that were included, the number and meaning of the ratios, and the results for each comparison.
A-T Meta-Analysis
A-T Method
The
Auditory-Tactile Meta-Analysis Studies, Ratio Values, and Moderator Variable Classifications
Note. RT = interrupting task response time; Acc. = interrupting task accuracy; OT = ongoing task performance.
In addition to the main modality effects, the possible impact of the following moderator variables was analyzed:
Ongoing task workload (high vs. low). Low and high workload were extracted from the individual studies in which the factor was specifically manipulated within the experiment.
Interrupting task decision complexity (level of uncertainty within the signal). For low-complexity interruptions, such as a general warning, the interrupting task simply informs the operator of the occurrence of an event (zero bits of information). For high-complexity interruptions, the interrupting task requires some choice of action, such as turning left or right, and informs the operator of a set of possible events (e.g., more than than zero bits of information).
Interrupting task urgency (alarm vs. notification). An alarm requires an immediate response to a critical task or event, whereas a notification informs the participant of a task or event that can be postponed. For example, an alarm can be a warning of an impending collision in an automobile or aircraft, and an example of a notification is the need to check tire pressure when feasible. Thus, an alarm was classified as high urgency and a notification as low urgency.
Interrupting task processing code (spatial vs. categorical). The former “relates to spatial relationships between stimulus components such as left-right,” whereas categorical information “refers to the extracted information [that] has symbolic meaning or refers to identity within a category” (Ferris & Sarter, 2010).
A-T Results
Interrupting task performance.
The mean ratio score for interrupting task response time was 1.06 (n = 42 ratios; range = 0.78 to 1.46). This value is significantly greater than 1.0 (α = .05), indicating that averaged across all conditions, tactile interruptions are responded to 6% faster than auditory interruptions. Of the total 42 ratios, 28 (67%) showed this response time advantage for tactile versus auditory interruption tasks. The analysis of interrupting task accuracy data yielded a mean ratio of 1.06 (n = 24 ratios; range = 0.36 to 2.69), indicating a marginally significant tactile advantage (α = .10).
Ongoing task performance
All studies included in this meta-analysis employed an ongoing task, but only seven of them reported ongoing task performance data; and these generated nine ratios. Smith et al. (2009) and Stanley (2006) both generated two ongoing task ratios (see Table 3 for the ongoing task ratios from each study). Since there were not many ongoing task ratios, response time and accuracy were both considered and pooled together. The mean ongoing task performance ratio for these studies was 1.02 (range = 0.99 to 1.14), and this value was not significantly greater than 1.0 (α = .10).
Ongoing task workload
Workload was varied in only 4 of the 25 studies. A pairwise t test showed that the interrupting task response time ratios for low and high workload (ratio = 1.11 and ratio = 1.10, respectively) did not differ significantly from each other, t(8) = 0.16, p = .88. The effect of workload on interrupting task accuracy was not examined in this meta-analysis because only 1 of the 4 studies that varied workload reported interrupting task accuracy data for each modality-workload combination (Mohebbi, Gray, & Tan, 2009).
Interrupting task decision complexity (level of uncertainty within the signal)
There was no significant difference for response time between high and low complexity (ratio = 1.06 and ratio = 1.07, respectively), t(39) = 0.36, p = .72. This equivalence also holds true when comparing only those studies that reported within-experiment differences between modalities. However, when we excluded the one outlier ratio that was more than three standard deviations from the mean, for the studies that reported interrupting task accuracy (n = 12 studies), there was a significant difference in accuracy favoring the tactile modality at low complexity (ratio = 1.14) and the auditory modality at high complexity (ratio = 0.86), t(21) = 2.57, p = .02.
Interrupting task urgency
We found that 16 studies focusing on the presentation of low-urgency signals (notifications) produced a tactile advantage for response time (ratio = 1.09), whereas the 9 studies examining high-urgency signals (alarms) showed neither a tactile or auditory advantage (ratio = 1.00). This difference between alarms and notifications ratios in response time was significant, t(23) = 1.99, p = .05). Further evidence of the difference is provided by the following. Of the notification studies, 14 reported significant within-experiment response time differences between modalities, and 12 of these studies showed a significant tactile advantage (86%). Regarding interrupting task urgency and accuracy, there was no significant difference between alarms (ratio = 1.17) and notifications (ratio = 1.04), t(5) = 0.42, p = .69.
Processing code (spatial vs. categorical)
Of the 40 ratios in this analysis, 22 were classified as spatial (ratio = 1.06) and the remaining 18 were classified as categorical (ratio = 1.07). For response time, the difference between the mean interrupting task ratios was not significant, t(39) = 0.36, p = .72. Again, when we excluded the one outlier ratio that was more than three standard deviations from the mean, there was a significant difference between spatial and categorical cues (0.88 and 1.20, respectively), t(17) = 2.24, p = .04, for accuracy, as spatial cues were more accurate with audition and categorical cues with touch.
A-V Meta-Analysis
A-V Method
In this analysis, the
Auditory-Visual Meta-Analysis Studies, Ratio Values, and Moderator Variable Classifications
Note. RT = interrupting task response time; Acc. = interrupting task accuracy; OT = ongoing task performance.
The same three moderator variables, aside from workload, employed in the A-T analysis were examined for this meta-analysis in addition to the following two:
Auditory permanence (permanent vs. transient). We examined differences between a relatively permanent (e.g., a repeated tone) versus a highly transient tone.
VAS. The angle of separation is measured by the number of degrees between the ongoing task’s center of focus and the interrupting task’s visual display. It was hypothesized that the larger this angle, the greater the visual cost (lower ratio) because of the increased scan required between the interrupting task and ongoing task. If this angle was not directly reported in the article, we estimated it from the geometry of the separation between information sources on the screen (e.g., 20 cm) and the typical seating distance from the screen in most experimental settings (i.e., approximately 60 cm). However, in nearly all studies, this information either was reported or could be estimated from figures depicting the experimental setup.
A-V Results
Interrupting task performance
The overall mean response time generated from 46 ratios was 0.88 (ratio range = 0.34 to 2.57). This value is significantly less than 1.0 (α = .05), showing a clear auditory advantage. The analysis of interrupting task accuracy data resulted in a mean ratio of 1.01 (n = 24 ratios; range = 0.33 to 3.30). This ratio was not significantly different from 1.0 (α = .10).
Ongoing task performance
Since there were only five response time ratios with respect to ongoing task performance, response time and accuracy ratios were pooled together to calculate the mean ongoing task ratio. The overall mean ratio was 1.13 (n = 33 ratios; ratio range = 0.31 to 3.54) was not significantly different from 1.0 (α = .10), indicating that the ongoing task was unaffected by interrupting task modality.
Interrupting task decision complexity (level of uncertainty within the signal)
There was a marginally significant difference between the 16 low-complexity ratios (ratio = 1.03) and the 30 high-complexity ratios (ratio = 0.82) with regard to response time, t(44) = 1.78, p = .08: While one performs a visual ongoing task, complex auditory events are processed faster than complex visual ones. There was no effect of signal complexity with regard to accuracy (low-complexity ratio = 0.86, n = 8; high-complexity ratio = 0.96, n = 15), t(21) = 0.91, p = .37.
Interrupting task urgency
The 13 interrupting task notification ratios and 33 interrupting task alarm ratios showed no significant difference for response time, t(44) = 0.31, p = .76, or accuracy, t(18) = 1.00, p = .33.
Processing code (spatial vs categorical)
We classified 38 ratios in this analysis as spatial (ratio = 0.91) and another 30 ratios as categorical (ratio = 0.82). Processing code had a marginally significant effect on response time such that categorical cues were best presented with audition (ratio = 0.72), whereas for spatial cues, the auditory benefit was diminished although still marginally significant (ratio = 0.89), t(29) = 1.77, p = .09. Regarding accuracy, no significant difference between modalities was observed for spatial (ratio = 0.91) or categorical cues (ratio = 1.01), t(12) = 0.67, p = .52.
Auditory interrupting task permanence
The permanence of the auditory interrupting task had a marginally significant effect on the ratio. If it was fairly permanent (e.g., a repeated tone; mean ratio = 0.75), auditory interrupting task performance was better than if it was transient (mean ratio = 0.95), t(7.28) = 1.93, p = .09. For the transient signal, the ratio was not significantly less than 1.0, and thus there was no auditory benefit.
VAS
We performed a regression analysis on the combined response time and accuracy ratios against the VAS. The slope was not significantly different from zero, indicating that visual interrupting task performance costs did not increase with eccentricity (r = 0.13, p < .22).
Redundancy (A+V) Meta-Analysis
A+V Method
In the redundancy meta-analysis, we examined 31 redundant A+V versus single-modality (auditory or visual) studies. Table 5 shows the studies and ratios for the A+V meta-analysis. The best of the two single-modality conditions was used in all cases. This single-modality baseline definition was chosen, because only with such a baseline can we assure that human information processing is truly exploiting redundancy and not just filtering the poorer of the two single modalities (see Wickens & Gosney, 2003). A redundancy ratio greater than 1.0 indicated a redundancy gain and one less than 1.0 indicated a redundancy cost. The three moderator variables that were included in this analysis were ongoing task workload, interrupting task type (communications, alert, or spatial), and VAS.
Redundancy (Auditory Plus Visual) Meta-Analysis Studies, Ratio Values, and Moderator Variable Classifications
Note. RT = interrupting task response time; Acc. = interrupting task accuracy; OT = ongoing task performance.
A+V Results
Interrupting task performance
The overall ratio for the interrupting task generated by 49 ratios was 0.97, which was not significantly less than 1.0 (α = .10). This indicates that redundant A+V presentation was on average as good as, but not better than, the best of the single-modality conditions, which usually was audition. However, this effect was qualified by a number of moderator variables as described later, some of which produce a true redundancy gain, and some actually illustrate a redundancy cost relative to the single-modality auditory condition.
There was a significant redundancy gain for accuracy (ratio = 1.34, α = .05), but a significant redundancy cost for response time (ratio = 0.83, α = .05). This large difference was important but was not surprising in that for most systems, redundancy helps guarantee security (by providing more ways for the information to be noticed) but at the expense of efficiency. To the extent that humans do not process visual and auditory information entirely in parallel (Wickens, 2002), this added cost of dual-channel processing will lead to a response time penalty. This penalty also manifests as time away from the ongoing task, which may explain the overall 7% redundancy cost for the ongoing task described next.
Ongoing task performance
For the ongoing task, the ratio was 0.93 calculated from 48 ratios, which was significantly less than 1.0 (α = .05), indicating that on average, there was a small redundancy cost to the ongoing task.
Ongoing task workload
When data were pooled across both response time and accuracy, ongoing task workload affected the redundancy gain for interrupting task communications information (e.g., data link) but not for other types of tasks, F(47) = 4.99, p < .01. Specifically, for interrupting communications tasks, there was a significant redundancy gain under high ongoing task workload (ratio = 1.67) but not low workload (ratio = 0.71; not significantly less than 1.0). In contrast, for the other two task types, workload did not alter the redundancy effect.
Interrupting task type
For response time, there was a significant interrupting task type interaction, F(2) = 4.20, p = .02, with a marginal significant redundancy cost for communication tasks, such as text–voice data link messages (ratio = 0.85). However, there was a marginal redundancy gain for alerting tasks (ratio = 1.06) and a significant gain for spatial tasks (ratio = 1.06). The accuracy ratio did not differ for task type.
VAS
The separation between the primary visual display of the ongoing task and the visual component of the interruption task affects the redundancy gain. As shown in Figure 1, when these channels are close together but not overlaid, there was a clear redundancy gain at 5° (ratio = 1.56). However, when the visual channels are more separated, this effect regresses through 1.0, and a regression analysis yields a significant slope (r = .29; p < .01). Note, however, this slope does not include the points at 0 VAS because they are qualitatively different and involve clutter from the overlay of multiple displays (Horrey & Wickens, 2004). The analysis reveals that at wide visual angles, there was a redundancy cost. Because the best single task modality in these experiments was always auditory, such a regression essentially means that the visual component of the redundant A+V interrupting task information was either ignored or processed at a cost when it was widely separated from the ongoing task.

Ratio of the redundancy condition to the best single task modality as a function of visual angle of separation (ratio values greater than 1.0 equals a redundancy gain).
Summary of Findings across All Meta-Analyses
Figure 2 provides a summary of the significant findings across the three meta-analyses. The findings are discussed in further detail in the following section.

Summary of marginally significant and significant findings across the meta-analyses in order of appearance: Auditory-tactile results are the in the top white section, auditory-visual results are in the shaded gray section, and redundant auditory-visual results are in the bottom white section. Arrows indicate modality gain direction for each analysis.
Discussion
Operators in complex, data-rich domains experience visual data overload and an increased need for effective interruption management. Multimodal interfaces, which combine visual, auditory, and tactile information presentation, have been proposed as a promising means to address those challenges with processing multiple tasks or sources of information simultaneously. However, not enough is known about the relative benefits and shortcomings of employing and combining the various channels. To help fill this gap, three meta-analyses were conducted concerning the effects of interrupting task modality, for example, visual, auditory, tactile, or redundant auditory-visual, on the performance of an ongoing visual-manual task and the interrupting task itself. The impact of several moderator variables, such as workload and complexity, was examined as well.
Primary Hypotheses
Four primary hypotheses were proposed. First, on the basis of the strong predictions of MRT, we proposed that in the context of visual ongoing tasks, a nonvisual interrupting task would be processed more effectively than a visual one. This effect was directly confirmed in the A-V meta-analysis, but it was also indirectly confirmed in the A-T analysis, in which tactile processing was found to be, on average, even more efficient than auditory processing when it was imposed during an ongoing visual task. Hence, by extrapolation, we would expect a tactile interrupting task to be processed better than a visual interrupting task, even though there were insufficient head-to-head comparisons of these two modalities to avail a meta-analysis (but see Sklar & Sarter, 1999). The finding of nonvisual superiority of the interrupting task is also fully consistent with auditory preemption theory. Finally, we note that the cost to the visual interrupting task can be only partially explained by the peripheral effects of visual scanning, because a larger VAS did not significantly increase this cost.
Our second hypothesis, regarding the ongoing task, was weaker but still confirmed. That is, for the ongoing task, we proposed that the auditory benefits of separate resources would be offset by the auditory costs of preemption. Such weakening was clearly confirmed. Ongoing task performance was found to differ not significantly or at all (ratio = 1.02) between interrupting task modalities. This “absence of effect” was demonstrated experimentally by Wickens and Liu (1988) and Wickens et al. (2005), but the meta-analysis provided the added statistical power to confirm the null hypothesis.
Thus, on balance, the findings related to the first two hypotheses provide further evidence that when used appropriately, auditory signals can support multitasking in a nonintrusive manner (Kramer, 1994) and lead to a net gain in performance across multiple tasks.
The third hypothesis, concerning redundancy, was also confirmed. A statistically significant 34% redundancy gain was found for accuracy, and a 13% (ratio = 0.87) redundancy loss was observed for response time. The former reflects the increased security resulting from processing the same information in multiple channels, and the latter indicates the increased time that that processing requires compared with the single modality, which was almost always an auditory display.
This finding of a speed–accuracy trade-off with redundancy leads us to the fourth hypothesis: the predicted time cost of processing a visual interrupting task when it is coupled with an auditory display. Scanning such visual information presumably caused the small but significant 7% cost in ongoing task performance.
Finally, although not offered as a specific hypothesis, the A-T comparison yielded important findings as well. In particular, the A-T analysis showed that in many cases, participants responded even faster to tactile interrupting tasks than to auditory ones. This finding appears to contradict earlier recommendations not to use tactile cues alone when response time is critical (e.g., Hale & Stanney, 2004). However, a closer look at the data indicates that this contradiction is resolved when moderator variables are considered. The tactile advantage to response time vanished in the case of urgent interruptions, and the auditory presentation resulted in more accurate responding for more-complex signals. Our study confirms the findings from a study conducted by Wogalter, Conzola, and Smith-Jackson (2002), which demonstrated that audition is a powerful means of getting attention and effective in producing an alerting reaction. Second, it confirms the findings from the Chang, O’Modhrain, Jacob, Gunther, and Ishii (2002) study, that without considerable training, it is difficult for users to interpret complex tactile signals or tactons. This difficulty is reflected not only in response time but also in terms of accuracy, which was found to be lower for tactile signals with high signal complexity.
Additional Moderator Variables
VAS
The lack of effect of VAS was unexpected. However, before we truly conclude that VAS does not matter, we note that (a) the data points were highly variable, so it is not appropriate to confirm the null hypothesis; (b) the function beyond 15° does show visible increase in visual costs; and (c) when the three studies that varied VAS within the experiment were examined, all three showed a consistent and significant monotonic decrease in interrupting task performance with increasing VAS (e.g., Wickens, Dixon, & Seppelt, 2002).
Task type
The effect of task type was complex and complicated by the fact that task type could not be coded in the same way across all three meta-analyses. For example, there were no studies involving communications tasks in the tactile modality that affected the A-T meta-analysis. However, certain effects did emerge. When the auditory and tactile modalities were compared, the auditory interrupting task had better accuracy than the tactile ones for spatial tasks, hence confirming the fluency of this modality for conveying spatial information (Begault & Pittman, 1996). This auditory spatial advantage was also confirmed by the fact there was no loss in accuracy. However, when the two modalities were contrasted for categorical tasks, the tactile modality yielded better accuracy, with no loss in speed.
In partial contrast, when the auditory and visual interrupting tasks were compared, the auditory modality now emerged superior for categorical tasks with respect to response time but with no modality difference for spatial tasks for both response and accuracy. The auditory speed advantage for categorical tasks possibly reflects the natural or compatible mapping between sound and language (Wickens, Sandry, & Vidulich, 1983), as many such tasks involved simple linguistic processing. But this visual advantage disappears when spatial tasks are used, often naturally mapping to the inherent spatial property of the visual system.
Finally, in the redundancy meta-analysis, we found that redundancy slowed the processing of communications task information relative to the single modality, which was almost always auditory. The slower response times may be attributed to the added time required to read the visual text component of the redundant information. In contrast, the relatively less complex, shorter messages of information inherent in the alerts and spatial tasks were not slowed by the added visual task and even resulted in a faster response time.
Applied Implications
Overall, the findings from the three meta-analyses highlight the fact that, rather than focusing on overall performance differences between modalities and modality combinations, it is critical to consider the effects of moderator variables when developing recommendations for the design of future multimodal and possibly adaptive interfaces. For example, redundant information presentation is beneficial for communication tasks only in case of high workload, and even then, only when accuracy is most important. In low-workload conditions, redundancy leads to improved performance only for alerting and spatial tasks. A+V redundancy is recommended only when the VAS between the visual ongoing task and the visual aspect of the interrupting task is small. Tactile messages lead to improved performance compared with audition for low-complexity and low-urgency messages. Thus, the sense of touch should be reserved for simple notifications. In contrast, the auditory channel results in better performance when a message is complex and urgent, suggesting the use of this channel for alarms and alerts. Also with regard to accuracy, the tactile modality is recommended for categorical tasks, whereas audition is recommended for spatial tasks. The importance of moderator variables strongly suggests a need for adaptive interface designs (Sarter, 2007; Scerbo, 1996; Trumbly, Arnett, & Johnson, 1994) whereby the nature of information presentation is varied depending on context in an effort to optimize information processing performance.
Limitations and Future Directions
There are a number of limitations to the approach taken for this study. We begin by describing those related to the meta-analytic approach in general. First, ideally, a meta-analysis should be analogous to a factorial experimental design, whereby moderator variables in the meta-analysis correspond to factors in the design and interactions between moderator variables can be examined in the same way as interactions between factors. However, in reality, this is rarely the case. Unlike in an experiment in which all cells are equally populated, in the meta-analysis, we are at the mercy of the population of studies available. At best, overlapping sets of studies will include examination of each moderator variable of interest.
Second, even when there are multiple studies involving a particular moderator variable, it is possible that two or more levels of the variable may be confounded with another variable. For example, suppose that all or most studies of high complexity involved communications tasks, and all or most of low-complexity studies involved alerting tasks. It would be difficult to establish the extent to which any difference was attributable to task type or to task complexity.
Third, as Simmons, Nelson, and Simonsohn (2011) have noted, there are multiple sources of bias created by the “researcher degrees of freedom,” or the biases that are associated with the decisions that researchers have to make when collecting and analyzing data. These biases can come in two forms:
We can be biased on how we coded moderator variables of the studies included and, indeed, what moderator variables we chose to identify in the first place (our justification was articulated in the Introduction).
The number of potential studies included will be biased downward by what is referred to the “file drawer problem” (Rosenthal, 1979), which refers to the fact that a number of valid studies in a given area of research may be conducted but never reported, in part because of a bias toward reporting the presence rather than the absence of effects.
Fourth, we note that our meta-analysis employed the less conventional ratio analysis, as opposed to the effect size analysis, and the potential limitations of the former were previously described.
These limitations notwithstanding, we believe the current results are important because of the following:
They provide confirmation to effects reflected in other studies only by single experiments, thus reinforcing the validity of those prior findings that there is an advantage of modality separation for the interrupting task (Wickens et al., 1983).
They identify some new effects revealed by the “collective wisdom” of the meta-analysis in integrating multiple studies, primarily, the effects of urgency and complexity moderator variables on the relative benefits of tactile and auditory modalities for speed and accuracy.
They provide suggestions for important new directions of research, particularly in the area of redundancy effects, where the relatively low statistical power from few studies has left intriguing questions to be resolved regarding the circumstances of redundancy gain and loss. For example, given that the meta-analyses appeared to reveal a balance of effects of the MRT and auditory preemption theory on the performance of the ongoing task, further research is invited to establish the moderating variables that may tip this balance one way or the other. Also, since the studies included focused mostly on visual ongoing tasks, authors of future work can compare the different interrupting task modalities in the context of an ongoing auditory or tactile task. Finally, the meta-analyses reveal that workload is a moderator variable that researchers should consider in the design of future studies; only a limited number of experiments that were included in our analyses incorporated this variable.
Key Points
Significant differences between auditory and tactile interrupting tasks were observed as a function of two moderator variables: complexity and urgency. Accuracy was higher for tactile tasks in case of low-complexity signals; in contrast, high-complexity signals resulted in higher accuracy in the auditory modality. Faster responses were seen for low-urgency messages in the tactile modality, and there was no difference between the two modalities for high-urgency messages.
Audition, rather than vision, should be used for spatial and nonurgent tasks when accuracy is the primary concern and for categorical tasks when the issue of importance is response time.
Redundant auditory-visual combination should be used for communication tasks under high workload, for alerting and tracking tasks in low workload, and when there is a small visual angle of separation.
Footnotes
Acknowledgements
We would like to thank Kara Latorella (technical monitor) at NASA Langley Research Center and Durand Begault and Kenneth Goodrich for their direction, support, and guidance on this research. This work was performed for NASA Langley and the Integrated Intelligent Flight Deck program, under Contract NNX09AQ34A.
Sara A. Lu is a PhD candidate in the Department of Industrial and Operations Engineering, Center for Ergonomics, at the University of Michigan. She received her MSE in industrial and operations engineering from the University of Michigan in 2011.
Christopher D. Wickens is a senior scientist at Alion Science and Technology, Micro Analysis and Design Operations, in Boulder, Colorado, and professor emeritus at the University of Illinois at Urbana-Champaign. He received his PhD in psychology from the University of Michigan in 1974.
Julie C. Prinet is a PhD precandidate in the Depart-ment of Industrial and Operations Engineering, Center for Ergonomics, at the University of Michigan. She received her MS in engineering from the École Centrale de Nantes in 2011.
Shaun D. Hutchins is a senior human factors engineer at Alion Science and Technology in Boulder, Colorado. He is also a PhD precandidate in the School of Education at Colorado State University. He received his MA in experimental psychology with a minor in experimental statistics from New Mexico State University in 2007.
Nadine Sarter is a professor in the Department of Industrial and Operations Engineering, Center for Ergonomics, at the University of Michigan. She received her PhD in industrial and systems engineering from The Ohio State University in 1994.
Angelia Sebok is a principal human factors engineer/program manager at Alion Science and Technology in Boulder, Colorado. She earned her MS in industrial and systems engineering from Virginia Tech in 1991.
