Abstract
When participants respond to stimuli of two sources, response times (RTs) are often faster when both stimuli are presented together relative to the RTs obtained when presented separately (redundant signals effect [RSE]). Race models and coactivation models can explain the RSE. In race models, separate channels process the two stimulus components, and the faster processing time determines the overall RT. In audiovisual experiments, the RSE is often higher than predicted by race models, and coactivation models have been proposed that assume integrated processing of the two stimuli. Where does coactivation occur? We implemented a go/no-go task with randomly intermixed weak and strong auditory, visual, and audiovisual stimuli. In one experimental session, participants had to respond to strong stimuli and withhold their response to weak stimuli. In the other session, these roles were reversed. Interestingly, coactivation was only observed in the experimental session in which participants had to respond to strong stimuli. If weak stimuli served as targets, results were widely consistent with the race model prediction. The pattern of results contradicts the inverse effectiveness law. We present two models that explain the result in terms of absolute and relative thresholds.
One of the basic experimental setups for the behavioural study of multisensory integration is the redundant signals task (Hershenson, 1962; Miller, 1982; Raab, 1962). In this task, participants respond to stimuli of two classes that are presented either alone or in combination, for example, visual, auditory, or audiovisual stimuli. In its simple version, the participants are asked to press a key as fast and as accurately as possible as soon as any stimulus is detected. The results typically show that the responses to combined stimuli are, on average, faster than the responses for the single stimuli (redundant signals effect [RSE]).
Several models explain the effect, including race and coactivation models. In race models (Miller, 1982, Ineq. 2), the information of the two channels is separately processed (independent of stimulation in the other channel, “context invariance”), and the faster of the two channels determines the response. Under this assumption, the cumulative response time (RT) distribution (i.e., the probability for a response below t) for AV must not exceed the summed RT distributions for the single auditory (A) and visual signal (V),
In many experiments, violations of Inequality 1 (“race model inequality”) were observed, which indicates some coactivation mechanism and, therefore, integrated processing of the stimulus components (e.g., Diederich & Colonius, 1987; Giray & Ulrich, 1993; Miller, 1982, 1986, 2016).
Where and how does coactivation occur? In a reanalysis of Miller’s (1986) data, Schwarz (1989, 1994) suggests channel-specific accumulation of sensory evidence over time; in the redundant stimulus, this activation is superposed (see also Diederich, 1995; Diederich & Colonius, 1991; Miller & Ulrich, 2003). The superposition of the two activated channels reaches the evidence criterion faster than the activity of the single channels. Schwarz’s (1994) model has been quite successful in explaining the data (e.g., Blurton, Greenlee, & Gondan, 2014; Gondan, Blurton, Hughes, & Greenlee, 2011; Gondan, Götze, & Greenlee, 2010), but is mute with respect to the processing stage at which coactivation occurs. Depending on the specific experimental task, superposition could reflect early integration of stimulus energy or coactivation of response tendencies at the response selection stage (e.g., Feintuch & Cohen, 2002), or both.
Several studies investigated the psychological and neurophysiological locus of the RSE. Using a redundant signals task with purely visual stimuli, Miniussi, Girelli, and Marzi (1998) found stronger RSEs and earlier event-related P1/N1 components for bilateral than for unilateral stimuli, which indicates an extrastriate locus of the RSE. Savazzi and Marzi (2008) favoured an early locus of the RSE because they found maximum RSEs for short stimuli with low intensity (inverse effectiveness, see below). Findings in visual pop-out search tasks led Zehetleitner, Müller, and Krummenacher (2008) to conclude that the RSE arises at the level of a master salience map where visual contrast features are pooled.
RSEs received a lot of attention in multisensory research (Giard & Peronnet, 1999; Molholm et al., 2002; Teder-Sälejärvi, McDonald, Di, Russo, & Hillyard, 2002). Neurophysiological correlates of the RSE include event-related potential (ERP) differences (Foxe et al., 2000; Molholm et al., 2002; Murray et al., 2005), fractional anisotropy from diffusion tensor imaging (Brang, Taich, Hillyard, Grabowecky, & Ramachandran, 2013), and activity peaks in functional magnetic resonance images (Iacobini & Zaidel, 2003). The majority of studies concluded that the RSE already arises at early sensory processing stages, with, for example, activation in visual areas modulated by the auditory stimulus (e.g., Murray et al., 2005).
Other studies point to the response selection stage as a potential locus of the RSE (e.g., Miller, 1982, Experiment 3; Miller & Reynolds, 2003; Miller, Ulrich, & Lamarre, 2001; Van der Stoep, Spence, Nijboer, & Van der Stigchel, 2015; Vu, Minakata, & Ngo, 2014). For example, in a psychological refractory period experiment (Pashler, 1984), Miller and Reynolds (2003) noticed that redundancy gains in Task 1 propagate to Task 2, whereas redundancy gains in Task 2 were unaffected by the onset asynchrony between Task 1 and Task 2. They concluded that redundancy gains arise within the bottleneck, which is indicative of a response selection stage. Evidence for a motor involvement was found by Diederich and Colonius (1987) in RT variance for bimanual responses and in response force measurements by Giray and Ulrich (1993). It should be noted, however, that—depending on the specific redundant signals task—redundancy gains might arise at different stages of the processing pathway (see, for example, Zehetleitner, Ratko-Dehnert, & Müller, 2015), and that some of the coactivation effects found at later processing stages might just be a downstream result of earlier perceptual processing stages.
An interesting variant of the task, which also inspired the present study, was used by Schröter, Frei, Ulrich, and Miller (2009): In one of their experiments, participants were asked to respond to the offset of binaurally presented auditory stimuli. Coactivation was found in both onset and offset conditions, but the offset condition elicited a larger redundancy gain than the usual onset condition. One of the key aims of the present study is to determine whether multisensory coactivation is the result of purely adding up stimulus energy (“early” integration), or whether it involves more complex processes at response selection stages (“late” integration).
To distinguish between perceptual stages and response selection, the present study used a go/no-go paradigm, with auditory and visual targets (i.e., go-signals) and distractors (i.e., no-go signals) either of weak or strong intensity, in different sessions. In one session, participants had to respond to strong stimuli (A, V, AV) and withhold their response to weak stimuli (a, v, av). In the other session, these roles were reversed; that is, participants had to respond to weak signals and withhold their response when strong stimuli were presented.
At the neural level, audiovisual superior colliculus neurons typical show an inverse relationship between stimulus strength and the increase in firing rate attributed to bimodal stimuli relative to the firing rate attributed to unimodal stimuli (“multisensory gain”). In the multisensory literature, this pattern of effects is known as the law of inverse effectiveness (Burnett, Stein, Chaponis, & Wallace, 2005; Jiang, Jiang, Rowland, & Stein, 2007; Meredith & Stein, 1983, 1986). According to inverse effectiveness, it would be expected that manipulating the intensity of the audiovisual targets would result in differential coactivation at perceptual stages. That is, one would expect higher redundancy gains in the weak target condition of the present experiment and low or no redundancy gain in the strong target condition. In contrast, if coactivation occurs at the response selection stage, one would expect coactivation in both conditions, evidenced by a large and significant violation of the race model inequality. Surprisingly, we observed quite a different pattern of results.
Method
Participants
Sixteen individuals (11 females; mean age = 30; age range = 21-47 years) participated in the experiment. All participants had self-reported normal or corrected-to-normal vision and hearing. Informed consent was collected following the guidelines from the local ethics committee and the Declaration of Helsinki.
Design
A 3 (sensory modality: auditory vs. visual vs. audiovisual) by 2 (intensity: weak vs. strong) by 2 (target feature: weak vs. strong) within-participants, factorial design was used with the order of the target stimulus (weak first vs. strong first) counterbalanced across participants. The main dependent variables were RT and accuracy. Accuracy was used for the estimation of the RT distribution (see below) and served to control if the participants followed the task instructions.
Apparatus
A computer monitor with a 100 Hz refresh rate was used to present visual stimuli and two Scandyna speakers (65-20000 Hz response) were used to present auditory stimuli, which were all computer controlled by a PC running Windows XP. Participants’ responses were collected with a single button on a response pad. A desk-mounted Eyelink 1000 with a 250 Hz sampling rate was used to record eye movements to ensure that participants were fixating during stimulus presentation.
Visual stimuli were Gabor patches with a fixed spatial frequency of 6 cycles/degree (σ = 12) in six orientations (i.e., 0°, 30°, 60°, 90°, 120°, and 150°) on a grey background with a luminance of 13 cd/m². Gabor patches with weak intensity had a contrast of 8% (luminance of 14 cd/m²), and Gabor patches with strong intensity had a contrast of 90% (17 cd/m²). Auditory stimuli were pure-tone sine waves of six frequencies (i.e., 2112, 2376, 2640, 3168, 3520, and 4224 Hz). The weak-intensity tones had an amplitude of 55 dB(A) and the strong-intensity tones had an amplitude of 66 dB(A), so that weak and strong stimuli were very easily distinguishable. The variation of Gabor orientation and tone frequency served to avoid boredom and stimulus repetition effects that might possibly weaken the intensity manipulation. Both the visual and auditory stimuli were presented for 200 ms. Audiovisual stimuli were either combinations of two weak stimuli or two strong stimuli.
Task
Participants had to maintain their fixation on the centre of the monitor for 200 ms in order for the stimulus to be delivered after an exponentially distributed (mean = 325 ms, truncated to 300-800 ms) inter-trial interval. The stimuli were presented in a pseudo-randomised sequence and were displayed twice in two sessions of 22 blocks. One half of the participants began with the weak target session (22 blocks, Session 1), in which they responded to the weak-intensity stimuli, and had to ignore the strong-intensity no-go stimuli. Session 2 then consisted of another 22 blocks in which participants had to respond to the strong-intensity stimuli and ignore the weak-intensity stimuli. The other half of the participants completed the experiment in the opposite order. There were 110 trials for each of the 12 possible conditions (Session Type × Sensory Modality × Stimulus Intensity), which yielded a total of 1,320 trials. Because both components of audiovisual stimuli were either targets or no-go stimuli, we avoided response conflicts due to combinations of targets and distractors in the same stimulus—such response conflicts have been identified as possible sources of artificial redundancy gains in go/no-go tasks (Fournier & Eriksen, 1990).
Procedure
Participants were seated in front of a computer screen and were asked to place their heads upon a chin rest that was located on a tabletop. Participants were then calibrated such that their eye gaze was properly tracked. Then participants were given instructions on the task. Participants had to press a response button whenever they saw or heard a stimulus of a particular intensity (i.e., weak vs. strong) depending on the session. The stimuli were either visual, auditory, or audiovisual. Whenever the stimulus was not in the assigned go-signal intensity, participants had to withhold their response and ignore the no-go stimulus. They were told that the orientation of the visual stimuli and the frequency of the auditory stimuli were irrelevant for the task and to ignore these stimulus features. After the task instructions, participants completed a brief practice session (30 trials) to ensure they were familiar with the task. Once they had completed the practice session, they continued with the two experimental sessions. Participants were informed that they were able to take breaks at their will. The experiment took approximately 40 min to complete.
Data analysis
For exploratory analysis, mean response times were submitted to a 2 × 2 × 3 repeated-measures analysis of variance (ANOVA). The first factor was go-signal type, the second was the stimulus intensity, and the third was the sensory modality. RSEs were investigated by submitting the empirical cumulative distribution functions (eCDFs) for auditory, visual, and auditory-visual stimuli to the test of the race model inequality (Inequality 1).
For each participant, modality, and intensity, eCDFs were determined based on the correct responses and omitted responses (coded as infinitely long responses, Miller, 2004, Appendix A). Anticipations and false alarms for each no-go stimuli eCDFs were subtracted from the respective go stimuli eCDFs, which is known as the “kill-the-twin”-correction of the eCDF (e.g., Eriksen, 1988; Gondan & Minakata, 2016, Inequality 8). If the race model holds, the difference d = FAV(t) − FA(t) − FV(t) should be below or equal to zero. One-sample T-statistics were calculated for d at the RT percentiles 5, 10, 15, . . ., 45 across participants and aggregated into a Tmax statistic using the maximum T across the tested percentiles. The distribution of the Tmax statistic under the race model assumption was determined by permutation (Gondan, 2010). The proportion of permutations that resulted in Tmax values greater than the observed Tmax was used to calculate the p value for the violation of the race model prediction. Separate permutation tests were performed for the two target conditions (weak and strong).
Results
Mean reaction time
There was a main effect of intensity whereby strong go-signals elicited faster responses relative to weak go-signals, F(1, 15) = 15.3, p = .001. The main effect of modality was also significant and the AV go-signals yielded the fastest responses, followed by the V go-signals, and the A go-signals had the slowest responses, F(2, 30) = 85.1, p < .001. Redundancy effects were larger in the strong signals than in the weak signals, as indicated by a statistically significant intensity by modality interaction, F(2, 30) = 10.2, p = .002 (see Figure 1). Only few false alarms and omissions were observed (Table 1). In the weak go-signals condition, false alarms occurred least frequently for strong audiovisual non-targets. In contrast, in the strong go-signal condition, the false alarm rates were highest for weak audiovisual non-targets. However, these differences were not statistically significant.

Mean reaction time as a function of go-signal intensity and modality.
Mean RT and standard deviations per modality and task.
RT: response time.
Test of the race model inequality
The eCDFs were calculated at the subject level for each of the go-signal conditions. Later, these eCDFs were submitted to a kill-the-twin correction to eliminate biases introduced by fast guesses and anticipations based on the observed behaviour in the no-go conditions. In those sessions in which participants had to respond to strong stimuli, the permutation test yielded statistically significant violations of the race model inequality (Tmax = 8.84, p < .001). In contrast, when participants had to respond to weak signals, responses to redundant audiovisual stimuli were consistent with a race model (Tmax = 1.06, p > .10, Figure 2).

RT distributions for the different intensities and modalities. A clear violation of the race model prediction is observed for (a) strong but not for (b) weak stimuli.
Discussion
The goal of the present study was to investigate the nature of redundancy gains within an audiovisual go/no-go paradigm and whether redundancy gains depend on the intensity of the go-signal as would be predicted by the law of inverse effectiveness. We implemented two experimental conditions using the same set of weak and strong auditory, visual, and auditory-visual stimuli, but two different tasks in separate sessions: in the “strong target” session, participants had to ignore weak signals and execute a speeded response when they heard or saw a strong signal, while in the “weak target” session, participants were given the opposite response instructions. In line with previous findings in go/no-go tasks (Blurton et al., 2014; Gondan et al., 2010; Grice & Canham, 1990; Grice, Canham, & Boroughs, 1984; Grice, Canham, & Gwynne, 1984), we expected strong audiovisual redundancy gains. Because weak activations leave more room for coactivation (Holmes, 2009), we expected even more pronounced coactivation in the weak target condition, that is, the experimental session in which participants were asked to selectively respond to the weak targets, but ignore the strong distractors (inverse effectiveness, see Burnett et al., 2005; Jiang et al., 2007; Meredith & Stein, 1983, 1986).
Interestingly, on the milliseconds scale, redundancy gains in our study were greater in the strong target condition than in weak target condition, which contradicts inverse effectiveness. In fact, statistically significant violations of the race model inequality (Inequality 1) were only found in the strong target condition, whereas in the weak target condition, redundancy gains were consistent with a race model account. Given the structural equivalence of the two tasks, this indicates two different mechanisms of multisensory integration in the same experimental setup: In the weak target condition, participants kept the channel-specific information separate, whereby weak redundant stimuli elicited speeded responses that were nearly equivalent to the distribution of minimum RTs for their unisensory constituents. In the strong target condition, participants effectively coactivated the different sources of information such that audiovisual, redundant signals elicited the fastest responses.
In the following, we propose a hybrid race and coactivation model that can account for the pattern of results observed in the present experimental setup (an alternative account is described below): The model is illustrated in Figure 3. Go/no-go discrimination is a comparison of the internal stimulus representation with a criterion that distinguishes weak and strong stimuli. The activation of the sensory channel accumulates over time until there is enough evidence to decide if the stimulus was weak or strong (Gomez, Ratcliff, & Perea, 2007; Link & Heath, 1975). In the superposition model, the channel-specific activations add up, such that the coactivated multisensory representation is just a stronger stimulus (Diederich, 1995; Schwarz, 1994; see Ulrich & Miller, 1997, for an overview on coactivation architectures). In the strong target condition, this results in a percept that is farther away from the discrimination boundary, so the participant can effectively make use of the coactivation mechanism for fast go/no-go decisions. Consistent with this coactivation account, most false alarms occurred in weak audiovisual no-go stimuli.

Influence of go-signal instruction on coactivation of weak and strong stimuli. “Early” coactivation of redundant signals leads to a stronger intensity percept which is helpful in strong targets because the combined percept is more distant from the discrimination barrier. In weak targets, it hinders performance because the combined percept is closer to the discrimination barrier. It is possible that the participants’ were able to adjust their processing mode to the two target types, such that coactivation effects were observed in strong targets, whereas in weak targets, redundancy gains were consistent with separate processing in a race model.
We interpret this finding as coactivation occurring at an early perceptual stage of information processing. The results are in line with findings in past redundant signals experiments that typically used the more salient stimulus as the target stimulus (“oddball task,” for example, Foxe et al., 2000; Gondan et al., 2010, Experiment 1, go/no-go task; Gondan, Niederhaus, Rösler, & Röder, 2005; Molholm et al., 2002; Teder-Sälejärvi et al., 2002). The RSE task with simple RT can be considered a special case of the go/no-go task with strong targets of the present study. In a simple RT task, participants have to discriminate a “strong” stimulus (i.e., the stimulus actually used) from a “weak” stimulus with intensity zero (no stimulus). Therefore, the present results for the strong target condition are also consistent with the numerous reports of coactivation effects in simple response tasks with audiovisual stimuli (Brang et al., 2013; Diederich, 1995; Diederich & Colonius, 1987; Foxe et al., 2000; Gondan et al., 2010; Miller, 1982, 1986, Experiment 1, simple response task; Molholm et al., 2002; Teder-Sälejärvi et al., 2002; Van der Stoep et al., 2015), and with the numerous reports of early perceptual stage of processing (Brang et al., 2013; Cavina-Pratesi, Bricolo, Prior, & Marzi, 2001; Foxe et al., 2000; Molholm et al., 2002; Mordkoff, Miller, & Roch, 1996). For example, Molholm et al.’s (2002) findings showed a multisensory interaction as early as 46 ms post-stimulus.
In contrast, in the weak target condition, redundancy gains were small and consistent with a race of the unisensory percepts. With weak targets, coactivation at the early sensory stage would have again resulted in a stronger percept, but one that is closer to the discrimination boundary. Consequently, go/no-go discrimination would be more difficult. Here, the present go/no-go task required participants not only to simply detect the sensory input but also to perceive and evaluate the contents of the stimulus components. That is, the bimodal components had to be integrated in some way to determine whether a response was required or not. However, perceptual coactivation would have resulted in a percept that is more difficult to discriminate, therefore, participants apparently switched to separate processing, which still yields redundancy gains of limited size, satisfying the speed requirements of the experimental task. If multisensory integration had occurred at later processing stages (coactivation of response tendencies, for example, Feintuch & Cohen, 2002), we would have observed coactivation effects in both the strong and the weak target condition, but we did not. The hybrid model relies on the assumption that participants are able to shift, strategically, between coactive processing in the strong target condition to separate processing in the weak target condition. In the present experiment, this shift of strategy was technically possible because the weak target condition and the strong target condition varied across sessions (not on a trial-by-trial basis).
Alternatively, it might be possible that both early and late integration were simultaneously active in the current task. 1 If early integration follows the logic of channel superposition (Schwarz, 1989, 1994), stimulus energy adds up, so that the audiovisual stimulus is simply perceived as a stronger stimulus. This mechanism is helpful in the strong target condition, whereas in the weak target condition, it hinders performance (Figure 3). In contrast, with late integration, the response tendencies coactivate, which boosts performance in both weak and strong targets. These response tendencies could, for example, arise from a comparison of the perceived channel-specific intensity signals with an internal criterion for strength discrimination, and feed into a module that triggers a response when enough evidence has been accumulated (e.g., Link & Heath, 1975).
Neither of the two mechanisms alone can explain the observed pattern of results: Pure early integration produces redundancy gains in strong targets, but is not consistent with the redundancy gain observed in the weak targets. Pure late integration predicts a violation of the race model inequality in both strong and weak targets—the latter was not observed. However, the simultaneous operation, or a probability mixture, of the two mechanisms might actually lead to the observed pattern of results. On one hand, substantial redundancy gains may be observed in the strong targets condition because both mechanisms operate in the same direction. On the other hand, small redundancy gains (that may not exceed the race model prediction) may arise in the weak targets condition because the early and late integration mechanisms operate in opposite directions. Further studies might investigate specific predictions that arise from the different architectures (hybrid, mixture, or simultaneous early and late integration).
The present study is limited by a post hoc conclusion from a null result (no significant violation of the race model inequality in the weak target condition), as well as an inbuilt confounding of target intensity and task type—by design, there are no responses to strong stimuli in the weak target condition and vice versa. The above conclusions are preliminary because of their post hoc nature, explaining phenomena after inspection of the results, and the two models described in this discussion might need verification in other experimental setups. Despite these limitations, the results of the present study represent an interesting instance of a violation of inverse effectiveness, and they underline the flexibility of the brain’s multisensory behaviour in response to different task demands (Andersen, Tiippana, & Sams, 2004; Powers, Hillock, & Wallace, 2009; Stevenson, Wilson, Powers, & Wallace, 2013).
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
