Abstract
Although it has been documented that musical training enhances multisensory integration, there is not yet a consensus as to how musical training influences the visual dominance effect in sensory dominance. The present study adopted the Colavita visual dominance paradigm, presenting auditory stimuli concurrent with visual stimuli, to investigate the visual dominance effect between music majors and nonmusic majors and compared the reaction time and response proportion of the two kinds of participants in the bimodal trials. The results showed that the proportion of simultaneous responses in bimodal trials of music majors is higher than that of nonmusic majors; the nonmusic majors show a greater difference between the proportion of “Visual-Auditory” trials and “Auditory-Visual” trials compared with the music majors; the ΔRT of the two responses of the nonsimultaneous bimodal trials of nonmusic majors is longer than that of music majors. The results indicated that musically trained individuals have an enhanced ability to bind visual and auditory information and show a lesser Colavita effect, that is, a reduced visual dominance effect, than their nonmusic major peers. We conclude that musical training extends beyond the field of vision or auditory domain, improves audiovisual integration, and reduces the visual dominance effect.
In daily life, our perception of the world relies on multiple sensory systems, and we need to process stimuli from different sensory modalities and integrate them as a unified, stable whole (Talsma et al., 2010). This process is called multisensory integration. However, the brain does not give equal weight to information from different modalities. There is competition between sensory modalities, which is referred to as the sensory dominance effect (Colavita, 1974). This effect is a result of the process of evolution; to adapt to the complex natural and social environment, people must effectively allocate limited cognitive resources and give priority to more important sensory information (Posner et al., 1976). In most cases, visual stimuli more frequently receives preferential processing and dominates other sensory modalities, a phenomenon known as visual dominance (Huang et al., 2015).
An intriguing example of visual dominance is the Colavita effect; that is, when visual and auditory stimuli are presented at the same time, participants respond only or preferentially to visual stimuli (Colavita, 1974). To explain the Colavita effect, the ratio of different stimulus and response demands are changed in different studies (Egeth & Sager, 1977; Koppen & Spence, 2007c). Follow-up studies have suggested that this may result from the asymmetric inhibition-promotion relationship between vision and audition (Sinnett et al., 2008), in which auditory stimuli facilitate responses to visual targets whereas visual stimuli impair responses to auditory targets. Another view has been presented that participants rely on unconscious stimulus processing to respond to the bimodal stimulus instead of conscious perception (Spence, 2009). However, in the bimodal trials, participants sometimes do not process the auditory component; thus, it may not be consolidated into short-term memory. As a result, the participants just “forget” to respond to the auditory target and that is the rapid forgetting account (Spence, 2009). Recently, one possibility is that visual dominance may result from interfering with auditory processing during the response and/or decision phase by visual input (Robinson et al., 2016). Although the Colavita effect is robust and consistently occurs (Hirst et al., 2018) and is related to the proportion of trials, response demands, spatial consistency, and the manipulation of attention (Fang et al., 2020; Koppen & Spence, 2007a, 2007c, 2007d; Yue et al., 2015), there is not yet a consensus on the interpretation underlying the Colavita effect.
However, more studies are interested in the individual differences in sensory dominance. They found that individual differences in sensory dominance may result from the differences in the extent of dependence on visual or auditory stimuli (Sekiyama et al., 2014; Stevenson et al., 2012). Recently, a meta-analysis indicated that the visual dominance effect in adults is significant, whereas children experienced a lesser, reverse Colavita effect, suggesting adults are more dependent on visual information because of their substantially developed visual system (Hirst et al., 2018), whereas for children, auditory input tends to be more transient than visual input, as visual scenes are usually more persistent, thus they are more dependent on auditory information (Robinson & Sloutsky, 2004; Wille & Ebersbach, 2016). In addition, audiovisual integration ability is affected by neurodevelopmental disorders such as autism (Wallace & Stevenson, 2014), participants with autism spectrum disorder (ASD) are unable to access information from more than one sense and have difficulties in integrating information into higher levels of processing, so they showed no Colavita effect or rather, a reverse Colavita effect (Moro et al., 2012). Therefore, sensory dominance mainly depends on the extent of reliance on sensory information, which is related to the developmental process of visual and auditory cortices (Graven & Browne, 2008a, 2008b; Hirst et al., 2018) and individual differences in the ability to assimilate information from more than one sense (Wallace & Stevenson, 2014).
The present study investigated the hypothesis that a salient form of multimodal experience, musical training, can sharpen audiovisual processing. Due to long-term exposure to auditory (pitch) information during musical training, musicians’ auditory ability, like auditory working memory, is significantly enhanced (Ho et al., 2003; Landry & Champoux, 2017), which makes them more dependent on auditory information (Proverbio et al., 2016). Also, the activation of the cerebral cortex of musicians is more intense than that of the average person when hearing pure tones, which make auditory sensitivity of musicians stronger (Pantev et al., 1998). In addition, musical training can also enhance the development of visual skills. Some studies have shown that visuospatial processes of musicians are enhanced following musical training because of long-term exposure to visual (musical notation) cues (Bidelman et al., 2013). However, studies have shown that musical experience extends beyond simple auditory processing and visual processing, also enhancing audiovisual integration for combining multisensory cues more broadly (Bidelman, 2016). Musicians are more sensitive in detecting coherent audiovisual stimuli, and their audiovisual integration window is shorter (Lee & Noppeney, 2011, 2014) Nowadays, studies on the effects of musical training on multisensory integration have mainly focused on the redundant signal effect (Li & Yang, 2020) and multisensory illusion effect (Bidelman, 2016; Proverbio et al., 2016); to date, there is no consensus as to how musical training affects sensory dominance.
Moreover, existing results of sensory dominance studies are contradictory and controversial (Politzer-Ahles & Pan, 2019; Proverbio et al., 2016). Some studies have used auditory (phoneme) and visual (talker’s lip movement) stimuli to compare the extent to which musicians and nonmusicians were affected by the McGurk effect, another visual dominance effect; the results indicated that musicians are not subject to the McGurk effect (Proverbio et al., 2016). However, others have shown a great and statistically significant McGurk effect with a larger sample size and within-participants manipulation (Politzer-Ahles & Pan, 2019). Musicians have developed finer acoustic abilities so that they rely more on acoustic cues than visual cues to process speech information when compared with nonmusicians (Sams et al., 1991). Due to the interference of noise level (Proverbio et al., 2016) and the ceiling effect, the influence of musical training on the sensory dominance still needs to be further explored. What’s more, it is unclear whether musicians’ supposed improvements in multisensory integration stem from the domain general benefits of audiovisual processing itself or from musicians’ well-known improvement in processing language and music-related stimulus (Bidelman, 2016; Bidelman et al., 2013). Unlike the McGurk effect paradigm, the robustness of the Colavita effect has been confirmed by research published over the last couple of years (Spence, 2009), and the Colavita effect paradigm is not affected by the individual’s previous experience because it adopts a simple sensory stimulus, which can better reveal the low-level sensory attributes/features in the process of sensory modal processing and the pervasive nature of the sensory dominance (Ngo et al., 2010; Spence, 2009).
In the classical Colavita paradigm, nothing but the proportion of visual-only trials can be measured as the size of visual dominance effect (Colavita, 1974; Fang et al., 2020). Also, as the participants can hardly respond to the auditory target of the bimodal trials, no experimental data can be obtained to illustrate the size of auditory dominance. To investigate the behavioral mechanisms that show how musical training influence the direction and size of Colavita effect, we need to be able to measure both visual and auditory dominance effect in the same paradigm. Therefore, the present study adopted a modified Colavita paradigm (Fang et al., 2020; Huang et al., 2015; Su et al., 2020) to investigate whether musical training influences the visual dominance effect in audiovisual integration. The modified Colavita paradigm focuses on the reaction times (RTs) to visual and auditory components in bimodal trials (Egeth & Sager, 1977), in addition to the proportion of visual versus auditory dominance in bimodal trials (Fang et al., 2020; Huang et al., 2015; Su et al., 2020). In the present study, when the participants were explicitly informed of the existence of bimodal trials, the mean RTs to the visual targets were significantly faster than those to the simultaneously presented auditory targets, which confirmed the Colavita effect on the reaction time level (Egeth & Sager, 1977). Therefore, visual responses could either precede or follow the auditory responses. Correspondingly, the former trials are classified as visual precedence trials, and the latter trials are classified as auditory precedence trials in a post hoc way (Fang et al., 2020; Huang et al., 2015; Su et al., 2020). In such a paradigm, the direction of sensory dominance is defined by comparing the proportion of visual to auditory precedence trials: Visual dominance occurs when the proportion of visual precedence trials is significantly larger than the proportion of auditory precedence trials, and auditory dominance occurs vice versa. In addition, the magnitude of sensory dominance can be derived by calculating the RT difference between the visual and auditory responses in the same trial.
To explore how musical training influences the Colavita effect, 58 healthy, age-matched, undergraduate male and female participants (29 nonmusic majors and 29 music majors) were engaged in both bimodal (audiovisual) and unimodal (only auditory and only visual) conditions. The group differences in the visual dominance effect were quantified by comparing the reaction proportion and the reaction times of participants in bimodal trials. As musical training can make individuals more dependent on auditory stimuli and enhance auditory sensitivity (Ho et al., 2003; Landry & Champoux, 2017; Proverbio et al., 2016), we hypothesized that the size of Colavita effect of music majors is smaller than that of nonmusic majors. Given that musical training can improve audiovisual integration (Bidelman, 2016; Lee & Noppeney, 2011, 2014), the hypothesis can be made that the proportion of simultaneous responses in bimodal trials of music majors is higher than that of nonmusic majors.
Method
Participants
To test the suitability of the sample size, we performed a sensitivity analysis of within-between subjects repeated measures in G*power 3.1.9.2 (Faul et al., 2007, 2009). Input parameters: The parameter effect size f = 0.32, α err prob = 0.05, power (1–β err prob) = 0.80, number of groups = 2, number of measurements = 3, correlation among repeated measurements = 0 and the nonsphericity correction parameter = 54; therefore, the total sample size is 46, the appropriate sample size was 54 / 2 = 27. Allowing for possible problems during the experiment (such as invalid participants), 29 participants were recruited for each group. Twenty-nine nonmusic majors (mean age: 20.2 ± 1.13 years; 9 males, 20 females) and 29 music majors (mean age: 19.8 ± 0.83 years; 6 males, 23 females) participated in this experiment. All of them were right-handed, with normal hearing and normal or corrected-to-normal visual acuity. Nonmusic majors were required to have never received systematic musical training, and participants in music majors were required to be proficient in one musical instrument and practice more than 20 hours a week. In addition, information related to musical training in music majors was investigated (see Table 1). According to the Declaration of Helsinki, all participants gave informed consent before the experiment and were paid after the experiment. This study was approved by the Ethics Committee of the Department of Psychology, Soochow University.
Musical Demographics of the Music Majors.
Apparatus and materials
The experiment was conducted in a soundproof, quiet and dimly lit room. In the experiment, each stimulus was presented on a 27-inch ASUS IPS monitor with a resolution of 2560 (horizontal) × 1440 (vertical) pixels and a refresh rate of 60 Hz and controlled by Presentation software (Neurobehavioral Systems Inc., Albany, CA; https://www.neurobs.com/). All the visual stimuli were presented on a black (RGB value: 0, 0, 0) background. The central fixation was a white “+” (0.5° × 0.5°) that appeared in the center of the screen for 700 ms. The visual target was a white sphere (2.98° × 2.98° of visual angle) that was presented for 50 ms, and both of them were presented. Participants were positioned 60 cm from a gray computer screen. The auditory target was a pure tone of 4,000 Hz that was presented for 50 ms via Logitech H340 headsets. Responses were collected with a DELL L100 keyboard.
Design and procedure
There were three types of trials: Unimodal auditory trials, unimodal visual trials, and bimodal trials (see Figure 1) in which the auditory and visual targets were presented simultaneously. Participants were asked to press one key if they saw the visual target and to press another key if they heard the auditory target with the index and middle fingers of their right hand, and to respond as quickly and accurately as possible. The mapping between the auditory/visual stimulus and the two response keys was counterbalanced across participants.

Stimuli in the Present Study.
Before the formal experiment, we asked the participants to press the key corresponding to the stimulus that they first noticed in the bimodal trials: (1) Participants were explicitly informed of the existence of the bimodal trials and (2) were instructed to respond twice as much as possible and press the two keys at the same time or in sequence. There were 600 trials in total, among which the ratio of visual, auditory, and audiovisual stimuli was 2:2:1 (i.e., 40% visual stimuli, 40% auditory stimuli, and 20% bimodal stimuli). The three types of trials were randomly presented. Before the start of each trial, the fixation was presented for 700 ms, and the participants needed to stare at it until the end of the trial. The intertrial interval (ITI) was randomized from 1,250 to 1,700 ms. All the participants practiced 60 trials before the formal experiment, and the exercise was the same as the formal experiment.
The present study investigated the effects of musical training on Colavita effect through indexes. First, we analyzed the proportions of trials when the participants pressed only a single key and conducted 2 (participant group: music majors vs. nonmusic majors) × 2 (type of incorrect bimodal trials: Visual-Only vs. Auditory-Only) repeated-measures analysis of variance (ANOVA) to verify the existence of the Colavita effect as the main effect type of incorrect bimodal trials was significant. Second, we analyzed the proportions of trials in which the participants pressed two keys at different times and conducted 2 (participant group: music majors vs. nonmusic majors) × 2 (type of correctly responded bimodal trials: Visual-Auditory vs. Auditory-Visual) repeated-measures ANOVA to verify the existence of the Colavita effect by observing whether the main effect of type of correctly responded bimodal trial was significant. Third, we planned paired t tests on simple effects to explore the difference in the proportion of Auditory-Visual trials and Visual-Auditory trials between the two types of groups. For analysis of RT, we excluded the trials with omissions, incorrect responses, and RTs exceeding 3 SDs from the mean RT for each condition. Moreover, we report the effect size by using
Results
In addition to the two types of unimodal trials, the bimodal trials were categorized into the following six types according to the recorded participants’ reaction time to the visual and/or auditory stimuli: (1) Visual-Auditory trials, in which participants first responded to the visual component and then to the auditory component; (2) Auditory-Visual trials, in which participants first responded to the auditory component and then to the visual component; (3) Visual-Only trials, in which participants responded to the visual component only; (4) Auditory-Only trials, in which participants responded to the auditory component only; (5) “Simultaneous” trials, in which participants’ reaction times for the auditory and the visual components were the same. Due to the error of the presentation software (Presentation Software package; Neurobehavioral Systems), the trials in which the absolute value of the reaction time difference between the auditory stimulus and the visual stimulus in a bimodal trial was less than 5 ms were also classified as simultaneous trials (Huang et al., 2015); and (6) “missed” trials in which no responses were recorded.
Proportions of different trial types
The proportions of different types of bimodal trials of music majors and nonmusic majors are illustrated in Figure 2. First, we followed the first method of data analysis on the Colavita effect: The proportions of incorrect bimodal trials (i.e., Visual-Only and Auditory-Only bimodal trials) were submitted to a 2 (participant group: music majors vs. nonmusic majors) × 2 (type of incorrect bimodal trials: Visual-Only vs. Auditory-Only) repeated-measures ANOVA. The main effect of the type of incorrect bimodal trials was significant, F(1, 56) = 4.22, p = .045,

Proportions of the Six Different Types of Behavioral Conditions in the Bimodal Trials of the Three Experiments.
Furthermore, for the correct and nonsimultaneous bimodal trials, we conducted the proportions of the Visual-Auditory and the Auditory-Visual trials to a 2 (participant group: music majors vs. nonmusic majors) × 2 (type of correctly responded bimodal trials: Visual-Auditory vs. Auditory-Visual) repeated-measures ANOVA. The main effect of group was significant, F(1, 56) = 32.23, p < .001,
Comparing the proportions of the simultaneous trials of the two groups of participants, t(28) = 8.92, p < .001, d = 1.93, indicates that the proportion of simultaneous trials of music majors (13.61%) was significantly larger than that of nonmusic majors (5.94%).
RTs in the bimodal trials
RTs in the Visual-Auditory and the Auditory-Visual bimodal trials were submitted to a 2 (participant group: music majors vs. nonmusic majors) × 2 (type of response: visual vs. auditory response) × 2 (response order: first response vs. second response) repeated-measures ANOVA. The main effect of group was not significant, F < 1. The main effect of type of response was significant, F(1, 56) = 14.30, p < .001,
These patterns of results suggested that, for both music majors and nonmusic majors, it was faster for the auditory responses to recover from the previous visual responses than for the visual responses to recover from the previous auditory responses: 92 ms vs. 112 ms for the music majors and 133 ms vs. 160 ms for the nonmusic majors (planned paired-sample t tests were significant in all three experiments, all p values < 0.05). To further explain the three-way interaction, we adopted a 2 (type of response: visual vs. auditory response) × 2 (response order: first response vs. second response) repeated-measures ANOVA for the music majors and nonmusic majors conditions (Figure 3 a and b). The main effect of the type of response was significant, F(1, 28) = 5.44, p = .27,

RTs in the Bimodal and Unimodal Trials of the Experiment.
A paired-sample t test was adopted to compare the ΔRT in the Visual-Auditory and Auditory-Visual trials (Figure 3(c)). In the auditory visual trials, the ΔRT of the nonmusic majors (133 ms) was greater than that of the music majors (92 ms), t(28) = 3.28, p = .003, d = 0.88. Similar to the Visual-Auditory trials, the ΔRT of nonmusic majors (160 ms) was greater than that of music majors (112 ms), t(28) = 3.74, p = .001, d = 0.91. These results suggest that musical training will affect the size of the sensory dominance.
RTs in the unimodal trials
For RTs in the unimodal trials (Figure 3(d)), a 2 (participant group: music majors vs. nonmusic majors) × 2 (type of response: visual response vs. auditory response) repeated-measures ANOVA was performed. The main effect of the type of response was significant, F(1, 56) = 39.18, p < .001,
Discussion
The present study investigated the visual dominance effect in the audiovisual integration between music majors and nonmusic majors via the Colavita effect paradigm, a task requiring not only a response to unimodal visual and auditory stimuli but also a response to temporally offset auditory and visual cues. The results indicated that musically trained individuals (1) have a higher proportion of simultaneous responses in bimodal trials and (2) show a lesser visual dominance than their nonmusic major peers. That is, musical training extends beyond the field of vision or auditory domain, promotes audiovisual integration, and impacts the visual dominance effect.
In addition, these conclusions extend recent studies on musicianship and multisensory integration for speech and musical stimuli to general stimuli (Politzer-Ahles & Pan, 2019; Proverbio et al., 2016). Previous studies on the multisensory integration and sensory dominance of musicians and nonmusicians mostly used speech or musical stimuli, but this may be affected by the enhancement of their language ability of musicians (Aizenman et al., 2018; Lee & Noppeney, 2011; Proverbio et al., 2016). The present study presents nonspeech and nonmusical stimuli, and it is found that the proportion of simultaneous trials of bimodal trials of music majors is 7.67% more than that of nonmusic majors, which indicates that music majors have enhanced processing of concurrent audiovisual objects. In nonsimultaneous bimodal trials, the Visual-Auditory proportion is smaller and the ΔRT of bimodal trials is shorter, which reflects the smaller visual dominance effect. These findings extend previous results and reveal the effects of musical training on the visual dominance effect in audiovisual integration without lexical-semantic meaning or musical familiarity of the stimulus.
In response to the result that music majors show a lesser visual dominance than their nonmusic major peers, we propose three possibilities from the perspective of perceptual processing. First, visual dominance can be considered as an attentional bias toward the visual modality to compensate for the poor alertness of the visual system (Posner et al., 1976). And because music majors are exposed to the musical notation during musical training (Lee, 2012; Wong et al., 2014), their visual perception is enhanced, thus reducing the compensation to the visual system. As a result, their attentional bias toward the visual modality is reduced, which lead to a lesser visual dominance. Second, auditory stimuli facilitated the visual stimuli processing, whereas visual stimuli impaired the auditory stimuli processing (Sinnett et al., 2008). So far, there have been no research explain why participants do not quickly recognize they made an incorrect response (Visual-Only trials bimodal trials) and then quickly response to the ignored auditory target (Spence, 2009), even though had enough time to make the supplementary response before the next trial (Koppen & Spence, 2007a, 2007c, 2007d). It is worth noticing that the present study found that the ΔRT of the nonmusic majors was greater than that of the music majors in Visual-Auditory trials, which indicate that musical training can minish the size of Colavita effect. In present study, for the music majors, the presentation of the visual stimuli does not reduce the perceptibility of the simultaneously presented auditory stimulus; thus, they would not “forget” to respond to the auditory target. Interestingly, at the end of the experiment, we asked the participants to evaluate the bimodal trials, and most of the nonmusic majors considered that the auditory stimuli appeared earlier than the visual stimuli, whereas the music majors tend to perceive the auditory stimuli and visual stimuli presented simultaneously. Koppen and Spence (2007b) obtained a similar result that participants perceived the auditory stimulus to have been presented slightly ahead of a simultaneously presented visual stimulus. This result is in the opposite direction to what would be expected according to the previous explanation of the Colavita effect, according to which visual stimulus dominate the participants’ responses because it is perceived first. However, according to what Spence (2009) have stated that there is a causal link between stimulus and response, in other word, the way we respond modulates our perception of stimulus (unconscious stimulus processing). As a result, we can make another explanation, musical training changes the way individuals respond and thus affects music majors’ perception of stimulus. Also, it is known that musical training can improve the neural encoding and mental control of auditory stimulus (Bidelman, 2016; Paraskevopoulos et al., 2012). Therefore, we speculated that musical training could change this aftereffect of action on perception. Third, previous studies found that sensory dominance may reflect automatic attention to the dynamic information, which means allocating attention to transient stimuli before processing more stable input (Robinson & Sloutsky, 2004; Spence, 2009; Wille & Ebersbach, 2016). For music majors, they have enhanced cognitive control of auditory working memory (Pallesen et al., 2010) because of long-term pitch memory during musical training, which keep auditory information in memory longer and more stable (Pallesen et al., 2017). Therefore, the attention is more equally distributed to visual targets and auditory targets for music majors because they allocate more attention to auditory information than the nonmusic majors.
The reason why the proportion of simultaneous trials of music majors is greater than that of nonmusic majors may be that musical training enhances audiovisual integration. Musical training is a multimodal cognitive activity that integrates information from different sensory modalities and the process of motor control and attention (Herholz & Zatorre, 2012). In addition, it affects audition (Nina & Bharath, 2010), vision (Lee, 2012; Wong et al., 2014), memory and their interaction (Guo et al., 2021; Pallesen et al., 2017). Therefore, musicians have enhanced audiovisual integration (Bidelman, 2016), audio-tactile integration (Landry & Champoux, 2017), and auditory-motor integration (Zatorre et al., 2007). Musical training enhances the ability of audiovisual integration, which means that music majors’ ability to perceive visual information and auditory information as a whole is enhanced (Lewkowicz & Ghazanfar, 2009). Furthermore, previous studies have found that participants responded significantly faster to the visual targets when visual targets were accompanied by accessory sounds (Quinlan, 2000; Sinnett et al., 2008), which means the auditory stimulus would facilitate the response to the visual stimulus. Consequently, we speculated that stronger processing ability of auditory stimulus can facilitate the response to bimodal trials, which means enhancement of auditory skills would improve audiovisual integration. As a result, they respond more accurately than nonmusic majors who do not have musical experience to the audiovisual targets presented simultaneously.
From a neurological point of view, in bimodal trials, the percentage of music majors response simultaneously is higher than that of nonmusic majors may be because musical training enhanced the connectivity between sensory modalities, that’s to say the music majors can bind the information from different models faster (Bidelman, 2016). Many neuroimaging studies have shown that the connectivity between sensory systems highly involved in musical training is enhanced (i.e., hearing, vision, and movement) (Grahn & Rowe, 2009), as well as the functional connectivity of brain regions involved in multisensory integration, and the corresponding cortical network forms a dynamic system that improves the processing efficiency of multisensory information (Herholz & Zatorre, 2012). Event-related potential (ERP) and functional magnetic resonance imaging (fMRI) studies show that perceptual training is related to the improvement of multisensory behavioral performance, resulting in the reduction of neural processing in perceptual tasks (Ding et al., 2003; Kelley & Yantis, 2010). Multisensory competition may attribute to limited attentional resources (Mishra & Gazzaley, 2012). Also, divided attention to concurrent sensory signals may reduce neural responses (Lavie, 2005). However, trained individuals responded less to task-irrelevant information; that is, they showed improved neural efficacy (Mishra et al., 2011). Therefore, we speculated that the enhancement of audiovisual integration by musical training found here is due to an augmentation of more general cognitive mechanisms, such as attention distribution. Studies have shown that the enhancement of the ability to allocate audiovisual attention improves multisensory behavior and reduces sensorineural processing (Mishra & Gazzaley, 2012). Moreover, musical training increases and/or enables one to distribute attentional resources more effectively, and possibly be cross-modality (Strait et al., 2010). Accordingly, it can be inferred that if musical training can enhance the ability of multimodal distribution of attention, it will lead to the enhancement of audiovisual integration ability of music majors observed in present study, so the proportion of simultaneous trials of bimodal stimulus is higher. In addition, when processing audiovisual stimuli, nonmusicians rely more on visual cues, whereas musicians activate the left inferior frontal gyrus more (Paraskevopoulos et al., 2015). Therefore, the difference between music majors and nonmusic majors may come from the differences in the extent of dependence on visual and auditory information. Although Koppen and Spence (2007b) have found that the visual stimulus must be presented 12 ms before the auditory stimulus in order for the two stimuli to be judged as being presented simultaneously as the nonmusic majors perceived the visual stimulus first. However, for music majors, after long-term auditory training, their auditory processing ability is enhanced, as they may be more sensitive to auditory stimuli. It is conceivable that in sensory modal competition, the visual dominance effect of music majors is decreased because they rely more on auditory stimulation than nonmusic majors.
As the proportion of simultaneous trials increases, the visual dominance effect is bound to decrease. The lesser visual dominance effect found in music majors may result from the enhancement of auditory abilities. Musical training can enhance auditory attention (Giard & Peronnet, 1999), auditory working memory (Pallesen et al., 2017) and auditory temporal acuity (Thomas et al., 2012). This enhanced attention may make music majors pay more attention to sound in bimodal trials. Therefore, musical training may enhance unisensory processing rather than affect the sensory dominance itself. In the Colavita effect paradigm, it is necessary to make behavioral decisions on bimodal stimuli, and the participants’ response to them depends on the internal threshold of visual and auditory targets (Spence, 2009). Studies have shown that musicians have a lower threshold for sound at a specific frequency; that is, they enhance auditory sensory coding (Zhang et al., 2019). Therefore, the decrease in the visual dominance effect of music majors compared with nonmusic majors may be because they have a lower threshold for auditory targets and respond to them easier. However, there were no significant differences in RTs to unimodal auditory stimuli between music majors and nonmusic majors. Therefore, we consider that the visual dominance effect may also be affected by audiovisual integration.
Overall, the findings of the present study can be interpreted as cognitive abilities affected by musical training. However, whether the impact of musical training on the visual dominance is due to the enhancement of audiovisual integration or the enhancement of unimodal abilities is unclear. Therefore, it is necessary to evaluate the relative contribution of uni- and multisensory brain mechanisms to the decreased visual dominance effect observed in musicians at the behavioral level in the future. In addition, future studies can also explore the correlation between years of musical training and the response proportion of bimodal trials in the Colavita effect paradigm, and it can be explored whether the paradigm can evaluate the effectiveness of musical training. Since musical training can improve visual and auditory processing and audiovisual integration, it can be used as a method to improve cognitive processing ability in some people with ASD and other disabilities in the future. In sum, the present study compared the Colavita effect between music majors and nonmusic majors to explore the sensory potentials associated with the cognitive development provided by musical practice. In general, music majors have enhanced audiovisual integration. This not only points out the importance of musical training in cognitive development, but also provides a theoretical basis for music learning and musical training and its assessment to take advantage of the benefits of human brain plasticity.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the 14th five-year plan of Jiangsu Province Education Science (B/2021/01/87), the Humanities and Social Sciences Research Project of Soochow University (22XM0017), the Undergraduate Training Program for Innovation and Entrepreneurship, Soochow University (202210285014Z) and the Interdiscipline Research Team of Humanities and Social Sciences of Soochow University (2022). M.Z. was also supported by the National Natural Science Foundation of China (31700939, 31871092) and the Japan Society for the Promotion of Science KAKENHI (20K04381).
