Abstract
Two experiments examined whether auditory imagery was localized to the left or right ear. Building on research of Prete and colleagues, in Experiment 1, participants imaged a person spoke into their ear or they spoke into an imaged person's ear. Valence of the message was positive (e.g., “you won!”) or negative (e.g., “you lost!”), and sex of the imaged person and whether the participant or imaged person was imaged to have moved was varied. Positively-valenced messages were more likely to be imaged at the right ear; negatively-valenced messages were more likely to be imaged at the left ear. In Experiment 2, participants imaged a positively-valenced (e.g., a kitten purring) or a negatively-valenced (e.g., fingernails scratching a chalkboard) nonverbal sound. Both positively-valenced nonverbal sounds and negatively-valenced nonverbal sounds were imaged at the right ear. Auditory imagery vividness and clarity, handedness, and preferred telephone ear did not generally correlate with ear preferences. Implications for lateralization of language, emotion, and auditory frequency; the inner voice/inner ear distinction; and methods of analysis are discussed.
Listeners are often faster and more accurate in processing verbal stimuli presented to their right ear than verbal stimuli presented to their left ear, and this has been referred to as a right ear advantage (for reviews, see Bryden, 1988; Hugdahl, 2003; for discussion of the neural basis, see Prete, D'Anselmo, Tommasi, et al., 2018; Tanaka et al., 2021). This pattern has been suggested to result from stronger contralateral connections than ipsilateral connections between the ears and the cerebral hemispheres and to most (right-handed) individuals having the bulk of language processing occur in the left hemisphere (e.g., Kimura, 1961, 1967). However, studies of the right ear advantage involve presentation of external stimuli, and there have not been many studies regarding whether an analogous pattern would occur with internally generated auditory imagery. The study reported here did not present external sound stimuli, but instead asked participants to generate auditory imagery and report whether that imagery was “heard” or “spoken” at their left ear or right ear or at the left ear or right ear of an imaged person, and localization at the left ear or at the right ear is referred to as a left ear preference or a right ear preference, respectively. Several areas in the left hemisphere are known to be specialized for language processing (e.g., Josse & Tzourio-Mazoyer, 2004), and auditory imagery of language involves primarily left hemisphere processing (e.g., McGuire et al., 1996). As auditory imagery often involves activation of the same neural structures as auditory perception (for review, see Hubbard, 2010, 2019), a right ear preference for auditory verbal imagery should occur.
Prete, D'Anselmo, Brancucci, et al. (2018) presented listeners with white noise in both ears. In some trials, a voice pronouncing a vowel sound at the left ear or at the right ear was also presented, and a right ear advantage occurred for detection of the vowel sound. At low voice intensities, a detected voice was more likely to be reported to be on the right, and this was referred to as an illusory right ear advantage. The right ear advantage might relate to the use of verbal stimuli, but it is possible there is a general bias to hear certain types of sounds on the right. For example, Deutsch (2013, 2019) describes two musical illusions that demonstrate a right-ear bias for higher pitches. In the scale illusion, the two ears hear different sequences of notes; in each sequence, the successive notes presented at each ear can vary widely in pitch, but listeners perceive two smoothly moving sequences, with the right ear initially hearing a high descending sequence that then ascends, and the left ear initially hearing a low ascending sequence that then descends. In the octave illusion, two notes an octave apart are simultaneously presented to the two ears, with the ear receiving the higher note changing with each presentation. Listeners perceive a single note that alternates between the ears, with the higher pitch always perceived on the right and the lower pitch always perceived on the left (for discussion of the neural basis of the latter illusion, see Brancucci et al., 2011; Prete et al., 2017). Remarkably, higher pitches were always heard on the right side in both illusions, even when the headphones were reversed, thus demonstrating a nonverbal illusory right ear advantage.
Several studies examined cerebral localization of verbal auditory hallucinations, and increased activity of the left cerebral hemisphere has been noted in auditory verbal hallucinations in patients with schizophrenia (e.g., Bentaleb et al., 2002) and in voluntary speech imagery in control participants (e.g., Shergill et al., 2001). Indeed, auditory verbal hallucinations in patients with schizophrenia have been hypothesized to result from hyperactivity of left hemisphere areas due to failure of top-down control (Hugdahl, 2009). Although several studies focused on asymmetries in cerebral activation (i.e., on cerebral lateralization) during imagery/hallucinations, asymmetries regarding imaged lateralization (i.e., on which side of subjective space the imagery/hallucination was perceived to be located) have received less investigation; indeed, in a review of literature on speech imagery (Alderson-Day & Fernyhough, 2015), imaged lateralization of speech was not discussed. There have been studies regarding whether hallucinated sounds appear to originate inside or outside of the body (e.g., Hunter et al., 2003; McCarthy-Jones et al., 2014), but whether those hallucinated sounds were localized on the left side or right side of the body was not emphasized. In one exception, participants imaged hearing a voice at one ear, and healthy controls and patients with schizophrenia who did not experience auditory verbal hallucinations imaged the voice at the right ear, but there was no ear preference for patients with schizophrenia who experienced auditory verbal hallucinations or for patients with bipolar disorder (Altamura et al., 2020).
Prete et al. (2016) had non-patient participants image that they were being spoken to at one ear or were speaking into one ear of another person, and they reported the imaged speech was localized at the participant's right ear or the imaged person's right ear, respectively. This pattern is consistent with a right ear preference for auditory verbal stimuli. However, a right ear preference for undefined sounds was also found, suggesting that the observed right ear preference was not specific to verbal imagery and questioning previous studies that found a left ear advantage for perception of nonverbal stimuli (e.g., King & Kimura, 1972; Piazza, 1977). Prete et al. considered the possible role of imaged movement of the sound source, and they claimed that a right ear preference occurred when imaged movement was related to listening but not when imaged movement was related to speaking (although this appeared limited to imaged movement of the participant and did not include imaged movement of another person). However, a difference in ear preferences between listening and speaking is not entirely consistent with motor approaches to speech (e.g., Galantucci et al., 2006; Liberman & Mattingly, 1985), which suggest similar neural structures are active in listening (perception) and in speaking (production). As speaking conditions and listening conditions in Prete et al. involved different imaged body movements of the participant or the imaged other person, perhaps these differences might have contributed to differences in ear preference. Lastly, ear preference was not influenced by whether the participant or the imaged person was male or female.
Having participants image that they were speaking to someone else or that they were listening to someone else speaking is reminiscent of the distinction in auditory imagery between the inner voice and the inner ear. Just as generating speech or listening to speech involves the voice and the ear, respectively, so too has auditory imagery of speaking and listening been hypothesized to involve an “inner voice” and an “inner ear”, respectively (for review, see Hubbard, 2013, 2018). Relatedly, Hurlburt et al. (2013) suggested that “inner speaking” is not the same thing as “inner hearing”. The inner voice and the inner ear are usually coordinated, but they can be experimentally dissociated (e.g., blocking subvocalization interferes with some auditory imagery tasks but not with other auditory imagery tasks, Smith et al., 1995). Consistent with the distinction between the inner voice and inner ear, different patterns of cortical activation occur when participants image pronouncing syllables or image listening to syllables (e.g., Tian et al., 2016; see also McGuire et al., 1996), and differences attributed to additional motor components in imagery for speaking than in imagery for listening have been reported (e.g., Tian & Poepple, 2010, 2013). The difference between when participants imaged speaking and when participants imaged listening in Prete et al. (2016) might relate to differences between the inner voice and the inner ear; thus, ear preferences might depend upon or be influenced by differences in the mechanisms associated with the inner voice and with the inner ear.
In addition to differences in language processing across the two cerebral hemispheres, there is evidence that the two cerebral hemispheres might be specialized for processing different emotional valences, with the right hemisphere specialized for processing negatively-valenced stimuli, and the left hemisphere specialized for processing positively-valenced stimuli (e.g., Davidson et al., 1987). Relatedly, the right hemisphere has been suggested to be specialized for processing stimuli that an individual would typically withdraw from, and the left hemisphere has been suggested to be specialized for processing stimuli that an individual would typically approach (e.g., Gable et al., 2018). As an individual might be more likely to approach a positively-valenced stimulus and more likely to withdraw from a negatively-valenced stimulus, these different types of cerebral specializations are consistent. There is also evidence that the right hemisphere is specialized for processing lower frequencies and the left hemisphere is specialized for processing higher frequencies (Robertson & Ivry, 2000), and this has been found for spatial (e.g., Peyrin et al., 2004) and auditory (e.g., Deutsch, 1985) frequencies. The findings regarding cerebral specialization and asymmetry suggest that valence, approach/withdrawal, and auditory frequency might all influence ear preferences. If these influences were consistent (e.g., a high-pitched and positively-valenced stimulus), a stronger asymmetry in lateralization would be expected, whereas if these influences were inconsistent (e.g., a low-pitched and positively-valenced stimuli), a weaker asymmetry in lateralization would be expected.
Prete, Tommasi, et al. (2020) had participants image that they were being spoken to in one ear or were speaking into an ear of another person, and the messages being spoken were positively-valenced (e.g., “you won!”) or negatively-valenced (e.g., “you lost!”). They found a right ear preference for messages having positive valence but no ear preference for messages having negative valence. Additionally, Prete, Tommasi, et al. had participants image someone else stating that they were happy or angry, and there was a right ear preference when an imaged person stated that they were happy and no ear preference when an imaged person stated that they were angry. The right ear preference for positively-valenced messages is consistent with the suggestion that the left hemisphere is specialized for processing positive valence. Prete, Tommasi, et al. suggested the lack of a left ear preference for negatively-valenced messages might reflect a combination of right hemisphere specialization for negative valence and left hemisphere specialization for language; in essence, valence and language specializations canceled out, thus leaving no clear ear preference. There were no effects of whether the message referred to the participant or to the imaged person. Effects of the sex of the participant or of the imaged person were not considered, although when the imaged person spoke, a right ear preference for female voices and a left ear preference for male voices might have been predicted (cf. Prete et al., 2016), as male voices are typically lower in pitch and female voices are typically higher in pitch.
Two experiments that examined imaged localization of auditory stimuli are reported here. In Experiment 1, participants imaged being spoken to (i.e., listening) or speaking to another person. In one condition, both the participant and an imaged person were stationary, and the imaged person spoke to the participant. In a second condition, the participant imaged approaching another person and then speaking into one ear of that person, and in a third condition, the participant imaged being approached by another person, who then spoke into one of the participant's ears. The message that was spoken or heard was positively-valenced or negatively-valenced. Participants reported whether the imaged person spoke into their right ear or left ear or whether they spoke into the right ear or left ear of the imaged person. In Experiment 2, participants imaged nonverbal sounds that were positively-valenced or negatively-valenced. Participants reported whether the image was located at their right ear or at their left ear. In Experiments 1 and 2, control conditions involving images of the sound of a flute and the sound of a tuba were included to separate effects of pitch from effects of valence, language, or sex of an imaged person that spoke to the participant. Also, whether ear preferences were related to self-reported vividness or clarity of auditory imagery, handedness, and other demographic and individual differences variables were considered. Although the literature did not suggest specific predictions regarding the demographic and individual differences variables on ear preferences, these variables are often of interest in studies of imagery, and so were included.
Experiment 1
In Experiment 1, participants imaged being spoken to or speaking to another person, and valence of the message varied. There were three primary hypotheses. The first hypothesis involved speech, and given claims that language is primarily processed in the left hemisphere, whereas music (in non-musicians) and other non-speech sounds are primarily processed in the right hemisphere, a right ear preference for speech and a left-ear preference for non-speech could be predicted. The second hypothesis involved valence, and given claims that positively-valenced stimuli are processed in the left hemisphere, whereas negatively-valenced stimuli are processed in the right hemisphere, a right ear preference for positively-valenced verbal messages and a left ear preference for negatively-valenced verbal messages could be predicted. The third hypothesis involved pitch, and given claims that higher frequencies are processed in the left hemisphere, whereas lower frequencies are processed in the right hemisphere, a right ear preference for higher musical pitches and female voices and a left ear preference for lower musical pitches and male voices could be predicted. 1 Given previous findings, the data were expected to align most closely with the second hypothesis. Additionally, comparisons of ear preferences for images involving movement of the participant or movement of the imaged person, and comparisons of ear preferences for images involving movement with images that did not involve movement, could shed light on the distinction between the inner voice and the inner ear.
Method
Participants
Participants were undergraduates at the University of South Carolina Upstate, who completed an online survey. The sample included 147 participants (116 female [78.9%], 131 right-handed [89.1%]) ranging from 15 to 45 years of age (M = 19.84, SD = 3.87). Participants self-identified primarily as White/Caucasian (70 [47.6%]) or Black/African-American (52 [35.4%]), with smaller numbers self-identifying as Hispanic/Latino (9 [6.1%]), Asian/Pacific Islander (14 [9.8%]), Native American (2 [1.4%]), and Other/Choose Not to Disclose (0 [0%]). There was a range of experience in performing in a band or choir (no experience = 43 [29.3%], <1 year = 20 [13.6%], 1–2 years = 34 [23.1%], 2–5 years = 32 [21.8%] and >5 years = 18 [12.2%]) and of experience in formal music training (no experience = 31 [21.1%], <1 year = 28 [19.1%], 1–2 years = 36 [24.5%], 2–5 years = 30 [20.4%] and >5 years = 22 [15%]). Lastly, there was a wide range in daily experience with earbuds or headphones (no experience = 6 [4.1%], <1 hour = 14 [9.5%], 1–2 hours = 38 [25.9%], 2–3 hours = 35 [23.8%], 3–4 hours = 21 [14.3%], >4 hours = 33 [22.4%]). Prete et al. (2016) and Prete, Tommasi, et al. (2020) used a between-participants design and included 50 participants within each condition, and the current study used a within-participants design (which has greater statistical power, Bellemare et al., 2014) with nearly three times the number of participants per condition, thus ensuring sufficient statistical power. Participants received partial course credit, and the study was approved by the Institutional Review Board at the University of South Carolina Columbia.
Measures
The online survey consisted of three questionnaires, 12 questions regarding the experimental hypotheses, two control questions, and eight questions regarding demographic and individual differences variables. The three questionnaires included the vividness subscale of the Bucknell Auditory Imagery Scale (BAIS-V; Halpern, 2015), the Clarity of Auditory Imagery Scale (CAIS; Willander & Baraldi, 2010), and the short form of the Edinburgh Handedness Survey (EHS; Veale, 2014). The BAIS-V consists of 14 items that are rated on a 1–7 scale. For each item, participants were instructed to form an image (e.g., the song “Happy Birthday”, the voice of a clerk on the phone), and they rated the vividness of that image (1 = no image present at all; 7 = as vivid as the actual sound). The CAIS consists of 16 items (e.g., a clock ticking, a car ignition), and participants were instructed to image each item and rate how clearly they heard the sounds on a 1–5 scale (1 = not at all; 5 = very clear). The EHS consists of 4 items (e.g., writing, using a toothbrush), and participants were instructed to image each item and rate their hand preference for each item on a 1–5 scale (1 = always left; 5 = always right). The 12 questions regarding the experimental hypotheses of ear preferences in each of several imaged speaking or listening scenarios are listed in Table 1. Two additional control questions regarding ear preferences for the imaged sound of a flute and for the imaged sound of a tuba were also included. The eight demographic and individual differences questions asked about the participant's sex, age, ethnic group, handedness, years of participation in a band or choir, years of formal instruction in music, how many hours per day they used earbuds or headphones, and at which ear they usually held their telephone during telephone calls.
The Experimental Questions in Experiment 1.
Note. For each question, the participant responded whether they spontaneously imaged hearing the sound at their left ear or right ear or speaking into the left ear or right ear of an imaged person.
Procedure
The survey was implemented online in Qualtrics. The first page of the survey informed participants that the study was about auditory imagery and contained consent information in which participants were informed of their rights as participants in research. After participants clicked a consent box on the first page, they were presented with the BAIS-V, CAIS, and the EHS; the order of the three questionnaires, as well as the questions within each questionnaire, were randomized across participants. After completing the questionnaires, the participants completed the experimental and control questions, and the order of these questions was randomized across participants. Participants then completed the demographic and individual differences questions.
Design
The survey contained a total of 56 questions. The BAIS-V, CAIS, and EHS involved 14, 16, and 4 questions, respectively. There were 12 questions related to the experimental hypotheses, two control questions involving imagery of musical instruments, and eight demographic and individual differences questions. Of the 12 questions related to the experimental hypotheses, one group of four questions (the stationary conditions) focused on ear preference for a message spoken into one ear from an imaged nearby stationary person, and these questions reflected all pairwise 2 × 2 combinations of the sex of imaged person (male, female) and valence of the message (positive, negative). One group of eight questions (the movement conditions) focused on ear preference for a message spoken into one ear when a participant imaged being approached by a person who spoke to that participant or imaged approaching a person and speaking to that person, and these questions reflected all pairwise 2 × 2 × 2 combinations of the sex of imaged person (male, female), valence of the message (positive, negative), and direction of movement (the stationary participant imaged being approached by a person who spoke to them, the participant imaged approaching a stationary person to whom he or she spoke). Following Prete et al. (2016) and Prete, Tommasi, et al. (2020), participants were instructed to imagine that the imaged person spoke into one of their ears or that they spoke into one ear of the imaged person. One group of two questions focused on ear preference for an image of the sound of a flute and an image of the sound of a tuba. Lastly, one group of eight questions involved demographic and individual differences measures.
Results
An analysis regarding comparisons across different experimental questions, and a separate analysis regarding the demographic and individual differences measures, are presented. In the analysis of the experimental questions, two types of analyses are reported. The first type of analysis compares ear preferences in different conditions, and these analyses involve ANOVAs. The second type of analysis compares the ear preferences within different conditions to chance, and these analyses involve t-tests. 2
Comparisons Across and Within Conditions
Separate analyses across conditions comparing the stationary conditions, movement conditions, and control conditions are presented; in these analyses, the dependent variable was the ear at which the image was localized, and data were coded to reflect the likelihood of a right ear preference. Separate comparisons within conditions examined whether the likelihood of a right ear response or a left ear response was greater than chance; in these analyses, the dependent variable was the likelihood of a right ear response or the likelihood of a left ear response.
Stationary Conditions
Ear preferences for a message that was spoken at one ear of the participant by an imaged nearby stationary person were analysed in a 2 (sex: male, female) × 2 (valence: positive, negative) repeated measures ANOVA, and these data are shown in Figure 1. Positively-valenced verbal messages (M = .626, SE = 0.031) were more likely than negatively-valenced verbal messages (M = .463, SE = 0.034) to be imaged at the right ear than imaged at the left ear, F(1,146) = 15.023, p < .0001, MSE = 0.261, partial η2 = .093. The likelihood of a right ear response for positively-valenced verbal messages was significantly greater than chance (.5), t(146) = 36.50, p < .0001, and the likelihood of a right ear response for negatively-valenced verbal messages was significantly less than chance (.5), t(146) = −28.46, p < .0001. These patterns are consistent with a right ear preference for positively-valenced verbal messages and a left ear preference for negatively-valenced verbal messages. Whether the imaged person was male (M = .551, SE = 0.030) or female (M = .537, SE = 0.030) was not significant, F(1,146) = 0.159, p = .691, MSE = 0.171, partial η2 = .001, nor was the interaction of Sex × Valence significant, F(1,146) = 0, p = 1.0, MSE = 0.185, partial η2 = .000.

Ear preferences when a nearby stationary imaged person spoke to the participant in Experiment 1.
Movement Conditions
Ear preferences for a message that involved an imaged person approaching the participant and speaking into the participant's ear or the participant imaging approaching another person and speaking into that person's ear were analysed in a 2 (sex: male, female) × 2 (valence: positive, negative) × 2 (movement: approached, approaches) repeated measures ANOVA, and these data are shown in Figure 2. Positively-valenced verbal messages (M = .57, SE = 0.027) were significantly more likely than negatively-valenced verbal messages (M = .44, SE = 0.02) to be spoken into the right ear than into the left ear, F(1,146) = 12.34, p < .001, MSE = 0.377, partial η2 = .078. When the participant was approached by an imaged person, the likelihood of a right ear response for positively-valenced verbal messages was significantly greater than chance (.5), t(146) = 14.50, p < .0001, and the likelihood of a right ear response for negatively-valenced verbal messages was significantly less than chance (.5), t(146) = −20.06, p < .0001. When the participant imaged approaching a person, the likelihood of a right ear response for positively-valenced verbal messages was significantly greater than chance (.5), t(146) = 31.36, p < .0001, and the likelihood of a right ear response for negatively-valenced verbal messages was significantly less than chance (.5), t(146) = −27.54, p < .0001. These patterns are consistent with a right ear preference for positively-valenced verbal messages and a left ear preference for negatively-valenced verbal messages.

Ear preferences when participants were approached and spoken to by (top) or approached and spoke to (bottom) an imaged person in Experiment 1.
Whether the imaged person was male (M = .49, SE = 0.02) or female (M = .52, SE = 0.02) was not significant, F(1,146) = 1.095, p = .297, MSE = 0.199, partial η2 = .007, and whether the participants were approached by (M = .488, SE = 0.024) or approached (M = .519, SE = 0.027) an imaged person was not significant, F(1,146) = 0.958, p = .329, MSE = 0.287, partial η2 = .007. Sex × Valence, F(1,146) = 2.88, p = .092, MSE = 0.143, partial η2 = .019; Sex × Movement, F(1,146) = 1.032, p = .311, MSE = 0.162, partial η2 = .007; Valence × Movement, F(1,146) = 0.08, p > .779, MSE = 0.173, partial η2 = .040; and Sex × Valence × Movement, F(1,146) = 1.48, p = .226, MSE = 0.147, partial η2 = .010, were nonsignificant. As can be seen in Figure 2, there was generally a right ear preference for positively-valenced verbal messages and a left ear preference for negatively-valenced verbal messages. For positively-valenced verbal messages, there was a trend for a larger right ear preference when participants imaged being approached by a female than by a male, but there was no such trend when participants imaged approaching a person.
Control Conditions
The hypothesis that the left hemisphere is specialized for processing higher pitches and the right hemisphere is specialized for processing lower pitches predicted that imaged female speakers would be localized at the right ear of the participant and that imaged male speakers would be localized at the left ear. If a difference in ear preference as a function of the sex of the imaged speaker had been found, then the control questions regarding imaging of a flute and imaging of a tuba might have helped disambiguate whether the difference between ear preferences for a male voice and for a female voice was due to sex per se or due to differences in the average pitches of male and female voices. Although an effect of the sex of the imaged speaker was not found, it is still of interest whether an effect of auditory frequency on ear preference might occur for non-speech musical instruments. Curiously, imagery of the sound of a flute (M = .59, SE = 0.04), t(146) = 2.085, p < .039, and imagery of the sound of a tuba (M = .58, SE = 0.04), t(146) = 1.91, p = .058, were both more likely to exhibit a right ear preference; such a pattern does not appear consistent with suggestions that the left hemisphere is specialized for processing higher pitches and the right hemisphere is specialized for processing lower pitches or that nonverbal stimuli exhibit a left ear preference.
Individual Differences and Demographic Measures
Average scores (across scale items) were calculated for the BAIS-V, CAIS, and EHS for each participant and compared to ear preferences in each of the four stationary conditions and eight movement conditions. Also, the average scores of the BAIS-V and CAIS were compared to the midpoint of those scales in order to determine whether the ear preferences might reflect an average self-reported high (average significantly above the midpoint) or low (average significantly below the midpoint) level of vividness or clarity in the participant sample, and the average score on the EHS was compared to the midpoint of that scale in order to determine whether handedness in the participant sample was strongly lateralized. Additionally, participant age, number of hours per day that participants spent using headphones or earbuds, and to which ear their telephone was typically held when participants were talking on the telephone, were compared to ear preferences in each of the stationary conditions and movement conditions. Highly unequal numbers of participants in the different conditions of sex and ethnic group did not allow consideration of those potential individual differences.
Vividness
An average score on the BAIS-V was calculated (M = 4.65, SE = 1.01), and when the individual ratings of imagery vividness from the BAIS-V were compared with ear preferences for each of the experimental questions, there were no significant correlations, all rss < .16, ps > .06. When tested against a mean value of 4 (on the 7-point BAIS-V scale), average ratings of vividness were significantly higher, t(146) = 7.79, p < .001. Thus, participants tended to rate their imagery as relatively vivid.
Clarity
An average score on the CAIS was calculated (M = 3.99, SE = 0.06), and when the individual ratings of imagery clarity from the CAIS were compared with ear preferences for each of the experimental questions, there were no significant correlations, all rss < .16, ps > .06. When tested against a mean value of 3 (on the 5-point CAIS scale), average ratings were significantly higher, t(146) = 16.26, p < .001. Thus, participants tended to rate their imagery as relatively clear. As found in previous literature (e.g., Hubbard & Ruppel, 2021), scores on the BAIS-V and CAIS were positively correlated, rs = .587, p < .0001.
Handedness
An average score on the EHS was calculated (M = 4.44, SE = 0.08), and when the individual ratings of hand preferences from the EHS were compared with ear preferences for each of the experimental questions, there were no significant correlations, all rss < .15, ps > .08. When tested against a mean value of 3 (on the 5-point EHS scale), average ratings were significantly higher, t(146) = 17.52, p < .001. Thus, participants generally preferred to use their right hand.
Age
The age of the participant did not correlate with ear preferences for any of the experimental questions, all rss < .15, ps > .07, nor did the age of the participant correlate with the BAIS-V, rs = .129, p = .120, or CAIS, rs = .052, p = .529.
Earbud or Headphone Use
The number of hours per day that participants reported using earbuds or headphones did not correlate with ear preferences for any of the experimental questions, all rss < .16, ps > .16, nor did the number of hours per day that participants reported using earbuds or headphones correlate with the BAIS-V, rs = .045, p = .59, or CAIS, rs = −.055, p = .51.
Telephone Ear
Participants reported that when they spoke on the telephone, they usually held the phone to their right ear (M = .80, SE = 0.03), and this was significantly higher than chance (.5), t(146) = 8.87, p < .0001. However, whether participants usually held the telephone to their left ear or right ear did not correlate with ear preferences for any of the experimental questions except for the right ear preference when participants approached a female with a positively-valenced message, rs = .20, p < .02. Given the large number of comparisons, this correlation might be due to chance.
Musical Experience
Neither years of formal musical instruction, all rss < .16, ps > .08, nor years performing in a band or choir, all rss < .12, ps > .16, correlated with ear preference for any of the experimental questions. Neither years of formal musical instruction, rs = .076, p = .358, nor years performing in a band or choir, rs = .062, p = .356, correlated with the BAIS-V. Years performing in a band or choir did not correlate with the CAIS, rs = −.120, p = .356, but years of formal musical instruction were negatively correlated with the CAIS, rs = −.163, p = .049. Why increased musical instruction should lead to a decrease in clarity of auditory imagery is not clear, and given the large number of comparisons, this correlation might be due to chance. Neither years of music training nor years in a band or choir correlated with whether the flute or tuba was imaged at the right ear or left ear, all rss < .05, ps > .64.
Discussion
Positively-valenced verbal messages were more likely than negatively-valenced verbal messages to be imaged at the right ear than imaged at the left ear, and this occurred when a stationary imaged person spoke to a stationary participant, the participant imaged approaching a person and then spoke to that person, or an imaged person approached and then spoke to the participant. Importantly, the likelihood of a right ear response was significantly greater than chance for positively-valenced verbal messages, and the likelihood of a left ear response was significantly greater than chance for negatively-valenced verbal messages. This pattern is consistent with those reported in Prete, Tommasi, et al. (2020); indeed, the finding of a left ear preference for negatively-valenced verbal messages was significant in Experiment 1 and did not reach significance in Prete, Tommasi, et al. The sex of the imaged person did not influence ear preferences. Auditory imagery of the sound of a high-pitched instrument (flute) and the sound of a low-pitched instrument (tuba) exhibited a right ear preference. The lack of differences between the stationary conditions and the movement conditions does not strongly support a distinction between the inner voice and the inner ear, and this will be addressed further in the General Discussion. The findings regarding demographic and individual differences measures will be addressed in the General Discussion. Overall, the data aligned most closely with the second of the three primary hypotheses (a right ear preference for positively-valenced stimuli and a left ear preference for negatively-valenced stimuli).
Experiment 2
It is possible that the right ear preference for positively-valenced verbal stimuli and the left ear preference for negatively-valenced verbal stimuli observed in Experiment 1 reflected hemispheric specialization for valence, with the left hemisphere specialized for positively-valenced stimuli and the right hemisphere specialized for negatively-valenced stimuli. However, valence was manipulated in Experiment 1 by changing the content of the verbal message, and so it is possible that verbal processing might have contributed to the ear preferences observed in Experiment 1. Also, the right ear preference for the two control stimuli, the imaged sound of an flute and the imaged sound of a tuba, might have occurred if both musical sounds were considered positive (or at least, not negative). Thus, it would be useful to compare ear preferences in auditory imagery for positively-valenced nonverbal sounds and for negatively-valenced nonverbal sounds. If a right ear preference for positively-valenced nonverbal sounds and a left ear preference for negatively-valenced nonverbal sounds are observed, then that would be consistent with a left hemisphere specialization for processing positively-valenced stimuli and a right hemisphere specialization for processing negatively-valenced stimuli. If a right ear preference is observed for positively-valenced nonverbal sounds and for negatively-valenced nonverbal sounds, then that would be consistent with a general right ear preference in auditory imagery. If no ear preferences are observed, that would be consistent with a lack of hemispheric specialization for different valences.
Method
Participants
Participants were from the same participant pool used in Experiment 1 and completed a similar online survey, and none had participated in Experiment 1. The sample included 144 participants (110 female [76.4%], 131 right-handed [91.0%]) ranging from 17 to 46 years of age (M = 20.31, SD = 4.21). Participants self-identified primarily as White/Caucasian (75 [52.1%]) or Black/African-American (52 [36.1%]), with smaller numbers self-identifying as Hispanic/Latino (7 [4.9%]), Asian/Pacific Islander (7 [4.9%]), Native American (7 [4.9%]), and Other/Choose Not to Disclose (3 [2.1%]). There was a range of experience in performing in a band or choir (no experience = 42 [29.2%], <1 year = 16 [11.1%], 1–2 years = 30 [20.8%], 2–5 years = 33 [22.9%] and >5 years = 23 [16.0%]) and of experience in formal music training (no experience = 32 [22.2%], <1 year = 18 [12.5%], 1–2 years = 36 [25.0%], 2–5 years = 26 [18.1%] and >5 years = 32 [22.2%]). Lastly, there was a wide range in daily experience with earbuds or headphones (no experience = 3 [2.1%], <1 hour = 19 [13.2%], 1–2 hours = 35 [24.3%], 2–3 hours = 29 [20.1%], 3–4 hours = 17 [11.8%], >4 hours = 41 [28.5%]). The relative percentages within the demographic and individual differences measures are similar to those in Experiment 1.
Measures
As in Experiment 1, the online survey consisted of three questionnaires, 12 questions regarding the experimental hypotheses, two control questions, and eight questions regarding demographic and individual differences variables. The three questionnaires, two control questions, and eight demographic and individual differences questions were the same as in Experiment 1. The 12 questions regarding the experimental hypotheses of ear preferences for positively-valenced sounds or for negatively-valenced sounds are listed in Table 2. Stimulus items for the questions regarding the experimental hypotheses were generated by having 50 undergraduate students (who did not participate in Experiment 1 or in the imagery trials in Experiment 2) each generate descriptions of 10 positively-valenced nonverbal sounds (which made them feel happy or good) and descriptions of 10 negatively-valenced nonverbal sounds (which made them feel upset or sad). The six most frequently named positively-valenced nonverbal sounds, and the six most frequently named negatively-valenced nonverbal sounds, were chosen as stimuli for Experiment 2.
The Experimental Questions in Experiment 2.
Note. For each question, the participant responded whether they spontaneously imaged the sound at their left ear or at their right ear.
Procedure
The procedure was the same as in Experiment 1.
Design
The design was the same as in Experiment 1, with the following exceptions. Of the 12 questions related to the experimental hypotheses, six questions focused on ear preferences for positively-valenced nonverbal sounds, and six questions focused on ear preferences for negatively-valenced nonverbal sounds.
Results
As in Experiment 1, an analysis regarding comparisons across different experimental questions, and a separate analysis regarding the demographic and individual differences measures, are presented. In the analysis of the experimental questions, two types of analyses are reported. The first type of analysis compares ear preferences in different conditions, and these analyses involve ANOVAs. The second type of analysis compares the ear preferences within different conditions to chance, and these analyses involve t-tests.
Comparisons Across and Within Conditions
Rather than analysing each of the 12 experimental questions separately (as was done in Experiment 1), an average of the six positively-valenced questions and an average of the six negatively-valenced questions were calculated for each participant, and the subsequent analyses were based on these averages. Separate analyses across conditions comparing the valence conditions and control conditions are presented; in these analyses, the dependent variable was the ear at which the image was localized, and data were coded to reflect the likelihood of a right ear preference. Separate comparisons within conditions examined whether the likelihood of a right ear response or a left ear response was greater than chance; in these analyses, the dependent variable was the likelihood of a right ear response or the likelihood of a left ear response.
Valence Conditions
Ear preferences for positively-valenced nonverbal sounds and negatively-valenced nonverbal sounds were analysed in a repeated measures ANOVA, and these data are shown in Figure 3. The ear preference for positively-valenced nonverbal sounds (M = .563, SE = 0.024) was not significantly different from the ear preference for negatively-valenced nonverbal sounds (M = .556, SE = 0.022), F(1,143) = 0.05, p = .82, MSE = 0.023, partial η2 = .00. The likelihood of a right ear response for positively-valenced nonverbal sounds was significantly greater than chance (.5), t(143) = 2.61, p < .010, and the likelihood of a right ear response for negatively-valenced nonverbal sounds was significantly greater than chance (.5), t(143) = 2.50, p < .020. These patterns are consistent with a right ear preference for positively-valenced nonverbal sounds and for negatively-valenced nonverbal sounds.

Ear preferences when a participant imaged a positively-valenced stimulus or negatively-valenced stimulus in Experiment 2.
Control Conditions
Imagery of the sound of a flute (M = .556, SE = 0.041), t(143) = 1.34, p < .10, exhibited a marginally significant right ear preference, and imagery of the sound of a tuba (M = .60, SE = 0.04), t(143) = 2.55, p = .006, exhibited a significant right ear preference. These results are consistent with Experiment 1 and are not consistent with previous suggestions that the left hemisphere is specialized for processing higher pitches and the right hemisphere is specialized for processing lower pitches or that nonverbal sounds exhibit a left ear preference.
Individual Differences and Demographic Measures
As in Experiment 1, average scores (across scale items) were calculated for the BAIS-V, CAIS, and EHS for each participant and compared to ear preferences for each valence condition. As in Experiment 1, the average scores of the BAIS-V and CAIS were compared to the midpoint of those scales in order to determine whether the ear preferences might reflect an average self-reported high (average significantly above the midpoint) or low (average significantly below the midpoint) level of vividness or clarity in the participant sample, and the average score on the EHS was compared to the midpoint of that scale in order to determine whether handedness in the participant sample was strongly lateralized. Additionally, participant age, number of hours per day that participants spent using headphones or earbuds, and to which ear their telephone was typically held when participants were talking on the telephone, were compared to ear preferences in each valence condition. As in Experiment 1, highly unequal numbers of participants in the different conditions of sex and ethnic group did not allow consideration of those potential individual differences.
Vividness
An average score on the BAIS-V was calculated (M = 4.89, SE = 0.07), and when the ratings of imagery vividness from the BAIS-V were compared with ear preferences for each valence condition, there were no significant correlations, all rss < .07, ps > .47. When tested against a mean value of 4 (on the 7-point BAIS-V scale), average ratings of vividness were significantly higher, t(143) = 12.63, p < .001. Thus, participants tended to rate their imagery as relatively vivid.
Clarity
An average score on the CAIS was calculated (M = 3.98, SE = 0.06), and when the individual ratings of imagery clarity from the CAIS were compared with ear preferences for each valence condition, there were no significant correlations, all rss < .06, ps > .51. When tested against a mean value of 3 (on the 5-point CAIS scale), average ratings were significantly higher, t(143) = 17.18, p < .001. Thus, participants tended to rate their imagery as relatively clear. As found in Experiment 1 and in previous literature, scores on the BAIS-V and CAIS were positively correlated, rs = .43, p < .0001.
Handedness
An average score on the EHS was calculated (M = 4.49, SE = 0.08), and when the individual ratings of hand preferences from the EHS were compared with ear preferences for each of the experimental questions, there was a significant correlation for positively-valenced sounds, rs = .30, p < .001, and no correlation for negatively-valanced sounds, rs = .09, p > .30. The significant correlation between handedness and positively-valenced stimuli might reflect a common role of the left hemisphere. Alternatively, given the other results in Experiments 1 and 2, as well as the large number of correlations being observed, this correlation might be due to chance. When tested against a mean value of 3 (on the 5-point EHS scale), average ratings were significantly higher, t(143) = 18.22, p < .001. Thus, participants generally preferred to use their right hand.
Age
The age of the participant did not correlate with ear preferences for either positively-valenced sounds or negatively-valenced sounds, all rss < .08, ps > .63. Age was not correlated with scores on the BAIS-V, rs = −.07, p = .39, but did negatively correlate with scores on the CAIS, rs = −.20, p < .02. The significant negative correlation suggests decreases in auditory imagery clarity occur in later adulthood, but such a change has not been reported in the literature, and so this might be an area for future research. Alternatively, given the large number of comparisons, this correlation might be due to chance.
Earbud or Headphone Use
The number of hours per day that participants reported using earbuds or headphones did not correlate with ear preferences for either positively-valenced sounds or negatively-valenced sounds, all rss < .10, ps > .24, nor did the number of hours per day that participants reported using earbuds or headphones correlate with the BAIS-V, rs = .02, p = .85, or CAIS, rs = .06, p = .50.
Telephone Ear
Participants reported that when they spoke on the telephone, they usually held the phone to their right ear (M = 0.77, SE = 0.03), and this was significantly higher than chance (.5), t(143) = 7.71, p < .0001. When the correlations of telephone ear with ear preferences were examined, there was a significant correlation for positively-valenced sounds, rs = .18, p < .04, and no correlation for negatively-valenced sounds, rs = .16, p > .20. Although the r values for positively-valenced sounds and negatively-valenced sounds are quite similar, the p value for negatively-valenced sounds seems sufficiently far enough away from the criterion for significance that a Type II error seems less likely. Interestingly, the pattern of correlations of ear preference and telephone ear parallel the pattern of correlations of ear preference and handedness noted earlier, and this might reflect a more general preference for a specific cerebral hemisphere.
Musical Experience
Neither years of formal musical instruction, all rss < .06, ps > .49, nor years performing in a band or choir, all rss < .06, ps > .55, correlated with ear preference for either positively-valenced sounds or negatively-valenced sounds. Neither years of formal musical instruction, rs = .10, p = .25, nor years performing in a band or choir, rs = .07, p = .42, correlated with the CAIS. Perhaps surprisingly, both years of formal musical instruction, rs = .21, p < .02, and years performing in a band or choir, rs = .26, p < .003, correlated with the BAIS-V. The lack of a correlation of measures of musical experience with scores on the BAIS-V in Experiment 1 is consistent with the possibility that the correlations observed in Experiment 2 were due to chance; alternatively, the correlation observed in Experiment 2 might reflect the greater auditory imagery in musicians than in nonmusicians (e.g., see Talamini et al., in press) and the lack of correlation in Experiment 1 might reflect a weak effect or a Type II error.
Discussion
The ear preference for positively-valenced nonverbal sounds was not significantly different from the ear preference for negatively-valenced nonverbal sounds, and the likelihood of a right ear response for positively-valenced nonverbal sounds and for negatively-valenced nonverbal sounds was significantly greater than chance. Additionally, auditory imagery of the sound of a high-pitched instrument (flute) and the sound of a low-pitched instrument (tuba) exhibited a right ear preference. These patterns are consistent with a general right ear preference for both positively-valenced nonverbal sounds and for negatively-valenced nonverbal sounds and are consistent with the right ear preference for undefined sounds in Prete et al. (2016). The data are not consistent with the hypothesis of a right ear preference for positively-valenced nonverbal stimuli and a left ear preference for negatively-valenced nonverbal stimuli. Rather, the presence of a left-ear preference for negatively-valenced verbal stimuli in Experiment 1, coupled with the presence of a right-ear preference for negatively-valenced nonverbal stimuli in Experiment 2, suggests that the ear preferences observed in Experiment 1 were influenced by language. In other words, a left ear preference for negatively-valenced imaged stimuli occurred when such stimuli were verbal, but not when such stimuli were nonverbal; why language processing has this effect on ear preferences is not entirely clear. The findings regarding demographic and individual differences variables will be addressed in the General Discussion.
General Discussion
In Experiment 1, participants exhibited a right ear preference for positively-valenced verbal messages and a left-ear preference for negatively-valenced verbal messages. This pattern occurred when an imaged nearby stationary person spoke to the participant, when an imaged person approached and then spoke to the participant, and when the participant imaged approaching and then spoke to an imaged person. In Experiment 2, participants exhibited a right ear preference for positively-valenced nonverbal sounds and for negatively-valenced nonverbal sounds. These patterns suggest a role for both valence and language in ear preferences for imaged auditory stimuli. Importantly, existence of an ear preference was established not just by comparing differences between the likelihood of a right ear preference and a left ear preference as in previous research, but by also demonstrating that the probabilities of a right ear preference or a left ear preference were significantly different from chance. The patterns of ear preferences across Experiments 1 and 2 did not consistently support any of the three primary hypotheses regarding hemispheric specialization, as verbal stimuli did not result in a right ear preference and nonverbal stimuli did not result in a left ear preference, positively-valenced stimuli did not result in a right ear preference and negatively-valenced stimuli did not result in a left ear preference, and higher auditory frequencies did not result in a right ear preference and lower auditory frequencies did not result in a left ear preference. Instead, imaged localization appears to involve a combination of (at least) valence and verbality.
Close examination of Figures 1 and 2 suggests the left ear preference for negatively-valenced verbal messages is slightly smaller in magnitude than the right ear preference for positively-valenced verbal messages, and this is consistent with the suggestion of Prete, Tommasi, et al. (2020) that ear preference reflects a combination of hemispheric specialization for language and hemispheric specialization for valence. 3 The left ear preference for negatively-valenced verbal messages might have reached significance in Experiment 1 and not in Prete, Tommasi, et al., because the within-participant design in Experiment 1 provided greater statistical power than the between-participant design in Prete, Tommasi, et al. The right ear preference in the approaches condition in Experiment 1 does not appear consistent with the lack of an ear preference in Prete et al. (2016) when an imaged person approached the participant. However, Prete et al. (2016) did not manipulate valence, and although valence was manipulated in Prete, Tommasi, et al., the latter paper only included conditions in which the imaged person was stationary near the participant's ear. A potential hypothesis for the lack of an ear preference when an imaged person approached and spoke to the participant in Prete et al. (2016) is that such a condition involved listening rather than speaking. However, similar ear preferences in the approached condition and in the approaches condition in Experiment 1 are not consistent with such a hypothesis.
The similarity in ear preferences across different movement conditions in Experiment 1 is relevant for the distinction between the inner voice and the inner ear. An imaged person speaking to the participant would seem to involve the inner ear, whereas the participant imaging speaking to another person would seem to involve the inner voice. Given this, it might be predicted that imaged movement of the participant might disrupt imaged speech production (i.e., inner voice) more than movement of the other imaged person might disrupt imaged speech perception (i.e., inner ear). However, findings of motor activation during auditory imagery of non-vocal stimuli (e.g., Halpern et al., 2004; Lima et al., 2016) might predict that motor activation would occur in all movement conditions, and so might have equal effects on the inner voice and on the inner ear. Consistent with this, motor activation has been suggested to be involved in speech perception as well as in speech production (e.g., Galantucci et al., 2006; Liberman & Mattingly, 1985), and as discussed in Hubbard (2010, 2018), an embodied approach to cognitive processing blurs the distinction between the inner voice and the inner ear, as mechanisms of production (i.e., the inner voice) are activated when an individual experiences auditory imagery of listening to someone else's voice (i.e., the inner ear). Regardless, the similarities of ear preferences across the different movement conditions in Experiment 1 do not clearly support a distinction between the inner voice and the inner ear, 4 and it is not clear whether potential subvocal activity associated with auditory imagery (e.g., Smith et al., 1995) contributed to ear preferences.
The experiments revealed an important methodological issue. In Experiment 1, the difference between the likelihood of a right ear preference and the likelihood of a left ear preference was significant, and tests of the likelihood of a right ear preference for positively-valenced messages and a left ear preference for negatively-valenced messages were significantly above chance. The convergence of comparisons across conditions and the findings that different conditions differed from chance gave confidence in the conclusions. However, Experiment 2 highlighted the insufficiency of just comparing the likelihood of a right ear preference and the likelihood of a left ear preference to each other and the necessity of comparing the likelihood of a right ear preference and the likelihood of a left ear preferences to chance. Had Experiment 2 followed the procedures of previously published studies, the nonsignificance of the comparison of responses for positively-valenced stimuli and negatively-valenced stimuli would have resulted in a conclusion that no ear preference existed. It was only when likelihood of a right ear preference was compared to chance that evidence for an ear preference was discovered, as the likelihood of right ear preferences for positively-valenced nonverbal stimuli and for negatively-valenced nonverbal stimuli were both higher than chance. When arguing that a specific preference (or bias) exists, it is not sufficient to just examine whether different conditions in which such a preference (or bias) might exist differ from each other, it is also necessary to examine whether each condition in which such a preference (or bias) might exist differs from chance.
In Experiments 1 and 2, potential relationships of demographic and individual differences measures to ear preferences were examined. The general vividness and clarity of auditory imagery were not systematically related to ear preference. Scores on a general handedness inventory (EHS) were unrelated to ear preference. The percentage of left-handed participants in Experiments 1 and 2 (11% and 9%, respectively) matched the percentage of left-handed individuals in the general population (e.g., Papadatou-Pastou et al., 2020). Although this created too large a difference in group sizes to allow use of self-reported handedness as a variable, it does suggest the participant sample was representative of the general population (at least on this dimension); even so, scores from the EHS suggest handedness is not strongly related to ear preference. Relatedly, a hypothesis that ear preferences in auditory verbal imagery might be related to the ear to which participants usually held their telephone while engaged in telephone calls was not supported, nor was the amount of experience with headphones and earbuds related to ear preferences. Years of formal instruction in music and years performing in a band or choir were not consistently related to ear preferences, and this is consistent with the possibility that ear preferences were not due to effects of musical experience. There were also a few correlations between some of the individual differences measures that suggest potential areas for further research (e.g., the negative correlation of years of musical instruction and clarity of auditory imagery in Experiment 1, coupled with the negative correlation of age and vividness in Experiment 2, suggest auditory imagery ability might decrease with increases in age).
Differences in ear preferences related to auditory pitch were not found, as the imaged sound of a flute (high pitch) and the imaged sound of a tuba (low pitch) both exhibited a right ear preference. Similarly, the lack of an effect of the sex of the imaged person in Experiment 1, Prete et al. (2016), and Prete, Tommasi, et al. (2020) suggests pitch might not contribute to ear preferences in auditory verbal imagery, although it is possible that pitch might be related to ear advantages or preferences in other types of auditory perception or imagery. The lack of an effect of the sex of the imaged person, although not consistent with hemispheric asymmetries in pitch processing, is consistent with reports of a bilateral cerebral representation of gender (e.g., Prete, Fabri, et al., 2020). Findings that imagery of the sound of a high-pitched instrument and imagery of the sound of a low-pitched instrument each exhibited a right ear preference are somewhat surprising, as previous studies reported a left ear advantage for musical stimuli in nonmusicians (e.g., Morais et al., 1982) and that the right hemisphere is specialized for low frequencies and the left hemisphere is specialized for high frequencies (e.g., Deutsch, 1985). If musical stimuli are positively-valenced, then the right ear preferences for imagery of the sound of a flute and imagery of the sound of a tuba are consistent with the general right ear preference for positively-valenced stimuli. Also, years of formal musical instruction and years performing in a band or choir were unrelated to ear preferences for auditory imagery of a flute or tuba, and this is consistent with the suggestion that ear preferences are not influenced by musical experience.
It is generally accepted that non-auditory information can contribute to auditory imagery (for review, see Hubbard, 2013), and some potential contributions of motor information and motor imagery to ear preferences were discussed earlier. It is also possible that visual information and visual imagery could contribute to ear preferences. Although it might be possible to image a voice speaking into one's left ear or right ear without having a concurrent visual image of the speaker, it seems likely that adding an action component (i.e., movement) to the image (e.g., approaching or being approached by an imaged person in Experiment 1) adds a requirement for additional spatial information to be present in the image. As vision contributes significantly to processing of spatial information (e.g., Eimer, 2004), inclusion of an action component in the auditory images might have resulted in additional activation of visual imagery mechanisms. Indeed, Intons-Peterson (1980) found that deliberate formation of an auditory image was often accompanied by spontaneous formation of a visual image of that sound source, and Godøy (2001) suggested that auditory images of sounds often evoke visual or motor images regarding the object that emitted those sounds or how those sounds were produced. The possible co-occurrence of visual imagery was not queried in the experimental participants, but given that imagery in the movement conditions in Experiment 1 involved spatial changes, it would be surprising if visual imagery was not spontaneously generated during the experimental task, and the presence of such visual imagery might potentially influence ear preferences in auditory verbal imagery. This remains an area for future research.
As research in imagery is often more potentially susceptible to demand characteristics than are other types of psychological research (Hubbard, 2018), the possibility of demand characteristics in the current study should be addressed. It might be objected that the goals or hypotheses of the study were inadvertently communicated to participants prior to or during data collection or that participants were able to deduce the goals or hypotheses of the study; if either of these possibilities occurred, that could potentially influence participant responding. Although participants were informed during the consent process that the study involved auditory imagery, they were not informed until after data collection of the specific hypotheses. However, even if participants did deduce that the research was examining ear preferences in imagery, it is highly doubtful the participants would have known the specific hypotheses (e.g., a right ear preference for positively-valenced messages and a left ear advantage for negatively-valenced messages, a right ear preference for speech). Furthermore, a demand characteristics account might suggest that participants would report differences in cases in which no such differences were actually observed (e.g., different ear preferences for images of positively-valenced nonverbal stimuli and for images of negatively-valenced nonverbal stimuli, different ear preferences for images of a high-pitched musical instrument and for images of a low-pitched musical instrument). Thus, an account based on demand characteristics seems less likely and can be tentatively ruled out.
As with any single study, the current study has limitations. One potential limitation is that Experiment 1, as did previous studies of Prete and colleagues, had participants imagine that they were whispering or being whispered to. Ear preference might thus reflect the sensitivity of an given ear rather than differences attributed to imagery per se. One way to examine this might be to have participants image speaking or being spoken to in a normal or loud voice. Although a depicted loudness level for the sound in the image was not suggested in Experiment 2, it seems likely that participants imaged a loudness level typical of everyday experience, and so the robust right ear preference in Experiment 2 is probably less likely to reflect ear sensitivity. A second potential limitation is the choice of a within-participants design, as it might be objected that such a design might lead to subsequent biases. However, it is not clear whether an experience of prior trials would systematically lead participants to keep the same response or change their responses (and so individual biases might be different and so average out across participants). The stimuli to-be-imaged were presented in a different random order to each participant, which allows any effect of trial order to average out across stimuli and participants. Also, the pattern of ear preferences in Experiment 1, which used a within-participants design, was similar to the pattern of ear preferences in Prete et al. (2016), who used a between-participants design in which each participant judged only a single stimulus. A final potential limitation is unequal number of males and females, which did not allow for a consideration of effects of the sex of the participant.
Auditory imagery exhibited a right ear preference for positively-valenced auditory verbal stimuli and a left ear preference for negatively-valenced auditory verbal stimuli, and a right ear preference for both positively-valenced nonverbal stimuli and negatively-valenced nonverbal stimuli. This pattern does not support accounts of ear preferences that are based solely on hemispheric differences in language processing or solely on hemispheric differences in valence processing. Rather, ear preference appears to involve both linguistic processing and affective processing. The lack of an effect regarding whether or not the participant imaged moving or imaged that another person moved does not support a distinction between the inner voice and the inner ear; however, the lack of such an effect does not provide compelling evidence against the distinction between the inner voice and the inner ear, either. Additionally, a right ear preference was found for imagery of a flute (a high-pitched musical instrument) and of a tuba (a low-pitched musical instrument). Measures of auditory imagery vividness and clarity, handedness, and preferred telephone ear did not generally correlate with ear preferences, suggesting the ear preferences observed in Experiments 1 and 2 do not depend upon specific types of experience. The findings have wide applicability, as they demonstrate that subjective qualities of imagery can shed light on neural mechanisms in imagery, how some properties of imagery can resemble properties of perception, and ways in which linguistic information and affective information can influence subjective lateralization in imagery.
Footnotes
Acknowledgements
Portions of these data were reported at the Virtual 62nd Annual Meeting of the Psychonomic Society (November, 2021). The author thanks two anonymous reviewers for helpful comments on a previous version of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article
