Abstract
Previous studies of auditory imagery have often confounded vividness and clarity, and the differences between these constructs are not clear. Additionally, it has been suggested that clarity is a more useful construct than is vividness in understanding auditory imagery. The Clarity of Auditory Imagery Scale and the Bucknell Auditory Imagery Scale were administered to participants, and ratings of the clarity, vividness, and control over auditory imagery were collected. All three measures were highly positively correlated. The magnitudes of these correlations were not influenced by participants’ sex, age, ethnic group, handedness, years of participation in a band or choir, or years of formal instruction in music, and possible reasons for the lack of individual differences are discussed. Three analogies for understanding differences between vividness in auditory imagery and clarity in auditory imagery, and suggestions for potential operational definitions of auditory vividness and auditory clarity in future studies, are provided.
One of the challenges in research on auditory imagery is the ambiguity and proliferation of terminology (Hubbard, 2018), and a striking example of this is the ambiguous relationship of vividness and clarity. In colloquial usage, vividness and clarity seem to be very similar, and it is not clear what the differences, if any, are between the vividness of auditory images and the clarity of auditory images. Furthermore, some well-known questionnaires used in the study of auditory imagery appear to combine vividness and clarity into a single rating (e.g., Betts’ QMI, Sheehan, 1967; Auditory Imagery Questionnaire, Gissurarson, 1992; for discussion, see Willander & Baraldi, 2010). When vividness of imagery has been explicitly defined, it is usually in comparison to perception, with greater vividness defined as a greater similarity to perceptual experience (i.e., a vivid image is more like a percept, Lacey & Lawson, 2013). Clarity has generally not been explicitly defined, but when it is defined, it is defined as some variant of “clear” (e.g., Willander & Baraldi, 2010), which seems circular at best. Appeals to literatures on other modalities of imagery are not helpful, as a similar lack of distinction between vividness and clarity occurs in studies of olfactory imagery (e.g., Arshamian et al., 2008), and surprisingly, the relationship between vividness and clarity has not been empirically addressed in studies of visual imagery. Thus, although the focus here is on auditory imagery, the results will potentially have implications for understanding other modalities of imagery, as well.
If vividness and clarity are different subjective qualities of auditory imagery, then it is possible that vividness or clarity might differ across different groups of people or different types of experience. One of the most studied individual differences variables within the literature on auditory imagery is musical experience (usually defined as the number of years performing in a band or choir or as the number of years of formal musical training, for discussion, see Hubbard, 2010, 2019b). Many tests developed to assess musical aptitude involve auditory imagery (e.g., Gordon, 1965; Seashore et al., 1956; Wing, 1962), and auditory imagery has been suggested to influence musical practice and performance (for overview, see Hubbard, 2019b), musical expression (Bishop et al., 2013), reading of musical notation (Brodsky et al., 2003), and musical composition (Bennett, 1976). Also, individuals with more musical training have better pitch acuity and better temporal acuity in auditory imagery (Janata & Paroo, 2006) and are better at identifying which of two lyrics should be sung on a higher pitch (Aleman et al., 2000). One study found vividness of auditory imagery (and time engaged in imagery) correlated with timing in musical performance (Clark & Williamon, 2012), and another study found greater vividness and greater clarity of auditory imagery was reported by music students than by other students (Campos & Fuentes, 2016). However, most studies of auditory imagery in musical practice and performance do not distinguish between vividness and clarity, and it is not clear if musical training increases auditory imagery ability (e.g., increases vividness and clarity) or if individuals with greater auditory imagery ability are more likely to seek musical training (Hubbard, 2010).
Although most research on individual differences in auditory imagery focused on effects of musical experience, potential relationships of a few other individual differences variables to auditory imagery have been investigated. No differences between males and females in auditory imagery were found with the Betts’ QMI (Ashton & White, 1980; Campos & Campos-Juanatey, 2014), Clarity of Auditory Imagery Scale (Campos & Fuentes, 2016; Willander & Baraldi, 2010), Bucknell Auditory Imagery Scale (Halpern, 2015), Auditory Imagery Scale (Campos, 2017; Gissurarson, 1992), and Auditory Imagery Questionnaire (Campos, 2017). Auditory imagery exhibits at least some cerebral hemispheric asymmetries similar to those in auditory perception (e.g., Prete et al., 2016; Zatorre & Halpern, 1993; for review, Hubbard, 2019a), and it might be possible that handedness could serve as a proxy measure of (the extent of) such asymmetries. However, cerebral hemispheric asymmetries in auditory processing are influenced by the type of stimulus (e.g., speech vs. music), but as vividness, clarity, and control are relevant to all types of auditory stimuli, it is not clear whether effects of handedness on vividness, clarity, or control would occur. Given that different ethnic or cultural groups might encourage greater or lesser cultivation of mental imagery (e.g., Marsella & Quijano, 1974; Noll et al., 1985), there might be ethnic or cultural differences in the vividness, clarity, or control of auditory imagery. No clear predictions regarding many of these non-musical variables can be confidently made; even so, considering possible relationships of such individual differences to vividness, clarity, and control of additory imagery might provide useful insights.
The usefulness of the terms “vividness” and “clarity” as descriptions of the subjective experience of auditory imagery has been questioned. Stillman and Kemp (1993) suggested that “vividness” might not be as useful a dimension for describing auditory imagery as for describing visual imagery. Willander and Baraldi (2010) questioned the usefulness of “vividness” as a construct in auditory imagery; they suggested that “clarity” was a more useful construct and proposed the Clarity of Auditory Imagery Scale. Surprisingly, comparisons of the vividness of auditory imagery and the clarity of auditory imagery have not been reported, and a clear distinction between vividness and clarity has not been provided in the literature. However, one study reported ratings of the clarity of auditory imagery on the Clarity of Auditory Imagery Scale were positively correlated with ratings of the vividness of visual imagery on the VVIQ-2 but negatively correlated with ratings on the auditory subscale of Betts’ QMI (Campos & Pérez-Fabello, 2011), but as noted earlier, the Betts’ QMI has been suggested to combine vividness and clarity into a single rating. The study reported here administered questionnaires regarding the vividness of auditory imagery and the clarity of auditory imagery and collected information regarding the ability to control (i.e., transform or manipulate) auditory imagery. Correlations between these measures, as well as potential effects of several individual differences variables on the correlations between vividness, clarity, and control, were examined. Possible relationships between vividness and clarity in auditory imagery, as well as potential operational definitions, are proposed.
Method
Participants
Participants were students at the University of South Carolina Upstate, who completed an online survey. The sample included 177 participants (141 women [79.7%], 162 right-handed [91.5%]), ranging from 16 to 38 years of age (M = 20.32, SD = 3.27). Participants were primarily White/Caucasian (74 [41.8%]) or Black/African-American (78 [44.1%]), with smaller numbers self-identifying as Hispanic/Latino (11 [6.2%]), Asian/Pacific Islander (5 [2.8%]), and Other/Choose Not to Disclose (9 [2.1%]). There was a range of experience in performing in a band or choir (no experience = 29 [16.4%], < 1 year = 29 [16.4%], 1–2 years = 44 [24.9%], 2–5 years = 39 [22%] and > 5 years = 35 [20.3%]) and of experience in formal music training (no experience = 57 [32.4%], < 1 year = 34 [19.2%], 1–2 years = 46 [26.0%], 2–5 years = 8 [4.5%] and > 5 years = 32 [18.1%]). Participants received partial course credit, and the study was approved by the Institutional Review Board at the University of South Carolina Columbia.
Measures
The online survey consisted of two imagery questionnaires: the Bucknell Auditory Imagery Scale (BAIS, Halpern, 2015), and the Clarity of Auditory Imagery Scale (CAIS, Willander & Baraldi, 2010). The BAIS consists of subscales for vividness (BAIS-V) and for control (BAIS-C), and each subscale has 14 items that are rated on a 1–7 scale. For the BAIS-V, participants were instructed to form an image (e.g., the song “Happy Birthday”, the voice of a clerk on the phone), and they rated the vividness of that image (1 = no image present at all; 7 = as vivid as the actual sound). For the BAIS-C, participants were presented with the names of two sounds (e.g., a children’s choir and an adult choir, a solo saxophone and a saxophone accompanied by piano); participants were instructed to form an image of the first sound and then transform that image to reflect the second sound, and they rated how easy it was to transform the image (1 = no image present at all; 7 = extremely easy to implement the change). The CAIS consists of 16 items (e.g., a clock ticking, a car ignition), and participants were instructed to image each item and rate how clearly they heard the sounds on a 1–5 scale (1 = not at all; 5 = very clear). Participants also answered several demographic questions regarding their sex, age, ethnic group, handedness, years of participation in a band or choir, and years of formal instruction in music.
Procedure
The survey was implemented online in Qualtrics (Provo, UT). The first page of the survey informed participants that the study was about auditory imagery and contained consent information in which participants were informed of their rights as participants in research. After participants clicked a consent box on the first page, they were presented with the BAIS-V, CAIS, and BAIS-C, and the order of scale presentation was randomized across participants. After completing the three scales, participants answered the demographic questions. The entire session lasted approximately 20 minutes.
Results
Average scores (across scale items) were calculated for the BAIS-V, CAIS, and BAIS-C for each participant. Spearman correlations between the averages of each of the three scales were calculated, and for each correlation, effects of potential individual differences involving sex, age, ethnic group, handedness, years of performing in a band or choir, and years of formal musical instruction were considered.
Correlations Among the Dimensions
As shown in Figures 1 and 2, ratings of clarity were positively correlated with ratings of vividness, rs(174) = .54, p < .001, and with ratings of control, rs(174) = .53, p < .001, respectively. The highly significant positive correlation of clarity and vividness is consistent with the hypothesis that clarity and vividness might involve the same subjective characteristics of the image, although the possibility that clarity and vividness reflect different subjective characteristics (that rely on different processes or mechanisms) that nonetheless are highly similar cannot be ruled out. The highly significant positive correlation of clarity with control was not predicted, and it suggests that auditory images that exhibit more or greater clarity might be easier to transform or manipulate than are auditory images that exhibit less clarity. As shown in Figure 3, ratings of control and ratings of vividness were positively correlated, rs(174) = .62, p < .001, and this is consistent with the positive correlation of the BAIS-V and the BAIS-C reported by Halpern (2015). The highly significant positive correlations across all three scales are consistent (e.g., it is not the case that for two scales that are positively correlated, one of those scales correlates positively with the third scale and the other scale correlates negatively with the third scale).

Correlation of Vividness and Clarity.

Correlation of Control and Clarity.

Correlation of Vividness and Control.
Individual Differences
To examine the possibility of individual differences, partial correlations of the ratings of vividness, clarity, and control were calculated with potential effects of sex, age, ethnic group, handedness, years performing in a band or choir, and years of formal music instruction partialed out. For Vividness and Clarity, partialing out effects of sex (rs(174) = .56, p < .001), age (rs(174) = .55, p < .001), ethnic group (rs(174) = .55, p < .001), handedness (rs(174) = .55, p < .001), years performing in a band or choir (rs(174) = .55, p < .001), or years of formal music instruction (rs(174) = .55, p < .001) did not influence the overall correlation. For Clarity and Control, partialing out effects of sex (rs(174) = .52, p < .001), age (rs(174) = .52, p < .001), ethnic group (rs(174) = .51, p < .001), handedness (rs(174) = .51, p < .001), years performing in a band or choir (rs(174) = .51, p < .001), or years of formal music instruction (rs(174) = .51, p < .001) did not influence the overall correlation. For Vividness and Control, partialing out effects of sex (rs(174) = .64, p < .001), age (rs(174) = .64, p < .001), ethnic group (rs(174) = .63, p < .001), handedness (rs(174) = .64, p < .001), years performing in a band or choir (rs(174) = .63, p < .001), or years of formal music instruction (rs(174) = .64, p < .001) did not influence the overall correlation. Overall, none of the individual differences variables had an effect on ratings of vividness, clarity, or control.
Discussion
The ratings on all three scales, the BAIS-V, CAIS, and BAIS-C, were positively correlated in all of the pairwise tests. The correlation of primary interest, that between BAIS-V and CAIS, suggests that vividness and clarity reflect the same or highly similar objective criteria or dimensions. Had these measures not exhibited a significant correlation, that would suggest vividness and clarity reflect separate and distinctly different criteria or dimensions. However, the positive correlation between vividness and clarity is also consistent with the possibility that participants did not always or systematically distinguish between vividness and clarity, even if vividness and clarity might have differed on some objective criteria or dimensions. Even so, the correlation of the BAIS-V and CAIS was not perfect, and this is consistent with the possibility of at least some uniqueness, as well as some overlap, in the processes or mechanisms for each dimension. Thus, combining vividness and clarity into a single measure (as done in many previous studies) does not seem appropriate. The correlation between the CAIS and BAIS-C suggests an unpredicted effect in which auditory images that exhibit more or greater clarity are easier for participants to transform than are auditory images that exhibit less clarity. Although the correlation between the BAIS-V and BAIS-C replicates previous findings, the constructs of vividness and control intuitively seem rather different, and it is not clear why they should be highly correlated. Vividness and control might be different aspects of a more general imagery ability or different processes that share a portion of their mechanisms.
The lack of an influence of the individual differences variables related to musical experience on the correlations (i.e., similarities of the full and partial correlations) is somewhat surprising, as previous studies found effects of these variables on ratings regarding experience of involuntary musical imagery (e.g., age, Bailes, 2015; musical experience, Liikkanen, 2012) and that musically trained individuals report greater vividness and clarity (Campos & Fuentes, 2016) of auditory imagery than do individuals who do not have musical training. One potential reason why partialling out the years of musical performance or years of musical instruction did not influence the correlations is that measures of musical experience might have been too narrowly defined as involving only musical production or musical instruction. Even individuals with no experience of musical production or formal music education are still exposed to the musical system of their culture and develop detailed schemata regarding that musical system (e.g., Bigand & Poulin-Charronnat, 2006; Bigand et al., 2003; Krumhansl, 1990). It is possible that mere musical exposure, rather than musical performance or musical instruction, might be more related to vividness, clarity, or control of auditory imagery. Also, many items on the BAIS-V, CAIS, and BAIS-C involve non-musical auditory stimuli, and it might be that musical experience influences auditory imagery for musical stimuli but not for auditory imagery of non-musical stimuli (i.e., perhaps as a function of learning or other experience, different types of stimuli within a given modality can be imaged with relatively more or less vividness or clarity).
The lack of influence of the non-musical individual differences variables on the correlations is not surprising, as some of those variables (e.g., sex) have not previously been found to influence auditory imagery, and others (e.g., handedness) might depend on the type of auditory stimulus that is presented (and the BAIS-V, CAIS, and BAIS-C each involve imagery of multiple types of auditory stimuli). There are at least two other potential reasons why the non-musical individual differences variables did not influence the correlations. First, the participant pool was relatively homogeneous (e.g., age range of only 16–38 years [which limited possible effects of age, years of musical performance, and years of formal musical training], highly unequal numbers of males and females and of left- and right-handedness). Second, participants might not have clearly differentiated between the concepts of vividness, clarity, and control in auditory imagery. Vividness and control of visual imagery are not clearly differentiated within the literature (e.g., Lane, 1977); indeed, many studies of visual imagery fail to apply coherent definitions of vividness or control and are relatively insensitive to individual differences in imagery (e.g., see discussion in Kihlstrom et al., 1991). Relatedly, the relationship between vividness and clarity has not been studied in visual imagery literature. Also, vividness is often used as a general measure of imagery ability, but ratings of general vividness are insensitive to the possibility of different subprocesses (e.g., generation, maintenance, transformation, etc.), each of which might vary across individuals (Lacey & Lawson, 2013) or might vary in clarity.
Given the high correlations observed in the present study, the need for specific operational definitions of vividness and clarity in future studies appears even more critical. One way to approach the development of such definitions is by analogy, and several potential analogies already exist within the literature or can be suggested. One analogy for describing differences between auditory vividness and auditory clarity is that auditory vividness is analogous to the saturation of a visual image and that auditory clarity is analogous to the resolution of a visual image (Hubbard, 2018). If vividness involves similarity to perceptual experience, then a more saturated image would presumably be closer to perceptual experience. Previous studies of resolution in visual imagery (e.g., Finke & Kosslyn, 1980) and auditory imagery (e.g., Janata & Paroo, 2006) described resolution in terms of acuity and precision, respectively, and such an approach might provide a non-circular definition of clarity. Given that vividness reflects similarity to perceptual experience, and that a portion of perceptual experience presumably involves the resolution (i.e., acuity, precision) of that experience, an overlap of vividness and clarity would be expected. The high correlation between vividness and clarity (in the presence of potential phenomenological differences) should be no more troubling than the high correlation between vividness and control (which also have clear phenomenological differences). Even so, just as visual saturation and visual resolution can be separated in experience, so too might auditory vividness and auditory clarity be separated in experience.
A second, albeit related, analogy for describing differences between auditory vividness and auditory clarity is that auditory vividness reflects the general strength or activation of the image (cf. Baddeley & Andrade, 2000), and auditory clarity reflects the level of focus within the image. Both strength and focus might be similarly diminished by interference from the activation or presence of other information not relevant to the imaged stimulus or diminished by decay of the representation, and this might contribute to the observed correlations of vividness and clarity. Additionally, viewing clarity as focus might help account for the correlation of clarity and control, as it should be easier to focus on and manipulate the content of a clearer image. A third analogy for describing differences between auditory vividness and auditory clarity is that auditory vividness reflects a visceral sense of how the stimulus is experienced (i.e., feeling or sensation), and auditory clarity reflects the level of detail or resolution of that experience (i.e., perception). This would map onto a more general distinction between sensation and perception, with vividness and clarity reflecting subjective properties and objective properties, respectively, of a stimulus. This is consistent with the definition of vividness as reflecting the similarity of the image to perceptual experience (e.g., Lacey & Lawson, 2013) and with the importance of feeling in cognitive processing (e.g., Damasio, 1994, 1999). Visceral experience of a stimulus could be related to but not necessarily identical with the level of detail at which that stimulus can be sensed, and so vividness and clarity might overlap while still being distinct dimensions.
The correlation of clarity and control is suggestive. A greater resolution or acuity in an auditory image suggests that an individual would be able to make more precise discriminations and encode smaller changes or parts within an image (e.g., Janata & Paroo, 2006), and this could increase the amount of information encoded within that image. Auditory images containing more information would be more complex (cf. Attneave, 1955), and the complexity of a given image might influence the cognitive processing of that image. For example, visual (Kosslyn et al., 1983) and auditory (Hubbard & Stoeckig, 1988) images consisting of more parts require more time to generate. It could be predicted that auditory images consisting of more parts (i.e., containing more information or that are more complex) might require more time and effort to transform or manipulate. Such a finding would be consistent with the positive correlation observed between clarity and control and with findings that more complex visual images require more time to mentally rotate (Folk & Luce, 1987). Relatedly, effects of complexity on visual mental rotation are influenced by whether rotation involves all or part of an imaged stimulus; analogous effects regarding whether transformation or manipulation of an auditory image is holistic or piecemeal have not been reported. Also, an increase in the information or complexity of an image could potentially increase the vividness of that image (alternatively, a higher level of information or complexity might require a higher minimum level of vividness), and this would be consistent with the positive correlation observed between vividness and clarity.
In much of the literature on auditory imagery, it is not clear if the terms “vividness” and “clarity” refer to the same subjective quality or to different subjective qualities of auditory imagery. The data reported here suggest that vividness and clarity are clearly not orthogonal, as the correlations suggest a significant, but not complete, overlap of these constructs. It is not clear whether vividness and clarity are equally useful constructs in developing a theory of auditory imagery or whether one construct might be more useful. Also, control was rated as easier for images that were clearer or more vivid, thus suggesting manipulation of images is influenced by properties of the content of those images. Although previous ambiguities in definitions could be interpreted as suggesting a proliferation of terms for the same vague concept, the correlations reported here suggest the terms “vividness” and “clarity” might have different referents, with vividness referring to the saturation, strength (activation), or visceral feeling of the imaged stimulus, and with clarity referring to the resolution, precision (acuity), or focus of the imaged stimulus. Relatedly, and as noted in visual imagery literature, the notion of “vividness” might collapse across too many processes to be a useful measure in theory development; whether “clarity” similarly collapses across processes is not clear (e.g., would clarity during image generation be related to clarity during image transformation?). Although many questions remain, the data, distinctions, and definitions proposed here provide a useful first step to specifying the similarities and differences of the vividness and clarity of auditory imagery.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
