Abstract
This research aims to advance the understanding of audio branding by investigating the effect of an understudied auditory attribute, timbre, in the context of brand audio logos. Specifically, the authors propose, and provide evidence in ten studies, that timbral sound quality in audio logos (i.e., roughness/smoothness) informs abstract judgments of brand personality (i.e., ruggedness/sophistication). Study 1 shows that the industry practice of altering instrumentation, and thus timbre, in audio logos can change personality perceptions of even well-known brands. This effect persists when the sound source is kept constant with various instruments (Studies 2a–2d), with a combination of instruments (Study 3), and in the absence of an identifiable sound source (Study 4). The authors test specific acoustic underpinnings of timbral sound quality perceptions (Study 4) and show that the effect on brand personality judgments is counteracted by incongruent sensory information from another modality (Study 5). The results of Study 6 suggest that the influence of timbral sound quality on brand personality perceptions is nonconscious, as consumers are unaware of the extent to which the stimulus affects their judgments. Study 7 shows downstream consequences for purchase intentions. Practical implications, theoretical contributions, and directions for future research are discussed.
Although audio branding has historically been neglected compared with visual branding, the broad adoption of voice technology, which creates novel auditory brand–consumer touchpoints, has sparked industry interest (Dooley 2021). Indeed, the number of brands using audio branding elements, such as audio logos, is increasing (amp 2022). Exemplifying this trend, Mastercard recently spent millions on a new audio branding concept (Armstrong 2019). However, despite the growing popularity of audio branding, a flourishing literature on sensory marketing has dedicated less attention to the auditory modality than to other sensory modalities (e.g., Krishna and Schwarz 2014). Reflecting the relative paucity of academic research, practitioners’ understanding of how auditory parameters can be leveraged to inform consumers’ brand perceptions is often limited, forcing them to resort to intuition (Areni 2003). This research aims to advance the understanding of audio branding by examining timbre, an understudied yet ecologically important auditory parameter, and its potential to communicate a brand's personality.
The American National Standards Institute (ANSI) defines timbre as “that attribute of auditory sensation which enables a listener to judge that two non-identical sounds, similarly presented and having the same loudness and pitch, are dissimilar” (ANSI 1994, p. 35). This definition is traditionally used in both acoustics and music perception (e.g., McAdams 2019; Siedenburg et al. 2019) and defines timbre at a perceptual level. The perception of timbre not only provides the basis for distinguishing different sound sources (i.e., what produces the sound: e.g., instrument) but also informs differences in sound quality perceptions (i.e., what it sounds like: e.g., rough/smooth). Given the information richness of timbre, it has been referred to as the most ecologically important feature in auditory sensation (e.g., Handel 1995; Menon et al. 2002; Nunes and Ordanini 2014; Siedenburg et al. 2019). At the same time, timbre is considered the least well understood auditory property due to its complexity (e.g., Siedenburg, Jones-Mollerup, and McAdams 2016) and is understudied in marketing (Bruner 1990).
We investigate timbre as an auditory design parameter in the practically relevant context of brand audio logos, which are the core element of audio branding and the sonic equivalent of visual brand logos (Steiner 2014). Although visual logo design and its effects on consumers have received much attention in the marketing literature (e.g., Cian, Krishna, and Elder 2014; Fajardo, Zhang, and Tsiros 2016; Hagtvedt 2011; Janiszewski and Meyvis 2001; Jiang et al. 2016; Morgen, Fajardo, and Townsend 2021), audio logos have received limited attention (but see Krishnan, Kellaris, and Aurand 2012; Mas et al. 2020). Audio logos are most commonly melodic and, as such, consist of a short sequence of notes that are designed to be memorable, make a brand auditorily identifiable, and communicate a brand's image (Bronner and Hirt 2009; Krishnan and Kellaris 2021). We propose that timbre plays an important role in communicating a brand's personality, a key aspect of brand image (e.g., Park, Jaworski, and MacInnis 1986). Specifically, we examine empirically how rough and smooth timbral sound quality systematically affects consumers’ brand personality perceptions on the dimensions of ruggedness and sophistication (Aaker 1997).
In doing so, we aim to make several substantive and theoretical contributions. To elaborate, this research provides practical insights of relevance to multiple audio branding stakeholders. That is, brand managers gain insights into the implications of using different timbres in audio logos for brand personality judgments, brand (re)positioning, and purchase intentions. In addition, by investigating a specific auditory design parameter, this research informs the growing community of specialized audio branding agencies, their audio engineers, and composers, who are tasked by brand managers to design audio logos that communicate a brand's personality. Further, this research contributes theoretically to diverse streams of literature. That is, we add to the growing literature on sensory marketing by investigating how a sensory cue can influence abstract attributes in the relatively neglected auditory modality (Krishna and Schwarz 2014). By focusing on timbral sound quality, we expand the conceptualization of timbre in marketing beyond sound source (Bruner 1990), which offers new research opportunities. Finally, we add to the literature in acoustics and music perception both by showing that timbral sound quality perception can elicit abstract meaning in terms of personality dimensions and by investigating acoustic correlates of such sound quality perceptions.
Theoretical Background
Before developing the conceptual framework, we provide a brief review of literature on music in marketing to situate the present research, organized by a distinction between molar (i.e., entire musical piece) and molecular (i.e., individual design parameter) units of analysis.
Prior Research on Music in Marketing
Music is more than the sum of its parts (Meyer 1956), and therefore it is not surprising that the majority of marketing investigations have focused on a molar approach, investigating effects of characteristics of entire musical pieces (e.g., Areni and Kim 1993; Gorn 1982; Kellaris and Cox 1989; Kellaris, Cox, and Cox 1993; McInnis and Park 1991; Nunes, Ordanini, and Valsesia 2015; Park and Young 1986; Valsesia, Nunes, and Ordanini 2016; Zhu and Meyers-Levy 2005). Although putting composite parameters aside in favor of holistic characteristics clearly has merits, this approach makes it difficult to evaluate which parameters in a piece of music create the desired effects. However, an understanding of individual parameters is important to guide their effective use and combination, as prior research suggests that much of listeners’ interpretation of music can be accounted for by the effects of individual parameters (e.g., Juslin and Lindström 2011; Scherer and Oshinsky 1977), thus supporting the merits of a molecular analysis.
Musical parameters have been broadly categorized into pitch (e.g., melody, harmony, tonality), time (e.g., tempo, rhythm, phrasing), and texture (e.g., loudness, timbre; Bruner 1990). Pitch parameters have received the most empirical attention, with many investigations studying the effects of fundamental frequency (e.g., Hagtvedt and Brasel 2016; Lowe and Haws 2017; Lowe, Loveland, and Krishna 2019) or tonality (Kellaris and Kent 1992, 1993; Knöferle et al. 2012), followed by time parameters, largely focusing on tempo (e.g., Kellaris and Kent 1993; Knöferle et al. 2012; Milliman 1986). Texture parameters, however, have received relatively little attention. Kellaris and Rice (1993) investigated effects of loudness on consumers’ hedonic responses to music, and, more closely related to the present research, Kellaris and Kent (1993) varied texture in addition to tempo and tonality to investigate effects on aesthetic judgments. Their manipulation of texture entailed a classical composition and a pop-style composition, which, as they acknowledged, varied not only in terms of the instruments used for orchestration and thus timbre, but also in terms of other musical parameters. To the best of our knowledge, this research is the first in marketing to investigate timbre in isolation and, specifically, its sound quality.
Timbre
As defined previously, the perception of timbre is based on the acoustic information remaining when pitch, duration, and loudness are equated. The perception of pitch corresponds to the lowest frequency of a tone (i.e., fundamental frequency = F0) measured in hertz (Hz), the perception of time corresponds to seconds, and the perception of loudness corresponds to amplitude measured in decibels (dB). Whereas these three percepts are essentially based on unidimensional properties, the percept of timbre is based on multidimensional acoustic properties (Siedenburg, Jones-Mollerup, and McAdams 2016; Siedenburg et al. 2019).
The multidimensional acoustic properties of timbre can be summarized at a high level as follows: Complex sounds, such as those created by musical instruments, consist of overlaying simple waves (i.e., sinusoids); the lowest of which is F0 (e.g., 440 Hz), whereas multiples of F0 (e.g., 880 Hz) are referred to as partials. The spectral and temporal envelope of such partials are the acoustic basis for the perception of timbre (Roederer 2000; Siedenburg et al. 2019). The spectral envelope is a function of the different frequencies included in a complex tone and their relative amplitude (e.g., 440 Hz at 90 dB, 880 Hz at 70 dB), and the temporal envelope is a function of changes in this amplitude–frequency plane over time. While acoustically complex, this information is integrated into a perceptual gestalt (Roederer 2000; Siedenburg, Jones-Mollerup, and McAdams 2016), which can be interpreted intuitively to provide two distinct pieces of information (Siedenburg and McAdams 2017; Siedenburg et al. 2019): sound source and sound quality.
Sound source identification
Timbre is the primary vehicle for sound source recognition and identification (Handel 1995; McAdams 1993, 2013). There are some acoustic properties within timbre, called invariants, which remain constant and are “shared by objects that, at some level of description, can be considered the same” (Michaels and Carello 1981, p. 25). This allows listeners to identify and recognize sound sources depending on previous encounters (McAdams 1993). Prior conceptualizations of timbre in marketing have focused on timbre in its capacity to represent a sound source (see Bruner 1990). However, although timbre varies as a function of source, it is not equivalent to a sound source. For instance, a saxophone played with a given articulation, playing effort, and embouchure technique or added effects produces a distinct timbre despite emanating from and signifying the same sound source (McAdams and Goodchild 2017). Thus, an instrument does not create “a” timbre, but can create a wide range of timbres varying in sound quality (Siedenburg and McAdams 2017; Siedenburg et al. 2019).
Sound quality perception
Unlike sound source identification, sound quality can be interpreted independently of prior encounters. In simple terms, sound quality determines “what it sounds like” (Handel 1995, p. 426). Whereas auditory percepts based on unidimensional properties can be described along a single dimension (e.g., pitch: high–low; duration: short–long; loudness: quiet–loud), timbral sound quality resulting from the fused percept of complex spectro-temporal acoustic information can be described on multiple dimensions. As consumers lack a specific sensory vocabulary for timbral sound quality perceptions, they use semantic descriptors from other modalities (e.g., Saitis and Weinzierl 2019). The sensory dimensions, which are most often used to describe perceptions of timbral sound quality, are luminance (e.g., dark, bright), mass (e.g., heavy, light), and tactility (e.g., rough, smooth; Bismarck 1974; Saitis and Weinzierl 2019; Zacharakis, Pastiadis, and Reiss 2012, 2015). The tactile dimension enjoys high agreement across listeners cross-culturally (Zacharakis, Pastiadis, and Reiss 2012) and is the focus of the present research.
The Present Research
Mas et al. (2020) manipulated melodic contour, tempo, and loudness of audio logos and showed that although ectodermal activity, heart rate, and emotion perception were affected, brand personality was not, with timbre held constant. Our central hypothesis is that sensory information in the form of timbral sound quality elicits abstract meaning, which we test specifically in the context of the effect of rough/smooth sound quality perceptions on rugged/sophisticated brand personality perceptions (Aaker 1997). Focusing on a subset of dimensions is in line with prior empirical research on brand personality (e.g., Aaker, Fournier, and Brasel 2004; Batra and Homer 2004; Sundar and Noseworthy 2016).
Elicitation of Abstract Meaning Through Timbral Sound Quality
Musical stimuli can affect consumers’ interpretation of abstract meaning through an embodied and a referential route (Meyer 1956; Zhu and Meyers-Levy 2005). Referential meaning relates to meaning elicited by activating a network of associations with sound-extrinsic concepts that a stimulus brings to mind (Radocy and Boyle 2012), whereas embodied meaning is intrinsic to the stimulus and refers to meaning evoked by patterns within the sound (Meyer 1956). Timbre may trigger referential meaning by activating a network of associations that are independent of the sound itself but held with the sound source a timbre signifies (i.e., a musical instrument). However, such referential meaning activation is contingent on identifying the instrument through timbre. This research focuses on embodied meaning elicited by sound-intrinsic timbral sound quality perceptions that are not subject to this contingency and can be interpreted independent of sound source identification. Whereas previous research has conceptualized embodied meaning as “the hedonic value or positive feelings that may simply emerge from the sound within the music” (Zhu and Meyers-Levy 2005, p. 333), we propose that musical embodied meaning is not limited to a hedonic apprehension, but may elicit abstract meaning by serving as the sensory source domain in the process of scaffolding.
Scaffolding is a natural process by which consumers develop abstract knowledge structures on the basis of physical concepts acquired early in life through their sensory experience of their environment (Bargh 2006; Williams, Huang, and Bargh 2009). As such, the sensory source domain and the abstract target domain become linked, and a sensory perception can automatically activate more abstract, higher-level concepts (Williams, Huang, and Bargh 2009). The conceptual metaphor framework makes a similar argument linking physical source domains to abstract target domains that consumers cannot grasp with their senses (Lakoff and Johnson 1999). Prior research in marketing has successfully applied this theorizing in the visual modality, for instance, to uncover that up vertical positions cue associations with rationality, whereas down vertical positions cue associations with emotionality (Cian, Krishna, and Schwarz 2015).
We propose that rough timbral sound quality perceptions cue associations with ruggedness characteristics such as rugged, tough, strong, and powerful, whereas smooth timbral sound quality perceptions cue associations with sophistication characteristics, such as sophisticated, glamorous, prestigious, and high-class (Aaker 1997). To elaborate, “rough” can be defined as having an uneven and bumpy surface and “smooth” as having an even and continuous surface (Merriam-Webster 2022a, b). It is thus conceivable that abstract personality characteristics related to the ruggedness or sophistication dimension derive their meaning—at least in part—from these perceptual dimensions. That is, although abstract concepts, such as personality characteristics, cannot be directly grasped with the senses (Lakoff and Johnson 1999), they can be described and understood at a more concrete and intuitive level in terms of sensory perceptions as being rough, uneven, irregular, jagged, or hard (vs. smooth, even, regular, level, or soft). For instance, a rugged personality can be described as being “rough around the edges” and a sophisticated personality as “well rounded.” A dictionary review suggests a semantic association between these sensory perceptions and abstract personality dimensions (see Web Appendix A). These associations were confirmed by an implicit association test (N = 87; Greenwald, McGhee, and Schwartz 1998) showing that consumers categorize words reflecting the dimensions of roughness, smoothness, ruggedness, and sophistication faster when grouping them in congruent (i.e., rough/rugged vs. smooth/sophisticated) versus incongruent (rough/sophisticated vs. smooth/rugged) categories (MDScore = .72, SDDScore = .39; t(86) = 17.32, p < .001, d = 1.86; see Web Appendix B).
Acoustics Underpinnings of Timbral Sound Quality and Their Relation to Abstract Meaning
While these results show such a relationship at a purely semantic level, scaffolding suggests that sensory perception could directly trigger abstract meaning. In the following, we propose acoustic underpinnings of rough (smooth) timbral sound qualities and suggest that they may trigger a rugged (sophisticated) personality due to the physical acts of sound elicitation that they often arise from, which can be naturally tied to ruggedness (sophistication) characteristics.
Rough timbral sound quality
We propose that rough sound quality perception is, at least in part, a function of attack time and sensory dissonance. Attack time is a significant aspect of the temporal envelope and denotes the time from tone onset until its maximum amplitude is reached (e.g., McAdams 2013). The shorter the attack time, the steeper is the slope of the temporal envelope. This is often the case when a tone is elicited relatively abruptly. Sensory dissonance (also referred to as auditory roughness) is a function of the closeness of the frequencies of sinusoids (e.g., Farbood and Price 2017; Sethares 1998). Adding inharmonic partials (i.e., fractional multiples of F0; e.g., 440 Hz × 1.2 = 528 Hz) leads to a closer pairing of frequencies as they occupy spaces between harmonic partials (i.e., integer multiples of F0; Farbood and Price 2017), and closely paired frequencies are difficult to resolve by the cochlea, a part of the hearing system (Saitis and Weinzierl 2019; Sethares 1998). Sensory dissonance can be achieved through the addition of inharmonic partials via instrument specific techniques (e.g., forceful embouchure in brass instruments, increased bow pressure in string instruments) or added effects (e.g., overdrive or distortion effects through signal clipping; Tsai et al. 2010). Shorter attack times and sensory dissonance are often a function of a physical mechanism of sound elicitation that is more abrupt, demanding, and forceful, resulting in waveforms that also look irregular, uneven, jagged, and rough (see Figure 1, Panel A). Consequently, due to the natural correlation between more forceful and abrupt physical acts of sound elicitation and a rough sound quality, manifested as shorter attack time and sensory dissonance in the acoustic realm, a rugged personality may be directly evoked.

Illustrative Oscillograms.
Smooth timbral sound quality
We propose that smooth sound quality perception is, at least in part, a function of attack time and frequency modulation. To elaborate, the longer the attack time is, the gentler is the slope of the temporal envelope. This is often the case when a tone is elicited in a more gradual fashion. Frequency modulation is a regular and continuous variation in the signal frequency according to the instantaneous value of the modulating waveform along with a cyclical modulation in amplitude. Vibrato articulation, which can be used on all pitched instruments, introduces such frequency modulation, typically within the range of 4–6 Hz (e.g., Fritz et al. 2010; Maher 2008). It can be achieved through instrument-specific techniques (e.g., embouchure techniques for wind instruments, or gentle backward and forward rocking of the finger that cyclically alters the vibrating length of the string for string instruments) or added effects. Longer attack times and frequency modulation are often a function of a more gradual, delicate, and precise mechanism of sound production, with resulting waveforms that also look cyclical, regular, even, and smooth (see Figure 1, Panel B). Consequently, due to the natural correlation between more gradual and gentle physical acts of sound elicitation and a smoother sound quality, manifested as longer attack time and vibrato frequency modulation in the acoustic realm, a sophisticated personality may directly be evoked.
Cross-Modal Integration of Sensory Qualities in Abstract Meaning Elicitation
We have argued that sensory information in the form of rough (smooth) timbral sound quality may elicit embodied meaning in terms of a rugged (sophisticated) brand personality through the process of scaffolding. To the extent that this is the case, sensory information from other modalities (e.g., rough- or smooth-looking visuals) should have a similar effect. Thus, the elicited abstract meaning should be jointly determined by perceptions of accessible sensory information across modalities. Such an account of cross-modal sensory integration suggests an important boundary condition. That is, visual sensory information that is incongruent with timbral sound quality—such as a rough (smooth) visual logo presented alongside a smooth (rough) audio logo—should counteract the effect of sound quality on brand personality perceptions.
Transfer of Abstract Meaning Elicited by Timbral Sound Quality to Brands
Prior research suggests that consumers frequently infer values of unspecified attributes using available information (e.g., Ahluwalia, Unnava, and Burnkrant 2001), and, specifically, sensory cues have been shown to lead to inferences about abstract attributes (Krishna 2012). For instance, vowel sounds in brand names (Yorkston and Menon 2004) or absolute shifts in pitch in music (Lowe and Haws 2017) have been shown to influence consumers’ impressions of brand and product attributes. However, the question arises whether timbral sound quality in brand audio logos influences brand personality judgments in a controlled or nonconscious manner. To elaborate, timbral sound quality is intuitive to interpret and, given the ecological relevance of timbre in auditory events in general, consumers may be particularly attuned to it, making it, arguably, highly accessible. In addition, audio logos are created or commissioned by a brand and, as such, arguably have some diagnostic value for the formation of brand perceptions. Thus, it is conceivable that consumers may use accessible timbral sound quality in a controlled manner as a function of its informative value for brand judgments (Lynch, Marmorstein, and Weigold 1988). Alternatively, consumers may be unaware of the extent to which sound quality influences judgments, and use it nonconsciously as a function of its mere accessibility, irrespective of how informative it is (Menon and Raghubir 2003; Raghubir 2008).
Empirical Overview
We test our theorizing in ten studies. Study 1 shows that changing instrumentation and thus timbre in an otherwise identical audio logo systematically affects brand perceptions of a well-known brand (Coca-Cola). Studies 2a–2d, using different visual and audio logos drawing from four major instrument families, show that rough (smooth) timbral sound quality systematically affects ruggedness (sophistication) brand personality perceptions even when the sound source (i.e., instrument) is kept constant. Study 3 replicates and generalizes the effect to polyphonic orchestrations (i.e., multiple sound sources) and assesses brand personality between participants. Study 4 provides evidence for the proposed acoustic correlates of rough and smooth sound quality perceptions by manipulating them directly through additive synthesis, examines their effect in comparison with a “clean” reference timbre as a control condition, and generalizes the effect of sound quality to unidentifiable sound sources. Study 5 shows that visual brand logos that are perceptually incongruent with the sound quality of an audio logo counteract the effect of timbral sound quality on brand personality perceptions. Study 6 tests whether the transfer of meaning elicited by timbral sound quality to a brand is a controlled or nonconscious process and shows that, consistent with a nonconscious process, the effect of timbral sound quality persists when its informativeness is discredited. Finally, Study 7 shows downstream consequences for purchase intentions. All data and materials are available at https://osf.io/64c3e/?view_only=a7401069d8824b6998b287d5781ccf4d.
In the following, we elaborate on the experimental procedure common across all studies.
Sound check
At the beginning of each study, participants listened to a sound file containing a short verbal message, which was followed by a surprise sound check. Since manipulations are auditory, participants who failed the sound check were directed out of the survey and were not assigned to an experimental condition (see Web Appendix C).
Manipulation
Across studies, participants were presented with short videos containing brand visual and audio logos (i.e., short melodies). Audio logos were created in consultation with a composer and music editor using industry standard tools: the digital audio work station Pro Tools and the high-fidelity virtual instrument packages Komplete Ultimate and EastWest (except Study 4, which used an additive synthesis approach in MATLAB). Timbre was manipulated between conditions, and duration, pitch, and loudness were kept constant across conditions according to the ANSI (1994) definition (see Web Appendix D for integrated loudness measurement and equation; Web Appendix E for pictures of visual brand logos, musical notation of auditory stimuli, and extracted audio descriptors for all studies; and Web Appendix F for an explanation of audio descriptors).
Supplemental measures
We assessed self-rated musical proficiency in all studies, which did not affect the results and is therefore not discussed further (see Web Appendix G).
Study 1: Changing Instrumentation in Brand Audio Logos
Timbre is an absolute musical parameter, such as tempo or tonality, and is therefore less relevant for the identity of a piece of music, which is based on relative relationships of pitch and time aspects (Schellenberg, Iverson, and McKinnon 1999). To illustrate, the recognition of “Happy Birthday” should be relatively unaffected whether played with a guitar or violin, faster or slower, in C or F major, as long as its melody, the pitch- and time-related intervallic structure, is preserved. Thus, perhaps unsurprisingly, it is not uncommon for some brands to change the instrumentation of their audio logo to fit different ad themes while maintaining their identity signaling function. Although admittedly changing instrumentation not only changes timbral sound quality but also sound source, Study 1 mimicked this industry practice to investigate whether changing timbre affects personality perceptions in ways that are potentially unbeknownst to brand managers. We used a well-known brand, Coca-Cola, which has a propensity to vary instrumentation in its existing audio logo melody and is considered among the top U.S. audio brands (amp 2021).
Method
We aimed to recruit at least 50 students per condition but collected responses from as many participants available in the laboratory (N = 148; 76 female, 72 male; Mage = 20.09 years, SD = .94). They were presented with a video containing the visual logo and audio logo melody of Coca-Cola and assigned at random to one of two instrumentation conditions: rough e-guitar (string plucked and distortion added for shorter attack time and sensory dissonance) or smooth violin (string bowed and vibrato articulation for longer attack time and frequency modulation). To account for existing personality perceptions, we assessed all five brand personality dimensions on seven-point scales. In line with prior research, we assessed each dimension with four items (e.g., Aaker, Fournier, and Brasel 2004; Sundar and Noseworthy 2016): ruggedness (tough, strong, powerful, rugged; α = .81), sophistication (glamorous, sophisticated, prestigious, high-class; α = .91), competence (efficient, reliable, responsible, dependable; α = .81), excitement (daring, spirited, imaginative, up-to-date; α = .69), and sincerity (honest, domestic, genuine, cheerful; α = .78; Aaker 1997; see Web Appendix H for item selection). In this and all following studies, all personality items were presented in randomized order. Participants then rated rough and smooth sound quality (order randomized) on seven-point scales and answered exploratory questions (see Web Appendix I).
Results
Sound quality perception
A 2 (instrumentation: rough e-guitar, smooth violin) × 2 (sound quality perception: rough, smooth) mixed analysis of variance (ANOVA) revealed an effect of timbre, F(1, 146) = 4.95, p = .028,
Brand personality perception
A 2 (instrumentation: rough e-guitar, smooth violin) × 5 (brand personality: competence, excitement, sincerity, ruggedness, sophistication) mixed ANOVA revealed no effect of timbre (F < 1), a significant effect of brand personality (F(4, 584) = 23.61, p < .001,
Coca-Cola Brand Personality Perception in Study 1.
Notes: Means with common superscripts are not significantly different from each other within condition at p < .05.
Discussion
These results suggest that the industry practice of changing instrumentation, and thus timbre, in audio logos can change personality perceptions of even a well-known brand. Further, the effect of rough e-guitar (smooth violin) timbre was relatively specific to ruggedness (sophistication). We report a conceptual replication with Intel showing similar results in Web Appendix J. Since different instruments were used, these effects may also have been driven by sound-extrinsic referential meaning arising from identification of the respective sound source. In the following studies, we manipulated sound quality while keeping the sound source constant.
Studies 2a–2d: Manipulating Sound Quality While Keeping Source Constant
Studies 2a–2d aimed to demonstrate the effect of timbral sound quality in audio logos on brand personality perceptions while keeping the sound source constant across four primary instrument families to demonstrate generalizability: guitars (e.g., e-guitar), classical strings (e.g., violin), keyboards (e.g., organ), and wind instruments (e.g., saxophone). Across conditions we manipulated the sound quality (i.e., rough vs. smooth) through musical articulation. In the rough sound quality condition, tone onset was relatively more sudden to decrease attack time. Sensory dissonance through addition of inharmonic partials was achieved through emulation of instrument-specific articulation techniques (bowing pressure for strings, embouchure for wind instrument) or commonly used effects (overdrive for organ, distortion for guitar). In the smooth sound quality condition, tone onset was relatively more gradual and gentler to increase attack time. Vibrato articulation was used to introduce frequency modulation.
Method
The studies were conducted on Amazon Mechanical Turk (MTurk), and we aimed to collect 50 participants per cell for a two-cell design for each instrument (Study 2a: e-guitar: N = 101; 45 female, 54 male, 2 preferred not to indicate; Mage = 39.24 years, SD = 10.82; Study 2b: violin: N = 100; 45 female, 54 male, 1 preferred not to indicate; Mage = 40.43 years, SD = 13.08; Study 2c: organ: N = 100; 49 female, 50 male, 1 preferred not to indicate; Mage = 40.48 years, SD = 11.70; Study 2d: saxophone: N = 100; 49 female, 47 male, 4 preferred not to indicate; Mage = 40.51 years, SD = 12.62). In all studies, participants were presented with a short video, in which they saw a unique brand name and logo (2a: Molia; 2b: Zalan; 2c: Instru-Tech; 2d: Velix) accompanied by a unique audio logo melody composition, whose sound quality (rough or smooth) was manipulated between subjects.
Participants then indicated brand perceptions on the dimensions of ruggedness (αs = .88, .86, .84, and .87) and sophistication (αs = .92, .96, .92, and .97) followed by sound quality perceptions (as in Study 1) and were asked to identify the instrument in an open-ended format. Identification was coded at the instrument family level in this and the following studies, in line with prior research (e.g., Giordano and McAdams 2010; see Web Appendix K for details).
Results
Source identification
Participants identified the instrument at different rates as a function of sound quality in Studies 2a and 2d (see Table 2), which is addressed subsequently.
Source Identification in Studies 2a–2d.
Sound quality perception
Across Studies 2a–2d, 2 (sound quality manipulation: rough, smooth) × 2 (sound quality perception: roughness, smoothness) mixed ANOVAs revealed the predicted cross-over interactions (Fs ≥ 35.79, ps < .001,
Brand personality perception
Across Studies 2a–2d, 2 (timbral sound quality: rough, smooth) × 2 (brand personality perception: ruggedness, sophistication) mixed ANOVAs all revealed the predicted cross-over interaction (Fs ≥ 29.55, ps < .001,

Brand Personality Perception (± SE) in Studies 2a–2d.
As identification rates differed significantly in Studies 2a and 2d we reanalyzed them including source identification as a factor in the model. However, source identification yielded neither main effects (ps ≥ .145) nor interaction effects (ps ≥ .246).
Discussion
Studies 2a–2d showed that, keeping the musical instrument constant, changes in the timbral sound quality in otherwise identical audio logos systematically affected brand personality judgments across a variety of musical instruments, melodies, fictitious brand names, and logos.
Study 3: Polyphonic Timbre Mix
Instead of using a single instrument (i.e., monophonic orchestration; Study 2), multiple instruments can be used in the orchestration of an audio logo, creating a sound quality that emerges from their combination: a polyphonic “timbre mix” (e.g., Nunes and Ordanini 2014). Study 3 extends the investigation to polyphonic timbres by orchestrating a unique audio logo composition for violin, cello, e-guitar, and trumpet and creating rough and smooth versions. In addition, we assessed brand personality perceptions between participants.
Method
We aimed to collect 50 participants per cell for a four-cell design on MTurk (N = 200; 105 female, 94 male, 1 preferred not to indicate; Mage = 41.59 years, SD = 12.38). Participants were presented with a short video of the fictitious brand JoMa. We manipulated timbral sound quality as well as the assessment of brand personality dimension (ruggedness: α = .86; sophistication: α = .96) between participants. Participants also indicated sound quality perceptions on a single scale (1 = “rather rough,” and 7 = “rather smooth”). Given that multiple instruments were used, participants indicated how many different instruments they heard (in an open-ended question), which did not differ between conditions (Moverall = 2.13, SDoverall = .86; Fs ≤ 2.72, ps ≥ .101,
Results
Sound quality perception
A 2 (sound quality manipulation: rough, smooth) × 2 (brand personality perception: ruggedness, sophistication) between-subjects ANOVA on sound quality perception revealed only a significant main effect of timbral sound quality (F(3, 196) = 48.71, p < .001,
Brand personality perception
The same ANOVA on brand personality perception revealed only a significant interaction (F(3, 196) = 20.66, p < .001,
Conversely, when ruggedness was assessed, ratings were higher in the rough condition (F(1, 196) = 8.31, p = .004,
Discussion
Study 3 showed that the effect of timbral sound quality extends to a fused timbre mix emerging from a polyphonic orchestration and replicates the effect when brand personality perceptions are measured between participants versus within participants. Studies 2 and 3 demonstrated generalizability of the effect across various instrumentations. Although source identification did not change the pattern of results, effects of timbral sound quality could not be shown in the absence of an identifiable source. Also, while being reflective of real-world listening contexts, altering sound quality through musical articulations does not allow for an isolated manipulation of the proposed acoustic underpinnings. Finally, Studies 2 and 3 did not include a reference to which rough and smooth sound quality could be compared. Study 4 aimed to address these shortcomings.
Study 4: Manipulating Sound Quality Through Additive Synthesis
In this study, we created timbres using additive synthesis (i.e., layering of sinusoids) in MATLAB, which uniquely allows for a more precise and controlled manipulation of the spectral and temporal envelope. This has three advantages: (1) we can create timbres that do not correspond to any existing sound source, allowing any effects to be uniquely attributed to timbral sound quality; (2) we can manipulate acoustic correlates of interest directly and in isolation (vs. indirectly through musical articulation, which may also affect other parameters); and (3) we can create a “clean” reference timbre as a control condition.
We created an artificial timbre for a three-tone melody (D5: F0 = 587 Hz; E5: F0 = 659 Hz; A4: F0 = 440 Hz) in MATLAB. That is, for each tone, a spectral envelope was created on the basis of the respective F0 that consisted of only ten harmonic partials (i.e., integer multiples of F0), whereby the amplitude of each harmonic partial (An) was an inverse function of its rank, n, according to An = k × 1/n (see Caclin et al. 2005). The temporal envelope of each tone consisted of 150 ms attack time part (exponential rise to maximum amplitude), a 1,800 ms sustain part (no change in amplitude), and an exponential decay of 50 ms (return to zero amplitude). This served as the reference timbre. Harmonic partials remained the same across all stimuli, as did decay time and total duration (2,000 ms). The attack portions of the rough and smooth timbre were set to 100 ms above and below the reference timbre (50 ms and 250 ms, respectively), with according adjustment of sustain time (1,900 ms and 1,700 ms for rough and smooth, respectively). For the rough timbre, we added inharmonic partials above and below each harmonic partial fn according to γ = .18 as follows: fn (1 – 3γ); fn (1 – 2γ); fn (1 – γ); fn (1 + γ); fn (1 + 2γ); fn (1 + 3γ). The amplitude of these inharmonic partials was set to 40% of the respective neighboring harmonic partial n (e.g., Farbood and Price 2017). In the smooth timbre, we added frequency modulation to the harmonic series by manipulating the period length corresponding to the nth harmonic determined according to Tn = T0 (1 + amod sin (2 π n fmod T0)), where T0 is the period corresponding to the nominal frequency of the harmonic partial fn, fmod is the modulation frequency (5.5 Hz), and amod is the modulation amplitude (.000125) (e.g., Fritz et al. 2010). Thus, the attack time of the reference timbre was exactly between the rougher and smoother timbre. Keeping harmonic partials constant, only the rough timbre contained inharmonic partials and only the smooth timbre contained frequency modulation.
Method
As these stylized timbres provide a less rich signal than musical instruments, we doubled the sample size to 100 participants per condition for a three-cell design on MTurk (N = 301; 151 female, 147 male, 3 preferred not to indicate; Mage = 41.34 years, SD = 11.83). Participants were presented with a short video showing the fictional brand Altino, including the audio logo in one of the three timbre versions. Then, they rated ruggedness (α = .87), sophistication (α = .96), and sound quality as in previous studies. However, participants were not asked to identify the source, as it could not be identified.
Results
Sound quality perception
A 3 (timbre: reference, rough, smooth) × 2 (sound quality perception: roughness, smoothness) mixed ANOVA revealed no effect of timbre (F(1, 298) = 1.92, p = .149,
Brand personality perception
The same ANOVA on brand personality perceptions revealed effects of timbre (F(2, 298) = 4.54, p = .011,

Brand Personality Perception (± SE) in Study 4.
Discussion
These results demonstrated that the effect of timbral sound quality on brand personality perceptions persists in the absence of an identifiable sound source. In addition, these results provide evidence for the proposed acoustic underpinnings of rough and smooth sound quality, as their direct and isolated manipulation produces the effect in comparison with a reference timbre.
Study 5: Moderation by Incongruent Visual Sensory Information
To the extent that abstract judgments of brand personality draw from embodied meaning elicited by sensory cues, sensory information across different modalities should be recruited jointly, and in a similar way, to inform brand personality perceptions in an integrated fashion. Thus, visual sensory information that is incongruent with sound quality should counteract the effect of sound quality on brand perceptions. Prior research suggests that angular shapes are perceived as hard or rough, whereas round shapes are perceived as soft or smooth (Jiang et al. 2016). Thus, the joint presentation of a visual brand logo whose sensory quality is incongruent with the timbral sound quality of the audio logo (i.e., an angular visual logo alongside a smooth-sounding audio logo, or a round visual logo alongside a rough-sounding audio logo) should attenuate the effect of timbral sound quality on brand personality perceptions. To test this, we created an angular logo version for the fictitious brand Mahk using a block font and a triangle shape, and we created a round logo version for the same brand using a cursive font and an oval shape, as depicted in Figure 4.

Visual Brand Logos in Study 5.
In a pretest, undergraduate students (N = 91) saw one of the two versions and rated how rough and smooth the respective logo looked (on seven-point scales). The angular logo was perceived as more rough than smooth, and the round logo was perceived as more smooth than rough, with all four contrasts being significant (ps < .001; Finteraction(1, 89) = 48.74, p < .001,
Method
We aimed to recruit at least 50 students per cell for a three-cell design but collected as many responses as were available in the laboratory (N = 183; 100 female, 79 male, 4 preferred not to indicate; Mage = 20.21 years, SD = 1.30). Participants were assigned at random to one of three conditions: rough congruent (rough audio–rough visual logo), smooth congruent (smooth audio–smooth visual logo), and incongruent (rough audio–smooth visual logo or smooth audio–rough visual logo). Participants assigned to the incongruent condition were further assigned to one of its two versions, as the proposed cross-modal sensory integration account would suggest that the two aspects of incongruent sensory information should counteract each other. Thus, the two incongruent versions should yield comparable brand personality perceptions, justifying the collapse of these versions into one condition. Participants indicated brand personality perceptions (ruggedness: α = .81; sophistication: α = .93).
Results
Combination of incongruent versions
A 2 (incongruent audio–visual logos version: rough audio–smooth visual, smooth audio–rough visual) × 2 (brand personality perception: ruggedness, sophistication) mixed ANOVA revealed no significant effects (all Fs < 1). Brand personality perceptions (rough audio–smooth visual: ruggedness M = 3.53, SD = 1.43, sophistication M = 3.38, SD = 1.26; smooth audio–rough visual: ruggedness M = 3.23, SD = 1.36, sophistication M = 3.46, SD = 1.53) did not differ within or between incongruent versions, justifying the combination of the two versions into a single audio–visual logo incongruent condition.
Brand personality perception
A 3 (audio–visual logo combination: rough congruent, incongruent, smooth congruent) × 2 (brand personality perception: ruggedness, sophistication) mixed ANOVA revealed no effect of brand personality (F(1, 180) = 2.11, p = .148,

Brand Personality Perception (± SE) in Study 5.
Discussion
The results of Study 5 showed that incongruent visual information can counteract the effect of timbral sound quality on personality perception. This speaks to the proposed embodied link between rough/smooth sensory qualities and rugged/sophisticated personality perceptions and to the proposed cross-modal sensory integration account, according to which sensory information from different modalities is jointly recruited to influence abstract meaning elicitation in an integrated fashion. As audio and visual logos are often presented jointly, these results also suggest a practically relevant boundary condition.
Study 6: Controllable or Nonconscious Process?
The primary goal of Study 6 was to investigate the process by which timbral sound quality in audio logos affects brand personality perceptions. Bargh (1989) proposed that an automatic process is effortless, outside of awareness, unintentional, and uncontrollable. Many processes have one or more of these characteristics (Raghubir 2008; Figure 7.1, p. 146). For example, a nonconscious process is one where consumers are aware of the stimulus, but unaware of the extent of its influence (Fitzsimons et al. 2002). Prior research has tested for such a process by discrediting the informativeness of a stimulus (Menon and Raghubir 2003; Yorkston and Menon 2004). If discrediting the informativeness of a stimulus eliminates an effect, that is consistent with a conscious and controlled process (e.g., Lynch, Marmorstein, and Weigold 1988). If the effect persists, then the process is nonconscious if consumers are aware of the stimulus but only partially aware of the extent of its influence on their judgments (Menon and Raghubir 2003).
Leveraging this paradigm, we examined the effect of discrediting the informativeness of auditory information for a brand. To make for a conservative test, we used the brand Intel, a well-known brand with competence as a core personality dimension. A pretest on MTurk (N = 50) displaying a short video of the visual Intel logo without sound showed that consumers are familiar with Intel (M = 5.84, SD = 1.45) and perceive it as predominantly competent (α = .89: M = 5.31, SD = 1.26; seven-point scales as in Study 1), versus exciting (α = .85: M = 4.17, SD = 1.47), sincere (α = .78: M = 4.51, SD = 1.16), rugged (α = .90: M = 4.04, SD = 1.55), or sophisticated (α = .89: M = 4.15, SD = 1.43; all ps < .001). Importantly, the pretest also revealed no difference between ruggedness and sophistication perceptions (p = .511).
Method
This study followed a 2 (timbral sound quality: rough, smooth) × 2 (audio information: audio logo, unrelated audio) × 3 (brand personality: competence, ruggedness, sophistication) mixed design with the third factor varying within participants. To be able to detect a three-way interaction, we aimed to collect 100 participants per condition on MTurk (N = 400; 187 female, 208 male, 5 preferred not to indicate; Mage = 42.62 years, SD = 13.40). Participants in the audio logo condition were informed that they were going to be presented with a video including the Intel brand visual logo and audio logo melody, while those in the unrelated audio condition were told that they would see the Intel visual logo and a melody composed by the researchers that was not connected to the brand. In fact, we reversed the Intel audio logo melody in the unrelated audio condition so that participants would not recognize it. While reversing a melody makes it unrecognizable, pitches and their respective duration are kept constant, satisfying the definition of timbre (ANSI 1994). We used e-piano presets in a synthesizer for the rough and smooth versions to create a sound design in the spirit of the original.
As an attention check, participants had to correctly select the information they had received about what the video would include: visual logo only, visual logo and audio logo melody, visual logo and unrelated melody. Participants who failed this check were directed out of the survey. Next, participants indicated their familiarity with the brand Intel, which was high (M = 5.19, SD = 1.57) and did not differ across conditions (all Fs < 1). After seeing the brand video, participants indicated brand personality perceptions of competence (α = .93), ruggedness (α = .85), and sophistication (α = .91). We included competence as the pretest results show that consumers predominantly perceive the familiar brand Intel as competent, an established brand personality perception that should not be influenced by timbral sound quality manipulations. To examine the perceived extent of the influence of auditory information, they then indicated the extent to which (1) their brand knowledge and experience, (2) the brand visual logo and the brand name, and (3) the sound of the brand audio logo melody or unrelated melody were informative for their perception of Intel (seven-point scales). Finally, participants rated sound quality perceptions and were asked to identify the sound source. We expected prior brand knowledge to be perceived as most informative, as consumers viewing a video of a familiar brand should predominantly draw on existing brand knowledge (Campbell and Keller 2003). Importantly, informativeness of the sound of the unrelated melody should be lower than that of the sound of the audio logo melody.
For brand personality perceptions, we expected competence (i.e., core dimension) ratings to be higher than ruggedness and sophistication ratings in all conditions, given participants’ prior brand knowledge as a diagnostic information source for a familiar brand. In addition, we expected our effect to replicate in the audio logo condition: higher ruggedness than sophistication perceptions in the rough sound quality condition, and higher sophistication than ruggedness perceptions in the smooth sound quality condition. The pattern in the unrelated audio condition would speak to process: If consumers use auditory information as a function of its informativeness in a controlled manner, we would expect differences between ruggedness and sophistication as a function of sound quality to be attenuated. However, if consumers use auditory information nonconsciously, irrespective of perceived informativeness, we would expect the pattern to be similar to the audio logo condition.
Results
Source identification
A logistic regression with sound quality, audio information, and their interaction as predictors revealed only a marginally significant main effect of sound quality manipulation (b = .61, SE = .34, Wald χ2(1) = 3.22, p = .073; rough: 65.0%; smooth: 79.3%; all other effects, p ≥ .109). Differential source identification is addressed subsequently.
Informativeness
A 2 (sound quality manipulation: rough, smooth) × 2 (audio information: audio logo, unrelated audio) × 3 (informativeness: audio information, visual logo, brand knowledge) mixed ANOVA revealed a main effect of informativeness (F(2, 792) = 112.12, p < .001,
Sound quality perception
A 2 (sound quality manipulation: rough, smooth) × 2 (audio information: audio logo, unrelated audio) × 2 (sound quality perception: rough, smooth) mixed ANOVA revealed only main effects of sound quality perception (F(1, 396) = 8.98, p = .003,
Brand personality perception
A 2 (sound quality manipulation: rough, smooth) × 2 (audio information: audio logo, unrelated audio) × 3 (brand personality: competence, ruggedness, sophistication) mixed ANOVA revealed a main effect of brand personality (F(2, 792) = 136.60, p < .001,

Brand Personality Perception (± SE) in Study 6.
Discussion
The results of Study 6 showed that timbral sound quality continues to influence perceptions of ruggedness and sophistication, even when its informativeness is discredited, and when consumers have access to prior brand knowledge. Participants found that the unrelated audio (i.e., melody composed by researchers) was relatively and absolutely low in informativeness (below the midpoint p < .001, d = .87), suggesting that they were unaware of the extent of its influence on their judgments, as it continued to influence their personality judgments in a manner similar to the audio logo melody. This pattern is consistent with a nonconscious process, as accessible auditory information influenced judgments irrespective of its perceived informativeness (Fitzsimons et al. 2002). However, timbral sound quality's influence has limits: a familiar brand, such as Intel, personifying competence (see pretest and Study 1B in Web Appendix J), continues to be perceived as more competent than rugged or sophisticated.
Study 7: The Effect of Timbre on Purchase Intentions
The goal of Study 7 was to test downstream consequences of the effect of timbral sound quality in brand audio logos for purchase intentions. Specifically, we predicted that if a brand's product offering fits with a personality dimension activated through timbral sound quality in the audio logo and transferred to the brand, intentions to purchase should be increased.
Method
We aimed to collect 50 responses per cell for a two-cell design on MTurk (N = 100; 49 female, 51 male; Mage = 39.58 years, SD = 11.63). Participants were presented with the following scenario: Imagine that the birthday of two friends of yours is coming up and they are both into board games.
One friend loves action games; that is, games that involve fast reactions to surprising turns of events and physical activity.
The other friend loves educational games; that is, games that involve careful deliberation and analytical problem solving.
Participants who failed a reading comprehension check confirming that their friends were into board games (out of ten options) were directed out of the survey before condition assignment. Participants then saw a short video introducing a fictional company, Cosmos Board Games, with a visual logo and a unique audio logo containing either smooth e-guitar or rough e-guitar, and indicated how likely they would be to buy a board game for the friend who loves action games and the friend who loves educational games (counterbalanced) on seven-point scales. Then, they indicated brand personality perceptions (ruggedness: α = .92; sophistication: α = .91) and sound quality perceptions and were asked to identify the instrument. Since an action game is more consistent with a rugged brand personality and an educational game is more consistent with a sophisticated brand personality, we predicted an interaction, such that purchase intentions should be higher for the game that is consistent (vs. inconsistent) with the personality dimension triggered by timbral perceptual sound quality.
Results
Sound quality perception
A 2 (sound quality manipulation: rough, smooth) × 2 (sound quality perception: rough, smooth) mixed ANOVA revealed only a significant interaction (F(1, 98) = 39.81, p < .001,
Source identification
Similar to Study 2a, participants’ identification rate differed (rough e-guitar: 82.4%; smooth e-guitar: 10.2%; χ2(1) = 52.22, p < .001; addressed subsequently).
Brand personality perception
A 2 (sound quality manipulation: rough, smooth) × 2 (brand personality: ruggedness, sophistication) mixed ANOVA revealed only a significant interaction (F(1, 98) = 42.79, p < .001,
Purchase intentions
A 2 (sound quality manipulation: rough, smooth) × 2 (purchase intentions: action, educational game) mixed ANOVA revealed only a significant interaction (F(1, 98) = 50.61, p < .001,
General Discussion
We investigated the effect of timbral sound quality in audio logos on brand personality perceptions. Study 1 showed that changing instrumentation and, thus, timbre systematically changes brand personality perceptions of even a well-known brand, Coca-Cola. Changing timbral sound quality systematically altered brand personality perceptions across a variety of instruments or their combination, even when sound source was held constant (Studies 2 and 3). Study 4 provided evidence for the proposed acoustic underpinnings of sound quality perceptions and replicated the effect in the absence of an identifiable sound source. Visual sensory information that is incongruent with timbral sound quality can counteract the effect of sound quality on personality perception (Study 5). Study 6 showed that meaning elicited by timbral sound quality affects brand personality perceptions even when the informativeness of auditory input is discredited, suggesting a nonconscious process. Finally, Study 7 showed downstream consequences for purchase intentions. In the following, we discuss managerial implications, theoretical contributions, and opportunities for future research based on these findings.
Managerial Implications
Judgments of brand personality are important as they can affect consumer preference, attitudes, and purchase intentions (Batra and Homer 2004; Biel 1993) and our results suggest timbral sound quality as an effective means to inform consumers’ brand personality perceptions. In particular, the strategic use of timbral sound quality in audio logos is unobtrusive and can affect consumers’ brand personality judgments even if they are not deemed informative. As consumers self-generate brand personality perceptions based on timbre with limited awareness, it is plausible that the timbral quality of audio logos may be as effective as or even more effective than the content of verbal claims (see also Krishna 2012), which could be perceived as persuasion attempts. Strategic audio logo design not only is a powerful means of expressing a desired brand personality but is increasingly important in light of companies’ growing investments in audio branding efforts (e.g., amp 2022; Armstrong 2019). We provide marketers with insights into the capabilities of sound quality to communicate brand personality, and for composers tasked with creating audio branding elements, we provide insights into the acoustic underpinnings of timbral sound quality. These insights are relevant not only for new brand introductions but also for rebranding exercises, as brand personality judgments can be updated with new encounters (Wentzel 2009). Although some brands (e.g., Coca-Cola) change timbres in their audio logos to fit specific occasions, this may have unintended consequences for brand personality perceptions. Further, the design of audio and visual logos ought to be considered jointly, as our results suggest that consumers draw from available sensory information across modalities and integrate such information in the formation of personality perceptions.
Theoretical Contributions
This research contributes to multiple streams of literature. By examining the effect of timbral sound quality on brand personality perceptions, we add to the literature on timbre in acoustics by answering calls for research on the capabilities of timbre to elicit meaning (Siedenburg et al. 2019). In addition, this research extends the notion of embodied musical meaning (Zhu and Meyers-Levy 2005). That is, our results suggest that a single parameter (i.e., timbral sound quality) can trigger abstract semantic associations, and thus the embodied route to meaning elicitation is not limited to a mere hedonic apprehension.
Further, this research answers the call for more research into the auditory sensory modality in marketing (Krishna 2012). We examine the auditory modality from an acoustics angle that has so far been little explored by providing insights into acoustic correlates of sound quality perceptions and by introducing additive synthesis as a promising tool for sensory marketing research concerning the auditory modality. In addition, we show that embodied meaning elicited from sensory qualities is formed by integrating sensory information across modalities. Finally, our results suggest that consumers are unaware of the extent to which sound quality influences their judgments, speaking to the literature on nonconscious processes.
Directions for Future Research
With the rapid adoption of voice technologies creating novel auditory brand–consumer touchpoints, audio branding increases in importance and provides fertile ground for research in marketing (see also Krishnan and Kellaris 2021). In the following, we provide some directions for future research specifically addressing the exploration of timbre in marketing research.
Although the present research demonstrates that timbre, specifically timbral sound quality, can be a powerful design parameter in audio branding by virtue of its capability to elicit abstract meaning in terms of brand personality, the empirical investigation focused on a limited set of sound quality dimensions (i.e., rough and smooth). However, the sound quality aspect of timbre provides rich opportunities for research due to its multidimensionality. That is, future research may explore how other dimensions of timbral sound quality (e.g., luminance) may activate meaning and investigate important underlying acoustic correlates. For instance, could a bright sound quality potentially trigger associations with competence or superior performance? Such investigations do not have to be confined to the context of musical branding elements, such as audio logos, but could extend to timbral sound qualities of brand spokespersons’ voices, synthesized AI agents’ voices, or product sound design.
In addition, although we investigated how timbral sound quality interacts with visual sensory qualities across modalities, we did not examine intramodal interactions with other auditory parameters. Future research could explore how timbral sound quality could be combined with other auditory parameters (e.g., loudness, which was held constant in the present research) in a way that reinforces the elicited abstract meaning. For instance, could increased loudness of an audio logo with rough timbral sound quality exacerbate brand personality perceptions of ruggedness, or could decreased loudness with a logo of smooth sound quality do the same for perceptions of sophistication?
Finally, although we focused on embodied meaning triggered by timbral sound quality, timbre may also activate referential meaning by signifying a sound source. Future research could disentangle the relative contribution of these two aspects of timbre to meaning elicitation and investigate whether meaning elicitation through sound quality and meaning elicitation through sound source identification are qualitatively different processes. That is, although consumers appear to generate embodied meaning elicited by timbral sound quality and draw on it for the formation of brand personality perceptions in a nonconscious manner, the referential meaning elicitation process through activation of associations held with a sound source that needs to be identified may constitute a more effortful and controlled process.
Conclusion
In summary, timbre is an ecologically important, multidimensional, information-rich auditory parameter that can be meaningfully leveraged in branding and provides ample opportunities for future investigations in marketing.
Supplemental Material
sj-pdf-1-mrj-10.1177_00222437221135188 - Supplemental material for The Sound of Music: The Effect of Timbral Sound Quality in Audio Logos on Brand Personality Perception
Supplemental material, sj-pdf-1-mrj-10.1177_00222437221135188 for The Sound of Music: The Effect of Timbral Sound Quality in Audio Logos on Brand Personality Perception by Johann Melzner and Priya Raghubir in Journal of Marketing Research
Footnotes
Acknowledgments
The authors would like to thank Maximilian Melzner for his help with digital signal processing and analysis, the composer and music editor Nicholas Kmet for his advice and help with stimulus creation, and the JMR review team for their thoughtful comments. The authors would also like to thank Rita Aiello, Andrea Bonezzi, the Trope lab, the NYU Stern Marketing Department, the Northwestern Kellogg Marketing Department, the LMU Munich Economic and Organizational Psychology Department, and the audience at the New Beginnings conference (2021) for their helpful comments on a previous version of this article.
Associate Editor
Eileen Fischer
Author Note
This research is part of the first author’s dissertation.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
