Abstract
Previous research found that speakers with more attractive voices receive more favorable evaluations (aka the vocal attractiveness stereotype). But sexual selection theory predicts that, to the extent that men perceive women with higher pitched voices as more attractive, women will be more hostile toward those women because they make more threatening mate rivals. Supporting this hypothesis, Study 1 (N = 102) showed that female participants higher in trait dominance displayed heightened aggressive cognition after being primed with a romantic (but not a control) feeling and listening to a higher- but not lower-than-average female voice. Study 2 (N = 111) showed that this heightened aggressive cognition was activated by a long-term but not a short-term mating motive. These findings supported sexual selection theory, challenged the vocal attractiveness stereotype, and suggested a mechanism that helps maintain the honesty of female voice pitch as a mate attraction signal.
In social psychological research, “beauty is good” refers to the tendency of attributing more positive traits (e.g., being more competent and well-adjusted) to more physically attractive individuals (Langlois et al., 2000). Voice evaluation appears to follow the same pattern. Zuckerman and colleagues (Zuckerman & Driver, 1989; Zuckerman, Hodgins, & Miyake, 1990) found that, compared with men and women whose voices were perceived as less attractive, those whose voices were perceived as more attractive were rated more positively, such as being more honest, likeable, and warmer. But, does this “vocal attractiveness stereotype” (Zuckerman & Driver, 1989, p. 67) always hold?
Recent studies on how men and women differentially evaluate women with different levels of voice pitch suggest that it does not. Pitch describes how high or low a voice sounds and is the perceptual correlate of fundamental frequency (F0), which indexes the vibration rates of vocal folds during vocalization (Titze, 2000). Individuals’ “natural” mean pitch is thus determined by the rates at which their vocal folds normally vibrate. Drawing on sexual selection theory (Darwin, 1871), research showed that men perceive women with naturally higher (but not exceedingly high; e.g., Borkowska & Pawlowski, 2011) voices or experimentally raised voices (e.g., Jones, Feinberg, DeBruine, Little, & Vukovic, 2010) as more attractive. This is likely because estrogen—a hormone central to women’s fertility—prevents the thickening of women’s vocal folds at puberty (Abitbol, Abitbol, & Abitbol, 1999), making pitch a reliable indicator of women’s reproductive capacity (Atkinson et al., 2012). Female listeners, however, associate women with higher voices with such negative attributes as being flirtatious (Puts, Barndt, Welling, Dawood, & Burriss, 2011) and unfaithful (O’Connor, Re, & Feinberg, 2011).These findings are interpreted as women being hostile to other attractive women as threatening rivals in intrasexual competition (i.e., the contest between same-sex members for mates).
This work thus challenged the vocal attractiveness stereotype—women assign negative rather than positive traits to voices that men find attractive. However, those studies focused exclusively on listeners’ overt judgments (of, e.g., whether certain traits apply to a speaker) and overlooked associated cognition. If young adult women’s hostility toward other attractive women is an adaptive response in intrasexual competition, principles of evolutionary psychology (Andrews, Gangestad, & Matthews, 2002) predict the existence of concomitant psychological mechanisms designed to achieve the same goal, that is, to repel same-sex rivals from access to potential mates. The present research thus examines American young adult women’s aggressive cognitive response to a female speaker with a higher- versus lower-than-average voice beyond their subjective evaluation of the speaker.
This research focuses on women’s voice because previous evolutionarily motivated studies that had demonstrated listener hostility used women’s voice (O’Connor et al., 2011; Puts et al., 2011; see also, O’Connor & Feinberg, 2012; Shoup-Knox & Pipitone, 2015). Using the same type of stimuli will render the current investigation empirically better grounded and more comparable with previous research. Meanwhile, despite the many different ways of operationalizing vocal attractiveness (Babel, McGuire, & King, 2014), mean pitch is chosen because its effect on vocal-attractiveness perceptions is well established (but see Babel et al., 2014) and methods are readily available for clean manipulations. Pitch is thus an ideal acoustic feature for a first test of women’s aggressive cognition in response to a higher- versus lower-than-average female voice.
Sexual Selection, Paternal Care, and Female Intrasexual Competition
Parental investment theory (Trivers, 1972) predicts that the sex that invests more in reproduction will generally be the more valuable sex that opposite-sex members compete over for reproductive success. In most mammalian species, females’ investment is much heavier than males’, and males thus compete more intensely among each other to physically repel rivals or outperform them on features that females find attractive (Andersson, 1994). In those species, females compete less but choose their partners more discriminately than males do due to the substantial cost associated with reproduction for females (Andersson, 1994).
While women share the choosiness, they also compete to attract men. The evolutionary cause of this enhanced female competition among humans is that human altriciality (i.e., born underdeveloped) increases the importance of paternal care (Puts, 2010). This makes men more valuable and causes them to be more selective in choosing women as partners. In general, men prefer women who are more fertile (Buss, 1989), and women’s fertility is phenotypically expressed as traits that men find “attractive.” Examples include high-pitched voices and feminine faces (Feinberg, 2008), and a small waist-to-hip ratio (Singh, Dixson, Jessop, Morgan, & Dixson, 2010). Thus, women will display and even exaggerate those traits to advertise their physical attractiveness when competing (Hill & Durante, 2011).
To compete, women also derogate rivals’ physical appearance, exhibit unfriendly behaviors toward women dressed provocatively, and socially exclude them (see for a review, Vaillancourt, 2013). These acts of indirect aggression are believed to enhance the aggressors’ self-esteem and tarnish rivals’ reputation. Women also occasionally engage in physical aggression to compete for highly desirable men (Schuster, 1983).
Women’s Hostility Toward Attractive Female Voices
Higher female voices can elicit hostility from female listeners because (as previously mentioned) pitch reliably tracks women’s fertility (Atkinson et al., 2012). Indeed, women with natural higher voices tend to have more feminine faces (Feinberg, 2008), slimmer and more symmetrical bodies (Wheatley et al., 2014), and a smaller waist-to-hip ratio (Hughes, Dispenza, & Gallup, 2004). There is also evidence that women raise their pitch (relative to their natural mean pitch) when addressing men who they find attractive (Leongómez et al., 2014; but see Hughes, Farley, & Rhodes, 2010). Finally, women’s pitch are detectably higher (Bryant & Haselton, 2009), and their voices perceived as more attractive (Pipitone & Gallup, 2008), when they are near ovulation than when they are not. Thus, given men’s preference for all those estrogen-mediated traits, women with higher voices make more threatening mate rivals.
Consequently, women are more likely to attribute negative traits (e.g., being flirtatious and unfaithful) to other women with higher voices (O’Connor et al., 2011; Puts et al., 2011) and be jealous when imagining those women flirting with their (the perceivers’) romantic partner (O’Connor & Feinberg, 2012). Activated by the perceived or actual presence of a mate poacher, jealousy is believed to facilitate mate retention (Buss, 2000), and has been found to cause aggression (Archer & Webb, 2006).
In addition to self-report findings, a recent study by Shoup-Knox and Pipitone (2015) found that female participants who listened to voices recorded at the fertile phase of a speaker’s menstrual cycle showed increases in heart rates and Galvanic skin responses compared with those who listened to voices recorded at the speaker’s nonfertile phase. While it is possible that heightened physiological arousal enables more effective tracking of rivals (Shoup-Knox & Pipitone, 2015), high electrodermal and heart rate reactivities also predict aggression (Lorber, 2004). Collectively, these findings suggest that higher female voices will be more likely than lower female voices to cause aggressive responses from female listeners.
Overview of the Present Research
With two studies, the present research tests whether heterosexual, American college-aged women will be more cognitively aggressive in the context of intrasexual competition after listening to a female voice that is perceived to be higher or lower than the voice of an average American female college student (see Method). Participants’ aggressive cognition will be measured as their accessibility of aggression concepts (i.e., how much they think about aggression). Concept accessibility is believed to promote adaptive responses to the stimuli that activate those concepts in the first place (e.g., Miller & Maner, 2011). If women in intrasexual competition aim to deter threatening rivals from courting the same man through indirect or physical aggression, women who are more motivated to compete are expected to have higher accessibility of aggression concepts after listening to a higher-than-average female voice than after listening to a lower-than-average female voice.
The present research also examines female listeners’ subjective evaluation of the speaker. To the extent that women derogate other women’s physical attractiveness in intrasexual competition (Vaillancourt, 2013), female listeners in a competitive mind-set should also be more likely to negatively evaluate the attractiveness of a woman having a higher-than-average voice than a woman having a lower-than-average voice (after all, the former makes a more threatening rival in competition). Furthermore, because social exclusion is a major tactic of indirect aggression that characterizes female intrasexual competition (Vaillancourt, 2013), female listeners who are cued to compete are also expected to perceive a woman with a higher-than-average voice as less friendly and warm—perceptions that are likely to facilitate actual exclusion.
Both types of responses—aggressive cognition and speaker evaluation—are examined in the two studies of this research. In Study 1, female participants will be primed with a romantic or control feeling. In Study 2, female participants will be primed with a long-term or a short-term mating motive. As elaborated below, priming these motives is to simulate different levels of intrasexual competition when the motives are combined with different levels of female voice pitch. Last, the effect of participants’ trait dominance, an index of their baseline competitiveness, will also be considered in both studies.
Study 1
To stimulate intrasexual competition, female participants in this study will first be primed with a romantic or control feeling and then listen to a higher- or lower-than-average female voice. Hearing a higher-than-average female voice when romantically aroused should suggest to participants the presence of an attractive woman threatening their romantic relationship. This condition should thus simulate more intense intrasexual competition than other combinations of speaker pitch and prime type. However, how participants respond to different levels of intrasexual competition will also depend on their trait dominance. In general, more dominant individuals are more competitive and respond to competition more aggressively (e.g., Anderson & Berdahl, 2002). Given these considerations, it is predicted that female participants higher in trait dominance will display higher accessibility of aggression concepts (Prediction 1) and evaluate the speaker as less attractive and sociable (Predictions 2a-2b) after being primed with a romantic (but not a control) feeling and listen to a higher- (but not lower-) than-average female voice. It is also tested whether aggressive cognition tracks speaker evaluation.
Method
Subjects and Design
A hundred and two heterosexual, U.S. female undergraduate students (Mage = 19.0, SD = 1.11) participated in exchange for course credit. 1 They were randomly assigned to the conditions of a 2 (speaker’s pitch level: high/low) × 2 (prime type: romantic feeling/control feeling) between-subjects factorial design with their trait dominance measured as a continuous moderator. The dependent variables were (a) participants’ accessibility of aggression concepts and (b) their evaluation of the speaker’s attractiveness and sociability.
Stimulus Recordings
A 20-year-old Caucasian female with a (local) Californian accent read the following passage into a Logitech microphone headset H390 attached to a Lenovo laptop computer: “Hi, my name is Emily and I am 19 years old. I am a pre-Comm major at UCSB. I love sunshine and outdoor activities. I party on weekends but I also study hard for my midterms and finals.” This passage portrayed “Emily” as a typical undergraduate student at the university where data were collected—fond of partying but also valuing academic achievement. “Emily” was described as a pre-Comm major because all participants were taking at least one communication class at the time of the study.
The recording was made with Audacity software in mono at 44.1 kHz with 16-bit amplitude quantization. The speaker was instructed to speak as if she was introducing herself in an interview for a student organization. To fulfill the cover story that participants would listen to segments from an interview (see below), 3-second-long silence was generated between successive sentences in the passage (e.g., “Hi, my name is Emily and I am 19 years old [3-second silence] I am a pre-Comm major at UCSB [3-second silence] . . . ”). This newly created recording lasted for 20.8 seconds and was used for F0 manipulation.
The F0 of the original recording was estimated by Praat© version 5.1.05, initially with pitch floor set at 50 Hz and ceiling at 500 Hz to search pitch within the broadest possible range of human female voices. But to avoid apparent mistracking of pitch by Praat (see Figure 1 for an example), the natural range of the speaker’s F0 was then determined by hand, including only the portions of the recording that contain recognizable utterances. In this recording, the speaker’s pitch ranged from 105 Hz to 274 Hz, and had a mean F0 of 189 Hz (SD = 29). This mean F0 was then raised and lowered by 20 Hz using the pitch-synchronous overlap add (PSOLA) algorithm of Praat, which allows for an independent manipulation of pitch from other acoustic features (e.g., formant frequencies). The newly synthesized, high-F0 recording had a mean F0 of 209 Hz (SD = 29, ranging from 126 Hz to 293 Hz), and the low-F0 recording had a mean F0 of 169 Hz (SD = 26, ranging from 81 Hz to 252 Hz). The F0 range and mean F0 of the synthesized recordings were estimated using the procedure described above.

A broadband spectrogram overlaid with voice-pitch traces (dark speckles) for the segment “Hi, my name is Emily and I’m 19 years old” in the original recording.
The total amount of manipulation is approximately 0.8 equivalent rectangular bandwidths and within the range of pitch manipulation used in previous research (0.5 to 1 equivalent rectangular bandwidth; O’Connor et al., 2014; Puts et al., 2011). The mean F0s of the manipulated voices (209 Hz and 169 Hz) are within two standard deviations of the average mean female pitch provided by Puts, Apicella, and Cárdenas (2012); participants should not find them unusually high or low. Furthermore, the two recordings have highly comparable formant frequencies (formant position of high-F0 recording = 1.1 and of low-F0 recording = 1.0; Puts et al., 2012). The author deemed the two recordings sound highly comparable other than their mean F0s based on those objective criteria and subjective assessment.
Procedure and Measures
Participants first received a prime of romantic or control feelings (cf. Griskevicius et al., 2009). In the romantic-prime condition, participants read a passage describing a romantic encounter between them and an attractive man. They were instructed to imagine that they met this man during vacation on an island. After a few awkward moments, they became more and more comfortable with each other, and decided to have dinner together after a pleasant stroll on the beach. The story ended with participants feeling excited kissing the man.
In the control condition, participants read a passage describing a nonromantic but equally arousing scenario. They were instructed to imagine that they were about to run some errands after a stressful day but could not find their keys. They looked everywhere and became more and more frustrated as time went by. Just as they were about to give up, they spotted the keys on the counter which they had searched a while ago. The story ended with participants feeling relieved and elated for finding the keys. Griskevicius et al. (2009) showed that the romantic prime successfully induced romantic feelings whereas the control prime induced similar levels of positive and negative feelings (e.g., enthusiasm and frustration).
After the prime, participants listened to the high-F0 or the low-F0 recording. Participants were told that the recording contained segments of an interview in which a student named “Emily” introduced herself. They were told to listen to the recording carefully as they would be asked a few questions about the speaker later. However, before the evaluation task, all participants were instructed to complete a word fragment completion task introduced as a surprise, filler task. In fact, this task provided the measure of the key dependent variable of the study, namely participants’ aggressive cognition.
The task consisted of nine word fragments: atta_ _, ins_ _t, _ate, m_d, _ill, ang_ _, _age, h_t, and _ush (adapted from Kross, Ayduk, & Mischel, 2005), presented on a single page—one fragment per line—in randomized orders. Participants were instructed to provide missing letters to turn the fragments into meaningful English words. Each fragment could be completed to form at least one aggression word (e.g., attack) or one neutral word (e.g., attach), and the number of aggression words generated indexed participants’ accessibility of aggression concepts. Higher values indicate heightened aggressive cognition (i.e., thinking about aggression more).
After that, participants indicated with separate items how (a) friendly, (b) attractive, (c) warm, and (d) pretty the speaker was (in that order; 1 = not at all, 7 = very much). Perceived attractiveness and prettiness formed a reliable index of perceived attractiveness (Cronbach’s α = .92) and perceived friendliness and warmth formed a reliable index of perceived sociability (α = .79). 2 Participants then completed a measure of trait dominance by indicating how well the adjectives dominant, assertive, forceful, and aggressive described them (1 = not well at all, 7 = very well; α = .84; Anderson & Berdahl, 2002). Last, participants were debriefed and thanked for their participation; their relational status was not assessed. 3
Results
Descriptive Statistics and Manipulation Checks
Descriptive statistics were presented in Table 1. The number of aggression words in Study 1 had a slight positive skew (skewness = 0.47, SE = .24), and it was corrected with a square root transformation (skewness = −0.09, SE = .24). Results were identical whether the original or the transformed variable was used, and the original variable was used in the report for the ease of interpreting results.
Descriptive Statistics of and Intercorrelations Among the Continuous Variables in Study 1.
p < .001.
To verify that the higher-than-average and lower-than-average voices were indeed perceived to be higher and lower than the voice of an average college-aged woman, a different sample of 45 U.S. heterosexual female students (Mage = 20.2, SD = 3.05) listened to either of the two recordings. They were asked to judge how much lower or higher the speaker’s voice was compared with the voice of an average American female undergraduate student (1 = lower by a lot, 6 = the same, 11 = higher by a lot). The perception scores were centered on the midpoint of the scale (=6) and predicted in an ordinary least squares regression from the dummy-coded variable “F0-condition” (0 = high-F0 recording). The regression slope tested the between-subjects difference in pitch-level perception, and the intercept tested whether the average perception score in the high-F0 condition was higher than six. By reversing the dummy code, the intercept tested whether the average perception score in the low-F0 condition was lower than six.
As expected, participants in the high-F0 condition (n = 22) perceived the speaker’s voice as significantly higher (M = 6.41, SD = 0.96) than those in the low-F0 condition (M = 5.39, SD = 0.78), β = −.51, t(43) = −3.91, p < .001. Critically, the average perception score in the high-F0 condition was significantly higher than six, b = 0.41, t = 2.20, p = .033 (Figure 2, upper panel), and that in the low-F0 condition was significantly lower than six, b = −0.61, t = −3.34, p = .002 (Figure 2, lower panel). These results validated the F0 manipulation. 4

Frequency distributions of female participants’ (N = 45) perceptions of how much lower or higher the higher-than-average voice and the lower-than-average voice were relative to the voice of an average American female college student.
Accessibility of Aggressive Concepts
Prediction 1 stated that trait dominance would positively predict the number of aggression words generated in the high-pitch/romantic-prime condition but not the in other conditions. 5 To test this prediction, moderated multiple regressions were run to predict the number of aggression words from trait dominance (mean-centered), prime type (dummy-coded), speaker pitch (dummy-coded), and their interaction terms. As predicted, the three-way interaction between speaker pitch, prime type, and trait dominance was significant, t(94) = 2.25, p = .027, 95% CI [0.12, 1.89] (Figure 3).

The three-way interaction between female participants’ trait dominance, prime type, and the speaker’s pitch level on participants’ cognitive accessibility of aggressive concepts, indexed by the number of aggression words generated—Study 1.
Confirming Prediction 1, trait dominance positively predicted the number of aggression words generated by participants who listened to the high-F0 recording under the romantic prime, β = .67, t(94) = 3.55, p = .001, 95% CI [0.34, 1.21], but not in the other conditions, ts < 1.
The effect of trait dominance for those who listened to the high-F0 recording (β = .67) was then compared with the effect of trait dominance for those who listened to the low-F0 recording (β = −.08) in the romantic-prime condition (coded as 0). The trait dominance × speaker pitch interaction under the significant three-way interaction tested the difference between the two slopes. Analyses confirmed that the difference was significant, β = −.48, t = −2.79, p = .006, 95% CI [−1.48, −0.25]. Similarly, the interaction between trait dominance and prime type compared the effects of trait dominance for participants receiving the two primes (βromantic-prime = .67 vs. βcontrol-prime = −.06) in the high-pitch condition (coded as 0). The difference was also significant, β = .52, t = 2.90, p = .005, 95% CI [−1.42, −0.27]. Prediction 1 was supported.
Speaker Evaluations
Prediction 2 stated that trait dominance would negatively predict the speaker’s perceived attractiveness and sociability in the high-pitch/romantic-prime condition but not the in other conditions. A multivariate analysis of variance with attractiveness and sociability ratings as dependent measures, speaker pitch and prime type as fixed factors, and trait dominance as a covariate revealed no significant effects, Fs < 1. Prediction 2 was not supported.
Did aggressive cognition predict speaker evaluations? To address this question, a first moderated multiple regressions was run to predict the speaker’s perceived attractiveness from the number of aggression words (mean-centered), prime type (dummy-coded), speaker pitch (dummy-coded), and their interaction terms. Because the speaker’s perceived attractiveness significantly correlated with their perceived sociability (see Table 1), perceived sociability was entered as a covariate. Analyses revealed a significant simple slope of aggressive cognition, β = −.26, t(92) = −2.12, p = .037, 95% CI [−0.35, −0.01]; βperceived sociability = .66, t = 8.76, p < .001, 95% CI [0.49, 0.77]. The variance inflation factor of perceived sociability was 1.02, suggesting that collinearity was not a concern. Repeating the analysis using the speaker’s perceived attractiveness as a covariate provided no evidence that aggressive cognition predicted the speaker’s perceived sociability, β = .15, t = 1.12, p = .27, 95% CI [−0.08, 0.29]. In other words, participants who thought about aggression more were more likely to downgrade the attractiveness but not the sociability of a woman with a higher-than-average voice under a romantic prime.
Discussion
Compared with hearing a lower-than-average female voice, hearing a higher-than-average female voice elicited more aggressive thoughts from dispositionally more competitive women when they were romantically aroused. No such effects were observed with less competitive women or when listeners were aroused in a nonromantic context. These findings support sexual selection theory and extend previous research on female hostility (O’Connor et al., 2011; O’Connor & Feinberg, 2012; Puts et al., 2011) by showing that women’s cognitive aggressive response to a higher-than-average female voice is person-specific (e.g., depending on listeners’ baseline competitiveness) and contextually sensitive (e.g., depending on current goals).
This finding is novel because previous research typically had participants evaluate voices of different pitch levels without specifying the context in which those evaluations were made (e.g., O’Connor et al., 2011; O’Connor & Feinberg, 2012; Puts et al., 2011). While performing those evaluations may be contextually invariant, women’s aggressive cognition is likely not. After all, an evolutionary analysis of aggression suggests that aggression is costly and should not be used unless it yields compensatory benefits (Archer, 1988). Consistent with this view, this study showed that female participants’ aggressive cognition was enhanced only on the activation of a fitness-relevant goal, that is, to compete with a threatening rival for a desirable man. It was also found that heightened aggressive cognition negatively predicted the speaker’s perceived attractiveness in the high-pitch/romantic-prime condition. This finding suggests that aggressive cognition is linked to speaker derogation, corroborating the view that concept accessibility operates in concert with other psychological mechanisms (e.g., overt perceptions) to promote adaptive responses (e.g., Miller & Maner, 2011).
Study 2
Study 2 aimed to extend Study 1 by examining how different types of mating motives influence women’s aggressive responses to, and evaluation of, a female speaker with a higher-than-average voice. In Study 1, the romantic prime did not specify whether the protagonist in the story was considering a long-term (e.g., marriage) or a short-term relationship (e.g., a one-night stand) with the man. However, theory and research suggest that women will be more aggressive against same-sex rivals when they are competing for a long-term partner than when competing for a short-term partner.
First of all, compared with short-term relationships, long-term relationships provide benefits that are more substantial and crucial to women’s reproductive success. As previously discussed, reproduction is a long, costly process for women to complete on their own in an ancestral environment. Men’s continuous provisioning before and after women’s birth giving directly improves mothers’ and infants’ well-being, but women only receive this sustained investment in long-term relationships (Puts, 2010). Consequently, women rank men’s financial prospect as one of the most essential selection criteria for a marriage partner but not for a sexual partner (Li & Kenrick, 2006).
In contrast, while it is possible for women to extract immediate resources from short-term relationships, sequestering prolonged, reliable investment from short-term partners is difficult and not the primary goal of having short-term relationships (Buss & Schmitt, 1993). Instead, by mating with men of better or more compatible genes in short-term relationships, women are more likely to have children who can survive environmental hardship (Gangestad & Simpson, 2000). Despite this benefit, however, women still face the long-run challenge of raising their offspring to sexual maturity; resource accruement remains a problem that short-term mating does not solve. Women should thus compete harder for desirable long-term partners.
Supporting this hypothesis, Griskevicius, Cialdini, and Kenrick (2006) found that women showed enhanced verbal creativity as a courtship display after receiving a long-term mating-motive prime than after receiving a short-term mating-motive prime. Furthermore, women are more stressed over their partner’s emotional infidelity (signaling men detaching from a long-term relationship) than over their partner’s physical infidelity (which could indicate a mere fling; Harris, 2003). Finally, activating a mate-guarding motive—which is to retain a long-term partner—has been shown to cause chronically jealous women to evaluate attractive women more negatively (Maner, Miller, Rouby, & Gailliot, 2009).
Given the above considerations and findings of Study 1, a higher-than-average female voice is predicted to cause female participants higher in trait dominance to think about aggression more (Prediction 1) and to evaluate the speaker more negatively (Prediction 2) under a long-term mating-motive prime than under a short-term mating-motive prime.
Method
Participants and Design
A hundred and eleven heterosexual, American female undergraduate students (Mage = 19.5, SD = 1.66) participated in exchange for course credit. They received either a long-term (n = 53) or a short-term (n = 58) mating-motive prime, and their trait dominance was measured as a continuous moderator.
Procedure and Measures
To activate different mating motives, participants viewed photos of four men (obtained from an online dating website) aged between 20 and 25 years, and were asked to select one of them for “a long-term, romantic relationship” or for “a short-term, sexual relationship.” 6 After making the selection, participants were instructed that: “Please imagine that you are going on a first date with your chosen man as a long-term, romantic partner/as a short-term, sexual partner. What would be a perfect first date with this person like?” Participants were given 3 minutes to write about their ideas before listening to the high-F0 recording that was used in Study 1. They then completed the word fragment completion task, evaluated the speaker’s attractiveness (how “attractive” and “pretty” the speaker is; α = .89) and sociability (how “friendly” and “warm” the speaker is; α = .81), and filled out the trait dominance measure (α = .81) as in Study 1. Finally, participants indicated whether they were currently in a committed, long-term relationship (1 = yes, 2 = no; n of yes = 34).
Results
Descriptive statistics are presented in Table 2. The distribution of the number of aggressions words in Study 2 had a skewness statistic of 0.35 (SE = .23). There was no evidence that the distribution significantly differed from normality, z = 1.52, p = .065. No transformation was made to the dependent variable.
Descriptive Statistics of and Intercorrelations Among the Continuous Variables in Study 2.
p < .05. **p < .001.
Aggressive Cognition
Prediction 1 stated that, after listening to a higher-than-average female voice, participants higher in trait dominance would generate more aggression words under a long-term but not a short-term mating-motive prime. Moderated multiple regressions were run to test this prediction, with trait dominance, mating-motive prime, and their interaction term entered as predictor variables. Confirming Prediction 1, trait dominance positively predicted the number of aggression words generated in the long-term prime condition, β = .40, t(107) = 3.06, p = .003, 95% CI [0.14, 0.64], but not in the short-term prime condition, β = −.07, t < 1. The two slopes significantly differed from each other, β = −.32, t = −2.48, p = .015, 95% CI [−0.82, −0.09]; see Figure 4.

The two-way interaction between female participants’ trait dominance and different mating-motive primes on participants’ cognitive accessibility of aggressive concepts, indexed by the number of aggression words generated.
Participants’ relational status (0 = in a long-term relationship) was then entered as a moderator. While there was a trend that participants who were currently in a long-term relationship generated more aggression words than those who were not, β = −.24, t(103) = −1.71, p = .091, 95% CI [−1.37, 0.10], the three-way interaction between trait dominance, prime type, and relational status was nonsignificant, β = .01, t < 1. Prediction 1 was supported.
Speaker Evaluation
Prediction 2 stated that, after listening to a higher-than-average female voice, participants higher in trait dominance would perceive the speaker to be less attractive and sociable under a long-term but not a short-term mating-motive prime. Failing to support this prediction, regression analyses revealed no significant main or interaction effects of trait dominance and its interaction with prime type, ps ≥ .17.
Did aggressive cognition predict speaker evaluations? Similar to Study 1, the speaker’s perceived attractiveness was predicted from the number of aggression words, motive prime, their interaction term, and perceived sociability. There was no evidence for a negative correlation between aggressive cognition and perceived attractiveness, β = −.08, t < 1, under the long-term mating-motive prime.
Discussion
Study 2 extended Study 1 by showing that the effect of a higher-than-average female voice on women’s aggressive cognition was specific to the activation of a long-term but not a short-term mating motive. This finding suggests that, rather than reacting indiscriminately to a threatening rival in intrasexual competition, women become more cognitively aggressive only when they think they are competing for a relationship that, on average, benefits their fitness more. This finding corroborates previous findings that women (a) showed better quality courtship displays when primed with a long-term mating motive and (b) are more upset about their partner defecting from a long-term relationship than from a short-term relationship (Harris, 2003).
Similar to Study 1, there was no evidence that dispositionally more competitive women perceptually downgrade the speaker’s vocal attractiveness when primed with intense intrasexual competition. This could be a methodological artifact; the priming effect might not have lasted long enough with the evaluations being made after the word-fragment completion task. But it is also possible that pitch, after all, does not reliably predict women’s vocal attractiveness as perceived by women (Jones et al., 2010) or by both men and women (Babel et al., 2014). More research is thus needed before a definitive conclusion can be made about women’s evaluation of female speakers with different pitch levels in intrasexual competition.
This study failed to replicate the negative correlation between aggressive cognition and attractive evaluation found in Study 1. To what extent aggressive cognition predicts speaker evaluations thus remains unclear. Perhaps rather than downgrading a woman’s vocal attractiveness, which is difficult to achieve (“an attractive voice is an attractive voice”), female listeners are more ready to attribute negative attributes to higher pitched women (O’Connor et al., 2011; Puts et al., 2011). Aggressive cognition may predict negative attributions more reliably than negatively predicting positive attributions. This research was unable to test this possibility because it did not include measures of negative attributions.
This study used only the high-F0 recording, and it might be argued that female participants’ aggressive cognition only reflected the priming effect and was irrelevant to the speaker’s pitch. However, previous research provided no evidence that a romantic prime (Griskevicius et al., 2009) or a sexual prime (Ainsworth & Maner, 2012) by itself causes female aggression. Thus, the heightened aggressive cognition found in this study must be the joint effect of the high-F0 recording and the long-term mating-motive prime.
General Discussion
Implications for Speaker Evaluation Research
The current findings, along with previous findings of female hostility toward higher female voices (e.g., O’Connor et al., 2011; Puts et al., 2011), challenge the vocal attractiveness stereotype (Zuckerman & Driver, 1989; Zuckerman et al., 1990). However, these seemingly contradictory findings may in fact consistently suggest that speaker evaluation is driven by fundamental goals, that is, goals related to survival and reproduction (Kenrick, Griskevicius, Neuberg, & Schaller, 2010).
Specifically, Zuckerman and colleagues (Zuckerman & Driver, 1989; Zuckerman et al., 1990) found a universal vocal attractiveness stereotype probably because they used attributes (e.g., honesty, likeability, and warmth) that track a person’s social attractiveness. This trait likely indexes one’s desirability as a social partner (e.g., a friend), which listeners of both sexes will value given the basic need of affiliation (Baumeister & Leary, 1995). In comparison, the studies that challenge the vocal attractiveness stereotype target the traits of a female speaker (e.g., physical attractiveness) that men and women differentially value in a mating context. While a higher-than-average female voice indicates to men the presence of a desirable partner, the same voice suggests to women the presence of a threatening rival. Not surprisingly, then, men and women generate different-valenced evaluations of higher female voices.
In addition to affiliation and mate acquisition, other fundamental goals include self-protection, strive for status, and parenting (Kenrick et al., 2010), and they correspond to the major dimensions of speaker evaluation in previous studies: personal integrity (Lambert, Frankel, & Tucker, 1966), competence (Bradac & Mulac, 1984), and benevolence and kind-heartedness (Giles, 1971). Presumably, evaluating speakers’ personal integrity facilitates the avoidance of cheaters in social exchange, evaluating speakers’ competence facilitates status competition, and evaluating speakers’ benevolence and kindheartedness facilitates the search for partners that will make good parents. These considerations lead to testable hypotheses. For example, priming a self-protection motive should make listeners more sensitive to a male (but perhaps not a female) speaker’s aggressiveness than to his talkativeness, despite that both traits were shown to load on the same “dynamism” factor (Zahn & Hopper, 1985). Thus, an evolutionary framework has the potential to expand and renovate speaker evaluation research.
More generally, the current research answers to Cargile and Bradac’s (2001) call for studying the cognitive and affective mechanisms underlying speaker evaluation. To the current author’s knowledge, at least among voice evaluation studies, only Shoup-Knox and Pipitone (2015) examined participants’ responses (e.g., physiological arousal) that are not overt perceptions. This research thus adds to the literature by revealing another mechanism (i.e., aggressive cognition) that is likely associated with women’s evaluation of female speakers.
Implications for Female Voice Research
The accumulating evidence of women’s increased hostility toward higher female voices suggests that speaking with a higher voice in intrasexual competition is more costly for women than speaking with a lower voice. Because men find higher female voices more attractive, women are motivated to speak with a higher voice (e.g., relative to their natural mean pitch) to sound more attractive (Leongómez et al., 2014). 7 With all else being equal, women who adopt this signaling strategy are likely more reproductively successful than women who do not.
However, there are at least two constraints to this attractiveness enhancement strategy. First, individuals vary in their natural F0 range depending on how much they can elongate their vocal folds (Titze, 2000). Due to this physical constraint, a woman with a naturally low voice (e.g., 180 Hz) and a small F0 range (e.g., 20 Hz) will never be able to sound as attractive as a women with a natural higher voice (e.g., 220 Hz) by just changing her pitch. Second, even given the same F0 range, pitch raising differentially affects the perceived attractiveness of women with different natural mean pitch. Indeed, Borkowska and Pawlowski (2011) found that, while Polish female speakers with “low” (mean F0 = 185 Hz) and “medium” (mean F0 = 224 Hz) voices were perceived to be more attractive after their voices were experimentally raised by 20 Hz, women with “high” (mean F0 = 262 Hz) and “very high” voices (mean F0 = 310 Hz) were in fact perceived as less attractive after their voices were raised by the same amount. This finding suggests that pitch raising is a more effective attractiveness enhancement strategy for women with low to medium natural voices than for women with higher natural voices.
The findings of this research add to the two constraints by suggesting a social cost to be imposed on women when they raise their pitch to sound more attractive. That is, a higher-than-average female voice causes women who are dispositionally more competitive to think about aggression more when being romantically aroused. This enhanced aggressive cognition, along with negative-attribution making (O’Connor & Feinberg, 2012; Puts et al., 2011) and jealousy (O’Connor et al., 2011), can lead to further hostile treatments to the signaler, such as social exclusion and physical aggression. The costs associated with speaking with a higher voice can thus deter its use by women who are not genuinely motivated to compete, allowing voice pitch to reliably indicate women’s courtship intent (cf. Searcy & Nowicki, 2005).
Limitations and Future Directions
The present research only used the word fragment completion task to measure women’s aggressive cognition. Even though this is a valid measure of aggressive cognition, it lacks proper control over lexical frequency (as multiple neutral or aggression words can be generated). While this does not invalidate the conclusion of this research, it prevents more accurate estimation of participants’ cognitive aggressiveness. Future research may consider using other measures of cognitive aggressiveness (e.g., lexical decision tasks that measure response time to aggression words) to replicate the present findings.
Second, this research used only one woman’s voice as stimuli, which limits finding generalizability. However, since the research is (to the author’s knowledge) the first of its kind, it focused more on establishing internal validity by minimizing the varying elements across studies. Had a different voice been used and a null finding observed in Study 2, it would be difficult to determine whether the null finding was due to the use of a different prime or a different voice. Furthermore, the observed effects are unlikely attributable to any unique features of the speaker’s voice, because the same voice was used and both studies adopted a between-subjects design. Other qualities of the voice than its mean F0 might or might not have induced aggressive cognition, but these biases (if any) should have been hold constant cross the conditions by random assignment.
For future studies, one may consider testing female participants from different age groups. Women’s fertility peaks at early 20s and will thus experience more intense intrasexual competition than those who are younger (thus not yet sexually mature) or postmenopausal (thus losing reproductive capacity). It can be expected that the effect of pitch and romantic prime on aggressive cognition will be weaker among those younger and older participants.
Future studies may also consider replicating this research by using other acoustic features than pitch to operationalize women’s vocal attractiveness. After all, both men and women can manipulate their voice in multiple ways, including varying their vocal breathiness, nasality, and resonance (to name just a few). In particular, growing evidence suggests that formant frequencies reliably track women’s vocal attractiveness (Babel et al., 2014; Puts et al., 2011), with higher formant frequencies indicating a shorter vocal tract and a less resonant sound (Fitch & Giedd, 1999). Men prefer women with higher formant frequencies presumably because a shorter vocal tract is associated with a smaller body size (due to anatomic constraints; Fitch & Giedd, 1999) and men (at least in the United Kingdom) prefer women with moderately small body size (Tovée, Maisey, Emery, & Cornelissen, 1999). Thus, higher-than-average formant frequencies are expected to have an effect on women’s aggressive cognition similar to that of higher-than-average pitch as demonstrated in this research.
It will also be important to conduct similar studies in different cultures. For example, van Bezooijen (1995) found that a higher female voice was rated as more attractive by Japanese participants than by Dutch participants. This finding suggests that the effects observed in this research would be stronger in cultures where the femininity stereotype for women is more salient. Cross-cultural evidence of this kind will help illuminate whether women’s aggressive cognitive response to a higher-than-average female voice is universal or culturally specific.
Conclusion
Previous research on speaker evaluation has found that listeners are more likely to associate positive attributes with individuals having more attractive voices. The findings of this research suggest that this may not always be the case. A higher-than-average female voice, which men generally find attractive, induces aggressive thoughts in female listeners who are dispositionally more competitive when they are romantically aroused. This psychology likely reflects a contextually dependent competitive strategy of women in intrasexual competition and may contribute to the negative evaluations of female speakers with higher-than-average voices.
Footnotes
Acknowledgements
Special thanks to the editor and two anonymous reviewers for their helpful suggestions during the review process.
Declaration of Conflicting Interests
The author(s) declared no conflicts of potential interests with respect to the authorship and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
