How Speaker Gender Shapes Emotion Perception: Prosodic Cues in Low-Pass Filtered Korean Speech

Abstract

Speech conveys rich paralinguistic information, notably the speaker’s emotional state. The acoustic expression of emotion, however, is subject to considerable variability shaped by factors such as speaker and listener gender, as well as broader cultural and linguistic contexts. This study investigates how four emotions - Happy, Sad, Angry, and Anxious - are perceived by 33 native Korean listeners. The stimuli consisted of low-pass filtered emotional utterances from ‘The Open AI Dataset Project (AI-Hub)’, which allowed for a focus on prosodic cues while removing semantic content. Results showed that recognition accuracy varied by both emotion and speaker gender: Happy was most consistently identified across all voices, while Angry was more accurately recognized in male speech and Sad in female speech. Perceived emotional intensity also differed by speaker gender: female speakers received higher intensity ratings for Happy than male speakers. In particular, female speakers’ Happy and Sad were perceived as more intense than their own Angry. These results are discussed in light of perceptual weighting of acoustic cues in emotion recognition, suggesting that gendered voice characteristics modulate how listeners extract emotional meaning from prosody alone.

Keywords

emotional speech gender differences speech perception acoustic features Korean

Get full access to this article

View all access options for this article.

References

Altrov

(2013). Aspects of cultural communication in recognizing emotions. Trames, 17(2), 159–174. https://doi.org/10.3176/tr.2013.2.04

Babchuk

Hames

Thompson

(1985). Sex differences in the recognition of infant facial expressions of emotion: The primary caretaker hypothesis. Ethology and Sociobiology, 6(2), 89–101. https://doi.org/10.1016/0162-3095(85)90002-0

Banse

Scherer

K. R.

(1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70(3), 614–636. https://doi.org/10.1037/0022-3514.70.3.614

Bhatara

Laukka

Boll-Avetisyan

Granjon

Elfenbein

H. A.

Bänziger

(2016). Second language ability and emotional prosody perception. PLoS One, 11(6), Article e0156855. https://doi.org/10.1371/journal.pone.0156855

Biersack

Kempe

(2005, June 15–17). Tracing vocal emotion expression through the speech chain: Do listeners perceive what speakers feel? ISCA Workshop on Plasticity in Speech Perception (PSP2005), London, United Kingdom. https://www.isca-archive.org/psp_2005/biersack05_psp.pdf

Boersma

Weenink

(2026). Praat: Doing phonetics by computer (Version 5.0.35) [Computer Program] Version 6.4.04.

Bonebright

T. L.

Thompson

J. L.

Leger

D. W.

(1996). Gender stereotypes in the expression and perception of vocal affect. Sex Roles, 34(5/6), 429–445. https://doi.org/10.1007/BF01547811

Breiman

(2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Brody

L. R.

Hall

(1993). Gender and emotion. In Lewis

Haviland

(Eds.), Handbook of emotions (pp. 447–461). Guilford Press.

10.

Cai

(2023, August 7–11). An acoustic analysis of Berlin database of emotional speech based on bio-informational dimensions. The 20th International Congress of Phonetic Sciences, Prague, Czech. https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2023/full_papers/489.pdf

11.

Chung

(1999, August 1–7). Vocal expression and perception of emotion in Korean. The 14th International Congress of Phonetic Sciences, San Francisco, United States.

12.

Coulombe

Martel-Sauvageau

Monetta

(2024). The expression of vocal emotions in cognitively healthy adult speakers: Impact of emotion category, gender, and age. Journal of Nonverbal Behavior, 49, 35–51. https://doi.org/10.1007/s10919-024-00472-x

13.

Dor

Y. I.

Algom

Shakuf

Ben-David

B. M.

(2025). Age-related differences in processing of emotions in speech disappear with babble noise in the background. Cognition & Emotion, 39(7), 1532–1541. https://doi.org/10.1080/02699931.2024.2351960

14.

Eagly

A. H.

(1987). Sex differences in social behavior: A social-role interpretation. Lawrence Erlbaum. https://doi.org/10.4324/9780203781906

15.

Eagly

A. H.

Wood

(2016). Social role theory of sex differences. In Wong

Wickramasinghe

Hoogland

Naples

N. A.

(Eds.), The wiley blackwell encyclopedia of gender and sexuality studies. John Wiley & Sons, Ltd. https://doi.org/10.1002/9781118663219.wbegss183

16.

Eagly

A. H.

Wood

Diekman

A. B.

(2000). Social role theory of sex differences and similarities: A current appraisal. In Eckes

Trautner

H. M.

(Eds.), The developmental social psychology of gender (pp. 123–174). Lawrence Erlbaum Associates Publishers.

17.

Ekman

(1992). Are there basic emotions? Psychological Review, 99(3), 550–553. https://doi.org/10.1037/0033-295X.99.3.550

18.

Fan

Zhang

Zhou

Guan

Ding

(2025). Emotional prosody perception in Mandarin: Effects of age, hearing, education, and cognition. Psychology and Aging, 40(7), 727–739. https://doi.org/10.1037/pag0000909

19.

Feldman

L. A.

(1995). Variations in the circumplex structure of mood. Personality and Social Psychology Bulletin, 21(8), 806–817. https://doi.org/10.1177/0146167295218003

20.

Fischer

A. H.

Jansz

(1995). Reconciling emotions with Western personhood. Journal for the Theory of Social Behaviour, 25(1), 59–80. https://doi.org/10.1111/j.1468-5914.1995.tb00266.x

21.

Juslin

P. N.

Laukka

(2003). Communication of emotions in vocal expression and music performance: Different channels, same code? Psychological Bulletin, 129(5), 770–814. https://doi.org/10.1037/0033-2909.129.5.770

22.

Knoll

M. A.

Uther

Costall

(2009). Effects of low-pass filtering on the judgment of vocal affect in speech directed to infants, adults and foreigners. Speech Communication, 51(3), 210–216. https://doi.org/10.1016/j.specom.2008.08.001

23.

Laukka

Elfenbein

H. A.

Thingujam

N. S.

Rockstuhl

Iraki

F. K.

Chui

Althoff

(2016). The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features. Journal of Personality and Social Psychology, 111(5), 686–705. https://doi.org/10.1037/pspi0000066

24.

Lausen

Schacht

(2018). Gender differences in the recognition of vocal emotions. Frontiers in Psychology, 9, 882. https://doi.org/10.3389/fpsyg.2018.00882

25.

MacCallum

J. K.

Olszewski

A. E.

Zhang

Jiang

J. J.

(2011). Effects of low-pass filtering on acoustic analysis of voice. Journal of Voice, 25(1), 15–20. https://doi.org/10.1016/j.jvoice.2009.08.004

26.

Meireles

Mixdorff

(2020, May 25–28). Voice quality in low and high registers in two different styles of singing. The 10th International Conference on Speech Prosody (SP 2020), Tokyo, Japan. https://doi.org/10.21437/SpeechProsody.2020-123

27.

Morton

E. S.

(1977). On the occurrence and significance of motivation-structural rules in some bird and mammal sounds. The American Naturalist, 111(981), 855–869. https://doi.org/10.1086/283219

28.

Morris

Wang

Raskin

M. S.

(2023). Analysis of emotional prosody as a tool for differential diagnosis of cognitive impairments: A pilot research. Frontiers in Psychology, 14, Article 1129406. https://doi.org/10.3389/fpsyg.2023.1129406

29.

Patman

Foulkes

McDougall

(2025, August 17–21). Evaluating the suitability of acoustic parameters for capturing breathy voice in non-pathological female speakers. The 26th Annual Conference of the International Speech Communication Association (Interspeech 2025), Rotterdam, The Netherlands. https://doi.org/10.21437/Interspeech.2025-180

30.

Posner

Russell

J. A.

Peterson

B. S.

(2005). The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and Psychopathology, 17(3), 715–734. https://doi.org/10.1017/S0954579405050340

31.

Sauter

D. A.

Eisner

Calder

A. J.

Scott

S. K.

(2010). Perceptual cues in nonverbal vocal expressions of emotion. Quarterly journal of experimental psychology, 63(11), 2251–2272. https://doi.org/10.1080/17470211003721642

32.

Scherer

K. R.

(2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1-2), 227–256. https://doi.org/10.1016/S0167-6393(02)00084-5

33.

Scherer

K. R.

Banse

Wallbott

H. G.

Goldbeck

(1991). Vocal cues in emotion encoding and decoding. Motivation and Emotion, 15(2), 123–148. https://doi.org/10.1007/BF00995674

34.

Scherer

K. R.

Feldstein

Bond

R. N.

Rosenthal

(1985). Vocal cues to deception: A comparative channel approach. Journal of Psycholinguistic Research, 14(4), 409–425. https://doi.org/10.1007/BF01067884

35.

Scherer

K. R.

Ladd

D. R.

Silverman

K. E. A.

(1984). Vocal cues to speaker affect: Testing two models. Journal of the Acoustical Society of America, 76(5), 1346–1356. https://doi.org/10.1121/1.391450

36.

Sen

Isaacowitz

Schirmer

(2017). Age differences in vocal emotion perception: On the role of speaker age and listener sex. Cognition & Emotion, 32(6), 1189–1204. https://doi.org/10.1080/02699931.2017.1393399

37.

Snel

Cullen

(2011, May 21–25). Obtaining speech assets for judgement analysis on low-pass filtered emotional speech. The 1st International Workshop on Emotion Synthesis, rePresentation, and Analysis in Continuous spacE (EmoSPACE 2011) (in conjunction with IEEE FG 2011 conference), Santa Barbara, CA, USA. https://doi.org/10.1109/FG.2011.5771358

38.

Snel

Cullen

(2013, September 2–5). Judging emotion from low-pass filtered naturalistic emotional speech. The 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII 2013), Geneva, Switzerland. https://doi.org/10.1109/ACII.2013.62

39.

Teshigawara

Amir

Wlosko

E. M.

Avivi

(2007, August 6–10). Effects of random splicing on listeners’ perceptions. The 16th International Congress of Phonetic Sciences (ICPhS XVI), Saarbrücken, Germany. https://www.icphs2007.de/conference/Papers/1303/1303.pdf

40.

Thompson

W. F.

Balkwill

L. L.

(2006). Decoding speech prosody in five languages. Semiotica, 158, 407–424. https://doi.org/10.1515/SEM.2006.017

41.

Wang

Fang

Ding

(2024). Gender differences in acoustic-perceptual mapping of emotional prosody in Mandarin speech. Corpus-based Studies across Humanities, 2(2), 235–264. https://doi.org/10.1515/csh-2024-0025

42.

World Economic Forum . (2025). Global gender gap report 2025. https://www.weforum.org/reports/global-gender-gap-report-2025/

43.

Lee

W. L.

Liu

Birkholz

(2013). Human vocal attractiveness as signaled by body size projection. PLoS One, 8(4), Article e62397. https://doi.org/10.1371/journal.pone.0062397

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.30 MB

0.00 MB