Abstract
Speech conveys rich paralinguistic information, notably the speaker’s emotional state. The acoustic expression of emotion, however, is subject to considerable variability shaped by factors such as speaker and listener gender, as well as broader cultural and linguistic contexts. This study investigates how four emotions - Happy, Sad, Angry, and Anxious - are perceived by 33 native Korean listeners. The stimuli consisted of low-pass filtered emotional utterances from ‘The Open AI Dataset Project (AI-Hub)’, which allowed for a focus on prosodic cues while removing semantic content. Results showed that recognition accuracy varied by both emotion and speaker gender: Happy was most consistently identified across all voices, while Angry was more accurately recognized in male speech and Sad in female speech. Perceived emotional intensity also differed by speaker gender: female speakers received higher intensity ratings for Happy than male speakers. In particular, female speakers’ Happy and Sad were perceived as more intense than their own Angry. These results are discussed in light of perceptual weighting of acoustic cues in emotion recognition, suggesting that gendered voice characteristics modulate how listeners extract emotional meaning from prosody alone.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
