Abstract
Here we present semantic space theory and the data-driven methods it entails. Across the largest studies to date of emotion-related experience, expression, and physiology, we find that emotion is high dimensional, defined by blends of upward of 20 distinct kinds of emotions, and not reducible to low-dimensional structures and conceptual processes as assumed by constructivist accounts. Specific emotions are not separated by sharp boundaries, contrary to basic emotion theory, and include states that often blend. Emotion concepts such as “anger” are primary in the unfolding of emotional experience and emotion recognition, more so than core affect processes of valence and arousal. We conclude by outlining studies showing how these data-driven discoveries are a basis of machine-learning models that are serving larger-scale, more diverse studies of naturalistic emotional behavior.
Emotions are complex states involving appraisals, expressions, physiological reactions, and patterns of feeling and thought (Keltner, Oatley, & Jenkins, 2018). Central to their understanding is the study of how emotional behaviors are associated with social contexts, neurophysiological responses, and self-reported experiences or perceptions (Scherer, 2005). At stake in such research are many questions: What are the emotions? Are specific emotions separated by clear boundaries, or discrete? What is primary in emotional experience, appraisals of valence and arousal or distinct emotions?
Here we present a new approach to these questions offered within semantic space theory. We first outline this theory and its central tenet that the latent dimensions of emotion can be derived quantitatively from data (Cowen, Sauter, et al., 2019). We then survey how recent studies of emotion-related experience, expression, and physiology offer tests of basic emotion theory and constructivist approaches. We conclude by considering advances in machine-learning methods for measuring emotion-related behavior.
Semantic Space Theory
Central to emotion science is the reliance upon self-reports or measures of behavior to understand how appraisals of stimuli give rise to varied emotional experiences. To approach such complexity, semantic space theory posits that emotion-related behaviors reside in what we call a semantic space (Cowen & Keltner, 2018). Much as other subjective phenomena—colors, tastes, physical sensations—are organized within broader representational spaces, the same is true of emotion. The emotional experiences or expressions that arise in an interaction with friends, for example, are determined by dynamic trajectories within a semantic space that capture how eliciting stimuli, experiences, and expressive behaviors relate to one another. Three properties define such semantic spaces and their variation across individuals and cultures (Cowen & Keltner, 2018, 2021).
A first is their dimensionality, or number of distinct dimensions or kinds of emotion that organize a semantic space. Dimensions are not to be confused with discrete categories or clusters, which are terms that describe the distribution of emotion (see next paragraph). Dimensions are typically continuous features that can combine in any number of ways and are uncovered in complex data with multidimensional reliability analyses (see next paragraph). In principle, a single emotion, such as “fear,” can contribute to multiple distinct dimensions; we find, though, that dimensions of semantic spaces of emotion most typically correspond to individual emotions.
The second property of a semantic space is the distribution of states within the space—that is, the geometric arrangement of emotions and boundaries between them. Are different kinds of emotion separated by distinct boundaries and organized within clusters, as presupposed by the phrase “discrete categories of emotion”?
Whether emotion-related phenomena are better described by emotion categories or broader appraisals is a question regarding conceptualization. Is there more explanatory value in the use of specific emotion concepts (e.g., “sympathy,” “anger”) or more general psychological features, most notably of valence and arousal, to explain emotion-related experience, expression, or physiology?
The methods typically used in emotion science assume that the dimensionality, distribution, and conceptualization of emotion are known in advance, and preclude uncovering these properties from the data. A typical study might focus on four to six emotions or two dimensions presumed to explain emotion-related responses, typically evoked using a limited number of stimuli and characterized with self-report measures that capture four to 10 emotions (Cowen & Keltner, 2017). The dimensionality of such data will, by definition, be low. The distribution of states will be biased toward isolated, discrete clusters. Forced-choice measures that focus on a narrow array of self-reports do not allow for the modeling of how people conceptualize emotion.
Given these concerns, we developed data-driven methods to map a semantic space of any emotion-related response. Participants are presented with a vast array of stimuli—short films, facial expressions, samples of music—instead of a limited number of prototypical stimuli. Participants report on 50 emotions and appraisals (Cowen & Keltner, 2017). Finally, with parallels to new approaches to brain imaging data (e.g., Davis et al., 2014), new statistical techniques reveal the patterns of relations between emotions (see Cowen & Keltner, 2017, 2018). Traditional methods, such as correlations, t tests, and emotion recognition accuracy measures, introduce assumptions about the structure of emotion and focus on testing one-to-one mappings between specific emotions and stimuli. The methods we have developed, by contrast, assess the patterns of covariation between self-reports of emotion-related experience, expression, and physiology. Whereas traditional dimensionality reduction techniques, such as factor-analytic and principal components analyses, use heuristic criteria, such as scree tests, to reduce the dimensionality of data sets, we have focused on how the multidimensional factors maximally covary across multiple data sets (such as emotion judgments from two cultures), which allows a characterization of the latent structure of multidimensional data (see Cowen et al., 2021; Cowen, Elfenbein, et al., 2019; Cowen, Sauter, et al., 2019; Demszky et al., 2020).
With these data-driven methods, we have carried out studies testing the claims of basic emotion theory and constructivist accounts, which we present in Table 1 (see Cowen & Keltner, 2020, 2021). Traditionally, these two approaches diverge in their accounts of the dimensionality of emotion, with basic emotion theorists claiming that emotion-related responses involve a small number of distinct states and constructivist accounts representing emotion along two or three dimensions of “core affect” (typically conceptualized as valence and arousal). A traditional claim of basic emotion theory is that the boundaries between emotions are discrete, which pertains to the distribution of emotions. Long debated, but little tested, are contrasting claims about conceptualization: In basic emotion theory, specific emotions, such as “awe” and “disgust,” describe distinct dimensions of emotion-related phenomena, whereas in constructivist approaches, emotions are constructed meanings within the two- or three-dimensional space of core affect. Our recent studies, spanning thousands of participants and numerous countries, converge upon three insights.
Basic Emotion Theory and Constructivist Hypotheses About Semantic Spaces of Emotion
Emotion Is High Dimensional, With Upward of 20 Distinct Kinds
The science of emotion is oriented to a small number of “the basic emotions”—anger, disgust, fear, sadness, surprise, and happiness (Ekman, 1992)—or constructivist assumptions that emotions have two or three underlying neurophysiological dimensions, typically conceptualized as valence and arousal (Russell, 2003) and sometimes a third dimension (Bakker et al., 2014), with specific emotions arising out of contingent social constructions (Asutay et al., 2021; Barrett, 2017). These are both low-dimensional accounts of emotion.
Several recent studies reveal that semantic spaces of emotion, instead, involve upward of two dozen distinct dimensions. In a first, participants rated their subjective responses (with Likert scales and in free-response format) to 2,185 short, evocative film clips in terms of 34 emotions and 14 appraisals (e.g., valence, arousal, effort, certainty). Rather than relying on factor analysis, we used a multidimensional reliability analysis method, split-half canonical correlation analyses (SH-CCA), which uncovers the distinct dimensions of response required to explain the consistencies in response across participants. For instance, if participants applied a single emotion term consistently to similar videos, independent of any other item, SH-CCA would identify this as a reliable dimension, whereas factor analysis, which measures only correlations or covariances across items, would not. (Factor analysis would also identify as dimensions items that were correlated but applied completely inconsistently to stimuli.) Figure 1a presents a two-dimensional t-distributed Stochastic Neighbor Embedding (t-SNE) plot of the results, with each color corresponding to a distinct category of emotion consensually relied on by participants to label their experiences. This embedding plot visualizes the space along two dimensions by preserving local neighborhoods of points but distorting distances between more distant points. It does not fully capture the high-dimensional space of emotion experience uncovered in this first study: People distinguished 27 distinct kinds of emotional experience.

Semantic space of emotion evoked by (a) 2,185 brief videos and (b) facial-bodily and vocal expression. At least 27 distinct emotional experiences can be evoked by video (Cowen & Keltner, 2017). Interactive map: https://s3-us-west-1.amazonaws.com/emogifs/map.html. A total of 3,523 expressions are lettered, positioned, and colored according to 28 distinct emotions (28 in facial expression [Cowen & Keltner, 2020] and 24 in vocal expression [Cowen, Elfenbein, et al., 2019]). Interactive map: https://s3-us-west-1.amazonaws.com/face28/map.html, voice: https://s3-us-west-1.amazonaws.com/vocs/map.html.
In a recent study from Japan, viewing these videos activated 34 distinct patterns of brain activation corresponding to distinct emotional experiences, coherent across Japanese participants (Horikawa et al., 2020). In studies of subjective experiences associated with music and visual art, with participants from several cultures, upward of 20 to 25 emotions emerged as distinct dimensions (e.g., Cowen et al., 2020).
What about emotional expression? Guided by semantic space theory, we had participants judge 1,500 naturalistic facial-bodily expressions in one study and 2,032 short vocalizations known as vocal bursts in another (Cowen, Elfenbein, et al., 2019; Cowen & Keltner, 2020). As Figure 1b reveals, people recognize upward of 25 distinct kinds of emotion from facial and vocal behavior, which we again visualize within a two-dimensional, nonlinear embedding. These results converge with recent studies from over 12 cultures documenting the recognition of 15 to 20 emotions in vocal bursts (Cordaro et al., 2016), prosody (Cowen, Laukka, et al., 2019), and full-body expression (Monroy et al., 2022), and cross-cultural similarities (between 50% and 70%) in facial-bodily expression of emotion (Cordaro et al., 2018, 2020) and mimicry of others’ vocal expression (Brooks et al., 2022). Across semantic spaces of experience, expression, and physiology, then, 20 to 25 distinct kinds of emotion are found.
Specific Emotions Are Heterogenous and Not Discrete
In a typical study in emotion science, prototypical stimuli are sorted into a handful of categories, and elicited responses are judged in terms of their correspondence to an expected response (Cowen, Sauter, et al., 2019). Mismatches are treated as challenges to the idea that emotion categories are stable and coherent (e.g., Barrett et al., 2019). The methods we have outlined here allow the structure of emotion to emerge from the data itself. No assumptions are made about the discreteness of emotion.
As evident in Figure 1, emotion blends are common. Within a category of emotion there are more and less prototypical experiences and expressions (Ekman, 1992). The frequency of emotion blends highlights the need to model the heterogeneity of emotion-related responses within a category, rather than treating it as noise, inaccuracy, or category incoherence; data-driven methods allow this.
A central tenet of basic emotion theory is that there are discrete boundaries between emotions (e.g., Ekman, 1992). This we find to be true for select emotions (see desire and craving in Fig. 1a). More typically, the boundaries between emotions are fuzzy, not discrete. In addition, certain cognitive appraisals arrayed on dimensions bridge distinct emotions. Appraisals of effort, for example, account for transitions from experiences of disgust to anger (see Cowen & Keltner, 2017). Categories of emotion, then, are heterogenous and not separated by discrete boundaries and are bridged by gradients of meaning.
Specific Emotions Are Primary in Emotion Conceptualization
What structures how people conceptualize an emotion-related response, specific emotions, or appraisals of valence and arousal? Answering this question has proven elusive because of the empirical focus on a limited array of emotions elicited with equally impoverished stimuli and self-report terms.
Three kinds of evidence from our studies reveal specific emotions to be primary in the conceptualization of emotion. First, across studies, valence and arousal explain only a small fraction—typically around 25%—of the variance in experiences of states like “disgust” or “awe”; reported categories of emotion, by contrast, capture nearly the entirety of the variance in valence and arousal. Valence and arousal would appear to be retrospective inferences about emotional phenomena that are fully explained by, but do little to explain, more primary, stimulus-specific representational processes.
The cross-cultural stability of self-reports offers a second test of the conceptualization primacy question. In cross-cultural research, studies typically compare the similarity with which individual members of two cultures label a stimulus with the same terms, which conflates individual and cross-cultural variation. In such studies, the maximum level of similarity that can be observed across cultures is the product of the interrater reliability within each culture. To control for individual-level variation, we use techniques that compare the stability across cultures with which specific emotions and dimensions of valence and arousal are associated with expressions and elicitors of emotion. For example, Figure 2 presents the results from a study of the recognition of emotion from 2,519 speech samples from five cultures (Cowen, Laukka, et al., 2019). Correlations between Indian and U.S. participants’ ratings reveal greater cross-cultural stability in how the voice conveys specific emotions than valence and arousal. Similar results were observed in Chinese and U.S. participants’ ratings of over 1,500 samples of music (Cowen et al., 2020).

Correlations across Indian and U.S. participants’ ratings of emotional prosody. Error bars represent standard error (estimated by bootstrapping across participants).
Recent neuroscientific studies yield a third kind of evidence pointing to the primacy of specific emotions in conceptualization processes. In an analysis of the temporal unfolding of the brain’s representations of emotion vocalizations in terms of self-reports, specific emotions manifested earlier in the brain’s response and valence and arousal later—perhaps as more “abstract,” even epiphenomenal, forms of conceptualization, posing problems for constructivist accounts that posit that valence and arousal are primary in the conceptualization of emotion (Giordano et al., 2021). In the functional magnetic resonance imaging study by Horikawa and colleagues (2020) referred to earlier, self-reports of dozens of emotions explained patterns of evoked brain activity better than ratings of valence and arousal in every brain region measured, both cortical and subcortical, suggesting that even in subcortical emotion-related processing, specific emotions are primary.
A Synthesis and Future of Emotion Science
In studies spanning multiple cultures, thousands of stimuli, and large and diverse samples, we find that 21 emotions meet Ekman’s criteria for “basic” emotions, being associated with distinct antecedents, experiences, expressions, and neurophysiological correlates (Ekman, 1992; see Cowen & Keltner, 2021). These emotions include 10 “negative” states, including anger, anxiety, boredom, confusion, disgust, embarrassment, fear, pain, sadness, and shame, and 10 “positive” states, including amusement, awe, contentment, desire, excitement, love, interest, pride, sympathy, and triumph (see also Weidman & Tracy, 2020), as well as surprise.
Our findings are descriptive. We have not explained how these emotions arise, unfold, and vary across cultures—a task for componential, discrete, and dimensional appraisal approaches, perhaps aided, we hope, by the data-driven methods summarized here. Nor have we offered functional accounts of why such a rich space of emotions emerged in hominid evolution, although findings dovetail with recent explanations of the self-conscious emotions (Sznycer & Cohen, 2021), hierarchically relevant emotions (Tracy et al., 2020), attachment- and knowledge-related emotions (Shiota et al., 2017), and aesthetic emotions (Juslin, 2019; Schindler et al., 2017).
Emotion science has been constrained by its reliance on small and homogenous samples and time-consuming approaches to behavioral measurement. Grounded in our data-driven discoveries, we have developed machine-learning “expression models” based on these methodological principles: (a) build large-scale data sets of naturalistic samples of emotional behavior, accompanied by psychologically valid measures of the meaning of the context and the behavior (as ascribed by the people forming the behavior and outside perceivers), and (b) train machine-learning models to predict the context or meaning of behaviors directly from raw auditory or visual recordings. We have used these methods to train machine-learning models to output objective measures of facial and vocal expressions based on hundreds of thousands of emotionally rich images, videos, and audio recordings.
In one study, annotations of hundreds of thousands of videos were used to train a machine-learning model that could accurately predict judgments of 16 kinds of facial expression. Importantly, the model was exposed only to pixels on the face itself, and normalization methods were used to remove effects of viewpoint and lighting. We then applied the model to a separate set of six million videos from around the world, each annotated using both video- and text-based machine-learning models to determine its social context (Cowen et al., 2021). Across 144 countries, 16 facial expressions reliably arose in naturalistic social contexts, and at minimum, there was a 70% overlap between any two broad cultural regions (e.g., western Europe, East Asia) in the covariation between facial expressions and specific social contexts.
With large-scale, experimentally controlled data sets that eliminate correlations between emotional behaviors and the traits, clothing, or environment of the people forming them, we have developed models for measuring facial expressions, vocal bursts (Brooks et al., 2022), text, and prosody (see https://hume.ai/ for more information). In two such experiments, we characterized the semantic spaces of how thousands of participants from Asian, African, South American, and western European countries mimic naturalistic facial and vocal expressions (Brooks et al., 2022). Models trained on these diverse data sets (involving hundreds of thousands of mimicked expressions) uncovered considerable cross-cultural similarity in 28 kinds of facial expression (see Fig. 3), overlapping 63% in meaning across six countries, and 24 kinds of distinct vocal expression, overlapping 79% in meaning across five countries.

Twenty-eight dimensions of facial expression in six countries discovered by an expression model (Brooks et al., 2022). We used a deep neural network (DNN) to find dimensions of facial expression across six countries, independent of facial appearance and context, by averaging ratings of 4,659 images in each country and tasking the DNN with predicting these country-wise average ratings solely from imitations of each expression. We present morphs of the average facial movements associated with the 28 dimensions and the concept most reliably associated with each dimension.
Semantic space theory and new data-driven methods allow for a new path forward in emotion science for finding answers to perennial questions about emotion in larger-scale, more diverse studies of naturalistic behavior.
Recommended Reading
Cowen, A. S., & Keltner, D. (2017). (See References). The first application of semantic space theory and its data-driven methods, finding that 27 distinct emotions are felt in response to highly evocative GIFs.
Cowen, A. S., & Keltner, D. (2021). (See References). Offers a more detailed study of the methods we profile in this article and a broader survey of the neurophysiology of distinct emotions.
Cowen, A. S., Keltner, D., Schroff, F., Jou, B., Adam, H., & Prasad, G. (2021). (See References). The first article to extend semantic space theory to machine-learning approaches to emotional expression, finding that naturalistic facial expressions of 16 emotions covary with specific contexts in highly similar ways across 144 countries.
Keltner, D., Oatley, K., & Jenkins, J. M. (2018). (See References). Provides a broad overview of what has been learned in emotion science.
