Abstract
For more than a century expressions have been approached as bidimensional, static, instantaneous, self-contained, well-defined, and universal signals. These assumptions are starting to be empirically reconsidered: this special section of Emotion Review includes reviews on the physical, social, and cultural dynamics of expressions, and on the complex ways in which, throughout the lifespan, facial behavior and emotion are perceived and categorized by primates’ and humans’ brain. All these advances are certainly paving the way for new exciting approaches to facial behavior more likely to strike an appropriate balance between description and explanation.
Keywords
Kagan (2007) distinguishes two different strategies for the study of emotion: in the top–down strategy, theory drives observation, and categories fit into the a priori concepts of an explanatory model. What the model gains in clarity of exposition and straightforward tests of its hypotheses is, on the other hand, lost in representativeness. Top–down strategies cannot provide a proper explanation of those phenomena that are not included in their a priori categories. The bottom–up strategy requires a long period of data collection in order to elaborate a theory that is as inclusive as possible. What bottom–up strategies gain in representativeness is lost in clarity and testability. Darwin’s development of his evolutionary theory is a prototypical example of the second strategy. He spent no less than 20 years collecting data and annotations in his notebooks before launching his theory.
These two conceptual strategies are complementary, and they coincided for several years in the study of the expression of emotion. During the 1970s and early 1980s ethology provided field observations whose epitome is Eibl-Eibesfeldt’s (1989) work on human behavior across an impressive number of cultures. Around the same period, psychology followed the top–down strategy, creating some parsimonious and attractive theories based on experimental studies. The most notable example of such work is Ekman’s neurocultural theory (1972), initially based on one experiment (Friesen, 1972).
Eventually, the top–down strategy prevailed and the psychological approach became popular. The history is well known. The pioneering work of Tomkins’ disciples—Carroll Izard, Paul Ekman, and Wallace Friesen—not only constituted the most important and popular chapter in the study of nonverbal communication, but decisively promoted a renaissance in the study of emotion. The new concept of facial expression helped to connect psychological causes and evolutionary functions, and to maintain a stream of behavioral research during the paper-and-pencil cognitivist era.
But such endeavor took its toll on the balance between explanation and description. The psychological approach to facial expression is a quintessential product of the top–down strategy: it assumes simple, a priori conceptual categories that look evident because they fit into researchers’ and the public’s commonsense assumptions, but they are actually grounded on problematic empirical evidence. The testing of “expressions of emotion” has also some technical limitations that arbitrarily shaped researchers’ explanations about where, when, how, and—most important—why facial behavior happens.
Paradoxically, Darwin, the champion of the bottom–up strategy, was responsible for inspiring some of the a priori assumptions of the current top–down approach. Darwin’s The Expression of the Emotions in Man and Animals (1872/1965) was a sort of popular book in which he indulged in anecdotal stories and retouched illustrations for the sake of persuasiveness. Unfortunately, Darwin did not have access to some primitive forms of motion pictures (e.g., Muybridge’s; see Fernández-Dols & Ruiz-Belda, 1997) and he followed an age-old tradition in the arts, basing his arguments on still representations of expressions, chosen on subjective or aesthetic bases. For example, Darwin’s most famous photographic plates for his 1872 book were actually drawings based on photographs of posed faces instructed by Darwin himself (Prodger, 2009). Darwin’s choice of de-contextualized still figures became, for more than a century, a theoretical prescription: movement and context were banished from the psychologists’ experimental study of expression.
Of course, Darwin’s propagandistic icons were not the only factor that helped to institute this static view of facial expression. Up to the 1980s, recording films or primitive video tapes required special lighting and restricted movements. Developing films was expensive and time-consuming, and video image resolution was low. There was no widespread use of reliable methods for the description of fine facial muscular movements, and when such methods were available, their use was extremely time-consuming and complex, leading researchers to the analysis of very short sequences of facial behavior.
In this way, 19th-century traditional assumptions about expression and 20th-century technical limitations conspired to support the a priori assumption that “expressions of emotion” were brief appearances of some static muscular configurations.
Moreover, the experimental design of the emotional events aimed at eliciting such “expressions” was afflicted by a similar mix of commonsense assumptions and practical restrictions. Most, if not all, typical antecedents of emotion (such as love affairs for happiness, losses of loved ones for sadness, friends’ betrayals for anger, traffic accidents for fear, and so on; see, e.g., Wallbott & Scherer, 1986) are extremely difficult or impossible to produce in the laboratory. Thus, less typical, or indeed quite unnatural, stimuli were expected to elicit typical emotions and their natural expressions.
A conspicuous example of such unnatural stimuli is movies. Researchers thought movies would provide the solution to the challenge they faced. Movies are make-believe devices that popular wisdom identifies as natural emotion elicitors. However, the way in which movies cause emotions is not a solution, but more of a research problem in itself. Movies probably constitute the least natural set of stimuli ever produced. They are bidimensional, dynamic events that represent other events “seen from outside” through a huge number of temporal and spatial conventions (e.g., the action of the represented event rarely takes place in real time); moreover, movies apparently allow identical reproductions of complex events—you can push the key “play” as many times as you want, a recent feature of the perceptual world absent throughout millions of years of evolution. The emotions of movie audiences are filtered and interfered with by extremely sophisticated forms of media literacy. The assumption that movie automatically elicit pure, ancestral, midbrain basic emotions impervious to neocortical processes would seem too naïve from both the evolutionary and psychological points of view.
Last but not least, the arbitrary assumptions about the concept of expression and its elicitors were reinforced by some accidental factors that had transcendental but undesirable theoretical consequences. Up until the end of the 20th century, people were not used to being photographed or filmed by cameras—a trivial appliance today, but an awesome device not so many years ago. Researchers feared that participants’ attention would be focused on the camera, and this led to the introduction of hidden cameras. But the concealment of cameras encouraged an additional common sense-based but actually untested assumption: Private expressions were necessarily true, and public expressions were probably false. Thus, “expressions of emotion” were categorized in terms of implicit two-value logic with a crisp binary decision: true versus false.
All in all, research on the “expression of emotion” included the following prescriptions and their corresponding implicit assumptions:
Facial expressions are bidimensional stimuli (sender’s and receiver’s position in a three-dimensional space are irrelevant features).
Facial expressions are instantaneous, brief, static facial configurations (muscular movement per se is not a relevant feature).
Distinctive facial information is based on extreme positions of the muscles, muscular tension being synonymous with emotion intensity (the sequence and timing of the unfolding of facial muscles is irrelevant).
The distinctiveness of static close-ups of facial expression is based on self-contained facial information (contextual information, including simultaneous body behavior, is irrelevant).
Facial expressions of basic emotion can be elicited by any kind of artificial stimulus (the production of facial expressions is impervious to the symbolic features of the stimuli).
Facial expressions must be described in terms of a two-value logic that distinguishes between true and false expressions (the referential value of facial expressions is not fuzzy, and does not depend on the context).
To these six assumptions can be added a seventh, following by deduction from the last one:
Facial expressions are universal. True facial “expressions of emotion” would be a “hard,” fixed pattern of behavior cast from the parents’ genes that can be isolated in humans—irrespective of their age, gender, or cultural background—across situations. The assumption that “true” smiles leak any individual’s happiness from cradle to grave is one example of this assumption.
A “New Look” at the Study of Facial Expression
The only potentially solid empirical grounds for this minimalist view of facial expressions (as bidimensional, static, instantaneous, self-contained, crisp, and universal signals) would consist in showing that such signals are the smallest and briefest amount of consistent information about the emotional state of the sender. This hypothesis has been systematically tested through recognition studies, that is, with a focus on the receiver rather than on the sender of the expression. As Nelson and Russell discuss in this special section (2013; see also Russell, 1994), practically all of these studies consisted in asking participants to verbally categorize carefully posed expressions. Ironically, these posed expressions construed by researchers—following their own commonsense assumptions—have been considered not just true but normative, while deliberate expressions produced by participants have been considered unreliable and unworthy of any test. A research program that sets out to test the informational value of the sender’s facial expressions should be based mainly on the senders’ actual expressions, rather than on the capacity of the receiver to decode artificial stimuli. Even if such capacities were confirmed for posed artificial expressions, these findings would not necessarily confirm the value of expressions as natural signals of emotion. Showing that a capacity, process, or behavior can exist in all human beings does not mean that it is actually functional and accessible in natural circumstances. All humans, with the proper training, can understand basic Western arithmetic, but arithmetic is not a naturally given way of dealing with quantity (Norenzayan & Heine, 2005). In fact, the conclusions of studies on the recognition of actual, natural expressions (e.g., Naab & Russell, 2007) are far from confirming the aforementioned minimalist hypothesis. Reisenzein, Studtmann, and Horstmann (2013) and Fernández-Dols and Crivelli (2013) provide additional evidence that helps to explain such inconclusiveness: Experimental and field studies do not confirm the existence of static, instantaneous, self-contained, crisp, and universal expressions of basic emotion.
Fortunately, the described seven assumptions about facial expression are starting to be empirically reconsidered:
Researchers are beginning to emphasize that looking at a face is an active process which must take into account the relative position of sender and receiver in a spatial location. Atkinson and Smithson’s (2013) and Rigato and Farroni’s (2013) articles present two examples of this approach that opens the way to the consideration of gaze and relative position of the target facial expression as key factors in facial behavior and its corresponding neural processes.
Facial behavior urgently requires a dynamic approach, and the development of such an approach is, fortunately, already under way, with some promising initial work. Krumhuber, Kappas, and Manstead (2013) review such studies, which herald a new era in the design and selection of expressions as stimuli; this in turn raises a number of fascinating questions about the concept of expression itself (e.g., facial muscles do not move synchronically into a static outcome, as suggested by the icons of facial expressions).
Subtle or isolated muscular movements may constitute an embodiment of different cognitive and affective processes, and taking this into account will lead to much more sophisticated views of facial expression. Scherer, Mortillaro, and Mehu (2013; see also Scherer, Clark-Polner, & Mortillaro, 2011) discuss a conceptual and empirical approach to more inclusive views of facial behavior and emotion. In addition, sequence and interaction of facial behavior can play a substantive role in natural emotional displays. Waller and Micheletta (2013), discuss how observational and anatomical studies can contribute to explaining the causes and functions of natural repertoires of facial motor behavior.
Context is receiving increasing attention in different laboratories. This special section includes a review by Hassin, Aviezer, and Bentin (2013) focused on the contextual factors that accompany facial behavior and its interpretation. Lindquist and Gendron (2013), for their part, summarize ongoing research on the important role of the symbolic context in the perception of facial behavior, and Widen (2013) describes the essential connections between the recognition of facial expressions and the development of the semantic categories of emotion in children.
New research approaches are emerging on the ways in which facial behavior is processed and elicited by the human brain. Whalen et al. (2013) review the role of some brain regions, such as the amygdala and the prefrontal cortex, in the processing of facial stimuli. The role of such structures seems closer to adaptive context-dependent learning than to mere encapsulated domain-specific adaptations (see also Atkinson & Smithson, 2013; Lindquist & Gendron, 2013; Rigato & Farroni, 2013). In the same vein, Fugate (2013) questions traditional approaches to the categorization of facial expressions as an outcome of predetermined, modular brain structures.
Based on the available evidence from field studies, Fernández-Dols and Crivelli (2013) propose an alternative view of facial expression as adaptive behaviors with flexible, context-dependent referential values.
Finally, the concept of universality and its limitations are discussed by Elfenbein (2013) in the framework of a linguistic metaphor that emphasizes the existence of expressive dialects, and by Nelson and Russell (2013), who discuss the claims of universality of expressions based on classic recognition studies.
Balancing Top–Down and Bottom–Up Strategies
Of course, the authors mentioned in the previous paragraph do not necessarily share the criticisms expressed here, nor are they involved in a unitary program; but their contributions are certainly paving the way for a new approach to facial behavior—one more likely to strike an appropriate balance between top–down and bottom–up views on the study of facial expression.
Rather than being the consequence of an explicit return to observational, accumulative records, the appearance of new balanced approaches to the study of facial expression is, on the bottom–up side, the consequence of some important changes in everyday assumptions and technical resources. New technical resources are helping researchers to break with assumptions based on resources from the past.
The spread of new forms of computer-based icons has consequences in the approach to expressions. Static 19th- and 20th-century icons are substituted by video clips or virtual avatars that include movement in their representations. While Darwin’s book (1872/1965) on facial expression was pioneering in the use of scientific photography, some of the work reported in this special section is equally pioneering in its incorporation of dynamic icons as standards for the description of facial expression (see Krumhuber et al., 2013; Scherer et al., 2013). Furthermore, the omnipresence of cameras and computers in everyday life increasingly provides researchers with opportunities to design experimental or field studies in which the presence of cameras or computers is part of the senders’ daily life and can unobtrusively record “natural” events (see Fernández-Dols & Crivelli, 2013; Reisenzein et al., 2013). Finally, the progressive development of software capable of simulating facial movement in great detail and of automatically coding facial movements should offer researchers a less restrictive view of the empirical and conceptual boundaries of the technical concept of facial expression. In summary, new technical devices have made researchers more aware of important sources of variation in natural emotion episodes, while enabling them to circumvent the costly traditional observational methods.
The consequences of this new technical landscape are starting to become visible thanks to the parallel development of alternative theoretical approaches to the study of emotion. The basic-emotion approach practically monopolized the study of facial expression during the 20th century. In the early years of the new century, other theories are approaching facial expression in a progressively bolder way. Appraisal theorists have overcome the customary theoretical contradiction which assumed that emotions were unlimited combinations of context-dependent affective and cognitive processes, but with a limited number of expressions. The outcome of this conceptual shift has been the emergence of analytic approaches that break facial behavior down into components related to specific appraisals or the action tendencies associated with such appraisals (Frijda & Tcherkassof, 1997; Scherer & Ellgring, 2007; Smith & Scott, 1997).
Other theories have incorporated the concept of emotion into a much broader and more complex process, whereby core affect triggers a series of processes and behaviors in which “emotion” is just an epiphenomenon (Russell, 2003), and “expression of emotion” is one of the several potential behavioral strategies included in an indeterminate number of more or less typical emotional episodes.
This combination of greater descriptive finesse and theories that emphasize the counterintuitive complexities of the causes and functions of emotion can help restore a balance between the top–down and bottom–up strategies. The principal lesson of this “new look” would be, in my view, that “expression of emotion” is a commonsense term that conceals the scientific challenge posed by a continuous flow of muscular movements from bodies moving in a three-dimensional world which produces events with flexible and context-dependent meanings.
The study of “facial expressions” should give way to the study of the facial muscular movements related, for example, to core affect, unexplained emotion, affect regulation, appraisals, action tendencies, or motives around an emotional episode. It is time to study plural, and potentially parallel, systems of facial behavior linked to different processes. Such systems are embedded in concrete situations, and not in abstract, monolithic, immutable entities called basic emotions.
Two contemporaries of Darwin, Edwin Abbott and Lewis Carroll, published popular books, namely, Flatland: A Romance of Many Dimensions (1884/2010) and Alice’s Adventures in Wonderland (1865/2008) respectively. Flatland is a world in two dimensions in which the idea of a third dimension is forbidden. In Wonderland, heads appear without their bodies and grins exist without their heads. For more than a century after The Expression of the Emotions in Man and Animals (1872/1965), facial stimuli used for studying the perception of expression were from a Flatland populated by disembodied, immobile, two-dimensional beings, while experiments on the production of expressions occurred in a sort of Wonderland where researchers studied the Cheshire cat’s grin without the Cheshire cat, looking at facial behavior removed from the natural context in which it is performed. Maybe it is time to leave those worlds.
Footnotes
Author note:
This article was funded by the Spanish Government (Grant PSI 2011-28720).
