Abstract
An ongoing challenge facing emotion researchers is finding appropriate measurement tools. Many of our theories focus on emotion in the context of dynamic interaction, yet many of our most relied-upon measures either interrupt or alter interaction. New research suggests that infrared thermography may be useful as a nonintrusive way to measure emotion. Here we discuss the viability of thermography for studying emotion response and advancing emotion theory.
There is a rich tradition of theoretically driven research in the sociology of emotions (see Clay-Warner & Robinson, 2008). The value of any theory, however, is limited by the tools available to test it. Although recent advances in neuroscience and bio-physiology have greatly increased the number of tools at our disposal, many of these tools have limitations that make them inappropriate for testing theories of emotion that emphasize interactional processes. As a result emotion researchers continue to look for new measurement techniques.
One recent advance in emotion measurement is the use of infrared thermography. Infrared thermography unobtrusively measures reflected radiation, which can be used to calculate temperature. Emotion researchers are interested in the use of thermography because it permits precise measurement of facial skin temperature without the use of wires or electrodes. Research on the utility of infrared thermography for measuring emotion comes from a variety of disciplines. Psychologists have documented associations between changes in facial temperature and arousal in humans and nonhuman primates (e.g., Kuraoka & Nakamura, 2011). Engineers are using infrared thermography to help robots identify the emotions of their human interaction partners (e.g., Khan, Ward, & Ingleby, 2009). Sociologists, meanwhile, have focused on the potential of infrared thermography to measure dimensional aspects of emotion (Robinson et al., 2012).
Though there is considerable interest in the use of infrared thermography, it is not clear yet whether the technology will assist emotion researchers in their quest to test and refine theory. There is uncertainty, in part, because it is difficult to translate across the diverse disciplines that are using infrared thermography to measure emotion response. Researchers in these disciplines approach the study of emotion from different vantage points, utilize different research techniques, and use different language to describe emotion processes. As a result, it is not readily apparent what this combined body of literature says about the use of thermography for emotion measurement.
The goal of this article is to bring together these disparate literatures in order to assess the potential of thermographic imaging for measuring emotion. To orient this discussion, the article first summarizes the theoretical perspectives that guide contemporary emotion research in sociology. It then discusses the measurement issues relevant to these perspectives and the limitations of widely used measurement techniques. Next, the literature on infrared thermography is reviewed and limitations of this method are discussed.
Testing Contemporary Sociological Theories of Emotions
Most sociological definitions of emotion utilize Thoits’s (1989) four-component model. Thoits (1989) described emotions as being composed of situational cues, physiological changes, expressive gestures, and labels. Though sociological definitions of emotion typically include reference to physiological responses, none reduce emotion to bodily experience. Consequently, for purposes of testing sociological theories, measures of physiological responses can never be equated with emotion. Nonetheless, corporeal manifestations are core to how we define emotion, and measures of that component are rare—and potentially useful for advancing our understanding of emotions in social settings.
One of the most influential theories in the sociology of emotion, emotion management theory (Hochschild, 1983), describes the structuring of individuals’ efforts to comply with the rules of emotion culture. Rather than making predictions about specific emotions, this theory focuses on the distinctions between felt emotion and expressed emotion. The most common measurement approach is to use observations and narratives to identify examples of individuals describing their own distinctions between felt and expressed emotion (e.g., Lively, 2000).
Other structural theories make specific predictions about the emotional consequences of social conditions (Kemper, 1978; Lawler, 2001). Kemper’s (1978) power and status theory predicts how discrete emotions result from gains and losses in the dimensional aspects (power and status) of social interaction. In the affect theory of social exchange, social exchange within groups produces positively and negatively valenced global emotions, which become discrete, targeted emotions when applied to self, other, and the social group (Lawler, 2001). For example, positively valenced global emotions become pride when attributed to oneself, gratitude when attributed to others, and affective attachment when attributed to the group. Scheff’s theory of the deference-emotion system (1988) focuses on the discrete emotions of shame and pride.
Structural symbolic interactionist approaches in sociology make predictions about both discrete and dimensional emotion. Affect control theory (ACT; Heise & Thomas, 1989) indexes emotions and other elements of social interaction (identity labels, behaviors, social characteristics, social settings) along three dimensions: evaluation, potency, and activity. These dimensions roughly correspond to the valence, power, and arousal dimensions used in other theories of emotion. Evaluation reflects the goodness versus badness of the emotion. Potency refers to powerfulness, and activity indicates liveliness. Combining ratings on these dimensions produces a point in three-dimensional space that corresponds to discrete emotion labels. Identity theory also relies on both discrete and dimensional conceptualizations of emotion (Burke & Stets, 2009). Its central prediction is that identity confirmation leads to positive feelings, while identity disconfirmation leads to negative feelings. Research in this tradition also elaborates the identity control model to make predictions about specific, discrete emotions, such as anger, shame, guilt, and empathy.
Wide-range testing of these contemporary theories of emotion requires refined measures of both discrete emotions and dimensions of emotion, though it is not necessary that any one tool measures all of these aspects. There are a number of measurement characteristics that are desirable whatever aspect of emotion is being measured (see Robinson, Rogalin, & Smith-Lovin, 2004). Ideal emotion measurement tools are nonreactive, nonintrusive, and able to capture dynamic aspects of emotion in interaction. Measurement reactivity is of particular concern, because a measurement tool that, itself, affects emotion requires the researcher to account for reactivity in interpretation of the data. Intrusiveness relates to reactivity, as intrusive measures are more likely to elicit an unintended emotional response than are nonintrusive ones. Intrusive measures also interfere with interaction, which is problematic for testing dynamic emotion processes. Finally, some measurement tools require either interrupting interaction for data collection or gathering retrospective accounts of emotion. Issues associated with widely used physiological measures are discussed below.
Physiological Measures of Emotion
Physiological measures of emotion tap into neurological, electrical, and vascular responses associated with emotions. As a result, they capture unfiltered reactions, reducing concern with response bias. These measures vary, however, in their reactivity and in their ability to capture emotions as they unfold in interactions, as discussed next.
Research documenting the relationship between emotion and brain activity has spurred the use of functional magnetic resonance imaging (fMRI) and electroencephalography (EEG). Functional magnetic resonance imaging identifies areas of heightened brain activity by measuring changes in blood flow and oxygen use, allowing researchers to identify neural activity associated with certain discrete emotions and also with emotion valence (Murphy, Nimmo-Smith, & Lawrence, 2003; Phan, Wager, Taylor, & Liberzon, 2002). Electroencephalography, in contrast, does not identify the location of brain activation but instead indicates when overall neural response occurs, making it useful in studies where temporality is important (Robinson et al., 2012). Both of these technologies have limitations for understanding emotion in interaction. Studies using fMRI require that participants enter an MRI tube, thus making face-to-face interaction impossible. Wired electrode caps employed in EEG limit movement and make conversation awkward but allow for more interaction than does fMRI. While EEG shows promise for measuring sentiments associated with linguistic stimuli, as would be useful for linguistically based dimensional theories such as affect control theory, it is not clear whether EEG captures responses to more traditional emotion stimuli.
Other physiological measures focus on peripheral nervous system responses. Several relevant measures are associated with cardiovascular activity. Increased blood pressure is associated with anger and also with general arousal and activity (see Cacioppo, Berntson, Larsen, Poehlmann, & Ito, 2000). Research on heart rate is less definitive, though some research finds that heart rate is associated with valence. A suppressed heart rate has also been associated with fear (see Robinson et al., 2004). Heart rate and heart rate variability can be measured with a finger plethysmograph, while blood pressure is typically measured with a cuff. These devices are simple to operate, but they limit movement, making them less useful for studying emotion during interaction. Cardiovascular activity can also be measured by a belt worn underneath clothing, allowing for unrestricted movement. The researcher must oversee the placement of the belt against bare skin, however, making it inappropriate for some populations.
Galvanic skin response is another potentially useful measure of autonomic nervous system activity. Galvanic skin response measures the ability of the skin to conduct electrical energy, which reflects activation of the sweat glands. Small finger transducers measure conductivity, making this a relatively unobtrusive measurement tool, though some transducers are sensitive to movement. Skin conductance increases in high-arousal situations, making it a useful measure of activation (Bradley & Lang, 2000; Markovsky, 1988). Skin conductance does not distinguish well between discrete emotions, nor does it measure valence or emotion intensity.
Finally, some emotion measurement tools are designed to detect emotion on the face. The most common of these are facial electromyography (fEMG) and the facial action coding system (FACS). Both of these tools operate under the assumption that emotions activate microexpressions that result in subtle changes in facial muscles. Facial electromyography measures the electrical activation that results from microexpressions. This involves attaching electrodes to the facial muscles of interest. The electrodes are then wired to the data collection unit. As a result, movement is limited, and the presence of electrodes makes naturalistic interaction difficult. Because facial electromyography measures the activation of muscles but not the intensity or direction of response, it is not useful for distinguishing between emotions that involve the same muscle groups. It may be useful, however, in detecting suppressed emotional expressions (Cacioppo, Petty, Losch, & Kim, 1986).
The FACS, in contrast, is an observational coding tool designed to document discrete emotion during the course of interaction (Ekman & Friesen, 1978). When used by certified coders, FACS has high rates of intercoder reliability. It is a very complex coding system, however, and certification can only be achieved after extensive training and practice. New research suggests, though, that machines may be taught to classify certain emotions using the FACS system, which would reduce the need for human coders (Du, Tao, & Martinez, 2014). FACS also requires adequate lighting and image resolution. Thus, when considering whether to use FACS, researchers must balance the time investment and image demands against the system’s ability to code emotions in ongoing interaction.
Infrared Thermography: Overview
Infrared thermography is a remote technique for measuring the amount of radiation that an object emits within the infrared electromagnetic spectrum. Just as standard cameras measure the electromagnetic radiation reflected in the visible spectrum (.4–.7 µm), infrared cameras measure the radiation in the infrared band (.7–14.0 µm). The amount of radiation emitted is determined by an object’s temperature and its emissivity (ε), which is its ability to reflect radiation. An infrared camera gathers data on three components of radiation: emission from the object, the reflected emission from ambient sources, and emission from the atmosphere. Emissivity determines the amount of radiation that an object reflects. The emissivity of human skin is .98, which means that skin reflects only minimal radiation from surrounding objects. Because moisture affects atmospheric emissions, ambient humidity is also included in the calculation of emitted radiation. Converting measures of radiation emitted by an object into measures of temperature also requires knowledge of the atmospheric temperature and the distance between the infrared camera and the object (see Robinson et al., 2012).
Infrared cameras produce images that reflect temperature variation (see Figure 1). Areas of intense heat are lighter in color than cooler areas. The infrared camera also records the temperature of each pixel within the image, which is important for emotion research. These numerical values can be used to calculate the overall temperature of areas of the face believed to be associated with particular emotions. These areas are termed “regions of interest” (ROIs). Typical ROIs are eyes, cheeks, forehead, nasal tip, and mouth.

Infrared image with marked regions of interest.
Infrared thermography unobtrusively measures localized temperature and has been used to identify facial heat patterns associated with emotional response (e.g., Robinson et al., 2012). Using patterns of heat on the face to measure emotions follows the same underlying assumptions as does the use of FACS and fEMG. All of these approaches assume that emotions evoke activation of facial muscles. The goal of FACS is to record these microexpressions along a number of emotion-related dimensions. Facial electromyography records the electrical energy associated with muscle activation. Infrared thermography, in contrast, measures the changes in temperature associated with changes in blood flow to the activated muscles. Jarlier et al. (2011) confirmed that the thermal patterns detected by thermography are consistent with the activation of specific “action units” in the FACS, which suggests concordance between these measurement techniques.
Though infrared thermography was used in emotion research in the 1980s (Mizukami, Kobayashi, Iwata, & Ishii, 1987; Zajonc, Murphy, & Inglehart, 1989), the technology did not attract a wide following. There was renewed interest in thermography following research documenting an association between deception and increases in temperature around the eyes (Pavlidis, Eberhardt, & Levine, 2002). Pavlidis et al. (2002) argued that the temperature change was caused by an increase in blood flow to the eyes, which is a physiological byproduct of the fight/flight response activated by deceit. Through thermographic imaging of the eyes, the researchers were able to distinguish between lying and truth-telling at rates significantly better than traditional lie detectors. Expanding on this research, Puri, Olson, Pavlidis, Levine, and Starren (2005) documented that frustration resulted in warming of the forehead. Thermography has since been used to measure emotion in a number of research domains.
Application of Thermography to Emotion Classification
There is a growing literature documenting that robotic devices can be trained to use facial thermography to classify the emotions of human interaction partners. In particular, research has shown potential for using machine-based learning to classify the valence (positivity vs. negativity) of expressed emotions (Khan, Ingleby, & Ward, 2006). Nhan and Chau (2010) found that thermography could not only distinguish between positive and negative emotions, but they also used infrared thermography to distinguish between self-reported high and low arousal. They recorded success rates above 80% for classifying high arousal versus baseline and high valence versus baseline. Success rates for classification of high versus low arousal and high versus low valence were much lower, in the 50–60% range. Khan et al. (2009) used similar methods to classify discrete emotions. These researchers noted the difficulty, however, in distinguishing between certain discrete emotions, such as happiness and sadness, perhaps due to the involvement of similar muscles groups.
Though the research discussed above suggests the potential for using infrared thermography in testing theories of emotion, the nature of machine-based learning limits its application. The goal of this type of research is to use large amounts of data (e.g., many temperature readings) to teach machines to classify emotions. To accomplish this, researchers create algorithms that identify which groups of temperature readings provide the most distinguishing information and produce equations that use the relevant information to predict emotion. The prediction equations are not typically included in the published research, however. Thus, there is limited information regarding what specific facial temperature patterns resulted in the classification of particular discrete emotions or dimensions of emotion. Nonetheless, the fact that successful classification algorithms have been developed using measures of facial temperature strongly suggests that emotion responses manifest as heat signatures on the face. To be useful for testing sociological theory, however, research must determine how temperature changes in specific regions of the face correspond to emotion dimensions and/or discrete emotions.
Thermal Signatures and Emotion Response
Research using facial thermography in nonhuman primates suggests that the arousal dimension of emotion may be reflected in temperature changes in the nasal area. Like research on the facial heat signature of emotions in humans, this research assumes the functionality of universal emotions both to communicate emotions to others and to prepare the body physically for response. The high arousal state associated with fear, in particular, has wide cross-species functionality, and primatologists have emphasized the importance of the nasal area in detecting fear-based arousal in response to threat. Nakayama, Goto, Kuraoka, and Makamura (2005) documented a decline in nasal temperature among rhesus monkeys exposed to threat. The upper nasal area showed a temperature decline as soon as 10 seconds after presentation of stimuli, with the cooling pattern spreading across the remainder of the nasal region. The monkeys displayed facial expressions consistent with negative affect, further suggesting the association between the temperature pattern and emotion response. Responding to this research, Kuraoka and Nakamura (2011) constructed a three-condition study in which macaque monkeys responded to presentation of videos of unknown monkeys behaving in directly threatening, indirectly threatening, or neutral ways. Results supported the interpretation of the earlier study, with nasal temperatures showing a significantly greater decline in the threat conditions compared to the neutral condition.
A parallel line of inquiry finds emotion scholars using facial thermography to document emotional arousal in humans. Consistent with primate research, Pavlidis et al. (2012) found that the temperature of the nasal area was an indicator of arousal in humans, with perspiration in high arousal situations resulting in cooling of the area around the nose. In a study of mixed-sex dyads, Hahn, Whitehead, Albrecht, Lefevre, and Perrett (2012) found that overall facial temperature increased following social contact with the opposite sex partner. Self-reported sexual arousal showed a marginally significant relationship with temperature change in the eyes, echoing Pavlidis et al.’s (2002) earlier findings.
Infrared thermography has also been used to detect discrete emotions in humans, though this research does not distinguish clearly between discrete emotions and dimensions of emotion. Ioannou et al. (2013) focused on moral emotions in examining the thermal signature of guilt in young children. Guilt was induced through the “mishap paradigm” in which children were led to believe they had broken the experimenter’s favorite toy. Temperature of the nasal area declined following the “mishap” and increased after soothing. Lack of a control condition, however, makes it impossible to conclude that the temperature change was a result of guilt. Instead, consistent with other research (e.g., Kuraoka & Nakamura, 2011; Pavlidis et al., 2012) the cooling may have been an indication of arousal. Similarly, the temperature of the nasal area in infants has been shown to decline following laughter (Nakanishi & Imai-Matsumura, 2008). Though this effect was interpreted as “joy,” the congruity between this heat pattern and the patterns documented by Kuraoka and Nakamura (2011) and Pavlidis et al. (2012) suggests that the decline in nasal temperature was most likely an indication of arousal.
In one of the most comprehensive studies of facial thermography, Robinson et al. (2012) incorporated self-report measures of identity impressions (evaluation, potency, and activity) and of positive emotion along with a measure of deflection in multivariate analyses of temperature changes in the forehead, eyes, and cheeks. Participants delivered a speech and then received either negative or positive (false) feedback. Infrared images were recorded, and participants completed a self-report measure of emotion. The forehead and cheek responded similarly, with the temperature of both regions negatively associated with positive emotion and potency. In contrast, eye warming was significantly related to deflection, which again points to eye warming as an indication of arousal. These findings suggest that infrared thermography may be useful for detecting a wide array of emotion responses.
Though Robinson et al. (2012) concluded that there was considerable promise in thermography as an emotion measurement tool, they cautioned that researchers using thermography must address several practical issues. For example, facial hair, bangs/fringe, eyeglasses, and hats interfere with accuracy, as accurate temperature measures require unobstructed views of facial skin. Temperature measures are also sensitive to motion and to the distance between camera and subject, though these issues can be addressed with tracking software (Zhou, Tsiamyrtzis, & Pavlidis, 2009). Measures of ambient temperature and humidity are used in the calculation of temperature and so must be recorded and kept relatively constant during imaging. The need for constant temperature and humidity requires careful control of the experimental setting and complicates the use of thermography in naturalistic settings. Finally, the technology is not difficult to use, but it does require either hand drawing of regions of interest or tracking software to automatize this process.
Conclusion
The appeal of infrared thermography is that it is nonreactive and noninvasive. Unlike fEMG and EEG, infrared thermography does not utilize wires or electrodes attached to the skin, both of which can make interaction awkward and uncomfortable. Unlike fMRI, it does not involve restraints or the physical separation of research participants. As a result, the data gathered by infrared thermography is much less likely to be affected by the measurement, itself, than are data gathered by these methods. Thermography also does not require interruption of interaction, making it ideally suited for measuring emotion in dyads and groups. Finally, infrared thermography does not require complex coding protocols and can be used in any lighting condition, making it more accessible in some ways than FACS. In addition, a recent study compared the utility of automated thermal analysis and automated FACS coding and found that thermal analysis alone outperformed FACS coding alone, although emotion classification was improved when the two methods were combined (Wesley, Buddharaju, Pienta, & Pavlidis, 2012).
Although these advantages are significant, there are limitations. Valid temperature measurement requires a controlled setting, which may make infrared thermography inappropriate for studying emotion outside the laboratory. The technology is also still in development. For example, a remaining challenge is to determine whether heat patterns reflect discrete emotions, dimensional responses, or both. Several studies have found that thermography is useful in measuring emotional arousal (e.g., Plavidis et al., 2002), and one study found facial temperature patterns to be associated with the potency dimension (Robinson et al., 2012). Other research claims that thermography can identify certain discrete emotions, though the results of these studies are open to interpretation (e.g., Nakanishi & Imai-Matsumura, 2008). Additional experimental research is needed to determine whether thermography is a useful measure only of certain dimensions of emotion or if it also discriminates between discrete emotions.
Though it seems unlikely that infrared thermography will soon replace more traditional emotion measures, there is great potential for this technology to complement existing measurement tools. In particular, combining infrared data with data gathered through self-report and/or other physiological tools may improve our ability to detect emotion response. Use of multiple devices to measure different aspects of emotion response in a single study is common. Adding thermal imagery requires minimal changes to existing research protocols, can be used alongside existing machinery, and does not affect other physiological or self-report data. Incorporating thermography with other emotion measures also allows for comparisons between measures, which facilitates a fuller understanding of the relative value of each approach.
Footnotes
Author note:
This research was supported by National Science Foundation Grant #BCS-0729396.
