Universality Revisited

Abstract

Evidence does not support the claim that observers universally recognize basic emotions from signals on the face. The percentage of observers who matched the face with the predicted emotion (matching score) is not universal, but varies with culture and language. Matching scores are also inflated by the commonly used methods: within-subject design; posed, exaggerated facial expressions (devoid of context); multiple examples of each type of expression; and a response format that funnels a variety of interpretations into one word specified by the experimenter. Without these methodological aids, matching scores are modest and subject to various explanations.

Keywords

culture emotion facial expression universality

Let us define the universality thesis as the claim that certain human facial expressions are signals of specific basic emotions—such as happiness, surprise, fear, anger, disgust, and sadness—signals universally recognized by human beings whatever their cultural background or spoken language. Reviewers of data relevant to this claim have gone back and forth. Early reviewers were critical, Izard (1971) and Ekman, Friesen, and Ellsworth (1972) supportive, Russell (1994) critical. We examine evidence published since Russell’s (1994) review.

Both proponents and critics of the universality thesis agree that human beings derive information relevant to emotion from the faces of others. The question is: What evidence would support the universality thesis over alternative accounts of what information the face conveys? To support the universality thesis specifically, matching scores (the percentage of observers matching the face to the specific predicted emotion) must be (a) high and (b) universal, that is, reasonably invariant across culture, language, method, and so on. As proponents Haidt and Keltner (1999, p. 238) wrote, “If the universality thesis is correct then those expressions that are pancultural should elicit very high recognition rates, generally in the 70–90% range . . . even when methodological constraints are relaxed.”

We searched PsychInfo for all peer-reviewed journal articles published between 1992 and 2010 that included the words emotion* and cultur* in the title, abstract, or keywords. From the resulting 6,566 entries, we selected all studies that were not reviewed by Russell (1994), that included a cross-cultural comparison, that had nonclinical adult samples, and that examined recognition of adult facial expressions that met criteria specified by Ekman and Friesen (1978). (This last criterion excluded studies on facial dialects in which individuals posed cultural-specific expressions; see Elfenbein, 2013.) The result was 21 studies that provided 57 sets of data.

Cross-Cultural Judgment Studies

Of the 57 data sets that met our criteria, 39 had a similar enough method to allow statistical comparisons and are listed in the Appendix. The remaining 18 sets are discussed in later sections. The 39 sets used what we call the standard method: at least four of the hypothesized basic emotions, posed facial expressions, a forced-choice response format, and a within-subjects design.

Table 1 compares the newer data sets seen in the Appendix with the older ones reviewed by Russell (1994). Newer studies yielded similar but slightly lower matching scores. This change is difficult to explain. Newer studies were carried out by new experimenters with new photographs of perhaps less intense expressions (e.g., Beaupré & Hess, 2005). Or perhaps the greater amount of data now simply provides a better estimate of matching scores.

Table 1.

Average matching scores for Western literate, non-Western literate, and isolated illiterate samples based on the standard method

	Facial expression
	Happiness	Sadness	Anger	Fear	Surprise	Disgust	Mean
	Literate, Western cultures
Pre-1994	95.1	86.5	79.1	77.3	87.4	81.1	84.4
1992–2010	88.6	74.2	75.2	68.9	81.3	64.7	75.5
	Literate, non-Western cultures
Pre-1994	87.1	75.1	61.8	63.2	77.7	69.0	72.3
1992–2010	90.7	67.1	60.8	46.3	86.0	51.6	67.1
	Illiterate, isolated cultures
Pre-1994	92.0	53.5	56.0	46.0	36.0*	29.0*	52.1
1992–2010	84.0	51.0	33.0	30.0	58.0	44.0	50.0
Overall Mean	89.6	67.9	61.0	55.3	71.1	56.6

Note: Matching score is the percentage of observers who selected the predicted label. Pre-1994 refers to articles reviewed by Russell (1994); 1992–2010 refers to articles summarized in the Appendix. Average is a mean, except matching scores for the pre-1994 illiterate isolated samples are sometimes medians (marked with an ^*) because some sample values were not provided in the primary source.

In the newer studies (those listed in the Appendix), matching scores varied with culture: For each emotion, Figure 1 shows matching scores for Western literate, non-Western literate, and isolated illiterate observers. Reliability of the large differences seen in Figure 1 was verified in several ways. The nine studies from the Appendix that included both Western and non-Western literate observers are listed in Table 2. In all nine studies, mean matching score was higher in the Western than non-Western literate sample. “Happiness” and “surprise” expressions showed no reliable cultural difference (with a two-tailed sign test, all ps > .18), but for “sadness,” “anger,” “disgust,” and “fear” expressions, non-Western matching scores were reliably lower (all ps < .01).

Figure 1.

Matching scores are the percentage of observers who selected the predicted label of studies in the Appendix (i.e., studies published between 1992 and 2010) for six facial expressions. Error bars represent standard error, except for the illiterate isolated sample, for which there is but one sample.

Table 2.

Literate Western (bold) versus literate non-Western comparisons of matching scores

Study	N	Facial expression
Society/Culture		Happiness	Sadness	Anger	Fear	Surprise	Disgust	Mean
Beaupré and Hess (2005)
Canadian	20	67.0	56.0	50.0	33.0	–	56.0	52.4
Chinese	20	71.0	38.0	45.0	28.0	–	49.0	46.2
Sub-Saharan African	20	76.0	29.0	49.0	27.0	–	46.0	45.4
Biehl, et al. (1997)
Polish	75	98.0	88.0	85.0	69.0	89.0	83.0	85.3
Hungarian	45	98.0	83.0	87.0	74.0	94.0	84.0	86.7
Sumatran	32	99.0	81.0	79.0	57.0	89.0	76.0	80.2
Vietnamese	34	99.0	80.0	81.0	67.0	92.0	58.0	79.5
Elfenbein, Beaupré, Lévesque, and Hess (2007)
Canadian	20	88.0	71.0	72.0	68.0	–	62.0	72.2
Gabonese	20	80.0	41.0	63.0	50.0	–	32.0	53.2
Elfenbein, Mandal, Ambady, Harizuka, and Kumar (2004)
American	50	93.5	54.0	79.0	77.0	82.0	34.5	70.0
Indian	50	99.0	39.5	65.0	75.0	78.0	53.5	68.3
Japanese	50	98.5	68.5	51.5	49.5	92.5	20.0	63.4
Jack, Blais, Scheepers, Schyns, and Caldara (2009)
European Caucasian	13	99.0	89.0	89.0	88.0	84.0	88.0	89.5
East Asian	13	99.0	90.0	80.0	64.0	91.0	69.0	82.2
Matsumoto et al. (2002)
American	138	93.3	65.7	52.3	–	78.7	–	72.5
Japanese	137	89.4	47.8	54.7	–	88.6	–	70.1
Tcherkasoff and de Suremain (2005)
French	96	97.0	83.0	88.0	85.0	–	–	88.3
Burkinabe	96	94.0	62.0	53.0	65.0	–	–	68.5
Wickline, Bailey, and Nowicki (2009)
American	72	93.0	77.0	65.0	74.0	–	–	77.3
American	102	89.0	82.0	80.0	78.0	–	–	82.3
European	30	84.0	85.0	71.0	70.8	–	–	77.7
African	30	79.0	70.0	50.0	76.0	–	–	68.8
Yik and Russell (1999)
Canadian	30	100.0	86.7	73.3	66.7	76.7	56.7	76.7
Chinese	30	100.0	76.7	46.7	23.3	83.3	56.7	64.5
Japanese	30	100.0	80.0	56.7	53.3	93.3	30.0	68.9
Mean Western		92.2	73.8	73.6	71.5	83.2	64.7	76.9
Mean non-Western		90.4	63.7	59.1	50.9	90.0	48.5	65.9

Note: Matching score is the percentage of observers who selected the predicted label.

We next updated Russell’s (1994) statistical comparison of Western literate versus non-Western literate samples. Matching scores from 54 samples (those reviewed by Russell together with those listed in the Appendix) were analyzed with a mixed-design two-factor analysis of variance. Each sample was treated as a single case (n = 31 for Western cultures and n = 23 for non-Western cultures). Culture (Western or non-Western) was a between-subjects factor. Type of expression (“happiness,” “surprise,” “sadness,” “anger,” “fear,” or “disgust”) was a within-subject factor. Greenhouse-Geisser corrections for lack of sphericity were used where needed. The effect of culture was significant, F(1, 52) = 14.90, p < .001, as was the effect of type of expression, F(5, 245) = 46.33, p < .001. The Culture × Expression interaction was also significant, F(5, 245) = 8.01, p < .001. A simple effects analysis pursued the interaction effect. When experiment-wise alpha level was determined by Bonferroni’s method, there were no significant effects of culture for “happiness,” “surprise,” and “sadness” expressions, but Western observers’ matching scores were higher than were non-Western observers’ scores for “anger,” F(1, 52) = 25.05, p < .001, “fear,” F(1, 52) = 28.39, p < .001, and “disgust,” F(1, 52) = 12.02, p = .001. In short, the pattern reported by Russell (1994) remained intact with the addition of new data: Matching scores varied reliably with culture for three of six expressions.

Because our search found but one study of isolated, illiterate observers, that sample was not included in the statistical comparisons reported above. Still, as emphasized by Ekman (1980), such studies are a necessary test of the universality thesis. Table 3 provides matching scores from the three studies in the earlier review plus those from the Tracy and Robins (2008) study, in which high matching scores for the “happiness” expression (84%) showed that these observers understood the task presented to them. Matching scores for all other expressions were lower: M = 43.2%.

Table 3.

Matching scores from isolated illiterate observers, based on the standard method

Society/Culture	N	Language tested in	Facial expression
			Happiness	Sadness	Anger	Fear	Surprise	Disgust	Median
			Appendix
Burkinabe^a	39	Dioula	84.0	51.0	33.0	30.0	58.0	44.0	47.5
			Pre-1994
Fore^b	18	Neo-Melanesian Pidgin	99.0	55.0	56.0	46.0	38.0	29.0	50.0
Fore^b	14	Fore	82.0	N/A	50.0	54.0	<19.0^c	44.0	46.0
Sadong^b	15	Bidayuh	92.0	52.0	64.0	40.0	36.0	<23.0^c	47.5
Median			88.0	52.0	53.0	43.0	<37.0	<36.5

Note: Matching score is the percentage of observers who selected the predicted label.

Tracy and Robins (2008); ^bEkman, Sorenson, and Friesen (1969); ^cModal choice was another emotion, and no matching score was given, although it was less than the figure presented here.

Matching scores also varied with language. Of the studies listed in the Appendix in which we could determine the language used for testing, there were 29 samples tested in an Indo-European language (all 20 Western samples and nine of the 19 non-Western ones) and eight samples in a non-Indo-European language. Treating each expression as a separate case resulted in 204 data points. Matching scores were higher with an Indo-European language (158 data points; M = 73.2%) than without (46 data points; M = 63.4%), t(202) = 3.03, p = .002. Table 4 gives a more specific comparison of studies that tested observers from the same culture in more than one language—with the same result.

Table 4.

Differences due to language: Mean matching score of within-study comparisons of language

Society/CultureLanguage	Facial expression
Society/CultureLanguage	Happiness	Sadness	Anger	Fear	Surprise	Disgust	Mean
Indian^a
English	99.0	94.0	90.0	81.0	90.0	–	90.8
Hindi	99.0	90.0	89.0	74.0	90.0	–	88.4
Mexican^b
English	90.0	68.0	58.0	65.0	87.0	57.0	70.8
Spanish	78.0	62.0	58.0	56.0	77.0	41.0	62.0
Burkinabe^c
French	92.0	75.0	60.0	65.0	–	–	73.0
Moore	96.0	52.0	54.0	67.0	–	–	67.3
Palestinian^d
English	62.5	85.0	73.0	–	62.0	75.0	71.5
Arabic	66.5	65.0	53.0	–	44.0	88.0	63.3
Mean English	83.8	82.3	73.7	73.0	79.7	66.0	76.4
Mean non-English	86.3	68.8	62.8	65.5	70.3	64.5	69.7

Note: Matching score is the percentage of observers who selected the predicted label. Kayyal and Russell (2012) allowed observers to attribute more than one emotion to a single expression. The data presented here represent the percentage of observers who selected the target category as one of the emotions conveyed by the expression.

Matsumoto and Assar (1992); ^bMatsumoto, Angus-Wong, and Martinez (2008); ^cTcherkasoff and de Suremain (2005); ^dKayyal and Russell (2012).

The Standard Method Inflates Matching Scores

The matching scores discussed so far cannot be taken at face value, according to Russell’s (1994) argument that such scores are inflated by method factors. Each of several aspects of the standard method pushes scores higher by some amount. Combined, these method factors can produce highly exaggerated scores. The rebuttal that each individual method factor produces only a small effect is factually incorrect, and it misses the point that, even if each effect were small, their cumulative impact can be large. The magnitude of the matching scores as they change with method is thus a key issue. Here we consider recent evidence on three of those factors.

Posed Facial Expressions

The universality thesis is about emotions and facial expressions that occur spontaneously. However, the standard method uses expressions posed with the purpose of conveying a single emotion as clearly as possible and then selected from a larger set as the best at doing so. Our search turned up two studies of spontaneous expressions. Shown photos of spontaneous expressions predicted to convey a single basic emotion, British, American, and Japanese observers yielded matching scores ranging from 31% to 35% (Matsumoto, Olide, Schug, Willingham, & Callan, 2009). Similarly, shown photos of spontaneous expressions provided by Ekman (1980) to illustrate the universality thesis, American observers yielded a mean matching score of 35% (Naab & Russell, 2007). Both studies produced matching scores noticeably lower than the overall mean (70.9%) from studies in the Appendix, which used posed expressions. Use of posed expressions inflates matching scores relative to spontaneous expressions.

Presentation of Multiple Expressions

In the standard method, observers are shown more than one facial expression (an average of 47 for studies in the Appendix) through previewing, practice trials, or a within-subjects design. Principles of perception predict that observers judge one facial expression relative to others (Russell, 1991). Especially with a forced-choice response format, observers may come to realize which types of expressions are being presented and can then use a process of elimination to help make their judgments. Empirically, the commonly used within-subjects design results in higher matching scores than does a between-subjects design (Russell, 1994). Yik, Widen, and Russell (2012) found that high matching scores (M for three samples = 67.5%) for the “disgust expression” depended on having recently seen an “anger expression”; otherwise, the mean dropped to 35%. With the 20 data sets in the Appendix that included all six emotions, matching scores were higher for observers who had viewed more faces (30 or more; n = 10; M = 78%) than those who had viewed fewer (6–24 faces; n = 10; M = 67%), F(1, 18) = 9.026, p = .007.

Response Format

The standard method uses a forced-choice response format in which observers select one emotion from a short list provided by the experimenter. This format creates many problems. Here, we highlight three. First, the observer’s spontaneous interpretation of a facial expression is not limited to emotions. Observers reliably report social messages (Yik & Russell, 1999), incipient actions (Tcherkassof & de Suremain, 2005), and the antecedent situation (Widen, 2013) or its appraisal (Scherer & Grandjean, 2008). The standard forced-choice format channels all interpretations into an emotional one.

Second, when observers were allowed to attribute more than one emotion to an expression, they did so. In one study, they selected for each face, on average, four of the eight available emotions (Kayyal & Russell, 2012). Quantitative ratings similarly show that observers see multiple basic emotions in the same face, albeit to different degrees (Beaupré & Hess, 2005; Mandal, Bryden, & Bulman-Fleming, 1996; Tcherkassof & de Suremain, 2005). Observers reliably rate faces on broad bipolar dimensions such as pleasure–displeasure and arousal–sleepiness (Gao, Maurer, & Nishimura, 2010). When observers were allowed to provide any emotion label they wanted for a given face, they generated many (Haidt & Keltner, 1999; Rosenberg & Ekman, 1995). Thus, the forced-choice format shoehorns each observer’s full attribution of various emotions into one of a few preselected categories.

Third, with a forced-choice format, a majority of observers attribute qualitatively different basic emotions to the same face, depending on the options available (Russell, 1993). Frank and Stennett (2001) showed that with the word surprise omitted from the list in the traditional format, the “surprise expression” was called happy by 61% of observers. They concluded that the standard forced-choice format can “demand agreement” (2001, p. 83).

Russell (1994) noted that the traditional forced-choice format does not allow observers to select no emotion, an emotion not on the list, or more than one emotion. Frank and Stennett (2001) agreed that a “none” option was needed and found that 7.8% of observers chose “none” in response to posed expressions of hypothesized basic emotions—evidence that indeed some observers do not find any of the available options applicable. In the 39 sets of data in the Appendix, only seven provided a “none” option. Although adding “none” is useful, it is no panacea: The full range of each observer’s interpretation is still funneled into one category, although now including “none” as an additional category. The observer still cannot provide their own spontaneous interpretation of the expression, nor can they indicate that the expression conveys multiple emotions, an emotion not on The list, or nonemotional information. Thus, even with a “none” option, the forced-choice format funnels a range of responses into one option, thereby oversimplifying the observer’s response and inflating the matching score.

Do Humans Universally Recognize Basic Emotions from Facial Expressions?

Like the evidence before it (Russell, 1994), the evidence reported between 1992 and 2010 (listed in the Appendix) shows that for only one emotion—happiness—did a large percentage of observers (89.6%) match it to the predicted facial expression across cultures and languages. For negative emotions (sadness, anger, fear, and disgust), Western observers gave the highest matching scores (M = 71%). Non-Western but still literate observers gave lower scores (57%), and illiterate observers isolated from the West lower still (39%). Both non-Western groups failed to reach Haidt and Keltner’s (1999) 70–90% criterion. Furthermore, these observed matching scores, low as they are, were inflated by the method used to gather them.

Shariff and Tracy (2011, p. 407) defended the universality thesis, writing of “easily recognizable signals” even in “geographically and culturally isolated populations.” Against the objection that matching scores are low, vary with culture and language, and are inflated by method, they pointed to the statistical significance of matching scores: “we believe these findings should be held to the same accepted standard of statistical significance as other empirical results.” This “statistical-significance” defense rests on an elementary error of statistical inference. That the data reach “statistical significance” rules out the null hypothesis that observers matched expressions with emotions randomly, but does not rule out various substantive alternative hypotheses or the possibility of method artifacts. Ruling out the null hypothesis does not rule in any one interpretation of the data. There are many alternative interpretations of nonrandom responding; Russell (1994) listed eight, and there are undoubtedly others. These alternatives all predict that matching scores are statistically significant, and therefore statistical significance cannot decide among them.

Evidence does not support the claim that facial expressions are preinterpreted signals for specific basic emotions universally recognized by human beings. But neither are humans clueless when interpreting a face. When asked what emotion a face expresses, humans do not answer randomly; they figure out an answer. Faces might provide information on core affect, social messages, incipient actions, situations, or appraisals; and from that information, observers might figure out the expresser’s emotion. We see no use for further studies using the standard method to discover that humans are not random when matching expressions with emotions. Rather, we see the need for hypotheses and evidence on the specific processes involved when an observer interprets another’s facial expression.

Footnotes

Appendix

Matching scores from cross-cultural judgment studies, 1992 to 2010

Society/Culture	N	No. of expression sets	Language	Facial expression
Society/Culture	N	No. of expression sets	Language	Happiness	Sadness	Anger	Fear	Surprise	Disgust	Mean
	Literate, Western cultures (20 samples)
Canadian^a	20	12	French	67.0	56.0	50.0	33.0	–	56.0	52.4
Canadian^b	20	8	French	88.0	71.0	72.0	68.0	–	62.0	72.2
Canadian^c	30	1	English	100.0	86.7	73.3	66.7	76.7	56.7	76.7
French^d	96	4	French	97.0	83.0	88.0	85.0	–	–	88.3
Hungarian^e	45	8	Hungarian	98.0	83.0	87.0	74.0	94.0	84.0	86.7
Indian^f	50	4	English	99.0	39.5	65.0	75.0	78.0	53.5	68.3
Indian^g	40	1	Oriya	45.0	42.5	80.0	55.0	42.5	82.5	57.9
Indian^h	100	8	English	99.0	94.0	90.0	81.0	90.0	–	90.8
Indian^h	100	8	Hindi	99.0	90.0	89.0	74.0	90.0	–	88.4
Mexicanⁱ	274	8	English	90.0	68.0	58.0	65.0	87.0	57.0	70.8
Mexicanⁱ	274	8	Spanish	78.0	62.0	58.0	56.0	77.0	41.0	62.0
Polish^e	75	8	Polish	98.0	88.0	85.0	69.0	89.0	83.0	85.3
American^j	37	16	English	–	80.0	89.0	63.8	89.9	–	80.7
American^f	50	4	English	93.5	54.0	79.0	77.0	82.0	34.5	70.0
American^g	40	1	English	73.0	88.0	83.0	55.0	80.0	78.0	76.2
American^k	138	4	English	93.3	65.7	52.3	–	78.7	–	72.5
American^l	72	11	English	93.0	77.0	65.0	74.0	–	–	77.3
American^l	102	11	English	89.0	82.0	80.0	78.0	–	–	82.3
European^m	13	8	English	99.0	89.0	89.0	88.0	84.0	88.0	89.5
European^l	30	11	English	84.0	85.0	71.0	70.8	–	–	77.7
	Literate, non-Western cultures (18 samples)
Burkinabe^d	48	4	French	92.0	75.0	60.0	65.0	–	–	73.0
Burkinabe^d	48	4	Moore	96.0	52.0	54.0	67.0	–	–	67.3
Chineseⁿ	32	4	Mandarin	97.0	72.0	57.0	21.0	74.0	54.0	63.0
Chinese^a	20	12	French	71.0	38.0	45.0	28.0	–	49.0	46.2
Chinese^o	237	7	Chinese	92.0	77.0	66.0	56.0	76.0	50.0	70.0
Chinese^j	38	16	English	–	81.0	75.3	61.2	90.0	–	77.0
Gabonese^b	20	8	French	80.0	41.0	63.0	50.0	–	32.0	53.0
Japanese^f	50	4	Japanese	98.5	68.5	51.5	49.5	92.5	20.0	63.0
Japanese^k	137	4	English	89.4	47.8	54.7	–	88.6	–	70.0
Japanese^p	123	8	English	94.8	66.6	56.3	44.5	89.2	63.7	69.0
Japanese^c	30	1	Japanese	100.0	80.0	56.7	53.3	93.3	30.0	69.0
Hong Kong^c	30	1	Cantonese	100.0	76.7	46.7	23.3	83.3	56.7	64.0
Sumatran^e	32	8	?	99.0	81.0	79.0	57.0	89.0	76.0	80.0
Tibetanⁿ	11	4	Mandarin	86.0	75.0	61.0	16.0	73.0	66.0	63.0
Vietnamese^e	34	8	?	99.0	80.0	81.0	67.0	92.0	58.0	80.0
African^l	30	11	English	79.0	70.0	50.0	76.0	–	–	69.0
Sub-Saharan African^a	20	12	French	76.0	29.0	49.0	27.0	–	46.0	45.4
East Asian^m	13	8	English	99.0	90.0	80.0	64.0	91.0	69.0	82.0
	Illiterate cultures (1 sample)
Burkinabe^q	39	2	Dioula	84.0	51.0	33.0	30.0	58.0	44.0	50.0
Overall mean				89.6	70.2	67.3	58.5	82.5	57.3	70.9

Note: Matching score is the percentage of observers who selected the predicted label.

Beaupré and Hess (2005); ^bElfenbein et al. (2007); ^cYik and Russell (1999); ^dTcherkasoff and de Suremain (2005); ^eBiehl et al. (1997); ^fElfenbein et al. (2004); ^gHaidt and Keltner (1999); ^hMatsumoto and Assar (1992); ⁱMatsumoto et al. (2008); ^jElfenbein (2006); ^kMatsumoto et al. (2002); ^lWickline et al. (2009); ^mJack et al. (2009); ⁿElfenbein and Ambady (2003); ^oHuang, Tang, Helmeste, Shioiri, and Someya (2001); ^pShioiri, Someya, Helmeste, and Tang (1999); ^qTracy and Robins (2008).

Author note:

This research was supported by two grants from the National Science Foundation: No. 1025563 awarded to James A. Russell and No. 1064757 awarded to Nicole L. Nelson. We thank Erin Heitzman, Alan Fridlund, Rachael Jack, Mary Kayyal, Joe Pochedley, Alyssa McCarthy, and Sherri Widen for comments on a draft of this article.

References

Beaupré

M. G.

Hess

(2005). Cross-cultural emotion recognition among Canadian ethnic groups. Journal of Cross-Cultural Psychology, 36, 355–370.

Biehl

Matsumoto

Ekman

Hearn

Heider

Kudoh

Ton

(1997). Matsumoto and Ekman’s Japanese and Caucasian facial expressions of emotion (JACFEE): Reliability data and cross-national differences. Journal of Nonverbal Behavior, 21, 3–21.

Ekman

(1980). The face of man. New York, NY: Garland Publishing.

Ekman

Friesen

(1978). The facial action coding system. Palo Alto, CA: Consulting Psychologists Press.

Ekman

Friesen

W. V.

Ellsworth

(1972). Emotion in the human face: Guidelines for research and an integration of findings. New York, NY: Pergamon Press.

Ekman

Sorenson

E. R.

Friesen

W. V.

(1969). Pan-cultural elements in the facial displays of emotions. Science, 164, 86–88.

Elfenbein

H. A.

(2006). Learning in emotion judgments: Training and the cross-cultural understanding of facial expressions. Journal of Nonverbal Behavior, 30, 21–36.

Elfenbein

H. A.

(2013). Nonverbal dialects and accents in facial expressions of emotion. Emotion Review, 5, 90–96.

Elfenbein

H. A.

Ambady

(2003). When familiarity breeds accuracy: Cultural exposure and facial emotion recognition. Journal of Personality and Social Psychology, 85, 276–290.

10.

Elfenbein

H. A.

Beaupré

Lévesque

Hess

(2007). Toward a dialect theory: Cultural differences in the expression and recognition of posed facial expressions. Emotion, 7, 131–146.

11.

Elfenbein

H. A.

Mandal

M. K.

Ambady

Harizuka

Kumar

(2004). Hemifacial differences in the in-group advantage in emotion recognition. Cognition & Emotion, 18, 613–629.

12.

Frank

M. G.

Stennett

(2001). The forced-choice paradigm and the perception of facial expressions of emotion. Journal of Personality and Social Psychology, 80, 75–85.

13.

Gao

Maurer

Nishimura

(2010). Similarities and differences in the perceptual structure of facial expressions in children and adults. Journal of Experimental Child Psychology, 105, 98–115.

14.

Haidt

Keltner

(1999). Culture and facial expression: Open-ended methods find more expressions and a gradient of recognition. Cognition & Emotion, 13, 225–266.

15.

Huang

Tang

Helmeste

Shioiri

Someya

(2001). Differential judgment of static facial expressions of emotions in three cultures. Psychiatry and Clinical Neurosciences, 55, 479–483.

16.

Izard

(1971). The face of emotion. New York, NY: Appleton-Century-Crofts.

17.

Jack

R. E.

Blais

Scheepers

Schyns

P. G.

Caldara

(2009). Cultural confusions show that facial expressions are not universal. Current Biology, 19, 1543–1548.

18.

Kayyal

Russell

J. A.

(2012). Americans and Palestinians judge spontaneous facial expressions of emotion. Manuscript submitted for publication.

19.

Mandal

M. K.

Bryden

M. P.

Bulman-Fleming

(1996). Similarities and variations in facial expressions of emotions: Cross-cultural evidence. International Journal of Psychology, 31, 41–58.

20.

Matsumoto

Angus-Wong

A. M.

Martinez

(2008). Priming effects of language on emotion judgments in Spanish–English bilinguals. Journal of Cross-Cultural Psychology, 39, 335–342.

21.

Matsumoto

Assar

(1992). The effects of language on judgments of universal facial expressions of emotion. Journal of Nonverbal Behavior, 16, 85–99.

22.

Matsumoto

Consolacion

Yamada

Suzuki

Franklin

Paul

… Uchida

(2002). American–Japanese cultural differences in judgments of emotional expressions of different intensities. Cognition & Emotion, 16, 721–747.

23.

Matsumoto

Olide

Schug

Willingham

Callan

(2009). Cross-cultural judgments of spontaneous facial expressions of emotion. Journal of Nonverbal Behavior, 33, 213–238.

24.

Naab

P. J.

Russell

J. A.

(2007). Judgments of emotion from spontaneous facial expressions of New Guineans. Emotion, 7, 736–744.

25.

Rosenberg

Ekman

(1995). Conceptual and methodological issues in the judgment of facial expressions of emotion. Motivation & Emotion, 19, 111–138.

26.

Russell

J. A.

(1991). The contempt expression and the relativity thesis. Motivation and Emotion, 15, 149–168.

27.

Russell

J. A.

(1993). Forced-choice response format in the study of facial expression. Motivation and Emotion, 17, 41–51.

28.

Russell

J. A.

(1994). Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies. Psychological Bulletin, 115, 102–141.

29.

Scherer

K. R.

Grandjean

(2008). Facial expressions allow inference of both emotions and their components. Cognition & Emotion, 22, 789–801.

30.

Shariff

A. F.

Tracy

J. L.

(2011). Emotion expressions: On signals, symbols, and spandrels – A response to Barrett (2011). Psychological Science, 20, 407–408.

31.

Shioiri

Someya

Helmeste

Tang

S. W.

(1999). Misinterpretation of facial expression: A cross-cultural study. Psychiatry and Clinical Neurosciences, 53, 45–50.

32.

Tcherkassof

de Suremain

(2005). Burkina Faso and France: A cross-cultural study of the judgment of action readiness in facial expressions of emotion. Psychologia, 48, 317–334.

33.

Tracy

J. L.

Robins

R. W.

(2008). The nonverbal expression of pride: Evidence for cross-cultural recognition. Journal of Personality and Social Psychology, 94, 516–530.

34.

Wickline

V. B.

Bailey

Nowicki

(2009). Cultural in-group advantage: Emotion recognition in African American and European American faces and voices. Journal of Genetic Psychology, 170, 5–29.

35.

Widen

S. C.

(2013). Children’s interpretation of facial expressions: The long path from valence-based to specific discrete categories. Emotion Review, 5, 72–77.

36.

Yik

Russell

J. A.

(1999). Interpretation of faces: A cross-cultural study of a prediction from Fridlund’s theory. Cognition & Emotion, 13, 93–104.

37.

Yik

Widen

S. C.

Russell

J. A.

(2012). The within-subject design in the study of facial expressions. Manuscript submitted for publication.