Abstract
This study contributes to the growing literature linking physical characteristics and behavioral tendencies by advancing the current debate on whether a person’s facial width-to-height ratio (fWHR) predicts a variety of antisocial tendencies. Specifically, our large-scale study avoided the social-desirability bias found in self-reports of behavioral tendencies by capturing survey data not only from more than 1,000 business executives but also from evaluators who reported knowing the focal individuals well. With this improved research design, and after conducting a variety of analyses, we found very little evidence of fWHR predicting antisocial tendencies. In light of prior research linking fWHR to social perceptions of evaluators, our results are suggestive of an evolutionary mismatch, whereby a physical characteristic once tied to antisocial tendencies in ancestral environments is—in modern environments—not predictive of such behaviors but instead predictive of biased perceptions.
As part of the burgeoning body of evolutionary-psychology research linking physical characteristics and behavioral tendencies, a consensus appears to have emerged suggesting that facial width-to-height ratio (fWHR) predicts a host of antisocial tendencies, including threat behavior (Geniole, Denson, Dixson, Carré, & McCormick, 2015), deception and exploitation (Geniole, Keyes, Carré, & McCormick, 2014; Haselhuhn & Wong, 2012; Stirrat & Perrett, 2010), trait dominance (Carré & McCormick, 2008), physical aggression (Goetz et al., 2013), and overall psychopathy (Anderl et al., 2016), and that this is especially so for males. An evolutionary explanation emphasizes that in violent ancestral environments, individuals with a high fWHR were more protected from fatal blows to the face and thus more likely to prevail in physical altercations (Stirrat, Stulp, & Pollet, 2012). This suggests further that males with a high fWHR were more effective ancestrally in garnering influence and access to resources through the use of threat and intimidation, which led to the development of psychological mechanisms that calibrate antisocial tendencies to fWHR. Evidence showing the greater relevance of fWHR in predicting male behavior can be explained in terms of males disproportionately representing both the perpetrators and victims of violent homicides in cultures across time and space (Daly & Wilson, 1988). Researchers have also proposed that the relationship between fWHR and antisocial tendencies is mediated by testosterone (Carré & McCormick, 2008), but this prediction has received only mixed support (Bird et al., 2016; Lefevre, Lewis, Perrett, & Penke, 2013).
Although various studies have reported a positive relationship between fWHR and antisocial tendencies, a recent large-scale study by Kosinski (2017) has called into question this growing consensus. Kosinski’s study is noteworthy because it found very little evidence linking fWHR and self-reported behavioral tendencies (e.g., cooperativeness, militarism, trustworthiness, sympathy, and morality), and what little evidence was found was also generally stronger for females than for males. A particularly compelling feature of Kosinski’s work is that its nonfindings were obtained using a large real-world sample, whereas prior work (mostly laboratory-based) has typically relied on a small number of participants. Indeed, Kosinski speculated that the frequency of marginally significant p values observed in prior published studies on fWHR and behavioral tendencies may reflect degrees of freedom and file-drawer problems (Nelson, Simmons, & Simonsohn, 2018).
Kosinski’s (2017) contribution is undoubtedly important in both highlighting an alternative position on the relevance of fWHR for behavioral tendencies and also identifying specific limitations of prior empirical research in this area. If fWHR does not predict behavioral tendencies, but only differences in social judgments of evaluators who are unfamiliar with targets (as found in prior research; Deska, Lloyd, & Hugenberg, 2018a, 2018b; Efferson & Vogt, 2013; Geniole, MacDonell, & McCormick, 2017; Lefevre & Lewis, 2014), this would open the possibility of an evolutionary mismatch (Li, van Vugt, & Colarelli, 2018). In other words, because the modern world differs in important ways from the violent ancestral environments in which human psychological mechanisms developed, social judgments formed on the basis of fWHR, which may have been adaptive in the evolutionary past, may no longer be accurate.
However, we suggest that before advancing such a concept with confidence, it is necessary to first address an additional important but underexamined limitation found in some of the studies focusing on the link between fWHR and behavioral tendencies (including Kosinski, 2017): namely, a reliance on self-reports and their attendant social-desirability bias, the tendency to provide socially acceptable answers (see the discussion of Kosinski’s results by Eisenbruch, Lukaszewski, Simmons, Arai, & Roney, 2018). Indeed, some researchers suggest that this bias can explain the persistently low or even negative correlation between self-reported versus other-reported ratings of behavioral tendencies (Atkins & Wood, 2002).
Therefore, we sought in this study to extend the current understanding of the relationship between fWHR and behavioral tendencies (and the possibility of an evolutionary mismatch) by explicitly addressing the above-mentioned limitations in prior research (i.e., the reliance on small sample sizes or self-reported data). Specifically, we conducted a large-scale study on the association between fWHR and behavioral tendencies that employed both self- and other-rated measures in a sample of business executives. An additional positive and original feature of our study is that we obtained our other-rated measures of behavioral tendencies from people who reported knowing the focal individual well, thus reducing the possibility that fWHR was simply biasing the judgment of raters who were unfamiliar with the person (e.g., Efferson & Vogt, 2013; Geniole et al., 2017). We see our study’s use of a very large sample and other-reported ratings from people familiar with the target as providing the foundation for a clearer assessment of whether fWHR predicts antisocial tendencies or whether the influence of fWHR on social perceptions may be an example of an evolutionary mismatch.
Method
Sample
The source of our data was the flagship executive-development program of the Center for Creative Leadership (CCL). CCL is a nonprofit organization headquartered in Greensboro, North Carolina, that specializes in leadership training. The initial sample in this proprietary data set consisted of 1,305 executives who participated in a feedback survey also completed by the executives’ colleagues called the Campbell Leadership Index (CLI; Campbell, 1991) from 2014 to 2017. We were able to capture not only self-reported ratings but also ratings from other people who reported knowing the focal executives well (subordinates, peers, and supervisors). To gather the self-reported data, CCL surveyed the focal executives at the start of the leadership program. For the other-reported data, colleagues from the focal executives’ firms were surveyed anonymously prior to the start of the program.
After accounting for missing pictures of the executives (n = 126) and averaging ratings within rater category for categories that included multiple raters per person (Atwater, Wang, Smither, & Fleenor, 2009), we had a final sample of 1,179 executives (873 males, 306 females) for whom we had a complete set of self-, peer, subordinate, and superior ratings. Interrater reliability was acceptable (α = .72 for peer ratings, α = .61 for subordinate ratings, and α = .66 for superior ratings).
Estimating fWHR
We ensured that the executives were facing forward in all pictures. Some of the facial images were slightly tilted to the right or left, which would have affected our fWHR measurements. To correct for this, we rotated these pictures so the eyes were on a horizontal plain. We then standardized the pictures to 8-bit gray-scale images with a height of 400 pixels (Carré, McCormick, & Mondloch, 2009). Next, two research assistants independently measured the facial width (distance between the right and left zygion) and facial height (distance between the midbrow and upper lip) of every executive using the National Institutes of Health’s ImageJ software (Abràmoff, Magalhães, & Ram, 2004). We computed fWHR as facial width divided by facial height. Given that the reliability score was high (r = .90 for fWHR; r = .94 for both width and height), we averaged the scores between the two raters.
We sought to further validate the results by measuring fWHR using the Facial Attributes function of Face++, an online artificial intelligence (AI) application (Kosinski, 2017), and found strong reliability between human raters and AI raters (α = .81), giving us greater confidence in our fWHR measurements. The mean human-rated fWHR for males was 1.81, 95% confidence interval (CI) = [1.802, 1.818], and for females was 1.70, 95% CI = [1.687, 1.713], whereas the mean AI-rated fWHR for males was 1.90, 95% CI = [1.891, 1.909], and for females was 1.85, 95% CI = [1.837, 1.863].
Behavioral measures
From the 100 items in the CLI survey, we chose the 25 that reflected anti- and prosocial or desirable and undesirable behavioral tendencies. These 25 items and their descriptions are reported in the Supplemental Material available online. To increase interpretability, we ran a common factor analysis to extract latent variables from the 25 items. First, the common factor analysis indicated that there were only three factors with eigenvalues greater than 1. A number of items cross-loaded onto multiple factors. We dropped items with loadings that were less than .40 and obtained the most optimal model of the three factors. On the basis of prior research (Fiske, Cuddy, Glick, & Xu, 2002; Goodwin, Piazza, & Rozin, 2014; Stavrova & Ehlebracht, 2016), we labeled these factors warmth (items: considerate, sensitive, affectionate, friendly, likeable, insensitive), cynicism (items: suspicious, cynical, temperamental, resentful, sarcastic), and morality (items: ethical, credible, candid, deceptive). The factor loadings are reported in Table 1. Table 2, which provides the correlations between self- and other-ratings on these three factors, shows that the correlation between different rater groups was low. This is consistent with our argument that self-reports and other-reports are likely to differ.
Results of the Common Factor Analysis of Selected Campbell Leadership Index Items With Uniqueness Score and Percentage of Variance Explained
Note: Only factor loadings above .40 are reported. Items were rotated using varimax rotation.
Correlations Between Self-, Peer-, Subordinate-, and Superior-Reported Scores on 3 Factors From the 25 Campbell Leadership Index Items (N = 1,179)
Note: CI = confidence interval.
Results
Table 3 shows the correlations between fWHR and the three factors from the CLI items (i.e., cynicism, morality, and warmth). We found that for males, both human-rated (p = .037) and AI-rated (p = .016) fWHR were positively related to self-rated cynicism but not to other-rated cynicism. Additionally, we found that AI-rated fWHR was negatively correlated with superior-rated cynicism for females (p = .048). All other correlations were nonsignificant. Table 4 shows the partial correlations between fWHR and the three factors after we controlled for age, race, and whether the target individual was smiling in the picture. After including these controls, we found that only one significant relationship remained. Whereas human-rated fWHR continued to be positively related to self-rated cynicism for males (p = .048), the positive relationship between AI-rated fWHR and self-rated cynicism for males became nonsignificant by a small margin (p = .052). When we applied a more conservative test of significance that adjusted for multiple comparisons (p < .05 divided by 24 within-gender comparisons), we failed to find any significant bivariate or partial correlations (see Tables S1 and S2 in the Supplemental Material).
Correlations Between Facial Width-to-Height Ratio (fWHR) and Self-, Peer-, Subordinate-, and Superior-Reported Scores on 3 Factors From the 25 Campbell Leadership Index Items
Note: BF = Bayes factor; CI = confidence interval.
Partial Correlations Between Facial Width-to-Height Ratio (fWHR) and Self-, Peer-, Subordinate-, and Superior-Reported Scores on 3 Factors From the 25 Campbell Leadership Index Items, With Controls
Note: Campbell Leadership Index factors and fWHR were obtained by regressing facial expression, age, and race and then obtaining the residuals. BF = Bayes factor; CI = confidence interval.
As an additional test of whether there was a relationship between fWHR and behavioral tendencies, we computed Bayes factor (BF01) values for the bivariate and partial correlations (reported in Tables 3 and 4, respectively). BF01 measures the likelihood that the observed data fit the null hypothesis (i.e., there is no relationship between fWHR and behavioral tendencies) as opposed to the alternative hypothesis (i.e., there is a relationship between fWHR and behavioral tendencies), whereas BF10 (1/BF01) measures the likelihood that the data fit the alternative hypothesis as opposed to the null hypothesis (Aczel et al., 2018; Wagenmakers, Morey, & Lee, 2016). We used BF01 for our analysis since we were interested in the likelihood of the null hypothesis over the alternative hypothesis. BF01 values above 1 but below 3 provide anecdotal support for the null hypothesis, BF01 values above 3 but below 10 provide moderate support for the null hypothesis, and BF01 values above 10 provide strong support for the null hypothesis; the same is the case for 1/BF01, except that values in the given ranges provide support for the alternative rather than the null hypothesis. As the results in Tables 3 and 4 indicate, the BF01 values for almost all of the bivariate and partial correlations were above 1, supporting the null hypothesis of no relationship between fWHR and behavioral tendencies. The one exception was the bivariate correlation between AI-rated fWHR and self-reported cynicism for males (BF01 = 0.718), indicating that it is 1.39 times more likely (1/0.718) that the data conform to the prediction that there is (vs. is not) a relationship between AI-rated fWHR and self-reported cynicism for males.
We further assessed the robustness of our results by conducting two supplementary analyses, the results of which are available in Tables S3 and S5 in the Supplemental Material. First, we conducted subsample analyses to assess whether fWHR would be more strongly related to the three factors when we examined individuals who had only extreme fWHR values, which we defined as those 361 executives whose fWHR values fell either 1 standard deviation above or 1 standard deviation below the mean values. The results from this supplementary subsample analysis suggested that our previously discussed findings are robust: No significant correlations were obtained, with the exception that human-rated fWHR (p = .018) and AI-rated fWHR (p = .004) were positively correlated with self-rated cynicism among males, and the 1/BF01 values for both of these correlations were above 1 (2.19 and 7.81, respectively). When we adjusted for multiple comparisons (see Table S4 in the Supplemental Material), these correlations became nonsignificant.
As a second robustness check, we also examined the correlations between fWHR and each of the original 25 CLI items and found no consistent patterns of significance. Furthermore, the significant relationships that were found became nonsignificant after we adjusted for multiple comparisons (see Table S6 in the Supplemental Material). Taken together, the lack of statistically detectable relationships between fWHR and behavioral tendencies (excepting fWHR and self-reported cynicism in males) leads us to conclude that the influence of fWHR on the social judgments of evaluators who are unfamiliar with targets (as documented in prior research) is likely a case of evolutionary mismatch.
Discussion
We began by noting that what had appeared to be a growing consensus that fWHR predicts a host of antisocial tendencies (e.g., Anderl et al., 2016; Carré & McCormick, 2008; Goetz et al., 2013; Haselhuhn & Wong, 2012; Stirrat & Perrett, 2010) had been challenged by Kosinski (2017), who notably found little evidence supporting this relationship. Our hope in this study is to have advanced this emerging debate by extending prior research, including Kosinski’s study, in several important ways. Specifically, we sought to improve on prior fWHR research by conducting a large-scale study on the relationship between fWHR and behavioral tendencies that did not rely exclusively on self-reported data (and their attendant social-desirability bias). Instead, our survey data were provided by focal individuals and by evaluators, and these evaluators reported knowing the focal individuals well. With this improved research design, and after conducting a variety of analyses, our results clearly suggest that there is very little evidence in support of a relationship between fWHR and antisocial tendencies (excepting some evidence on self-rated cynicism in males).
We interpret our findings as more consistent with the notion of an evolutionary mismatch, whereby fWHR, once reliably tied to antisocial tendencies in ancestral environments in which violence was far more pervasive, may no longer be predictive of these tendencies in modern environments, leading to biased perceptions (Li et al., 2018). Indeed, we suggest that the fact that we found no reliable link between fWHR and behavior does not necessarily mean that fWHR is unrelated to social judgments. Existing research has found that fWHR is positively related to a number of antisocial perceptions, including aggressiveness (Lefevre & Lewis, 2014), deceptiveness (Efferson & Vogt, 2013), proneness to anger (Deska et al., 2018a), threat potential (Geniole et al., 2017), and ascriptions of inhumanity (Deska et al., 2018b). We welcome future research that further refines our understanding of how and why physical characteristics, such as fWHR, may differentially affect focal-actor behaviors and social perceptions. Such an endeavor will increase our understanding of not only our evolved psyche but also why immutable physical characteristics may shape social outcomes, even when they are not predictive of actual behavior.
While our findings appear robust, we also want to acknowledge two caveats. First, our results could have been affected by ceiling and reference-group effects because the business executives in our sample may differ systematically on both fWHR (Alrajih & Ward, 2014) and antisocial tendencies (Babiak, Neumann, & Hare, 2010) compared with the general population. Although the mean fWHR for males and females in our sample was similar to the general population (see Alrajih & Ward, 2014; Kosinski, 2017), we could not rule out the possibility that the executives in our sample are more antisocial (Babiak et al., 2010). Second, since our data were collected in the context of occupational training, it is possible that respondents could have been hesitant to portray their colleagues in a negative light. While we could not entirely rule out this possibility, the leadership training program attempted to reduce this bias by making the ratings completely anonymous and by making it clear to the raters that only scores aggregated across all raters would be disclosed to individual participants. We welcome future research that extends our work to consider these and other issues relevant to the growing body of evolutionary-psychology research linking physical characteristics and behavioral tendencies.
Supplemental Material
Wang_Supplemental_Material_rev – Supplemental material for A Case of Evolutionary Mismatch?: Why Facial Width-to-Height Ratio May Not Predict Behavioral Tendencies
Supplemental material, Wang_Supplemental_Material_rev for A Case of Evolutionary Mismatch?: Why Facial Width-to-Height Ratio May Not Predict Behavioral Tendencies by Dawei Wang, Krishnan Nair, Maryam Kouchaki, Edward J. Zajac and Xiuxi Zhao in Psychological Science
Footnotes
Action Editor
D. Stephen Lindsay served as action editor for this article.
Author Contributions
D. Wang, K. Nair, M. Kouchaki and E. J. Zajac developed the study concept. X. Zhao provided the data. D. Wang analyzed and interpreted the data under the supervision of K. Nair and M. Kouchaki. K. Nair and D. Wang drafted the manuscript under the guidance of M. Kouchaki and E. J. Zajac. E. J. Zajac provided critical revisions. All authors approved the final version of the manuscript for submission.
Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
Open Practices
The design and analysis plans for the present study were not preregistered. Researchers interested in obtaining the data can contact the Center for Creative Leadership (https://www.ccl.org/) or X. Zhao directly (
).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
