Abstract
The article provides a micro-behavioral model and an experimental design to understand the effect of heterogeneity in social identities on cooperation while accounting for endogenous sorting. Social identity is induced exogenously using the minimal group paradigm. The experiment manipulates sorting with three treatments: having subjects interact with both in- and outgroup members, giving them the choice to interact either with ingroup or outgroup members, and isolating the groups from the outset. Cooperation is measured by the Prisoner’s Dilemma Games at the dyadic level and by Public Goods Games at the tetradic level. The results show that heterogeneity hampers between-group cooperation at the dyadic level. In addition, endogenous sorting mitigates this negative effect of heterogeneity on cooperation. Heterogeneity hampers cooperation at the tetradic level most substantially if there is a commonly known negative history between groups.
In the near future, the developed regions of the world will become much more heterogeneous ethnically and culturally, not only due to factors such as wars and conflicts that force emigration from developing countries, but also due to the need in the developed world to attract more foreign labor (UK Office for National Statistics 2014). This increase in heterogeneity with respect to social identities may bring many challenges, including but not restricted to urban integration, interethnic conflict, and threats to social cohesion and social capital (Putnam 2001, 2007). Understanding the social consequences of heterogeneity is thus one of the most important questions in sociology.
In this article, I focus on cooperative behavior in social dilemmas and investigate how heterogeneity in social identity affects cooperation. Social dilemmas pose a conflict between collective and individual interests (Dawes 1980; Kollock 1998). Problems of collective action (Olson 1965), trust (Dasgupta 1988; Gambetta 1988), and cooperation (Taylor 1987) are classical examples of social dilemmas. Studying cooperation in social dilemmas is important because the capacity of a society to peacefully solve the conflict between individual and collective interests for the benefit of the latter characterizes the level of social cohesion in the society (Coleman 1990; Roca and Helbing 2011).
Prior experimental work has shown that heterogeneity in social identity generally hampers cooperation in social dilemmas (e.g., Simpson 2006; Yamagishi and Kiyonari 2000). Experimental studies in other disciplines, particularly in economics and political science, also confirm this general finding (Bouckaert and Dhaene 2004; Chen and Li 2009; Eckel and Grossman 2005; Fershtman and Gneezy 2001; Habyarimana et al. 2007). The results of these experiments resonate well with the observational studies that also show, albeit not in all occasions, a negative association between heterogeneity and social cohesion (see Dinesen and Sønderskov forthcoming; for an overview, see Van der Meer and Tolsma 2014).
In most prior experimental and observational studies, however, a crucial factor, namely, endogenous sorting of heterogeneous actors, has been largely left out. It is well documented that when a social entity is composed of heterogeneous actors, these actors tend to self-select into homogeneous subgroups (Marsden 1987; McPherson, Smith-Lovin, and Cook 2001; Schelling 1971). Due to this self-selection, the society may be globally heterogeneous but locally homogeneous, making heterogeneity a “moving target” (Uslaner 2010). Exposure to outgroups is thought to be a key mediator of the heterogeneity-cooperation link (Dinesen and Sønderskov forthcoming). To what extent exposure to outgroups increases with heterogeneity is very difficult to assess due to endogenous sorting of actors. In the aforementioned experimental studies, the experimenter typically forces subjects of different social identities to interact and compares the level of cooperation under this condition with an alternative condition in which actors of the same social identity are forced to interact. This typical design leaves no room for sorting. In the observational studies of the heterogeneity-cohesion link, the researcher can only observe social cohesion in an environment ex post, that is, only after actors have sorted into the environment.
In this article, endogenous sorting in a controlled experiment is allowed by giving subjects the opportunity to choose the social group of their interaction partner before the interaction takes place. I compare this treatment with other treatments in which partner selection is not allowed, similar to the treatments in previous experimental studies. I show, as predicted by the presented theory, that allowing endogenous sorting mitigates the negative effects of heterogeneity on cooperation. I thus illustrate, both theoretically and empirically, that ignoring endogenous sorting processes yields misleading conclusions on the heterogeneity-cooperation link.
A second important aspect of the current work is that the heterogeneity-cooperation link be studied at two levels: local (dyadic) and aggregate (tetradic). Actors can sort into homogenous environments locally, but at the aggregate level, heterogeneity remains unchanged. For example, sorting into different neighborhoods in a city does not influence heterogeneity at the city level. Yet, some collective goods are produced at the local level and some at the aggregate level. The level at which the heterogeneity-cooperation link is studied is important. Dinesen and Sønderskov (forthcoming), for example, show that it is heterogeneity at the immediate micro-context rather than heterogeneity at larger geographical units that hampers social cohesion. The level at which Dinesen and Sønderskov (forthcoming) measure social cohesion, however, still remains unclear. In this article, I aim to show that depending on how interactions at the local level unfold, heterogeneity may or may not substantially influence cooperation at the aggregate level. In other words, I show that the heterogeneity-cooperation link at the aggregate level is not independent from local-level interactions.
A further distinctive feature of the current work is that it provides an explicit micro-behavioral model that defines and formalizes the micro mechanisms that link heterogeneity, self-selection into homogenous groups, and cooperation. Any theoretical model that seeks to explain a macro relationship should, in principle, include some sort of individual-level mechanism (Coleman 1990; Hedström 2005; Raub, Buskens, and van Assen 2011). This micro mechanism is missing in most previous theoretical work. For example, Putnam (2007) argues that heterogeneity hampers all forms of cooperation and trust (i.e., between and within groups). But the mechanisms through which heterogeneity hampers these different forms of cooperation remain unclear in Putnam’s original work (Van der Meer and Tolsma 2014).
Theory
The theoretical model is presented in two steps. The first step starts with the case in which actors do not have an option to choose their interaction partner. This resembles the situation in almost all intergroup experiments (e.g., Chen and Li 2009; Eckel and Grossman 2005; Simpson 2006; Yamagishi and Kiyonari 2000). In the second step, actors are allowed to choose the group of their interaction partners before the interaction takes place. This allows actors to potentially sort into homogenous subgroups. Perhaps due to theoretical and practical difficulties of providing actors with an opportunity to choose their partners, past experimental research has mostly ignored endogenous sorting (for an exception, see Slonim and Garbarino 2008). 1
Social identity theory (Tajfel 1970; Tajfel et al. 1971) predicts that actors classify self and others in a few categories to cope with cognitive complexity. This is called the categorization process. The second important element of social identity theory is self-enhancement: actors are predicted to attribute positive characteristics to the ingroup and negative characteristics to the outgroup. In addition to categorization and self-enhancement, a further element of social identity theory is about actors’ perceptions of others’ interests: ingroup members’ interests are perceived as interchangeable, and actors assign a greater weight to collective ingroup outcomes than the outcomes for the self (Kramer and Brewer 1984; Turner 1985). These effects of social identity on perceptions of others’ interests are captured in the following way.
In the model, in their “utility functions,” actors assign non-zero weights to the outcomes of others. A utility function is a function that maps a distribution of tangible objective outcomes, such as money, to a subjective evaluation of this distribution. In principle, outcomes can be negative or positive. In this application, all outcomes are positive. An actor has two weights in their utility function, one for ingroup members’ outcomes and one for outgroup members’ outcomes. These weights are random variables, and actors may have different weights. For example, some actors may assign higher weights to others’ outcomes. Others may assign zero weights and thus are selfish. Still others may assign negative weights to others’ outcomes. This latter type of actors can be classified as competitors. Utilities can be negative while outcomes are positive. For example, a competitor actor who assigns a negative weight to others’ outcomes may derive disutility from a distribution in which his or her own outcomes are lower than others’ outcomes.
Furthermore, it is assumed that actors vary with respect to ingroup bias; some assign high weights to ingroup members and low weights to outgroup members, and others are less biased and assign similar weights to ingroup and outgroup members. Some “multicultural” actors may even assign higher weights to outgroup members’ outcomes than to ingroup members’ outcomes. Yet, in line with social identity theory, I expect that on average, the weight assigned to the outcomes of ingroup members is higher than that of outgroup members’ outcomes. Formally, the utility of actor i from an outcome allocation for the self (x) and for
where I indicates the set of ingroup members, O indicates the set of outgroup members,
The Prisoner’s Dilemma (PD) and Public Goods Game (PG) Outcomes
Note: The numbers in cells indicate outcomes. In PG, the first number in a cell indicates how much Player 1 earns, and the remaining numbers indicate how much others earn, where the right side number(s) indicate the outcome(s) for noncooperator(s).
The predictions summarized in the following are derived from the following two basic assumptions: (1) the average weight for ingroup members’ outcomes is higher than the average weight for outgroup members’ outcomes and (2) actors maximize utility given the function in Equation 1 (the rationality assumption).
Predictions for the Prisoner’s Dilemma with and without Partner Choice
Consider the PD in Table 1A. Without assuming any social preferences, the predicted behavior (which happens to be the Nash equilibrium) is mutual noncooperation. This is because whatever the other player does, not cooperating gives a higher individual payoff. In game theory language, not cooperating is a dominant strategy. Both actors would be better off by mutual cooperation, which is unsustainable due to individual incentives for noncooperation.
The solution of the game is different once one assumes social preferences. If one assumes that subjects transform the outcomes in the PD into utilities using Equation 1, then those with sufficiently large
Let’s now assume that there are two groups, A and B, and before playing the game, members of A and B can choose to be matched with either an ingroup or an outgroup member. In this case, the formal analysis is more complex. The reason is as follows. An actor may choose to interact in an ingroup environment (i.e., play the PD with an ingroup member) or in an intergroup environment (i.e., play the PD with an outgroup member). The composition of these environments with respect to θ is endogenous. For example, members of Group A who choose to interact with members of Group B have a different θ distribution than members of Group A who choose to interact within their group. One may think that those who have only very large
While an analytical solution is complex, a numerical solution can easily be found with computer simulations. Table 2 includes some examples. The table presents a number of θ distributions and the resulting predicted equilibrium cooperation rates with and without partner choice. 3
Predictions for Overall, Within-Group, and Between-Group Cooperation Rates: Some Examples
Note: (
Extensive simulations show that when they have the choice, the majority of actors choose to interact with ingroup members (Hypothesis 2a) (see column 6 in Table 2). Furthermore, even with partner choice, the level of between-group cooperation is lower than the level of within-group cooperation (Hypothesis 2b) (column 4 vs. column 5 in Table 2). But the difference between within- and between-group cooperation will be lower when actors have partner choice than when they don’t have partner choice (Hypothesis 2c) (see the difference in differences between columns 4 and 5 and columns 8 and 9 in Table 2).
These theoretical results point to a paradoxical effect of sorting. On the one hand, when actors are free to choose their interaction partners, they mostly interact with ingroup members, which increases segregation. On the other hand, this sorting keeps the level of cooperation in between-group interactions higher compared to the situation in which actors are forced to interact with outgroup members. This latter phenomenon occurs because actors who would behave noncooperatively toward outgroup others, had they been forced to interact with the outgroup, are more likely to play with ingroup members when they can choose. Thus, most of the actors who would behave noncooperatively toward outgroup members endogenously sort into within-group environments. This sorting keeps the level of cooperation in between-group environments relatively high.
In addition to linking heterogeneity, sorting, and cooperation, the theoretical scheme provides further interesting implications. For example, research has shown that minority groups display more ingroup bias, are less cooperative, and have lower levels of generalized trust than majority groups (De Vroome, Hooghe, and Marien 2013; Hewstone, Rubin, and Willis 2002; Schlueter and Scheepers 2010). In addition, a larger minority size is associated with stronger anti-immigrant attitudes among the majority group. These findings are often explained by psychological mechanisms such as perceived threat (Hewstone et al. 2002; Schlueter and Scheepers 2010).
The presented theoretical model provides an alternative, a more structural explanation. Minority groups are more constrained in their partner selection choices. That is, they have to interact with outgroup members more frequently than the majorities interact with minority groups. Majority group members, however, mostly lack the opportunity to interact with outgroup members. Translating this to the present setup, the minorities interact more often in conditions that resemble the ninth column of Table 2 whereas the majorities interact more often in conditions that resemble the eighth or the fourth column of Table 2 because the majorities are often isolated or can choose to isolate from the minorities. The presented theory predicts that cooperation will be lower among the minorities than the majorities because of these differences in the structure of interaction opportunities, not necessarily because of a perceived threat or any other psychological process.
Predictions for Public Goods Game
Public Goods Game is the N-person version of the Prisoner’s Dilemma (Table 1B). Here, the formal and empirical analysis of PG is restricted to the case in which actors cannot choose their interaction partners. Actors can sort into homogenous groups locally (in this case in the PD), but at the most aggregate level (in this case PG), heterogeneity remains unchanged. For example, sorting into different neighborhoods does not influence heterogeneity at the city level. Yet, some collective goods are produced at the local level and some still at the aggregate level. Including PG alongside the PD, I aim to understand the interplay between the local- and aggregate-level dynamics.
Consider the PG in Table 1B. As in the PD case, without social preferences, universal noncooperation is the predicted outcome of PG because not cooperating always gives a higher payoff irrespective of how many others cooperate. When one assumes that actors transform the outcomes of the game into utilities using Equation 1, there is a different prediction: those with sufficiently high
One can test several implications of this formal result. For example, the higher the number of ingroup members relative to the number of outgroup members in the game, the higher the likelihood of cooperation. To be clear, I am not testing this prediction, which is effectively the same as Hypothesis 1 formulated previously. I will be testing Hypothesis 1 using the PD. In the experimental design, the level of heterogeneity in PG is kept constant: in all experimental conditions, subjects play PGs with one ingroup and two outgroup members. This is because I am interested in a further factor that may influence behavior in PG, namely, prior negative between-group contact, which is discussed next.
Negative between-group contact and cooperation
Contact between members of different groups is a double-edged sword. Under certain conditions, contact is predicted to reduce intergroup conflict (Allport 1954). Although this has been shown to be true in various situations (e.g., Brown and Hewstone 2005), in this case, the conditions for contact to be beneficial are not present. Social dilemmas are situations in which there is a risk for cooperative actors to be exploited by noncooperative others. The formal analysis presented previously predicts that interactions between actors of different social identities in social dilemmas do not reduce conflict; on the contrary, in such situations, cooperation is predicted to be low (also see Rutherford et al. 2014).
Recent social psychological studies on “negative contact” show that negative experiences between groups are more strongly related to increased conflict than positive experiences are with reducing conflict (Barlow et al. 2012; Paolini, Harwood, and Rubin 2010). This is consistent with the game theory literature on reciprocity, which shows that people adjust their social preferences based on the previous behaviors of their interaction partners. People respond to the negative intentions of others with less prosocial preferences (negative reciprocity) and the positive intentions of others with more prosocial preferences (positive reciprocity) (Falk and Fischbacher 2006; Gautschi 2000). Moreover, negative reciprocity is shown to be a stronger motivation than positive reciprocity (Aksoy and Weesie 2014).
The formal analysis of the PD predicts that the subjects will experience the lowest levels of cooperation in between-group interactions when actors cannot choose the group of their interaction partners. When actors have partner choice, cooperation will still be lower in between-group interactions than in within-group interactions, but this difference will not be as high as the difference in the no partner choice case. Hence, it is expected that during and after intergroup interactions unfold, subjects will lower their
Note that negative contact can have an effect on cooperation if actors receive information about others’ decisions. When no feedback about others’ decisions is provided, actors will not be exposed to noncooperative behaviors of others (ingroup or outgroup), hence there will be no reason for reciprocity. Hypothesis 3 is thus tested rigorously by experimentally concealing and revealing the information subjects receive about the level of negative contact between groups.
Methods
Participants
One hundred eighty-six subjects were recruited for the experiment. To assess the external validity of the experiment, subjects from heterogeneous backgrounds were invited. Of the subjects, 42 percent were male, and 50 percent had UK citizenship. The remaining non-UK subjects were from 33 different countries. Seventy-five percent of the subjects were students, mainly at the University of Oxford, and the remaining 25 percent were nonstudent citizens of the city of Oxford. The average age of the subjects was 29.8 years (SD = 13.9).
Design and Procedure
The experiment was conducted in ten sessions. In each session, subjects were seated randomly in a cubicle in the Nuffield Centre for Experimental Social Science (CESS) lab at Nuffield College so that they could not see each other. The entire procedure took place on computers. 5 Throughout the experiment, subjects earned tokens that were converted to cash at the end of the experiment at an exchange rate of 20 tokens = £1 (~1.6 USD). Subjects earned £14 (~22 USD) on average (minimum, £11; maximum, £17), including £4 for showing up for the experiment.
The experiment included four between-subjects treatments, mixed, choice, isolated, and control, and two within-subjects treatments, no-feedback and feedback. The overall design of the experiment is shown in Table 3 and explained in detail in the following. 6
Experimental Design
Note: Columns are between-subject and rows are within-subject treatments. PD = Prisoner’s Dilemma; PG = Public Goods Game.
Inducing heterogeneity
The theoretical model does not rely on the nature of the groups per se. The theory applies more generally to any case in which actors, on average, feel more altruistic toward an ingroup than toward an outgroup. Nonetheless, a key design issue is how to manipulate heterogeneity. One can use natural identities, such as ethnicity and gender, or it can be induced in the lab. Both approaches have been implemented in the literature. Buchan, Johnson, and Croson (2006); Goette, Huffman, and Meier (2006); Habyarimana et al. (2007); Kuwabara et al. (2007); and Bornhorst et al. (2010) are examples of recent studies that use natural identities. Tajfel et al. (1971), Yamagishi and Kiyonari (2000), Simpson (2006), and Chen and Li (2009) are studies that induced identity in the lab using the minimal group paradigm, in which subjects are divided into groups based on a trivial criterion. Experiments with minimal groups show that such a categorization per se, no matter how trivial, is sufficient to induce a fairly strong group identity.
Natural identities are multidimensional; many factors in addition to social identity vary with natural identities (Chen and Li 2009; Tajfel et al. 1971; Yamagishi and Kiyonari 2000). Moreover, it is generally not known beforehand how natural identities will respond to experimental manipulations. It is also difficult to assess whether a subject identifies with her “natural” group at all or if she does so whether others categorize her by her social identity. Consequently, the results of studies that use natural identities are more ambiguous because they can be attributed to many causes. Inducing identities in the lab with the minimal group paradigm, in contrast, gives the experimenter greater control over the subject’s prominent identity and greater confidence that results are due to the experimental manipulation. Based on this discussion, in the current experiment, social identity was induced in the lab using minimal groups.
Subjects received five pairs of paintings by Paul Klee and Wassily Kandinsky on their screens. Depending on their relative preferences, half of the subjects were classified as Klees and the other half as Kandinskys. 7 A subject’s identity as a Klee or a Kandinsky remained fixed for the rest of the experiment. The design included two further procedures to enhance group identity. Because these procedures are quite standard in the minimal group literature, they are not discussed here but given in full detail in the online Appendix B. 8
Once social identities were induced (or not induced, in the control condition), the primary experimental procedure took place. In the following, I summarize the details of the mixed, choice, isolated, control, and the no-feedback and feedback treatments.
Mixed condition: Stage 2
In stage 2, subjects played ten rounds of PD each time with a randomly matched partner. The particular outcome parameters in the PD were chosen based on previous experimental studies to keep cooperation levels close to .5, which would provide the maximum statistical power to detect differences between experimental conditions (Aksoy and Weesie 2013; Isaac and Walker 1988). Also, this version of the PD kept the fear and greed components the same (Kuwabara 2005; Simpson 2003).
In the mixed condition, the group of the interaction partner in each of the ten PDs was always alternated. In the odd-numbered (even-numbered) rounds, the interaction partner was always a randomly selected ingroup (outgroup) member. Participants were paid for two randomly selected rounds, one ingroup and one outgroup round. Information on the rounds that were selected and how much subjects earned in those rounds was made available only after the subjects completed the experiment. 9
Stage 3
Subjects played five rounds of binary choice Public Goods Games (Table 1B). 10 Because PG was conceptualized as a higher level collective good from which one cannot exclude actors of different social identities, in each of these five rounds, two randomly selected Klees and two randomly selected Kandinskys were matched. Participants were paid for a randomly selected round. Information on the round that was selected and how much the subjects earned in this round was made available at the end of the experiment.
During stages 2 and 3, no feedback about other subjects’ behavior was provided. Hence, this first block is called the no-feedback condition.
Stage 4
Subjects played another ten rounds of PD, half of the rounds with ingroup and half with outgroup as in stage 2. Different from stage 2, before each round the subjects received aggregate-level feedback about: (1) the percentage of cooperative decisions made by Klees who were matched with other Klees, (2) the percentage of cooperative decisions made by Klees who were matched with Kandinskys, (3) the percentage of cooperative decisions made by Kandinskys who were matched with other Kandinskys, and (4) the percentage of cooperative decisions made by Kandinskys who were matched with Klees, averaged up to that round. The payment procedure was the same as in stage 2.
Aggregate feedback (rather than individual feedback) was provided for the following reasons. If individual feedback was provided, each subject would experience a unique experimental history. This would undermine the experimental control over the subjects’ experiences and would increase unsystematic variance in the data, complicating statistical analysis. Aggregate-level feedback ensured not only that each subject received the same aggregate information but also that each subject knew that each subject received the same information.
Stage 5
Participants played a final five rounds of PG in heterogeneous groups of two Klees and two Kandinskys with feedback. Before each round, the subjects were informed about the percentage of cooperative decisions made by Klees and Kandinskys, averaged up to that round.
Once stage 5 was completed, the randomly selected payment rounds and the amount the subjects earned throughout the experiment were revealed.
Choice condition
The procedure was identical to the mixed condition with one exception. Instead of alternating the group of the interaction partners exogenously, in stages 2 and 4, each subject submitted whether she wanted to be matched with a Klee or a Kandinsky before each PD round. Then, the subject was matched with another subject from the target group who wanted to be matched with someone from the subject’s group. 11
In stage 4 in which feedback was provided, the subjects were informed before partner choice and the PD play about the same percentages of cooperative behaviors of each group against each group as in the mixed condition.
Stages 3 and 5 were exactly the same as in the mixed condition: subjects played five PGs in a heterogeneous group (two Klees and two Kandinskys) formed randomly in every round, without and with feedback. In other words, in PGs, subjects could not choose the groups of their interaction partners.
Isolated condition
The procedure was identical to the other two conditions with the following exception. In stages 2 and 4, where subjects played ten PDs without and with feedback, respectively, the subjects were matched with people only from their own group. Stages 3 and 5 in this condition were exactly the same as the other two conditions. To be clear, I do not expect any difference between this condition and the ingroup cooperation in the mixed condition. Different from the previous two conditions, however, in this condition there is no (negative) contact between the two groups, which enables me to test Hypothesis 3.
Control condition
The control condition did not induce group identity. In stages 2 and 4 (3 and 5), subjects played ten PDs (five PGs) each round with randomly selected others without and with feedback, respectively. In stages 4 and 5, before each round, the subjects were informed about the overall percentage of cooperative decisions, averaged up to that round. To be clear, I do not have hypotheses about the control condition. This condition would reveal if the inducement of group identity influences cooperation in an unexpected way.
Note that the mixed, choice, isolated, and control treatments were between-subjects and the six stages, hence the no-feedback and feedback treatments were within-subjects. The feedback treatment always followed the no-feedback treatment. I chose not to include a further condition in which the order of the no-feedback and the feedback treatments was reversed (i.e., no-feedback followed feedback). This was because in this alternative order, even if no-feedback was provided in a subsequent block, at least some subjects would remember the feedback provided in the first block.
Postexperiment Survey
The postexperiment survey included the following four items taken from Yamagishi and Kiyonari (2000) to measure the degree of identification with the ingroup and outgroup: “How strongly did you feel belongingness to the (Klee or Kandinsky) group?” (belongingness); “How much commonality did you think you shared with the members of the (Klee or Kandinsky) group?” (commonality); “How close did you feel toward the members of the (Klee or Kandinsky) group?” (closeness); and “How favorably did you feel toward the members of the (Klee or Kandinsky) group?” (liking). The answer categories were from 1 = not at all to 7 = very much. In the control condition that had no grouping, these questions were omitted.
Basic demographics (age, gender, study field, occupation, nationality), political party preference measured from 1 (extreme left) to 11 (extreme right), number of siblings, and level of experience in similar experiments were other variables measured in the postexperiment survey. 12
Results
Manipulation Check
Participants in the minimal group condition felt more closeness, difference = 1.48, t(121) = 9.50, p < .05; more belongingness, difference = 1.91, t(121) = 10.59, p < .05; and commonality, difference = 1.34, t(121) = 7.89, p < .05, toward the ingroup and liked their ingroup significantly more, difference = 1.30 t(121) = 7.10, p < .05, compared to the outgroup. 13 All of these differences remain statistically significant when the three experimental conditions are considered separately. 14 Because all differences are substantial (i.e., larger than 1) and statistically significant, it is concluded that the inducement of heterogeneity was successful.
Cooperation
Figure 1 displays the average cooperation in the PD and PG by treatment. Before interpreting the results, I note that there is no significant main effect of a preference for the paintings of Klee (vs. Kandinsky) on cooperation, nor is there a significant interaction of a preference for the paintings of Klee (vs. Kandinsky) with the experimental conditions for any of the four panels in Figure 1. Hence, I collapse Klees’ and Kandinskys’ responses in all analyses. Furthermore, there is no significant effect of being a student on cooperation, and there is no significant interaction between being a student and the experimental condition. Similarly, being a British citizen has no significant effect on cooperation in the PD or in PG.

Average Cooperation by Treatment
The significance tests reported in the following are based on multilevel logistic regression models with random effects for subjects (Table 4). The subject-level random effects capture roughly the θ parameter in the theoretical model. When included in the model as a second level, variance across sessions is not statistically significant. The regression models also include the control variables sex, political party preference, and age, the only subject-level covariates that influence cooperation significantly. The effect of feedback on the effects of treatment is captured by interaction terms in the regression models.
Multilevel Logistic Regression Models with Random Intercepts for Subjects Predicting Cooperative Behavior (1 = Cooperate, 0 = Defect) in the Prisoner’s Dilemma (PD) and Public Goods Game (PG)
p < .05 for two-tailed tests.
Note that the formal tests of the hypotheses are performed via a number of linear combinations of the coefficients in Table 4. Table 5 presents how these linear combinations are calculated, together with the associated hypotheses. Table 5 also includes the standard errors and the z-values of the linear combinations, which are used to test the hypotheses formally. 15
Tests of Hypotheses as Linear Combinations (LC) of Coefficients in Table 4
p < .05 for one-tailed tests.
Prisoner’s Dilemma Behavior
As apparent from Figure 1, there is a substantial difference between within-group and between-group cooperation in the mixed condition (Hypothesis 1). This is in line with theoretical predictions, but the effect size is striking: the odds of cooperating are approximately 4.6 times higher (see Table 5) with feedback and 3.2 times without feedback when ingroup members interact compared to the situation in which outgroup members interact.
As predicted, when they have the choice, the majority of the subjects (more than 70 percent) play with an ingroup member (Hypothesis 2a). A separate multilevel logistic regression model with random effects for subjects predicting partner selection shows that the percentage of subjects who choose to interact with an outgroup member is significantly different from 50 percent (z = −5.04, p < .05). The predicted difference between within-group and between-group cooperation rates in the choice condition becomes statistically significant only when feedback is provided (Hypothesis 2b) (Δb = .63, z = 1.86, p < .05; see Table 5). This difference is substantially significant, too: when feedback is provided, the odds of cooperating are about 1.9 times higher in the choice condition when ingroup members interact compared to the situation in which outgroup members interact.
I also find strong support for Hypothesis 2c: the difference between within-group and between-group cooperation is lower in the choice condition compared to the mixed condition (without feedback, difference in Δb = −1.46, z = −3.21, p < .05; with feedback, difference in Δb = −.89, z = − 2.03, p < .05; see Table 5). Figure 1 is useful to interpret the substantial significance of this result. Without feedback, within-group and between-group average cooperation rates are virtually identical in the choice condition but substantially different (i.e., larger than 0.2) in the mixed condition. With feedback, the average between-group cooperation becomes lower than the average within-group cooperation in the choice condition, too. But the difference remains much higher in the mixed condition.
As an aside, results also show that when the mixed and the choice conditions are combined in a separate regression model, the difference between within-group and between-group cooperation is higher under the feedback condition than under the no-feedback condition, Δ(Δb) = .61, z = 2.11, p < .05. Thus, feedback moderates the negative effect of heterogeneity on cooperation.
Public Goods Game Behavior
In the no-feedback condition, cooperation rates in PG across treatments are very similar. In fact, the differences between treatments are insignificant when subjects do not receive feedback about others’ decisions. When feedback is provided, the odds of cooperating in the isolated condition are approximately 12 times higher than in the mixed condition, and the difference is statistically highly significant (Δb = 2.46, z = 2.70, p < .05). The difference between the isolated and the choice condition is also statistically significant with feedback (Δb = 1.72, z = 1.89, p < .05). This corresponds to an odds ratio between the isolated and the choice condition of about 5.6. These results support Hypothesis 3.
Discussion
In this article, I provide a micro-behavioral model and an experimental design to understand the effects of heterogeneity in social identities on cooperation. The theoretical scheme is a typical example of a micro-macro model (Coleman 1990; Hedström 2005; Raub et al. 2011). The micro-behavioral model links the macro constructs of heterogeneity and cooperation via the social preferences of actors while accounting for partner selection decisions (and thus endogenous sorting). The study shows that given the same distribution of micro-level social preferences, depending on the nature of initial macro conditions, quite different macro outcomes may emerge. For instance, the freedom actors have in choosing the social group of their interaction partners has a very substantial effect on the link between heterogeneity and cooperation.
The experimental design induces heterogeneity independent of endogenous sorting and cooperation, thus it provides a clear assessment of the causal effect of heterogeneity on cooperation. Moreover, the design allows us to naturally analyze cooperation at two levels, dyadic and tetradic.
Results show that when dyadic interactions are considered, heterogeneity decreases cooperation substantially. Most importantly, however, this negative relationship is moderated by endogenous sorting. When the subjects have the option to choose their interaction partners, a somewhat paradoxical effect—which is precisely predicted by the theoretical model—occurs. On the one hand, most of the subjects (more than 70 percent) sort and choose to interact with other ingroup members. On the other hand, this sorting reduces ingroup bias with respect to cooperative behavior. This is because when the subjects have a partner choice, then those who would act noncooperatively toward outgroup others are more likely to play with ingroup others. Consequently, the difference between within-group and between-group cooperation is lower in the choice condition compared to the case in which subjects are forced to interact with outgroup members.
This particular result has a number of methodological implications. First, it shows that it is very difficult to measure heterogeneity independent of endogenous sorting in observational studies because in most real-life situations actors have a partner choice. As a consequence, most observational studies might have underestimated the “true” causal effect of heterogeneity on social cohesion. This is because in those studies the researcher can measure heterogeneity and cohesion only after participants have sorted into their environments. Conversely, most previous intergroup experiments might have overestimated the level of ingroup bias that occurs naturally. This is because in those experiments subjects typically do not have a partner choice but are forced by the experimenter to interact either with ingroup or outgroup members.
Taken together, the theoretical and experimental results help us understand under which conditions heterogeneity is expected to reduce cooperation and social cohesion. At the local level, heterogeneity clearly decreases cooperation. But the size of this negative effect depends on how much freedom actors have in selecting their social environments. In rigid environments in which sorting in or out is relatively difficult, such as residential areas or schools, heterogeneity will likely reduce cooperation. In more fluid environments in which actors can voluntarily choose their interaction partners, heterogeneity is expected to influence cooperation much less, if at all. For example, heterogeneity in social clubs or in one’s personal network should not have strong effects on social cohesion.
In addition to endogenous sorting, how much heterogeneity reduces cooperation also depends on the history of intergroup relations and how much actors are primed about this history. The present experiment shows, as predicted by the presented theory, that information about intergroup relationships is another key moderator of the heterogeneity-cooperation link. Cooperation diminishes in heterogeneous environments quite substantially when actors are informed about a negative history between groups. If there is prior negative contact but the subjects are unaware of it or if there is no prior contact between groups, cooperation remains relatively high in heterogeneous environments. This also hints that the negative effect of heterogeneity on cohesion is partly a self-fulfilling prophecy. The theory predicts that making intergroup conflict more salient by, for example, a sustained emphasis from the media or politicians will reduce social cohesion in heterogeneous environments. Results also show that the effect of information about negative contact spills over to more macro-level contexts. For instance, in this experiment, negative contact happens initially in the PD context. Yet, this negative intergroup history in the PD reduces cooperation in a more macro level, namely, in heterogeneous PG. This implies that making intergroup conflict in a micro-context more salient would reduce cooperation not only at the micro-context but also at more macro levels.
In real life, group interactions are more complex than what is theoretically and experimentally analyzed here: many other factors co-vary with social identity. For example, some ethnic groups are often worse off than others, and there is often a status difference between groups. Precisely because of these confounding factors, it is difficult to isolate the causal effect of heterogeneity in observational studies. However, many of these factors will likely reinforce the results presented in this article. If, for example, social identity is coupled with inequality, this would likely strengthen the differences between groups because actors would be different not only with respect to social identities but also with respect to resources. Nevertheless, future work should extend the experimental design presented in this article with additional elements. The most obvious next step is inducing social identity and inequality together.
Footnotes
Acknowledgements
I thank John Jensenius III for his help in programming and Wojtek Przepiorka, Akitaka Matsuo, Vincent Buskens, and Werner Raub for helpful discussions. I also thank participants of the CESS Colloquium Series, the First International Meeting on Experimental Social Sciences Conference in Oxford, the Social Sciences Seminar Series at Koc University, the Thurgau Experimental Economics Meeting in Stein am Rhein, the Seventh Analytical Sociology Conference in Mannheim, the International Conference on Experimental Social Science on Social Dilemmas in Utrecht, four anonymous SPQ reviewers, and the SPQ editors Richard T. Serpe and Jan E. Stets for their helpful comments. The experiment reported in this article was conducted at Nuffield Centre for Experimental Social Sciences.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is funded by the Netherlands Organization for Scientific Research (NWO) under grant 446-13-004.
1
There are a number of experimental studies that investigate cooperation in dynamic networks (e.g., Gallo and Yan 2015; Rand, Arbesman, and Christakis 2011; Wang, Suri, and Watts 2012). These studies also allow partner selection, similar to the present design. In these dynamic network experiments, however, a subject can form a tie (or sever an existing tie) with a specific and anonymous individual rather than a randomly selected member of a social group. Hence, these experiments do not study the effects of heterogeneity in social identity on endogenous sorting and cooperation. A further difference between those dynamic network experiments and the current design is that in those experiments, a subject can be matched with several other subjects simultaneously. Nevertheless, the results of those dynamic network experiments unequivocally show that endogenous sorting increases cooperation, which is in line with the core argument of this article.
2
The average level of ingroup and intergroup cooperation depends on the distribution of
3
In these examples, (
.
4
Written formally, what sufficiently high means that an actor i who has no partner-choice and plays the game with
5
One of the ten sessions had to be aborted prematurely due to a power failure and a subsequent software crash. Data recorded before the power failure were used in the analyses whenever possible. Removing or including this session from the analyses does not change the results.
6
All instructions and the screens subjects saw during the experiment are available from the author upon request.
7
Splitting the session into equally sized groups maximizes the Herfindahl index, a standard measure of heterogeneity.
9
10
11
In some cases, the number of people in a focal group who wanted to be matched with a target group could be different than the number of people in the target group who wanted to be matched with the focal group. In such cases, the computer performed a one-to-many matching. If there was nobody in the target group, the computer randomly selected a decision from previous rounds from the target group that wanted to match with the focal group. If there was no one in any of the previous rounds, the computer randomly selected a decision. Subjects were fully informed about this matching procedure.
12
Generalized trust was also measured in the postexperiment survey. I do not discuss the results on trust in detail for brevity. The levels of trust across the four experimental conditions are almost a copy of the levels of cooperation in PG with feedback. That is, trust in the mixed condition is significantly lower than trust in all other three conditions. This is in line with past research that shows a deep connection between cooperation and trust (Irwin, Mulder, and Simpson 2014;
).
13
I report one-tailed p values when I test explicit hypotheses (e.g., Table 5). All other p values (manipulation checks and those in
) are two-tailed.
14
15
The standard errors of the linear combinations are calculated using the Stata command “lincom.”
Bio
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
