Abstract
We present a set of studies the objective of which was to test the robustness of the acknowledgment of earned entitlement effect across different experimental modes and populations. We present three sets of results. The first is derived from a between-subject analysis of two independent, but comparable samples of nonstudent adults. One sample participated in a standard, behavioral laboratory experiment and the other participated in a survey experiment. The two methods returned similar treatment effects. The second set of results relates to a sample of students drawn from a behavioral laboratory’s pool of registered subjects. They participated in both the behavioral lab and survey experiments. We perform a between-subject comparison of the two treatment-elicitation methods but, this time, focusing on the same sample of subjects. Again, the treatment effects are very similar. Finally, we establish that within-subjects there is some consistency between decisions made under the two methods.
Keywords
Introduction
With reference to experiments in the social sciences, Deaton and Cartwright (2018:12) recently wrote that “Establishing causality does nothing in and of itself to guarantee that the causal relation will hold in some new case, let alone in general.” Bardsley et al. (2010:51) share this opinion, writing that, “strictly, all that happens in a particular laboratory experiment is what happens in it.” Using the conventional terminology in social science research, a major concern is that experiments may lack external validity (Campbell 1957; Campbell and Stanley 1963; Cook and Campbell 1979). This is disquieting, given that the main motivation for virtually any experiment is to say something of relevance beyond the local conditions of a particular experimental trial. A common response to this tension is to argue that what one generalizes is not a particular experimental result, but a theoretical construct: “Even simple theory can do much to interpret, to extend, and to use RCT results” (Deaton and Cartwright 2018:15). Under this interpretation, “an experimental setting is externally valid if it is an instance of the theory it tests and neither adds nor subtracts anything from the theory” (Zelditch 2014:96). Therefore, an experiment is just one instantiation of a theoretical construct and, if a new experiment offers further evidence supporting the theoretical construct, we can say that the external validity of the theoretical construct has increased. 1
Evidence of external validity must, by definition, be empirical (Morton and Williams 2010:266), and most discussions on the generalizability of experimental results recommend scientific replication as the cornerstone of scientific advancement and knowledge accumulation (Camerer 2015; Falk and Heckman 2009). But, there are two very different ways of replicating an inference or knowledge claim: (1) by reproducing the exact experimental settings of the original trial; 2 (2) by testing the same knowledge claim using a different experimental mode. 3 Focusing on this second definition of replication, two methods emerge as the ideal candidates for increasing the generalizability of a theoretical construct: theory-guided meta-analysis and cross-validation or empirical cross-checking. The former uses data from previously conducted experiments to study whether methods, populations, and experimental settings affect findings. The latter compares experiments involving different methods to test the same theoretical construct. The most common example of empirical cross-validation or cross-checking in experimental social science involves the comparison of laboratory and field results on the same theoretical construct (Bardsley et al. 2010:chapter 5; Guala 2005:chapter 9). Correspondence between the lab and the field is typically interpreted as evidence of external validity. Here we extend this idea of empirical cross-validation to the correspondence between a behavioral and a survey experiment.
A few recent papers across the social sciences study the correspondence between behavioral and survey results measuring similar theoretical constructs. Becker et al. (2012) find weak correlations between economic preferences and psychological personality measures. Falk et al. (2018) present a data set containing experimentally validated survey measures of five individual preferences and behavioral tendencies much studied by experimental social scientists. Chen and Tam (2019) correlate three measures of inequality preferences: behavioral games, vignette studies, and conventional attitudinal survey items. Eifler (2010), and Petzold and Wolbring (2019) discuss the validity of survey experiments more generally by looking at the correspondence between behavioral intentions reported in survey experiments and actual behavior in the field. They find similarities between the two experimental modes, especially regarding treatment effects, but also quantitative deviations between the two. 4
In this article, we present the results of a cross-validation exercise to test the generalizability of a widely studied theoretical prediction: earned entitlement is acknowledged. First proposed by Homans (1961), this principle implies that justice exists when the received benefit of a group member is proportional to her investments. Adams (1965) materialized this idea in the well-known “equity formula.” Contrary to the Rawlsian egalitarian ideal (Haidt 2012; Rawls 1971), Homans’ and Adams’ proportionality principle offers a normative justification for some inequalities to be tolerated, justified, or accepted. In the last 60 years, a number of experimental studies have been conducted on the empirical validity of the equity or proportionality principle in sociology, economics, psychology, political science, and philosophy (e.g., Cook and Hegtvedt 1983; Konow 2003; Scott et al. 2001; Wagstaff, Huggins and Perfect 1996). One of the main conclusions of these studies was anticipated by Homans (1961:246) when he claimed that people “differ in their ideas of what legitimately constitutes investment, reward, and cost, and how these things are to be ranked.” Here, we build on more recent definitions of justice as proportionality (Barr et al. 2015; Barr, Miller, and Ubeda 2016; Cappelen et al. 2007; Demel et al. 2019; Haidt 2012; Konow 1996, 2000, 2003; Winter et al. 2012) and propose a testable principle—the acknowledgment of earned entitlement—that will be defined and operationalized in the Method section.
Our cross-validation involves survey and laboratory experiments with heterogeneous samples. At the core of this strategy is an experimental paradigm that has been extensively used in several social sciences and has been previously called a distributive justice experiment (Barr et al. 2016). A distributive justice experiment typically entails two consecutive phases. In the production phase, people undertake an individual or group task. This phase might be hypothetical (a vignette or story created by the researchers) or real (actual participants in the experiment undertake the task). In the distribution phase, participants are asked to distribute the surplus created in the production phase. Again, this can be done hypothetically, that is, participants subjectively rate the fairness or social appropriateness of various proposed distributions or it can be real, participants actually (re)distribute monetary or nonmonetary allocations. The main experimental manipulation involves informing participants in the distribution phase about the source of the surplus that is to be distributed. In Earned treatments, 5 at the start of the distribution phase, participants are informed about the contribution or performance of individual subjects in the production phase. In Random or Luck treatments, at the start of the distribution phase, participants are informed that the initial distribution has been generated randomly, by chance, or by another arbitrary mechanism. The main theoretical prediction is that, on average, across methods, settings, and populations, a difference in productivity (Earned treatment) instead of luck (Random treatment) as the source of inequality causes an increase in the acceptability of that inequality.
We present three sets of results. The first is derived from a between-subject analysis of two independent, but comparable samples of nonstudent adults (referred to collectively as the adult sample below). One sample participated in a standard, incentivized behavioral experiment and was recruited following a number of strategies including making use of mailing lists of public institutions, employment centers, and local companies (for more details see Barr et al. 2016). The other sample was part of an ongoing program evaluation and participated in a survey experiment during the first day of the program. Despite the many differences between the behavioral and survey experimental designs (see Method section for details), they returned similar treatment effects, consistent with acknowledgment of earned entitlement in both cases. The second set of results relates to a sample of students drawn from a behavioral laboratory’s pool of registered subjects (referred to below as the student sample). They participated in both the behavioral and survey experiments. We perform a between-subject comparison of the two treatment-elicitation methods but, this time, focusing on the same sample of subjects. Again, the treatment effects are very similar. Finally, using the data from the latter experiment, we establish that, within-subjects there is some consistency between the decisions and responses made under the two modes.
Our study relates closely to the recent paper of Bader et al. (forthcoming) on the sensitivity of laboratory findings to changes in experimental samples and settings. In the case of prosocial behavior, they find that quantitative effects vary significantly across conditions, while qualitative treatment effects are mostly consistent. The two papers are complementary in that they assess the generalizability of laboratory results in two different types of study, those that focus on preferences and behavioral tendencies that map directly onto choices, and those that focus on preferences that map onto differences in choices across contexts, that is, that map onto treatment effects rather than choices.
Method
In this section, first, we briefly present the design of our behavioral experiment, which is identical to that used by Barr et al. (2015), Barr et al. (2016), and Demel et al. (2019), and set out the analytical approach used to establish whether acknowledgment of earned entitlement manifests as a treatment effect in the experiment. Second, we describe the design of the survey experiment, explain how it relates to the same theoretical framework, while differing in many ways from the behavioral experiment, and set out the analytical approach used to establish whether acknowledgment of earned entitlement manifests as a treatment effect in the experiment. Finally, we describe all the samples involved in the experiments as well as the experimental procedures followed in each study.
Behavioral Experiment
This experiment is noncomputerized and has two stages. Participants first engage in an individual real effort task and then play a four-person distributive justice (DJ) game. The real effort task involved sorting yellow and blue gravel into various containers for seven minutes. In the earned treatment, each participant’s performance ranking in this real effort task determined his or her initial endowment in the four-person DJ game. In the random treatment, the initial endowments were randomly assigned. In both treatments, within a playing four, one participant received an initial endowment of 16€, one 12€, one 10€, and one 6€.
The DJ game was undertaken using specially designed and manufactured trays. Each participant received a tray. Each tray was divided into four quadrants, each quadrant relating to a participant. The tray-receiving participant’s own quadrant was blue and located at the side of the tray closest to the participant when the tray was placed on a desk in front of him or her. Each quadrant contained a number of counters indicating the initial endowment of the corresponding participant. Each counter was worth €1. The participants were invited to rearrange the counters across the quadrants as they saw fit, while being instructed not to remove any of the counters from the tray. Once all the participants had made their allocations, those of one of the four, randomly selected, determined the payoffs.
In addition to their payoffs from the DJ game, each participant received €4. In the random treatment, this €4 was presented as a flat fee for the real effort task. In the earned treatment, the €4 was added to each of the possible earnings levels and then set aside to be collected at the end of the session. Thus, the €4 represented a minimum total final payoff for each experimental participant. There was no additional show-up fee. 6
To investigate whether a sample of participants in the experiment described above acknowledge earned entitlement, focusing on the allocations made to others by those who did not take everything for themselves in the DJ game, 7 one estimates linear regression model 1:
where Ei takes the value 1 if i played under the earned treatment and zero if i played under the random treatment; a1 and a2 are the coefficients to be estimated; ui is the allocator fixed effect; and
Survey Instrument
The survey instrument presents participants with a scenario or vignette within which a decision-maker has to choose between two possible final resource distributions across three other people and asks the participants to state how fair they think each of the decision-makers’ possible choices is on a seven-point scale, ranging from “very unfair” to “very fair” (−3 and +3, respectively, in the Results section).
The scenario or vignette describes the second phase of a behavioral experiment involving groups of four subjects. Within this experiment, initial resource allocations were unequal. One of the group members received 16€, one 11€, and one 6€. The fourth group member received no initial resource allocation, but was to decide to either leave the (unequal) distribution across the other three as it was or equalize by allocating 11€ to each.
On the first page of the survey instrument, the distribution of initial allocations is represented graphically as counters or tokens on a circular “tray” divided into three sectors, each a different color (Figure 1). Then, on a second page (Figure 2), the distribution of initial allocations is shown again, using the same graphic image, followed by the two final distributions between which the decision-maker in the scenario had to choose. In one of the possible final distributions (distribution 1), the final allocations equal the initial allocations. In the other (distribution 2), the allocations are equalized. Below each of the two possible final distributions, appears the seven-point scale that the participants are asked to use to express their evaluations.

Survey experiment, page 1: Vignette.

Survey experiment, page 2: Evaluation sheet.
Across treatments in the survey experiment the information given to participants about the origin of the initial allocation in the scenario or vignettes varied. Under the earned treatment, participants were told that initial allocations were related to individual productivity. Under the random treatment, they were told that they were owing to luck. Each participant took part in both the earned and random treatments. The order was randomized.
There are several noteworthy differences between the behavioral and the survey experiments. First, in the behavioral experiment, the subject is the allocator, receives an initial endowment, and can reallocate to or from him or herself; whereas in the survey experiment, the subject observes and evaluates an allocator’s decisions and the allocator is a third party who receives no initial endowment and cannot reallocate either to or from himself or herself. These differences reflect the fact that some features of the behavioral experiment, while of value within the context of that experiment, were redundant and a potential distraction to the participants within the context of the survey experiment. Specifically, while in a behavioral experiment, by making subjects’ payoffs dependent on their own decisions, one expects to enhance the validity of those decisions as indicators of the subjects’ preferences, in a survey experiment, this is not an option. And, once this option is removed, one is free to refocus the design on the primary objective which, in this case, is to investigate whether individuals consider inequality to be more appropriate or fair when it is earned rather than owing to luck. So, in the survey experiment, subjects are ideally placed in an evaluative role, and it is useful to exclude the possibility that their evaluations are affected by how selfish they think a decision-maker is by rendering that decision-maker unable to affect his or her own payoff, that is, by placing the decision-maker in a third party rather than a stakeholder role.
Prior to running the survey experiment described above, we conducted a pilot study involving 79 students drawn from the same laboratory pool as the actual student experiments. In this pilot, we cross-cut the earned versus random initial endowment treatments with whether the decision-maker in the vignette was a third party or a stakeholder. An earned endowment effect was observed in both decision-maker-type variants. However, within both the earned and random treatment arms, the evaluations varied less under the third party compared to the stakeholder treatment. The lower variance under the former is consistent with selfishness being excluded as a possible motivation for the decision-maker (full results available on request).
In addition to the differences described above, in the behavioral experiment, the decision-maker faces an extensive choice set made up of all the possible distributions of 44 units across four individuals; whereas, in the survey experiment, the respondents are told that the decision-maker has only two options, leave the initial endowments untouched (distribution 1) or equalize (distribution 2). In the behavioral experiment, even though the choice set is extensive, only four allocations need to be made, and this can be done in just a few minutes. In contrast, evaluating each and every one of the possible distributions of 44 units across four individuals would have taken hours. So, in the survey experiment, to reduce time costs, we limited the number of final distributions that the subjects had to evaluate to the obvious candidates, distributions 1 and 2. In the pilot study mentioned above, to explore whether and how extending the list of distributions to be evaluated might affect evaluations, we added another cross-cutting treatment in which the decision-maker’s choice set included an inversion of the initial endowment distribution (distribution 3). Within both the earned and random treatment arms, this distribution was considered to be unfair and its inclusion in the choice set increased the mean fairness evaluation and reduced the variance in evaluations for both distributions 1 and 2. The overall effect was greater statistical power when testing the hypotheses of specific interest, but the power gain was insufficient to warrant an increase in the number evaluations made by 50 percent. We also considered adding a weighted average of distributions 1 and 2 to the choice set, but decided against it as the weights would have been arbitrary and, again, it would have increased the number of evaluations to be undertaken by 50 percent. The time saved by asking the survey participants to evaluate only two of the many possible distributions was used to engage each subject in evaluations under both treatments, thereby ensuring perfect balance. The order of the treatments was randomized between subjects.
To establish whether a sample of individuals acknowledges earned entitlement in the survey experiment, one can compare the fairness evaluations of distribution 1 under the earned and random treatments. If distribution 1 is considered significantly fairer under the former compared to the latter, it can be taken as evidence that earned entitlement is acknowledged. To corroborate this finding, one can test whether distribution 2 is considered significantly less fair under the earned compared to the random treatment.
Samples and Procedures
The participant sample for the behavioral experiment (N = 198) consists of a group of nonstudents recruited in the Spanish province of Biscay for the experiments conducted in 2013 and 2014 and reported in Barr et al. (2016) and Demel et al. (2019). Barr et al. (2016) and Demel et al. (2019) describe the procedures followed in the experiment in full. The participant sample for the survey experiment (N = 91) consists of adults, who were taking part in an intervention conducted by the Government of Biscay. The survey experiment was conducted in 2016. In both samples, there are more females (63 percent and 53 percent in the survey and incentivized experiment, respectively), most subjects have tertiary education (59 percent and 84 percent) and more than half are unemployed (54 percent and 56 percent). The sample in the survey experiment is somewhat older (42 vs. 27 years on average). In the statistical analysis, we control for these sociodemographic variables.
Participants in the within-subject experiment were undergraduate students (N = 124) recruited through the laboratory LABEAN’s recruitment site in Bilbao (Biscay). We adopted a 2 (treatments) by 2 (orders) factorial design. Each of these conditions was conducted in a different experimental session. Participants received a participation fee and their earnings from the behavioral experiment. The within-subject experiment was conducted in early 2017.
Results
In this section, we present three sets of results. First, we compare the results for the two independent adult samples that participated in the survey experiment and the behavioral experiment (Between-subject analysis of the adult sample subsection). Second, we replicate the same analysis but using the data for the student sample that participated in both the survey and behavioral experiments (Between-subject analysis of the student sample subsection). Finally, focusing on the student sample, we look at the within-individual correlation between behavior in the behavioral experiment and evaluations in the survey experiment (Within-subject analysis of the student sample subsection).
Between-subject Analysis of the Adult Sample
Figure 3 compares the treatment effects obtained under the two modes. In the left panel, we plot the average evaluation of the proportional tray (xi = yj) by the adult sample under each treatment in the survey experiment. To exclude learning effects, we include in the analysis only the evaluations made by participants under the treatment (random or earned) they faced first. In the random treatment, the average evaluation of the proportional tray is negative and significantly different from zero. In the earned treatment, the average evaluation of this tray is positive and significantly different from zero. There is a significant treatment effect (Table 1, columns 1 and 2; p < .001) in the evaluation of the proportional tray caused by the experimental treatment. There is also a significant treatment effect in the evaluation of the equal tray. That tray is evaluated more positively in the random than in the earned treatment (Table 1, columns 3 and 4; p < .001).

Between-subject analysis of the adult sample. Note: The figure shows the average fairness evaluations under the two treatments in the survey experiment (left panel) and the average slopes of the initial-endowment-allocation relationships under the two treatments in the behavioral experiment (right panel). The whiskers indicate standard errors. The bars in the right panel (survey) are derived from the regression reported in column 1 of Table 1. The bars in the left panel (behavioral) are derived from the regression reported in column 1 of Table 2.
Regression Analysis of the Acknowledgment of Earned Entitlement in the Survey Experiment. Dependent Variable = Evaluations of the Tray.
Note: ** indicates sig. at 1 percent, * indicates sig. at 5 percent.
The right panel of Figure 3 presents the treatment effect in the behavioral experiment. 8 The heights of the bars indicate the slope of the relationship between j’s initial endowment and i’s allocation to j in the distribution task. This slope is not significantly different from zero in the random treatment, but positive and significant in the earned treatment, meaning that participants in the earned treatment acknowledge initial endowments when making final allocations, while those in the random treatment do not. The difference between the slopes in the earned and random treatments is positive and statistically significant (Table 2, columns 1 and 2; p < .001). Comparing visually the two panels of Figure 3, it becomes clear that both methods return similar treatment effects: both go in the same direction and are statistically significant at the 0.1 percent significance level.
Between-subject Analysis of the Student Sample
Figure 4 replicates the analysis reported in Figure 3 but focuses on the student sample that took part in both the survey and behavioral experiments. Again, the evaluation of the proportional tray is significantly higher in the earned compared to the random treatment (Table 1, column 5; p < .001) and the slope in the incentivized task is significantly higher in the earned compared to the random treatment (Table 2, column 3; p < .001), although the slope difference is lower than in the adult-sample analysis. Using a student sample, we have replicated the main finding reported above: The survey and behavioral tasks produce significant treatment effects that go in the same direction. 9

Between-subject analysis of the student sample. Note: The figure shows the average fairness evaluations under the two treatments in the survey experiment (left panel) and the average slopes of the initial-endowment-allocation relationships under the two treatments in the behavioral experiment (right panel). The whiskers indicate standard errors. The bars in the right panel (survey) are derived from the regression reported in column 5 of Table 1. The bars in the left panel (behavioral) are derived from the regression reported in column 3 of Table 2.
Regression Analysis of the Acknowledgment of Earned Entitlement in the Behavioral Experiment. Dependent Variable = i’s Allocation to j (xij).
Notes: Samples include allocations to others; participant fixed effects, ui, included in all models; j’s initial endowment (yj) = j’s initial endowment expressed as a proportion of the 44 tokens in the game; Earned (E) = 1 if i made allocations under the earned treatment, = 0 if i made allocations under the random treatment; ** indicates sig. at 1 percent.
Within-subject Analysis of the Student Sample
Finally, we study the within-subject correlation between behavior and evaluations in the student sample. For each student participant i, first, we compute the slope of the relationship between j’s initial endowment and i’s allocation to j in the behavioral experiment. 10 Then, we look at whether and how these correlate with the is’ evaluations under the same treatment in the survey experiment. In the left panel of Figure 5, evaluations of the proportional tray are plotted against individual slopes in the behavioral experiment, pooled across treatments. As expected, there is a positive and significant relationship (Spearman’s Rho = 0.266; p = .004). However, there is also evidence of considerable noise in the relationship. In the right panel of Figure 5, evaluations of the equal tray are plotted against individual slopes in the behavioral experiment. Here, as expected, the correlation is negative and significant (Spearman’s Rho = −0.310; p < .001) but, once again, noisy. In summary, we find that, on average, those who evaluate the proportional tray more positively in the survey also acknowledge performance in the behavioral task more. In contrast, on average, those who evaluate the equal tray more positively, acknowledge performance in the behavioral task less.

Correlation between behavior and evaluations. Note: The figure shows correlations between evaluations of the proportional (left) and equal (right) trays in the survey task and the slope of the relationship between initial endowments and final allocations in the behavioral experiment.
Discussion
Social sciences differ in the methods they use to estimate similar, sometimes the same, treatment effects. For instance, in experimental economics, behavioral experiments prevail, while in experimental psychology, politics and sociology hypothetical survey tasks are more commonplace. An important question is whether these distinct methodological approaches produce different insights. In this article, we have focused on a particular treatment effect that has received a great deal of attention in sociology, economics, psychology and political science for almost 60 years (Adams 1965; Homas 1961; Konow 2003): the idea that people tend to respect initial inequalities more when those inequalities are earned rather than owing to random assignment. We have compared results from a behavioral experiment with results from a survey experiment, and we have done so using both student and nonstudent samples. Both methods yield remarkably similar aggregate treatment effects for both samples indicating that the acknowledgment of earned entitlement theoretical prediction has, at least, some external validity.
This result is in line with previous papers reporting similar treatment effects in field and survey experiments (Eifler 2010; Petzold and Wolbring 2019). One of the strengths of our validation exercise is that we obtain very similar treatment effects using experimental methods that differ in a number of design and implementation features. This being the case, our cross-validation findings may be viewed as conservative. However, further research is needed to uncover potential heterogeneous effects involving the interaction between the acknowledgment of earned entitlement and other operationalizations of the units, treatments, and experimental settings.
The within-subject correlations between behaviors and evaluations add further validity. However, here, the noise in the observed relationships is worthy of note. In future work, it would be interesting to isolate the effects played by real incentives, individual selfishness, and other contextual and individual characteristics in driving a wedge between individuals’ evaluations of the behavior of others and their own behavior and to think about what this implies of our methodological choices.
Supplemental Material
Supplemental Material, sj-pdf-1-smr-10.1177_0049124120986194 - Is the Acknowledgment of Earned Entitlement Effect Robust Across Experimental Modes and Populations?
Supplemental Material, sj-pdf-1-smr-10.1177_0049124120986194 for Is the Acknowledgment of Earned Entitlement Effect Robust Across Experimental Modes and Populations? by Abigail Barr, Luis Miller and Paloma Ubeda in Sociological Methods & Research
Footnotes
Acknowledgments
A. Barr acknowledges support from the Economic and Social Research Council via the Network for Integrated Behavioural Sciences (Award No. ES/K002201/1). L. Miller acknowledges support from the Spanish Ministry of Economy and Competitiveness (Grant ECO2015-67105-R). P. Ubeda acknowledges support from the Spanish Ministry of Science, Innovation and Universities (Grant PGC2018-097875-A-100).
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
The supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
