Abstract
We provide experimental evidence on the theory of club goods. Subjects played what we call ‘the club game’ – that is, a two-stage game: at the first stage, each player announces her favourite club size. Depending on the profile of announced club sizes, each player either joins a club of her favourite size or stays alone. At the second stage, club members play some kind of non-linear public goods game exhibiting the property that certain club sizes favour cooperation, and hence the payoffs for the players. Among other hypotheses, the central idea of club theory can be tested this way: A population of actors should partition itself in clubs of optimal size. As it turns out, this hypothesis can be validated.
Introduction
Buchanan (1965) coined the term ‘club good’ to address the gap between purely private and purely public goods. A good is purely private if it is perfectly excludable and highly rivalrous in consumption. Consider an apple; if alter eats the apple, ego can’t, so the apple is rivalrous. At the same time that’s not a big problem for ego, since the apple is excludable: If ego controls the apple, it should not be difficult for him to prevent alter from eating it.
On the contrary in both dimensions, no actor can be excluded from the consumption of a purely public good (Musgrave, 1959, 1969) and, from a welfarist point of view, nobody should be excluded, since a purely public good is not rivalrous – that is, the benefit from its consumption is not diminished by an additional consumer (Samuelson, 1954). A nice example for a purely public good is national government – no fellow countryman can be excluded from the consumption of its highly beneficial operations, and the fact that ego enjoys excellent government does not preclude alter from doing so as well.
More seriously, the standard example for a purely public good is air; however, from overcrowded academic meetings taking place in undersized rooms scholars all over the world have learned that air definitely rivals. At this point, the theory of club goods enters the stage. Consider some particular room at your affiliated institution: while it is true that air is not rivalrous, if the group of people joining the academic meeting is small enough, there exists a tipping point where the marginal benefit of the witty contributions by one additional participant is outweighed by the marginal cost due to the fact that the additional participant consumes some of the air. Hence, there exists an optimal number of participants of an academic meeting.
In general, a club good is rivalrous, but not in the same ‘degree’ as a private good. 1 Since a club good is excludable too, from its very beginning (Buchanan, 1965; Tiebout, 1956) club theory has dealt mainly with the question of optimal club size. In particular, scholars have identified various determinants of the optimal size of a club, and they have analyzed the consequences of the existence of an optimal club size for a population as a whole. Real-world examples of clubs and their quest for optimal size abound. For decades, the European Union has been at pains to mitigate the tension between a strengthening of the integration between its existing members and the inclusion of new applicants, such as Turkey. On a local level, institutions such as sports clubs and country clubs restrict membership to prevent overcrowding. It has even been demonstrated that secular trends with respect to religious sects and churches (e.g. Iannaccone, 1994) and their interior design (e.g. McBride, 2007) can be explained by interpreting these groups as clubs and applying club theory.
The theory of club goods is best seen as a bunch of canonical models (cf. Cornes and Sandler, 1996) and their present-day formulations which deal with common themes, but diverge with respect to the used modelling techniques. As already indicated, the major theme of club theory is the determination of optimal club size. Two modern classics in social theory highlight two different determinants of optimal club size (cf. Ahn et al., 2009). While Buchanan (1965) determines optimal club size as a function of the production technology of the club good, Olson (1965) discusses extensively the impact of group size on the incentives of the members of a group to contribute to the provision of a public good. Besides this so-called within-club-viewpoint, a lot of work was directed at the total-economy-viewpoint. To understand the difference between these two viewpoints, a simple numerical example might be of some help. Suppose a population consists of four agents. Assume that the average payoff for a club member is a function of club size as summarized in Table 1.
Within-club-viewpoint and total-economy-viewpoint
From a within-club-viewpoint, clubs of size three are to be expected, since this size maximizes the average payoff for those actors who join the club. Contrary to this, the total-economy-viewpoint argues that clubs of size three are unlikely, since the resulting partition of the population is inefficient (3.5 + 1.0 = 15) – an equal split of the population in two clubs of size two provides a greater aggregate payoff (2.4 + 2.4 = 16). In this sense, the within-club-viewpoint and the total-economy-viewpoint collide in this example.
Now, the central idea of club theory applies to a situation where both viewpoints go hand in hand. The idea can be stated as follows. A population of actors splits itself into clubs of optimal size, whenever this is possible. To see this idea at work in our example, suppose the population consists of six actors and clubs of size five or six provide average payoffs smaller than five to their members. Of course, the optimal club size is still three; consequently the within-club-viewpoint predicts the emergence of many clubs of that size. But now the total-economy-viewpoint agrees with this prediction, since two clubs of size three is indeed the unique partition that maximizes the average payoff over the whole economy.
In this paper we provide experimental evidence on this central idea of the theory of club goods. Compared with other topics from microeconomics and game theory, such as decision theory, linear public goods games, market games, dictator and ultimatum games (cf. Camerer, 2003), club theory has mostly been neglected in the experimental literature (a notable exception is Ehrhart and Keser, 1999). It seems to us that the fact that club theory draws from divergent modelling techniques – that is, neoclassical maximization calculus (e.g. Buchanan, 1965), and cooperative (e.g. Pauly, 1970) as well as non-cooperative game theory (e.g. Barham et al., 1997) – has rendered the design of a suitable experiment for an experimental test of predictions by club theory a rather puzzling task. Only recently, the first two experimental studies (Ahn et al., 2009; Crosson et al., 2004) were conducted that shed some light on the central idea of club theory. Unfortunately, one of the studies comes to a rather affirmative evaluation of its predictive power (Ahn et al., 2009), while the other draws a more pessimistic picture (Crosson et al., 2004).
The design of our experimental study rests on the foundational work by Crosson et al. (2004). In particular, we follow their lead and confront our subjects with a two-stage game. At the first stage, a population of subjects partitions itself in clubs. At the second stage, members of a club play a non-cooperative game that determines the material payoffs for the players, and hence the value of membership in clubs of certain sizes. Our design departs from the design of the study by Crosson et al. (2004) in two important respects: First, the non-cooperative game within clubs at the second stage implemented in our design is sound from a game-theoretical point of view – that is, equilibrium behaviour gives rise to a well-defined function from club size to material payoffs; as a consequence, the term ‘optimal club size’ is meaningful in our design, while it can be argued that this notion does not apply to the experiment by Crosson et al. (2004). Second, in our study, a population of subjects partitions itself in clubs by announcing a favourite size of their clubs. In contrast, the first stage of the experiment by Crosson et al. (2004) consisted of some unstructured process of group formation.
With our experimental study, we put three hypotheses to the test. The first hypothesis is rather instrumental; its empirical validity is a precondition to test the central idea of club theory with our design. The hypothesis claims that cooperation rates are higher in clubs of optimal size than in clubs of suboptimal size. Note that this is a critical issue in any experiment that confronts the subjects with a two-stage game: If the behaviour of the subjects at the second stage does not conform with the theoretical predictions, the empirical observations referring to the first stage become quite difficult to interpret. As it turns out, this is not a problem in our study – the first hypothesis is clearly validated. The second hypothesis rests on the central idea of club theory: subjects should announce clubs of optimal size more often than clubs of suboptimal size. Cum grano salis, this hypothesis is also validated by the data. Finally, our third hypothesis relates to the conflict between the within-club-viewpoint and the total-economy-viewpoint as described in the discussion of our example above. The hypothesis states that the announcements of optimal club size should be less frequent and the variance of the announcements should be higher in treatments where this conflict arises than in treatments where the two viewpoints coincide predictively. It turns out that the first part of this hypothesis can be validated, while the second part is rejected by our data.
The remainder of this paper is organized as follows: The next section describes the game the participants of our study were confronted with in detail. This encompasses a game-theoretical analysis as well as a formal derivation of all tested hypotheses. The third section contains all the information regarding the experimental procedure. In the fourth section we present our empirical findings, in the fifth section we conclude.
Theoretical background and predictions
In this section we describe the club game – that is, the non-cooperative game we confronted the participants of our study with. To motivate the design of the second stage of the club game, we briefly discuss the second stage of the game used in the study by Crosson et al. (2004). The introduction of the club game is followed by its game-theoretical analysis serving two purposes. First, one of our hypotheses is a direct consequence of non-cooperative game theory (third subsection). Second, it turns out that non-cooperative game theory all by itself does not predict a specific partition of the population of subjects in clubs. That is, the game-theoretical analysis reveals the predictive potential for the central idea of club theory in our experimental design. It is demonstrated (fourth subsection) how the central idea of club theory – the optimal partition of a population in clubs – can be derived formally with respect to the club game. In the fifth subsection, we state our three hypotheses and summarize our predictions.
The study by Crosson et al. (2004)
As already explained in the introduction, our study rests on the foundational work by Crosson et al. (2004). In particular, this concerns the major hypothesis to be tested – the central idea of club theory that a population of subjects should partition itself in clubs of optimal size. However, in this subsection we demonstrate that, from a game-theoretical point of view, the experimental design used in the study by Crosson et al. (2004) provides no incentives for the subject to achieve such a partition.
The subjects participating in the experimental study by Crosson et al. (2004) were confronted with the following situation: Eight persons are in one room, each of them has received two poker cards. If a (not necessarily proper) subset of these persons controls at least three cards of the same type (e.g. kings, queens), they can form a club. The experimenters allocated the poker cards so that each club had at least three members. Each club receives US$10 from the experimenters. Then the members of a club play the following non-cooperative game. Each member selects privately an even number c i from the set {0,1, … 10}. From the verbal description of Crosson et al. (2004), we can infer the formal payoff function of this game:
Crosson et al. (2004) provide the following interpretation of this payoff function. The club members simultaneously announce their private claims (ci). If aggregate claims do not exceed the resources of the club, each player gets what she has claimed. If there is overclaiming, each player gets what she has claimed minus a fine which equals half of the overclaimed amount.
Since each club gets $10 irrespective of the number of its members, Crosson et al. (2004) argue that many clubs with minimal size should be observed. However, it is easy to see that for each club member it is a weakly dominant strategy to claim $10. Consequently, game theory suggests that the subjects should expect a payoff of 0 irrespective of the size of their clubs.
We feel that this property of the experimental design used by Crosson et al. (2004) is rather detrimental for an empirical test of the central idea of club theory; note that the notion of an ‘optimal club size’ does not apply to this design. The overall inconclusive evidence provided by their study may be due to a somewhat ill-designed experiment rather than the consequence of a predictive inaccuracy of club theory.
Hence, we follow the lead of Crosson et al. (2004) by implementing a two-stage game in which the subjects first partition themselves into clubs and afterwards play a non-cooperative game inside the clubs which determines the worth of membership in clubs. However, the non-cooperative game at the second stage should be designed such that rational subjects actually have an incentive to form clubs of particular sizes.
The club game
As already indicated, the club game is a two-stage game. Drawing on the study conducted by Crosson et al. (2004), the strategic features of the club game are intended to enable a clean empirical test of the central idea of club theory.
At the first stage, subjects are partitioned into clubs. First, all subjects announce simultaneously the favourite size of their club,
Each subject either joins a club with her favourite size or stays by herself according to the following rule. If z subjects choose
The actual allocation of subjects to clubs is achieved by a random device which admits to this rule. A subject i who is not randomly assigned to a club of size
Then, all members of a club choose simultaneously whether they donate a private resource worth 1 to them to the club account. Let c i equal 1 if actor i donates and 0 otherwise.
Let
in which
The second stage of the club game
In the following we derive the theoretical hypotheses which will be tested in the experimental study. First, we determine the Nash equilibria in the subgames at the second stage of the club game. Given this analysis, we verify that non-cooperative game theory alone does not provide sharp predictions concerning club formation in the club game. Technically speaking, we show that any partition of the population can be sustained in a subgame perfect equilibrium of the club game. Finally, we describe the club-theoretic resolution for this equilibrium selection problem.
Proposition 1. Consider a subgame at the second stage of the club game with a club of size (a) If (b) If (c) If (d) If (e) If (f) If Proof. All claims but the last one are easily verified. With respect to the equilibrium in mixed strategies, the claim follows from the following argument. In some realizations of the random variable
which holds true if and only if
Some comments on this proposition are in order. First, note that there is no incentive to form a club which is too small to supply the benefits from cooperation (a). A club of size k is ideal to realize a Nash equilibrium of full cooperation (b). Statement (c) suggests that for high values of λ, clubs with a size greater than k could be ideal too. For low values of λ, however, clubs which are too big pose serious problems in terms of obtaining the benefits from cooperation for their members. Either (e) there is no Nash equilibrium in which the club good is provided, or (d) there are plenty of Nash equilibria, involving free-riding and asymmetric payoffs. Hence, there is a massive equilibrium selection problem – that is, coordination problems concerning the question which subset of the club members actually contributes to the club good. Practically, these coordination problems imply the risk that not enough contributions are made and the club good is not provided. In applications (e.g. Diekmann, 1985; Voss, 2001) it is standard to argue that the appropriate solution concept for such a situation is the symmetric mixed-strategy equilibrium (f) which, applied to our context, gives each player an expected payoff strictly smaller than λ.
The second stage of the club game was designed so that by the choice of appropriate parameter configurations k, λ, and n, we can achieve a meaningful notion of an optimally sized club. As explained in the introduction, this notion is crucial for club theory. From Proposition 1 it is clear that there is a unique integer which deserves the labelling ‘optimal club size’, if case (c) is ruled out by the choice of our parameters. To see this, first note that players in clubs smaller than k obtain a payoff of 1 which is smaller than λ – the payoff players in clubs of size k can reasonably expect – since cooperating is a weakly dominant strategy in such clubs. Second, if the set of integers
These observations motivate the choice of the parameter configurations implemented in the experiment. Subjects were confronted with club games where
Definition. If k = 5 or λ = 3.5, a club is of optimal size if and only if
We included the case k = 3 and λ = 5.5 because the comparison between treatments in which the optimal club size is unique and treatments in which it is not unique may be informative. Note that in most cases
The first stage of the club game
By now, the analysis of the club game has shown that its second stage gives rise to the notion of optimally sized clubs. Additionally, by the choice of the parameter configuration we can implement a unique optimal club size or multiple optimal club sizes. Surely, these are attractive features of the club game.
It remains to check whether the first stage of the club game is designed such that the central idea of club theory – a population of subjects tends to partition itself in clubs of optimal size – can be tested. Recall, this central idea of club theory is typically not derived from non-cooperative models but from models using techniques from cooperative game theory. The club game leaves room for the explanatory power of the latter models by the fact that non-cooperative game theory does not determine the partition of a population in clubs at all.
Proposition 2. Any partition of the population can be achieved in a subgame perfect equilibrium.
Again, the proof is trivial: Let any player defect in any subgame at the second stage. By Proposition 1, all of the induced action profiles in all proper subgames constitute a Nash equilibrium. Also, any player obtains the same payoff, irrespective of the size of her club, and consequently any profile of announcements of favourite club size constitutes a Nash equilibrium.
Clearly, Proposition 2 implies that non-cooperative game theory provides no sharp predictions concerning club formation in the club game. This is in stark constrast to our intuition and the central idea of club theory; both suggest that clubs of optimal size are likely to form. In the following we show that club theory supports our intuition via arguments from cooperative game theory. Put differently, we derive the club-theoretic solution for the equilibrium selection problem concerning the formation of clubs.
To begin with, consider an experimental condition with n = 15 and k = 5. For reasons of symmetry and incentive compatibility, it seems perfectly reasonable that the population splits itself into three clubs of optimal size. Compare this with an experimental condition where the population cannot be split evenly in clubs of optimal size, for example n = 9 and k = 5. Not all subjects can join a club of optimal size; if all subjects announce five at the first stage of the club game, some will stand small. Over the 10 rounds played in each session, different subjects will face this difficulty and eventually change their strategy. Additionally, some subjects could anticipate this difficulty and adapt their strategy to this more complex environment in advance. Anyway, compared with a situation where the population can be evenly split in optimally sized clubs, the relative frequency of announcements demanding optimally sized clubs should diminish and the variance of the announcements should increase.
Besides such intuitive reasoning, theorists have offered formal arguments for this central idea of club theory (e.g. Pauly, 1970). Let us briefly draw on one of the classical approaches to this problem. Pauly (1970) relies on games with transferable utility (TU games). A TU game is a pair (N, v), where N is a finite and non-empty set of players, and
Let us call an allocation
Following Pauly (1970), we focus on symmetric games that are single-peaked, that is the function f(s)/s has a unique maximum on {1, 2, …, n}. Let k be the argument that maximizes this expression and define
Finally, we are in position to state Pauly’s central theorem.
Proposition 3. (1) If (2) If
But the latter inequality is false, since N cannot be partitioned such that all elements of this partition are of size k, and f (s)/s is single-peaked.
The upshot of this exposition is the following. In all of our treatments except those in which k = 3 and λ = 5.5 the payoff the players can reasonably expect as a function of the size of their club is single-peaked, and consequently Pauly’s result applies. Hence, club theory predicts for these treatments that if
Predictions
For future reference we state the hypotheses to be tested in our experimental study.
Hypothesis 1. Subjects should cooperate more in optimally sized clubs than in suboptimally sized clubs. Hypothesis 2. Subjects should announce optimal club sizes. Hypothesis 3. Suppose there is a unique optimal club size. If the population of subjects can be evenly split in optimally sized clubs, subjects should announce optimal club sizes more often and there should be less variance in their announcements than if such an even split in optimally sized clubs is impossible.
Clearly, the first hypothesis is in line with the general ideas typically expressed in connection with the theory of club goods (e.g. Cornes and Sandler, 1996; McBride, 2007; Sandler, 1992). Unlike the second and third hypotheses, the first one is a straightforward consequence of non-cooperative game theory applied to the club game. The second and third hypotheses lie at the heart of club theory. Both intuition and club-theoretic arguments stemming from cooperative game theory support them. This is obvious with respect to Hypothesis 2. Hypothesis 3 is a modest interpretation of the theoretical statement that ‘no Pauly-stable allocation exists’.
Figure 1 outlines what these hypotheses specifically entail when applied to the experimental treatments. The integers inside the cells denote the optimal club sizes. Hypothesis 2 predicts that subjects should announce optimal club sizes. The words ‘Yes’, ‘No’, and the symbol ‘?’ refer to the question whether club theory suggests coordination problems in the formation of clubs. Hypothesis 3 predicts that the conflict between the total-economy-viewpoint and the within-club-viewpoint leads to coordination problems if the population cannot be split evenly into clubs of optimal size. Hence, in treatments to which the figure assigns ‘Yes’ we expect lower relative frequencies of first-round announcements of optimal club sizes and higher variances in first-round announcements than in treatments to which the figure assigns ‘No’. Note that our theoretical derivation of these coordination problems of club formation presupposes that there is a unique optimal club size. The latter condition is not satisfied if k = 3 and λ = 5.5 – the symbol ‘?’ is used to signify the absence of a clear-cut theoretical prediction with respect to coordination problems in club formation.

Theoretical predictions regarding the announcements of club sizes and whether coordination problems in forming clubs are expected.
Methods
In this section we describe the experimental design, the course of a session, and discuss some methodological issues.
We varied the parameters of the club game (k, λ, and n), to generate the experimental conditions. In particular, λ ∈{3.5, 5.5}, k ∈ {3,5}, and n ∈ {9,15}. We allowed for any combination of parameters and for each combination conducted two sessions where subjects played 10 paid rounds in the respective environment. In one of the sessions, for any combination of the parameters, one experimental currency unit (ECU) equalled €0.3, in the other session an ECU equalled €0.4. In total we conducted 24 = 16 sessions with 192 subjects. One half of the sessions were conducted in February 2009, the other half in July 2010. All subjects were students residing in Leipzig. Subjects were recruited by posting the offer on a website which is often visited by students. Each subject received €5 show-up fee and additional earnings from the experiment. Average earnings across all sessions equalled approximately €15 (including the show-up fee). Since a session lasted for about one hour, average experimental earnings clearly exceed the usual salary students can achieve in Leipzig in that amount of time.
Each session was conducted in the personal computer (PC) pool at the Institute for Sociology of the University of Leipzig. Every subject was seated at a PC terminal which was separated from the others by blinds. An experimental session consisted of five parts. First, after all participants of the session (either 9 or 15) had arrived, they were allowed to read the instruction paper (see Appendix 1). The paper explained the structure of the session, the rules concerning the experimental earnings, and the rules of the club game. Second, the subjects were allowed to ask questions concerning these rules. These questions were answered publicly. The remainder of the session was conducted via three programs in z-Tree (Fischbacher, 1999). The first program contained five exercise questions concerning the rules of the club game. In the case of either a correct or an incorrect answer, the program informed the subjects why the answer was right or wrong. The second program implemented the club game. The subjects played the club game 13 times in a row, the first three times termed as exercises that do not affect the monetary earnings of the subjects. The third program contained a short questionnaire dealing with demographic information and eventual background knowledge on microeconomics and game theory.
Now we describe how the club game was implemented electronically. Each round of the club game consisted of several screens providing information for the subjects and asking them for their respective decisions. The first screen informed the subjects about the size of the population n – that is, the number of participants of the particular session – and asked them to enter their favourite club size
At this point some comments on methodological issues are in order. Note that all participants of a particular session formed the population N of the respective club game. That is, in half of the sessions n = 15 and in the other sessions n = 9. Note also that the parameters λ, k, and n were held constant during a particular session and that the subjects were informed in advance about the fact that they would play 10 paid rounds of the club game. So it can be argued that the subjects did not play a single-shot club game in each round (as assumed in the second section), but a repeated game, and that our analysis of the game does not accommodate this fact. While we acknowledge that our data show round effects – that is, the outcomes of earlier rounds influence the behaviour of the subjects in later rounds – we reject the notion that these round effects would be better captured in a game-theoretical analysis which treats each round of the club game as part of a repeated game than in an analysis which interprets each round as a single-shot game. First of all, the interpretation of equilibrium concepts in game theory is a delicate point and the question of under what circumstances different equilibrium concepts provide plausible predictions, is quite debatable. Epistemic game theory (e.g. Aumann and Brandenburger, 1995) grounds solution concepts in terms of Bayesian rationality of the players and conditions referring to their mutual and common knowledge of various aspects of the ‘game’. The epistemic conditions for a Nash equilibrium are quite strict (see also Gintis, 2009, 2010) and, loosely speaking, involve much knowledge about what the other players do. Put differently, it is problematic to expose subjects to a certain situation of strategic interdependence only once and expect them to play strategies that constitute a Nash equilibrium. Instead, epistemic game theory suggests that players have to be repeatedly exposed to the same game to give the equilibrium concept predictive power.
From this argumentation we conclude that it is reasonable to expose subjects repeatedly to the same strategic interaction when testing game-theoretical predictions. Many researchers implement random matching wherein a group of subjects is in the lab, in each round subgroups are formed by a random device, and games are played inside the subgroups. This way subjects are given the opportunity to gain experience in the strategic environment such that eventually the conditions for equilibrium are satisfied, while additional strategic considerations associated with repeated play within a fixed player set are more or less ruled out.
Note that in the club game the clubs at the second stage are formed by a random device. Hence, after the players announce their favourite club size, there is random matching. 2 So, if there is a problem with spillover effects from repeated play, then it concerns announcements of favourite club size. However, in most treatments (all those assigned ‘No’ in Figure 1) the single-shot club game has a symmetric, Pareto-optimal Nash equilibrium in which every player announces k, defects if she is a member of a club C of size 1<|C| ≠ k, and cooperates if she is a member of a club C of size |C| = k. In these treatments this equilibrium leads to a game play in which the population splits itself into clubs of optimal size and all clubs succeed in providing themselves with the club good. Even if we consider all 10 paid rounds as one game, this game surely has a subgame perfect equilibrium in which every player plays his part of the described Nash equilibrium after any possible history of the game. The fact that the game consisting of 10 paid rounds may have a vast set of possibly strange equilibria seems rather irrelevant for predicting what happens in these treatments, since an intuitive equilibrium leading to a stationary game play exists. Hence, our predictions regarding treatments in which the population can be split evenly into clubs of optimal size are reasonable predictions, even if all 10 rounds are viewed as one game. Concerning the other treatments, Hypothesis 3 says little more than that the subjects’ behaviour shows more variation concerning announcements of favourite club size than in treatments in which the stationary and intuitive game play is available. This is a quite modest prediction which should also be reasonable to those readers expecting spillover effects from repeated play.
Empirical findings
In this section we present our empirical findings. First, we provide some basic information on our database. Second, we inform the reader about the observed distribution of clubs. Third, we analyze whether cooperation rates are higher in clubs of optimal size, as claimed in Hypothesis 1. Finally, we turn to an evaluation of Hypotheses 2 and 3 which deal with the announcements of favourite club sizes.
Basic descriptive statistics
For future reference Figure 2 summarizes some basic information about the experimental treatments. First of all, the letter in each cell is a shorthand for the respective treatment. For example, D denotes the treatment in which k = 3, λ = 5.5, and n = 15. The first integer in each cell gives the number of first-round announcements observed in each treatment. Since there are two sessions for each treatment, every subject participates in 10 paid periods of the club game, and every subject has to announce a favourite club size in every period, these numbers can be calculated from the description of the experiment (see the third section). The second integer in each cell refers to the observed number of second-round announcements. Recall, if a subject was not assigned to a club after the first round of announcements, she had a second chance. For example, in treatment E we observe 135 such second-round announcements. The last integer in each cell refers to the number of such second-round announcements that were made in a situation in which the number of players unattached to clubs after the first round of announcements is not smaller than the optimal club size.

Basic descriptive statistics.
In the following analysis, we use this figure to inform the reader about which subsample of the data the exposition refers to.
Observed distribution of clubs
To prepare the analysis of cooperation rates inside clubs, we provide information about the observed distribution of clubs. Recall, our theoretical analysis of the club game showed that the optimal size of a club simply equals k, except in treatments in which k = 3 and λ = 5.5. In such treatments a club of size 3, 4 or 5 is optimal. Since theoretically and, as we will see shortly, empirically, the optimal club sizes are of great importance for the behaviour of the subjects, we present three different figures, one for each kind of prediction. Please note that an in-depth analysis of the announcements of favourite club size is provided after we have checked whether the second-stage behaviour of the subjects actually confirms Hypothesis 1. Only if this is indeed the case, can we legitimately speak of an ‘optimal club size’ and test whether the central idea of club theory has some empirical appeal in our study.
As can be seen from Figures 3 to 5, averaged over all observed club games, roughly 50% of all players join a club of optimal size. Note that ‘clubs’ of size one are mostly the result of coordination failures at the first stage of the club game. Taking this into account, it seems legitimate to say that practically all observed clubs whose size conforms to the announcements of the subjects are of optimal size.

Observed distribution of clubs if k = 3 & λ = 3.5 (treatments A & C).

Observed distribution of clubs if k = 3 & λ = 5.5 (treatments B & D).

Observed distribution of clubs if k = 5 (treatments E–H).
Cooperation rates within clubs
Hypothesis 1 states that cooperation rates should be higher in clubs of optimal size. It must be stressed once again that the empirical validity of this hypothesis is a prerequisite to test the central idea of club theory that a population partitions itself into clubs of optimal size. If the behaviour of the subjects at the second stage of the club game would not conform with the game-theoretical predictions encapsulated in Proposition 1, the very notion of an optimally sized club would be unwarranted, and our experimental design would suffer from the same weakness as the design of the study by Crosson et al. (2004).
Before we interpret Figure 6, a comment on the connection between these numbers and the preceding figures might be of some help for the reader. From Figure 6 we can see that over all club games 105 players were members of a club of suboptimal size. This equals the sum of the second columns of Figures 3 to 5 when we account for two facts (2 + 12 + 25 + 48 + 2 + 16 = 105). First, members of a club of size one did not have to decide whether they cooperate or not (see the third section). Second, in treatments with k = 3 and λ = 5.5 clubs of size 3, 4 or 5 are optimal.

Cooperation rates within optimally and suboptimally sized clubs (all treatments).
Figure 6 provides a clear-cut validation of Hypothesis 1. A great majority of members of clubs of optimal size cooperate, whereas suboptimally sized clubs suffer from free-riding. Of course, this descriptive relation between optimally sized clubs and cooperation rates is highly significant by any statistic conceivable (e.g. Pearson χ2(1) 230.07).
Given that the behaviour of our subjects at the second stage of the club game actually conforms with Hypothesis 1, the stage is set to come to the main research question addressed by this study: Does a population of subjects actually partition itself in clubs of optimal size?
Announcements of favourite club size
Figure 7 shows the relative frequencies of announcements of favourite club size (

Relative frequencies of first-round announcements of favourite club size (treatments A & C, B & D, and E–H, respectively).
Notably, club sizes equalling k is the mode in all three distributions. In fact, over all club games with k = 5 an absolute majority of subjects announced the optimal club size. Announcements of a favourite club size smaller than k do hardly ever occur, a fact showing that the subjects clearly understood that undersized clubs were a pointless enterprise with respect to payoffs. Note also that in treatments in which k = 3 and λ = 5.5 – treatments in which club sizes of 3, 4, and 5 are optimal – nearly 80% of first-round announcements demanded optimally sized clubs. Although there is a considerable fraction of announcements of oversized clubs, especially for k = 3, it seems justified to us to interpret the empirical distributions of
Figure 8 shows the observed distributions for these second-round announcements. For k = 3, there are 560 such decisions, all of which were in situations where the cardinality of the set of subjects not assigned to a club after the first round of announcements is not smaller than k (see Figure 2). For k = 5, there are 625 such decisions, but the number of subjects participating in this second round of announcements exceeds or equals k only in 561 cases. Since Hypothesis 2 only refers to these cases, we excluded 64 second-round announcements in the calculation of the relative frequencies shown in Figure 8. Again, only a very marginal fraction of second-round announcements exceed 10 (0.32% for k = 3 and λ = 3.5, 0% otherwise).

Relative frequencies of second-round announcements of favourite club size (treatments A & C, B & D, and E–H, respectively).
Clearly, Figure 8 is even more supportive of Hypothesis 2 than Figure 7. Now, for both k = 3 and k = 5 a majority of subjects announces a favourite club size which is optimal.
To back our overall positive evaluation of Hypothesis 2 on the basis of Figures 7 and 8 by some tests of statistical significance, we present Figure 9 showing the means and medians of first-round announcements over all observed club games for any combination of the parameters k, λ, and n as well as the results of four t-tests (one-sided) and four median tests.

Means and medians of first-round announcements of favourite club size (all treatments).
To begin with, Hypothesis 2 suggests that the mean and the median of the announcements of favourite club size should be higher for k = 5 than for k = 3 for any combination of λ and n. As it turns out, this assertion holds true in all four cases with respect to the means. Given the fact that the non-parametric median test has little statistical power – that is, it is per se conservative – it seems fair to say that the assertion is validated in all but one combination of λ and n. At this point, we close our inspection of the validity of Hypothesis 2 with a positive upshot. Figures 7 to 9 as well as Figures 3 to 5 provide more than modest support for the central idea of club theory – that a population of subjects splits itself into clubs of optimal size.
Before we turn to an evaluation of Hypothesis 3, a comment on the effect of multiple optimal club sizes seems in order. Recall, in all treatments with k = 5 there is a unique optimal club size. In contrast, regarding treatments with k = 3 it depends on whether we have λ = 3.5 (only club size 3 is optimal) or λ = 5.5 (club sizes 3, 4, and 5 are optimal). Comparing Figures 3 and 4 we see that the percentage of observed suboptimally sized clubs decreases if more than one size is optimal. This is hardly surprising. A bit striking is the fact that the presence of alternative optimal club sizes leads to an increase in announcements of club size 3 (see Figures 7 and 8). In fact, although theoretically the mean or median of announcements should generally rise if club sizes of 4 or 5 are optimal in addition to a club size of 3, the opposite effect takes place (see Figure 9). Comparing all treatments with k = 3 and λ = 3.5 to those with k = 3 and λ = 5.5, tests for differences in mean (t-statistic 3.57, p-value 0.00) and median (Pearson χ2 39.79, p-value 0.00) yield significance. Although this effect contradicts our theoretical expectations, these expectations are certainly not central to this study and do not substantially undermine the empirical validity of any of our hypotheses.
We turn to an evaluation of Hypothesis 3. Recall, the hypothesis applies to situations in which there is a unique optimal club size. It is claimed that if the population can be split into clubs of optimal size, the percentage of announcements of optimally sized clubs should be higher and the variance in announcements lower than in situations in which the population cannot be split into clubs of optimal size.
A straightforward way to test this hypothesis is to compare first-round announcements in the treatments with k = 5 and n = 9 to first-round announcements in treatments with k = 5 and n = 15 or k = 3 and (see Figure 10). Treatments with k = 3 and λ = 5.5 are of no interest here, since they fail to satisfy the condition that a unique optimal club size exists.

First-round announcements of optimal club size (treatments A & C & G & H vs. E & F).
Figure 10 shows that descriptively the frequency of first-round announcements of optimal club sizes is slightly higher in treatments with k = 5 and n = 9 than in the comparison group. This contradicts our theoretical expectations. However, this effect is very small and in fact insignificant (Pearson χ2(1) 0.046, p-value 0.83).
We have much more data to which Hypothesis 3 applies. Indeed, for any second-round announcement of favourite club size we can check whether the set of subjects not assigned to any club after the first round of announcements can be split evenly into clubs of optimal size. Hypothesis 3 claims that in situations in which the latter is possible, the proportion of announcements of optimal club sizes should be higher. The next figure refers to this comparison. Again we do not consider treatments in which k = 3 and λ = 5.5, since these treatments violate the condition that there exists a unique optimal club size. Additionally, we do not consider situations in which the number of unattached players after the first round of announcements is smaller than the optimal club size (see Figure 2).
Figure 11 demonstrates that the possibility of splitting the set of unattached subjects after the first round of announcements into clubs of optimal size considerably enhances the proportion of announcements of optimal club sizes, well in line with Hypothesis 3 (Pearson χ2(1) 65.84, p-value 0.00). Taking Figures 10 and 11 together, it seems fair to say that we found weak evidence in favour of the first part of Hypothesis 3 stating that the proportion of announcements of optimal club sizes is higher when there is no clash between the total-economy-viewpoint and the within-club-viewpoint.

Second-round announcements of optimal club size (treatments A & C & G & H vs. E & F).
To get some intuition for what is going on if a split of the population into clubs of optimal size is impossible, it is intructive to compare the distribution of first-round announcements over all treatments with k = 5 and n = 15 to those with k = 5 and n = 9 (see Figure 12).

Relative frequencies of first-round announcements (treatments G & H and E & F, respectively).
Strikingly, although the action space is larger if n =15, subjects announce the optimal club size more often. Note also that announcements of club sizes which are too small practically only occur if n = 9 and announcements above the optimal club size increase in relative frequency if n = 9. We interpret these observations as clear evidence in favour of Hypothesis 3. Apparently, subjects have a clear understanding of the fact that if
On the downside of this matter, there is no way to validate the second part of Hypothesis 3 which states that the variance of announcements should be higher if the population cannot be split into clubs of optimal size. In fact the standard deviation of first-round announcements in treatments with k = 5 and n = 9 is actually the lowest among all kinds of treatments (1.47 < 1.84 < 2.11 < 2.18), grouped by combinations of n and k and only taking into account treatments in which there is a unique optimal club size. A similar negative picture emerges if we look at second-round announcements. Hence, the second part of Hypothesis 3 is clearly refuted by our data. Figure 12 hints at the underlying reason why this prediction might fail with respect to first-round announcements. The action space is larger if n = 15 and there is a considerable fraction of subjects with announcements
Conclusion
This paper presents experimental evidence on the theory of clubs. The central idea of club theory states that a population of actors partitions itself into clubs of optimal size, whenever this is possible. To test this hypothesis, we implemented an essentially two-stage game, the so-called club game. At the first stage of the club game, each player announces her favourite club size. Given a profile of announcements each player either joins a club of her favourite size or stays alone. At the second stage of the club game, each club plays a non-linear public good game. Assuming equilibrium behaviour in the latter game, the notion of an optimal club size is sensible. This distinguishes our experimental design from its close predecessor (Crosson et al., 2004) which, from a game-theoretical point of view, lacks this property.
It turns out that the behaviour of the subjects at the second stage of the club game actually conforms quite well to the predictions from non-cooperative game theory. Hence, an important precondition to put the central idea of club theory to an experimental test is met. In our setting, club theory predicts that a population of subjects should partition itself into clubs of optimal size if the population can be evenly split into clubs of optimal size. And indeed, taking account of both first-round and second-round announcements of favourite club size, our subjects tended to announce the optimal club size. Further, as predicted, in experimental conditions where the population cannot be split evenly into clubs of optimal size, the relative frequency of announcements of optimal club size declines. However, unlike what is predicted by club theory, the variance of announcements is not higher in these experimental conditions.
Although one of our hypotheses derived from the application of the theory of club goods to our experimental setting is rejected by the data, we think that it is justified to draw a rather benign conclusion with respect to the empirical validity of club theory in our experiment. After all, the central idea of club theory applies to situations where the within-club-viewpoint and the total-economy-viewpoint provide the same prediction with respect to club formation. The fact that the relative frequency of announcements of optimal club size is generally quite high and even higher in experimental treatments corresponding to such situations should be interpreted as support for the central idea of club theory. The prediction referring to the variance of announcements relies on an interpretation of the statement ‘no Pauly-stable allocation exists’ (see the second section) in terms which are empirically observable. Put differently, the rejection of the second part of Hypothesis 3 might not be due to a predictive deficiency of club theory, but the consequence of a poor operationalization of a complicated theoretical concept or a possible shortcoming of our experimental design.
This brings us to the questions of how our experimental findings relate to the literature and what this implies for future experimental research on club theory. As explained in the introduction, comparatively few experimental studies on club theory were conducted. Additionally, the two studies focusing on the central idea of club theory (Ahn et al., 2009; Crosson et al., 2004) implement quite different designs and come to divergent conclusions. Our experimental design was inspired by a deficiency of the design used in the study by Crosson et al. (2004) – the fact that, from a game-theoretical point of view, the second stage of their game does not justify the notion of an optimal group size. We speculated that their overall inconclusive evidence with respect to the central idea of club theory might be caused by this feature of their design. Our empirical findings clearly validate this expectation and shed new light on the findings of Crosson et al. (2004). Further, our results support the empirical findings of Ahn et al. (2009) who also study the endogenous formation of clubs, albeit in a quite different experimental setting, and draw a benign conclusion with respect to the central idea of club theory. Taken together, these three experimental studies suggest that the predictive accuracy of club theory is rather invariant with respect to the details of the rules of group formation, which differ considerably between the studies, but is heavily affected by the strategic features of the second-stage games played within the clubs.
By now, experimental research on club theory is of a rather explorative nature; quite different designs were used to test the central idea of club theory and it seems difficult to compare or generalize the results. As already indicated, the fact that club theory draws from diverging modelling techniques has the consequence that no ‘natural’ or ‘straightforward’ design exists; consequently, scholars have set up two-stage games in the spirit of club theory and observed what happens. We feel that future experimental research on the theory of club goods should focus on the details of group formation to test whether the above assertion concerning the invariance under diverging rules of group formation actually holds true.
To achieve this it seems promising to avoid two-stage designs and instead fix the worth of a club (or the value of club membership) exogenously. Club theory offers a variety of predictions stemming from the application of different stability-concepts (e.g. Pauly-stability) to such environments. For example, the theory of hedonic games (see Bogomolnaiaa and Jackson, 2002; Casajus, 2008) implies that given a function that assigns each club size a payoff for the players (which are assumed to be symmetric), stable partitions of a population of actors always exist. As we have seen in the second section, Pauly-stability tells a different story. Loosely speaking, this stability concept says that if the value of clubs depends only on the number of their members, a stable partition of the population exists if and only if the population can be evenly split into clubs of optimal size. It seems to us that an experimental study that focuses on the process of club formation and tests whether details in the experimental setting (one condition: worth of club membership fixed; second condition: value of clubs fixed) would be a worthwhile next step in the experimental investigation of club theory.
Footnotes
Appendix 1
This appendix contains the translation of the instruction paper (originally in German) for experimental conditions with k = 5, λ = 3.5, and an exchange rate €/ECU which equals 0.4.
Dear students,
Thank you for participating in this study. This document provides you with all the necessary information. Please read the following instructions carefully. Afterwards you will get the opportunity to ask further questions.
Regardless of the results you will receive €5 as a reward for your attendance. Additionally, depending on your decisions during the study, it is possible to earn additional money. The total amount will be paid out at the end.
All information you provide will remain anonymous. It is not possible to draw conclusions from your data that would lead to your identification.
From now on please avoid any contact with other participants and only ask questions of members of staff.
Funding
Financial support from the German Research Foundation (DFG VO 684/9-1) is gratefully acknowledged.
