Abstract
In a principal–agent relationship, how should principals budget time for oversight when oversight activity is not instantaneous? We develop a formal model of resource allocation by a principal monitoring multiple agents, where the principal faces a dynamic budgeting problem. Our model reveals a tension between the value of holding resources in reserve to maintain the threat of an audit and the direct policy gains of monitoring activity. We show that as the frequency of principal–agent conflict increases, there are some conditions under which the most effective strategy for a principal is to allocate less and less of their total time to monitoring. The model has important implications for the empirical analysis of a monitoring setting where a principal oversees multiple agents.
Introduction
Principal–agent settings, where superiors delegate tasks to one or more subordinates, possess a certain irony. Inherent in a principal–agent relationship is the idea that principals are limited in what they can do alone; if principals had infinite resources, or there were no costs associated with acquiring information or carrying out particular tasks, there would be no need to consider delegating. Yet when principals face the kinds of limitations that make delegation attractive in the first place, these limitations make it difficult for principals to delegate effectively, as they may lack sufficient resources to create good incentives for their agents.
Principals with limited resources must consider the opportunity costs of deploying their scarce resources. Resources dedicated to one activity may not be spent on another; if already-constrained principals make bad choices about how to use their limited resources, they risk worsening the agency problems associated with delegation. How do principals manage these opportunity–cost tradeoffs? In this paper we argue that the answer to this question depends significantly on the nature of the resources being expended. Understanding exactly how a resource constraint “bites” is critical to understanding how that resource constraint alters principal–agent relations. Here we explore how time constraints affect a principal’s ex post review strategy. While other resources are also important, we focus on time because time constraints frequently lie at the heart of the decision to delegate. 1 Principals delegate to those who can perform tasks more efficiently, or simply to multiple agents who can do more total work.
Yet almost by definition, principals do not have time to check up on all agents and ensure they are performing as directed. This problem is particularly acute in political principal–agent relationships, where agents may flatly disagree with a principal’s objectives, and principals often possess meager tools with which to incentivize agents. In contrast to hiring relationships within firms, political principals often lack the ability to create detailed ex ante contracts that align agent incentives, and instead are limited in whole or in part to various forms of ex post, case-by-case auditing of agent performance. For this reason, much research on political institutions examines how principals can use ex post oversight most effectively (McCubbins and Schwartz, 1984; McCubbins et al., 1987; Calvert et al., 1989; Cameron et al., 2000), and how the practicalities of using limited oversight tools interact with other choices, such as the decision of whether to delegate or the parameters of agency design (Bawn, 1997; Lax, 2003; Huber and McCarty, 2004; Vanberg, 2005; Wiseman, 2009; Shotts and Wiseman, 2010).
How do principals manage their time when performing oversight? What oversight strategies are most effective for principals with limited time? Our interest in these questions is motivated by two observations about extant models of political oversight. First, although exceptions exist (e.g. Bendor, 1985; Ting, 2002, 2003), the dominant mode of scholarship on principal–agent relationships focuses on interactions between a single principal and single agent (cf. Gailmard and Patty, 2012), limiting our ability to understand the tradeoffs inherent when principals choose among many opportunities for oversight. Second, many oversight models that consider resource constraints treat this concept very broadly, arguing that oversight entails generic costs. We argue that this choice, though quite defensible on the grounds of generality, may obscure important aspects of a principal’s budgeting problem. Using a generic cost transforms an opportunity cost into a fixed marginal cost; it allows the principal to review as much as it wishes.
This latter observation is important because time constraints are about opportunity costs. Time is finite and irreplaceable. For a real-world principal with a complex review task, oversight is not instantaneous. Instead, real-world monitoring entails consequences, the closing off of other potential activity, rather than direct costs. Moreover, because review is time-consuming, and in some substantive settings exceedingly so, these consequences may extend far into the future until a given oversight action is finally resolved. If a principal takes on too much, it may find itself either unable to complete all of its work or less able to review subsequent agent activity effectively. 2 In short, we argue that the concept of the opportunity cost of review, if taken seriously, implies an effect on a principal’s oversight capacity that extends for some non-trivial length of time.
Below, we examine how principals choose what to audit when oversight has consequences for future capacity, developing a formal model of this dynamic constraint on a principal’s ability to audit. Our model reveals a tension between the value to the principal of reviewing an agency action today and the value of forgoing review in an effort to maintain a more significant threat of review tomorrow. This tension leads to a counterintuitive finding, that under certain conditions a principal’s best response to increases in agency activity is to review less often. Thus, if the tradeoff identified by our model is operative, a principal that has chosen a low level of monitoring may not be abdicating its responsibilities, but might instead be adopting the best strategy available to cope with an enormous monitoring challenge.
This result provides a microfoundation for understanding reductions in review activity even in settings where principals have nothing to gain from tying their own hands. Although some prior work has examined informational rationales for principals to give up oversight discretion and reduce activity (Gilligan and Krehbiel, 1987; Callander, 2008; Gailmard, 2009), our result is very general and, unlike these models, does not require partial compatibility of principal and agent preferences nor liability of agents for production failures. Our results are thus quite broad and additionally apply to pure oversight problems such as law enforcement or regulatory oversight, where principal and agent preferences may be completely opposed. Further, we explore how the nature of a principal–agent hierarchy, and particularly the ease with which agents may acquire or infer information about other agents’ likely behavior, may or may not give rise to this result. We consider how the principal’s dynamic resource allocation problem changes when groups of agents, of whatever size, might be able to coordinate their behavior. This allows us to characterize the conditions under which we should expect principals to reduce their oversight activity in the face of a particularly severe monitoring problem.
The model
Consider an oversight relationship between a set of agents (Ai ) and a single principal (P). Every period N of the agents apply a rule to a discrete case and the principal, if it wishes, can audit some number of the agents’ decisions. Consistent with many existing principal–agent models (cf. Calvert et al., 1989; Bawn, 1997; Cameron et al., 2000; Lax, 2003) we presuppose a conflict of interests between the principal and its agents, allowing agents a simple dichotomous choice: apply the rule so as to comply (C) with the principal’s preferences, or else shirk (S) and implement their own preferences. 3 We also follow this literature by assuming that the agents most prefer to shirk and not be caught, next prefer to comply, and least prefer to shirk and be caught. Agents thus receive b for successful shirking, 0 for compliance, and – r for being caught shirking.
We deviate from convention in three ways. First, while most oversight models focus upon the relationship between a single principal and a single agent, we are interested in studying how a principal’s resource allocation over monitoring agent decisions depends upon the workload it has to monitor. Therefore, we generalize the standard model by allowing for N agents, each of whom decides a single case each period, and each does so in an infinitely (indefinitely) repeated setting.
Second, we allow for agency error. In particular, we assume that even those agents who are trying to faithfully comply with the principal’s preferences may unintentionally make decisions that the principal would like to review. 4 For example, agents may make mistakes due to uncertainty about the principal’s preferences or due to good-faith misapplication of a complex rule or policy. Formally we allow each agent that does not intentionally shirk to, with some probability q ∈ (0,1), generate a case that the principal prefers to review. Thus, if all agents choose to comply, then in expectation qN agents will generate a case the principal wishes to review, though in any actual period the realized number may be higher or lower. 5 The principal cannot discriminate between unintentional shirking and intentional shirking; it observes only the total number of agency actions it may wish to review.
Finally, we do not assume any direct utility cost for oversight; rather, we assume review lasts for a duration of time. As a consequence, once a decision to start reviewing some activity is made, time that could have been spent on starting to review some other activity must be allocated to finish the current review activity before the task can be completed. For simplicity, we assume any review started must be completed. Later we demonstrate our results are robust to relaxing this assumption. To model this notion of a time constraint, we adopt the following formal structure. First, our principal has a finite pool of time with which to monitor. Specifically, we assume that the principal has one unit of time each period that can be allocated to reviewing agency behavior. Second, a review takes two periods to complete. The principal can start a small amount of reviewing, and thus only have a small amount of reviewing to complete in the next period. Or the principal can start a large amount of reviewing, and thus have most of its time pre-committed to completing those reviews in the next period. For example, if the principal spends a 0.6 share of its resources at time t, it may spend at most 0.4 at time t + 1. If the principal then spends 0.3 at time t + 1, it will have a maximum of 0.7 available in t + 2, and so on. In the extreme, if the principal commits its full unit of time to reviewing in one period, it will have no time to add more review activity in the next: its time in the next period is fully allocated to resolving the review activity to which it has already committed. This structure captures the central notion that review is not instantaneous; the time it takes to complete tasks “spills over” across periods.
Note that, because we are explicitly modeling time and attention, rather than a material resource such as money, resources that are not spent in a given period cannot be saved. For example, if the principal spends no resources at time t, it has made no commitments and may spend up to 1 at time t + 1. If the principal then chooses to spend no resources at time t + 1, its maximum commitment at time t + 2 remains just 1; that is, it may not carry over any portion of the resources unused at time t. Perhaps the most intuitive way to understand this structure is to consider principal activity as having both an initiation and a resolution phase; activity initiated at time t requires equivalent resources to be expended on follow-up at t + 1, but time unspent on either phase is wasted and cannot be banked.
The principal’s dilemma is that some portion of total agent activity, N cases per round, will come available for review in each discrete period. The principal must thus balance its desire to take on review commitments today against its desire to take on new review opportunities tomorrow. This structure captures an important aspect of the opportunity cost of oversight: namely, the fact that a principal’s alternative opportunities for review include not only those agency actions that are known about when review is initiated, but also those which may come to the principal’s attention after initiation. For example, suppose that a Congressional committee decides to initiate proceedings to investigate some question of bureaucratic compliance. This oversight process may take weeks or even months during which time agents may take new actions and other investigative opportunities may become apparent: opportunities that the committee might consider quite urgent. If the committee wishes to bring to conclusion its current activities, it may have to forgo starting those new proceedings. It is this dynamic tension that our formal structure attempts to capture.
It is worth reiterating here that while we focus on the case where a commitment to completing a review once started is binding, i.e. the principal is forced to resolve reviews begun at time t during time t + 1, we also consider an extension in which the principal is instead free to drop partially completed reviews to clear its calendar. This feature does not alter our central results; it is the dynamic aspect of the principal’s problem, not the binding nature of its commitment to completing a review, that drives our results below. 6
Formally, define the proportion of resources expended in any period as xt , where xt ∈ [0,1], and when the principal spends xt resources in period t, their resource expenditure at time t + 1 must be in [0,1 − xt ]. This resource spillover lasts only one period; if the principal spends xt in period t, this amount is tied up in period t + 1 but becomes free again for use in period t + 2.
Expected utilities and strategies
We consider a setting in which individual agents may only intermittently be active or find themselves in disagreement with the principal. For this reason and for the purposes of tractability, we therefore model agents as short-run players, such that a different set of N agents is responsible for each period’s activity. 7 For purposes of descriptive symmetry we assume agents operate at the same pace as principals and that each agent, like the principal, requires two periods to resolve an action; thus in any given period N new agents begin their activity, and N agents resolve theirs and exit the game. An agent’s final decision on how an action is resolved (comply or shirk) takes place in its second period of activity. An agent’s expected utility for shirking is (1 − p)b − pr, where p is the probability that the principal reviews their particular case. If an agent attempts to comply it receives zero utility for successful compliance, occurring with probability 1 − q, and receives the expected utility for shirking given above if it does not, which occurs with probability q. The review probability p is an implicit function of the total number of agents actually shirking in any period, nt , and the principal’s expenditure in that period xt , which we denote as p(nt ,xt ). 8
We assume a functional form of p(nt
,xt
) in which
To capture the principal’s preferences, we simply need a utility function that indicates its desired resolution of each agency action. Thus, assume the principal receives a payoff of 1 for each instance of ex ante compliance or for each instance of non-compliance which it is able to review and reverse. The principal receives a payoff of zero otherwise. Based upon these payoffs, the principal’s utility function for a given amount of shirking nt is UP = (N − nt ) + ntp(nt ,xt ), where (N − nt ) is the total number of cases not shirked on and ntp(nt ,xt ) is the number of cases the principal is able to review. The latter term is composed of the probability of reviewing any given case (as a function of the number of cases shirked and the resources directed towards review) times the number of cases shirked on (either strategically or unintentionally).
A strategy x * for the principal is a function which maps any number of observed instances of shirking and possible level of available resources to some expenditure xt ∈ (0,1 − x t − 1), for a given history of the game:
A strategy
Equilibrium behavior
As is typical of dynamic games, our game has many possible equilibrium strategy combinations. Our strategy is to identify two equilibria which differ in a key feature, and discuss how behavioral implications differ as a function of this feature. The distinction between our equilibria depends upon what individual agents anticipate about other agents’ strategies. Because the probability of any individual agent being reviewed depends on the total number that shirk, the best replies for a set of agents may be shirk if they anticipate each other will shirk, while under the exact same conditions (i.e. parameters and game structure) the best replies may be comply if they anticipate each other will comply. Said differently, it can sometimes be the case that an individual agent prefers to shirk if it believes many others will do so as well, but prefers not to shirk if it believes other agents are likely to comply. We will refer to this difference as a difference based upon the beliefs of the agents in order to connect the distinction in the two solutions to substantive interpretations.
In this model, agents may sometimes face a coordination problem. Some of the multiple equilibria are Pareto-superior to others from the perspective of the N agents. We present two equilibria in which agents’ expectations about how other agents will respond to this coordination problem are very different. In the first, agents’ beliefs can be understood, intuitively, as “uncoordinated”, in that agents do not expect others to take advantage of chances to jointly shirk. For a real-world analogue to this setting, consider a regulator monitoring a large number of firms for compliance with some regulatory regime. Because firms (agents) are independent and large in number, they may not possess the tools that would facilitate coordination on the Pareto-superior equilibrium and thus are most likely to act unilaterally. In the second, agents may instead share coordinated beliefs that others will choose to shirk when opportunities to take Pareto-improving multilateral action arise. As an analogue to this setting, we might consider a Congressional subcommittee that is responsible for oversight of rule application within a single bureaucratic agency. Though agency decisions may be made by many different individual bureaucrats, these decisions are housed within a single institution with its own avenues of internal communication; as such, it is possible for members of the agency to choose to stake out a different policy position from its principal, resulting in simultaneous shirking over many individual policy decisions. For the sake of exposition, we refer to these sets of beliefs throughout as uncoordinated or coordinated, respectively.
Note that because both sets of beliefs are consistent with equilibrium play, there is no ex ante reason to believe one or the other is more likely to obtain. Later, we discuss in more depth various substantive interpretations of these agent beliefs, and discuss the real-world conditions that may make it more or less likely that agents hold a particular set of beliefs about one another’s strategies. Because our agent-coordination equilibrium requires only mutually reinforcing beliefs rather than explicit collusion, this coordination may be understood in a variety of ways, from the highly premeditated, entailing pre-play communication, to the less purposive, where agents merely understand one another’s incentives well but do not actually interact. While it is always possible for agents to hold either set of beliefs, mutually-improving beliefs may be more likely in settings where, for example, agents work together, have access to similar information, or where single agents are responsible for many tasks; no-coordination beliefs may be more likely where agents are highly independent from one another and have loose to non-existent informational ties.
We choose to highlight these two equilibria because they represent opposite extremes of the principal’s possible oversight dilemma. The first equilibrium, based on uncoordinated agent beliefs, represents a setting extremely favorable to the principal. Equilibrium 1 is a subgame perfect Nash equilibrium (SPNE) in which the principal monitors effectively by virtue of the fact that individual agents assume that other agents are generally compliant even when the principal lacks the time to review a large number of cases. Agents do not achieve Pareto-improving coordination, and instead behave conservatively.
In the second equilibrium, we refine our solution concept and identify a SPNE that is, unlike Equilibrium 1, coalition-proof. This equilibrium, which recognizes that agents’ expectations about one another’s behavior may be coordinated on mutual shirking, represents a setting that is highly unfavorable to the principal. Intuitively, the coalition-proofness refinement relaxes an assumption from Equilibrium 1 that is potentially quite strong. Equilibrium 1 assumes that all subsets of agents have uncoordinated beliefs. Equilibrium 2 allows such coordination, and is robust, for example, to the existence of bilateral or multilateral relationships among agents that might enable them to solve their coordination problem.
Because the two equilibrium concepts we adopt map onto different real-world settings, each is substantively sensible as applied to some subset of monitoring relationships. Notably, these two equilibria provide strikingly different expectations about monitoring by resource-poor principals. We now present equilibria describing principals’ resource allocation in highly favorable and highly unfavorable oversight environments, and show that in unfavorable environments, principals sometimes do not use all of their resources and instead maintain a degree of “slack” capacity that provides more ability to monitor in subsequent periods. Following presentation of these two equilibria we also provide a discussion of intermediate cases in which agents may form coalitions only up to a certain maximum size.
Equilibrium 1: no-coordination equilibrium
We begin by discussing the basic logic of the no-coordination equilibrium. Best response functions for this equilibrium are given by Proposition 1. Here and below, we focus our attention on equilibrium behavior within a range of the number of agents N for which the principal possesses enough capacity to execute an effective monitoring strategy, defined as one that can routinely induce all agents to intentionally comply in consecutive periods. For more extreme N, the principal can no longer oversee its agents effectively, and only half-control or no-control equilibria exist. 9 Thus both this equilibrium and Equilibrium 2, presented below, exist only up to a “breaking point” at which the principal can no longer maintain effective control of its hierarchy. Substantively, we are interested in insights into a principal’s behavior as it approaches such a breaking point, rather than its behavior beyond it, and as such we restrict our exposition to the effective-monitoring range.
Behavior in Equilibrium 1 centers around a key threshold, which we term x 1. Here and throughout, an expenditure with superscript, xi , denotes the expenditure that just deters a coalition of size i from shirking when they believe that the remaining N − i agents will attempt to comply. 10 For i = 1, then, the threshold x 1 can be understood as the potential expenditure x at which individual agents are just indifferent between compliance or shirking, given that they believe all other agents that period will try to comply. Because agency error is possible, and review by the principal is probabilistic for sufficiently small expenditures, there may be a temptation for agents to ‘hide’ among the unintentional shirking. The threshold x 1 denotes an expenditure by the principal such that an agent is just indifferent between this gamble and the certain payoff from compliance.
In this equilibrium, the intuition behind the principal’s strategy profile is straightforward. The principal’s strategy is to review as many cases as it can in the immediate period without impairing its ability to deter intentional shirking the next period. As a result, the threshold x 1 is an important one for the principal. If the principal enters a period of play with free time below than the threshold x 1, individual agents know with certainty that the principal lacks the ability to spend even enough to make them indifferent, even if other agents attempt compliance, and thus individual agents will shirk. Because all agents face this same unilateral incentive to shirk, all will choose to do so and the net result is total non-compliance for that period. This outcome is disastrous for the principal. By contrast, if the principal retains a credible threat to review agents with sufficient probability, then all agents make the opposite calculation and comply. In Equilibrium 1, the principal’s strategy involves never spending more than 1 − x 1, thus assuring sufficient available time for deterrence in each period. 11
Further, because the number of realized instances of shirking is a random variable, it may occur that the principal need not spend even this amount of time. For any observed amount of shirking nt , there is a maximum expenditure for the principal, where it reviews all shirked cases with probability 1; the particular maximum expenditure associated with a given realization of the number nt of instances of shirking is termed mn . When mn is less than 1 − x 1, the principal need not spend more than mn . Finally, if the amount the principal retains from the prior period, 1 − x t − 1, is less than both mn and 1 − x 1, the principal simply spends all that it has left. Thus, when the time constraint based on the principal’s desire to retain reserves for deterrence does not bind, the principal may be able to review all shirking in a period. However, this constraint may bind. If so, the principal reviews as much shirking as possible, conditional on keeping enough uncommitted time to deter intentional shirking the following period.
Given this set of strategy profiles, in each period agents anticipate that x 1 or more will be spent by the principal. As such, agents always comply fully. The combination of these strategy profiles results in overall equilibrium behavior which is quite intuitive. Every period the principal reviews as many shirked cases as possible, subject to the constraint that it must keep at least x 1, the amount sufficient to deter shirking in the next period, available. Knowing that the principal always has sufficient free time, agents never intentionally shirk. Rather, all observed shirking is a product of unintentional actions that cause the principal to want to look at a case, occurring with probability q.
Comparative statics
How do the principal’s resource allocation choices in Equilibrium 1 vary as the size of its monitoring problem grows? To examine this, we explore comparative statics over the principal’s equilibrium expenditure x *, with respect to the number of agents N. We consider two aspects of the principal's allocation: its short-run behavior, or how allocation choices within a single period change as the number of agents grows larger, and its long-run behavior, or how the principal's allocation across an arbitrary pair of periods changes as the number of agents rises. The principal’s long-run allocation across a pair of periods is given by the sum of its short-run allocations across successive periods 12 The model’s predictions about short-run behavior can shed insight into how principals optimally allocate resources on a day-to-day basis, while predictions over long-run behavior, because the principal’s full carrying capacity of 1 is guaranteed to be available to it over any two-period interval, speak more directly to the total workload principals might take on over some extended stretch of time. Because we are particularly interested in the possibility of underutilization of resources, where the principal allows any portion of its time and attention to ‘expire’ and go unused, we are especially interested in predictions over long-run behavior.
Because our game is stochastic, we focus on changes in the expected time usage by the principal as the number of agents, N, increases. Consider Figure 1. Figure 1 presents the expected allocation across a pair of periods, t and t + 1, where the principal begins period t with 1 available: its full endowment. The principal’s allocation, given by the bold line, is defined by which of its three constraints takes on the smallest value. The solid line represents the expected value of mn . The term mn , recall, represents the proportion of the principal’s resources necessary to review a particular realized number of instances of shirking with certainty. The solid line thus represents the time necessary to review qN cases, and is increasing in the number of agents.

Resource expenditure x * at times t and t + 1, and combined, by N, for Equilibrium 1. When agents choose to comply, E(mn ) gives the resource expenditure necessary to review all unintentional shirking, in expectation. The resource constraint 1 − x 1 represents the amount the principal wishes to retain into the next period to deter individual shirking. The available resource pool 1 − xt reflects the consequences of previous equilibrium expenditures.
The narrowly dashed line gives the constraint induced by the agents’ deterrence threshold x 1. The principal desires to retain at least 1 − x 1 after each period to maintain control in the next period. The agents’ indifference threshold x 1 is also increasing in the number of agents. This is because adding more agents increases the expected amount of unintentional shirking, 13 meaning that the principal’s expenditure must also increase to maintain a sufficiently large probability of review.
Finally, the broadly dashed line gives the “hard” constraint induced by past commitments, 1 − x t − 1. In Figure 1, the upper-left plot represents a period xt in which the principal’s full endowment is available, or 1 − x t − 1 = 1. The upper-right plot represents a period x t + 1 in which the principal has played its equilibrium strategy at time t, given the expected amount of unintentional shirking; as such, the broadly dashed line at time t + 1 vertically mirrors the bold line from period t.
As shown by the bold line in Figure 1a, at time t when the principal has its full endowment, it does not use all of its time immediately. The principal spends mn when there is sufficiently little shirking, but once the number of agents N reaches a certain size, it spends 1 − x 1 and retains enough for deterrence. Behavior at time t + 1, when the principal’s initial endowment is dependent on its past behavior, is somewhat different. Here, the principal’s short-run expenditure remains non-monotonic, but because the previous period’s spending acts as an additional constraint, takes on a more complex shape, first rising, then falling, and rising again as the number of agents N increases. This unusual shape is caused because when the principal’s deterrence constraint does not bind, it will spend more and more resources at time t, and sometimes more than half. Once its constraint begins to bind, it begins to leave itself more resources for time t + 1. Taken together, what this means is that given the interplay of equilibrium expenditures across periods, the principal will alternate between periods of slightly higher and lower expenditures for many values of N, though increases in the number of agents eventually cause the principal to balance its expenditures more equally across periods. This also implies that a short-run snapshot of a real-world principal’s resource allocation might result in seemingly unusual patterns, because its allocation depends in part on its prior choices. ‘Busier’ principals may sometimes engage in less monitoring at a single moment in time than those with seemingly less on their plate, due to the need to pace themselves and avoid overcommitment.
That said, though the principal never commits all of its time to initial review of cases in a single period, in the long run the principal makes full use of its total endowment of 1 across any two periods once the number of agents grows sufficiently large. This is illustrated by Figure 1c, showing the principal’s long-run allocation choices in Equilibrium 1 as a function of the number of agents N, or the sum of the expected equilibrium expenditures at t and t + 1. 14 In the no-coordination equilibrium, the principal’s expenditures are strictly increasing and eventually reach 1, the maximum possible. The principal may spend less, but this only occurs when shirking is so rare that the principal’s capacity is not strained. When agents cannot overcome their coordination problem, once the principal’s time expenditure reaches its peak, it remains there. Thus for real-world principals facing a setting resembling Equilibrium 1, we would expect long-term growth in agency size to result in increases in oversight activity, up to a maximum.
Note that Equilibrium 1 exists only for the range of the number of agents N during which
Equilibrium 2: coalition-proof equilibrium
Now consider an equilibrium in which groups of agents can coordinate their actions. The possibility of coordinated shirking makes the setting substantially more difficult for the principal, who must design an oversight strategy that avoids presenting groups of agents (of any size) with opportunities to gain by shirking collectively. Formally, we are applying the coalition-proof Nash equilibrium (CPNE) refinement.
Allowing for coordinated actions by the agents changes the amount of time the principal must keep in reserve in order to deter intentional shirking. Whereas x 1 is sufficient to deter unilateral shirking, here the principal must reserve xN , enough time to deter the largest possible coalition, the coalition of N agents, from jointly shirking. That is, let xN , be the resource expenditure necessary to make N agents prefer complying to shirking when anticipating all other agents to shirk. If the principal enters a period with less than xN in reserve, then it lacks enough time to review agents with high enough probability to make shirking disadvantageous if all agents jointly shirk. When insufficient time is available, shirking is a profitable and self-enforcing deviation for the coalition of N agents, 16 and, unlike in the previous equilibrium, in this setting agents will take advantage of such opportunities. As such the principal must be cautious about overstepping this threshold. Note that xN > x 1.
When the number of agents N is small enough that
This situation changes when
In this range a broad swath of potential oversight strategies for the principal do indeed either unravel as equilibria or result in frequent agent noncompliance. For example, a stationary strategy as in Equilibrium 1, modified so that the principal always retains xN , enough reserves to deter coordinated shirking by all N agents, does not induce agents to comply. This is because the maximal time that then can be spent reviewing in any given period, 1 − xN , is not sufficient to deter N agents from coordinated shirking in that period (by the definition of xN ). At the same time, stationary strategies that exceed 1 − xN unravel because the principal cannot commit to such expenditures when there are no reputational consequences. For example, if the principal’s strategy is to always spend the maximum time possible then agents must always comply when xN or more reserves are available. However, this fact causes the principal to deviate from its maximum strategy whenever it implies an expenditure above 1 − xN , going over causes mass non-compliance in the next period while deviating downward generates compliance, unraveling the potential equilibrium.
While many possible strategies fail, there nonetheless exists an equilibrium in which the principal can achieve (nearly) full compliance for some of this range. We focus attention on this equilibrium accordingly. Interestingly, in this equilibrium the principal underutilizes its long-run resources (i.e. does not use all of its time available to review), and responds to growth in the size of its monitoring task by eventually narrowing the scope of its monitoring activity. Proposition 2 details pure strategy best-response functions for all players in the equilibrium of interest.
Note first that where the deterrence threshold
The logic of this strategy is quite intuitive. The principal is most concerned about coordinated group defections. Yet to deter larger coalitions, it is not sufficient for the principal merely to possess xN in reserves; agents must also fear that those reserves might actually be used. If agents anticipate that they actually face no serious punishment for coordinated defection even given that the principal possesses the time to carry out such punishment, those reserves are of no effective value. The principal’s strategy must thus entail a credible threat to punish mass deviations, meaning it must maintain sufficient reserves as well as a willingness to use them under appropriate conditions.
When the principal sees a small amount of shirking, it treats it as unintentional and reviews no more than 1 − xN
, leaving itself at least xN
free time for the next period of play. This amount 1 − xN
results in enough routine monitoring to keep small coalitions of agents from being tempted to hide among the random unintentional shirking.
18
Yet when
Intuitively, this strategy works for the principal because its threat to punish mass shirking is credible, meaning that following through with a large expenditure after observing sufficient shirking is actually a best reply. Failure to follow through is not tempting because it would undermine the principal’s short-term credibility, resulting in more shirking.
The principal’s oversight strategy is also highly effective because it need not actually spend much time on review to deter large coalitions from shirking. Instead, the principal maintains a threat to spend at least xN , the amount sufficient to deter mass shirking, which causes large coalitions to fear immediate punishment for mass deviation and induces collective compliance. The fact that agents comply in response to this threat, however, implies that the principal will typically not observe a sufficient amount of shirking to cause it to actually carry out this threat, except when a suspiciously large number of unintentional shirks happen to be drawn at the same time. This means that the principal’s resources are generally not depleted, allowing it to carry its deterrent threat across successive periods.
The amount of shirking n
* above which the principal begins to aggressively review is the largest possible threshold that prevents all coalitions of any size from attempting to coordinate on shirking. Note that if the principal sets its threshold too high, it may not deter intermediate-sized coalitions. For example, suppose N = 300 and the principal triggers its high-review strategy only after defection by all 300 agents. Then a coalition of, e.g., 250 agents might prefer to coordinate on shirking, knowing that the odds that the remaining 50 agents will all draw unintentional shirks are very remote, and thus it is likely that the principal will not spend aggressively. At the same time, if the principal sets the threshold too low, it will achieve deterrence when time is available but trigger its costly high-review phase more frequently than necessary.
20
The principal’s optimal trigger threshold balances between these two tensions, and is defined with reference to the largest coalition that can be deterred by an “ordinary” expenditure of 1 − xN
: the maximum expenditure that does not impair its ability to deter shirking in the next period. Thus, all coalitions of size less than n
* are deterred from group shirking because the lower expenditure 1 − xN
still presents them with a significant enough chance of review; coalitions larger than n
* are deterred by the principal’s threat of a more serious response. The optimal trigger threshold n
* is never lower than
The principal’s aggressive response may occur on equilibrium path, though in practice it is exceptionally unlikely that this aggressive response is ever necessary. For example, suppose that N = 300 and xN
= 0.75. Then, given the formula above, n
* must be at least 100. Even assuming a serious agent error rate q = 0.2, the probability of 100 unintentional shirks being drawn in a single period is so small as to be effectively zero. As such, combined principal and agent strategies in Equilibrium 2 result in total compliance when
This strategy is effective for the subset of the range
Comparative statics
So why does the principal choose to underutilize its long-run review resources and how does this decision depend upon the amount of activity the principal is charged to oversee? The answer lies in the principal’s ability to achieve deterrence based on the threat of aggressive review rather than its execution. Consider Figure 2, illustrating the principal’s typical time expenditures in periods t and t + 1 as a function of the number of agents N.

Resource expenditure x * at times t and t + 1, and combined, by N, for Equilibrium 2. When agents choose to comply, E(mn ) gives the resource expenditure necessary to review all unintentional shirking, in expectation. The resource constraint 1 − xN represents the amount the principal wishes to retain into the next period to deter mass shirking. The available resource pool 1 − xt reflects the consequences of previous equilibrium expenditures.
As before, the figure shows the expected value of mn , the typical amount of resources necessary to review all unintentional shirking in a period, as a solid line, the principal’s deterrence constraint 1 − xN as the narrowly dashed line, and its resource pool constraint 1 − xt as the broadly dashed line. In both periods, the principal’s short-run expenditure increases for the range of N where the expected amount of shirking can be reviewed without any effect on deterrence in the subsequent period. At a certain point, however, the need to maintain reserves for deterrence begins to constrain the principal, and its allocation instead is typically 1 − xN .
Because the principal can achieve deterrence in Equilibrium 2 via the mere threat to trigger a significant amount of review, the long-run relationship between resource allocation and the number of agents looks quite different. As can be seen in Figure 2c, there now exists a non-monotonicity in the summed expenditure from periods t and t + 1: the principal’s long-run allocation. Marginal increases in the amount of potential review, beyond a certain point, begin to cause decreases in the principal’s realized activity level. This occurs once the principal’s deterrence constraint, 1 − xN
, begins to bind. As the number of agents N increases, the principal must keep more and more resources free to maintain its deterrent threat, but because the principal does not typically spend this time reviewing, its observed activity declines. During the downturn portion of the figure, the principal will typically spend 2 − 2xN
across a pair of periods. Because this occurs while
When facing a sufficiently large number of agents, the principal begins to favor responsiveness over actual oversight. By maintaining “slack” in the system, the principal can retain a credible threat of an effective response to group deviations. Past a certain point the principal must keep so much in reserve to deter intentional defection that it begins to underutilize its time, and to allow a large percentage of seemingly ripe opportunities for monitoring to pass by without review. 24
Note that this result is not due to our stylized assumption that the principal must fully carry out any review it commits to. As we show in the Appendix, even when this assumption is relaxed, and the model extended to allow the principal to drop initiated cases without resolution if it wishes to clear its calendar, equilibrium behavior remains as described above. This occurs because a principal that has committed to a large amount of review is strictly better off completing that review and beginning the next period with a largely open calendar, and a return to compliance, rather than dropping cases and initiating a large number of new reviews, even in response to group shirking. Though one might intuitively expect such a principal to routinely fill its calendar, and respond to group deviations by clearing space and punishing them, principals cannot credibly threaten such a strategy because they never prefer to drop cases, and agents respond to overcommitment with group shirking that will go unpunished. Thus, to maintain agents’ incentives to comply, the principal must still adopt the slack strategy we describe, even when it may abort review at no cost. A similar logic will also apply to the possibility that the principal can delay review for any number of periods; in the long run, delay is not a sustainable strategy because it would result in the buildup of a larger and larger backlog that a principal can ultimately never clear.
Finally, though our presentation focused on two extremes, no interagent belief coordination or potential coordination among any or all agents, predictions similar to those given by Equilibrium 2 also arise when only limited subsets of agents may coordinate their beliefs. For example, if the largest subgroup of agents that may coordinate is arbitrarily capped at some proportion of the total, i.e.
Discussion
The central takeaway from the two equilibria presented above can be stated simply: when a principal can exploit a coordination problem among its agents, its resource expenditure can be anticipated to increase steadily with growth in agent activity, up to a maximum; when the principal is concerned that agents possess coordinated beliefs about one another’s strategies, the principal’s resource expenditure can instead be expected to first increase, then decrease as the number of actions taken by agents grows. The model provides a microfoundation for understanding why a policy-oriented principal might prefer to maintain slack in its schedule: doing so is a more effective monitoring strategy when facing multiple agents who have potential to coordinate their expectations. When agents believe that other agents will adopt a strategy of attempting to take advantage of an overburdened principal, the principal must avoid overcommitting its time, and may ultimately underutilize its review capacity.
Understanding agent coordination
What does it mean to say that agents might be able to coordinate their expectations of one another’s behavior, and what does it imply for our understanding of real-world principal–agent relationships? There are multiple ways to conceptualize agent coordination, some of which suggest that the relationship seen in the coordinated equilibrium is likely to obtain in a wide variety of monitoring settings.
The most direct conceptualization of coordination is as an explicit incentive-compatible contract among agents to choose particular strategies. This is the strictest conceptualization, and requires avenues of communication or information-sharing among groups of agents. Though strict, we expect that many political hierarchies containing large numbers of agents are often characterized by some non-negligible risk of this form of agent coordination. While in our model, each agent resolves a single case or action before being replaced, this is obviously a stylization, and in fact principals often interact with agents housed under a single roof, or who are responsible for multiple tasks. When considering the aggregate output of a single bureaucratic organization, with its own internal norms and procedures, it seems clear that agents are not independent from one another and may communicate among themselves about the principal’s capacity to monitor them.
It is also reasonable to conceptualize coordination more loosely, in a way that does not demand explicit communication or interaction. Equilibria 1 and 2 differ primarily in terms of agents’ beliefs about other agents. Importantly, group shirking does not necessarily require direct coordination of beliefs by pre-play communication. Instead, it merely requires that a sufficient number of agents mutually believe that the principal cannot control them all with a high-resource strategy. For this to be true agents need merely to understand one another’s incentives. Because the problem agents face concerns which equilibrium to coordinate on, it resembles an assurance game: agents always have an interest in shirking if they believe others will as well. Sufficiently savvy agents may be able to solve this problem even without direct communication. If the principal is concerned that agents could gravitate toward a collective belief that its monitoring strategy is ineffective, such as by long-term observation and repeated interaction, then it may make sense for the principal to use the more conservative slack strategy from the outset. Thus maintaining slack monitoring resources may be a strategy designed to prevent the considerable downside risk of group shirking even when the ability of agents to overcome their coordination problem is not so clear.
The adoption of a slack resource strategy is likely even when we consider that many principals, such as Congress, are responsible for oversight of multiple agencies containing multiple agents. Our model predicts that principals may find a slack strategy optimal even when the maximum number of agents who may coordinate is capped at some subset of the total. This maps onto a case where principals desire to monitor many agents who operate within a modest number of independent bureaucratic or judicial institutions, such that they can coordinate within but not across institutions. This describes many monitoring problems in political hierarchies such as legislative oversight of the bureaucracy or Supreme Court oversight of the circuit courts.
We would thus expect that single-principal multiple-agent hierarchies should often exhibit the non-monotonicity in monitoring activity described above, except in circumstances where principals are extremely confident that collective defection is impossible and where agents interact with the principal in an ad hoc way and are largely unable to learn over time. Political hierarchies, such as legislative–bureaucratic relationships, seem to us especially likely to be characterized by the possibility of coordinated agent beliefs and the adoption of a slack resource strategy by principals. Other monitoring settings, such as regulatory oversight of firms, might be better explained by the no-coordination equilibrium and not subject to this non-monotonicity, though even there we might see such behavior by regulators facing an oligopolistic or monopolistic market.
Empirical implications
The model we present provides insight into the challenges faced by principals who must oversee many agents, and into the strategies they may adopt to best cope. It also suggests implications that may be important for the empirical study of delegation and monitoring. The central result of our modeling exercise, of course, concerns the long-term relationship between agency size and monitoring activity by principals. Our second equilibrium provides intuition for why principals settings with some risk of agent coordination, such as most political hierarchies, would begin to privilege flexibility and responsiveness, even if it means correcting fewer agency errors. Our model suggests that declines in monitoring activity by Congress and the Supreme Court may be directly attributable to growth in government: the proliferation of bureaucratic agency activity and federal caseload growth, respectively. This is a testable implication, and these linkages could be productively explored in future work.
Our model also reveals interesting implications with respect to the short-term relationship between agency size and monitoring. Notably, in both equilibria the relationship between the number of agents and short-run activity is non-monotonic. Even in Equilibrium 1, where principals face a relatively tractable monitoring problem and generally operate at full capacity, they still pace themselves and avoid committing too much of their time and attention at any one discrete moment. In settings where researchers can be confident that it is possible to observe short-run rather than long-run allocations, this implication is testable. For example, consider a cross-section of law enforcement agencies that typically engage in multi-year investigations. An examination of a snapshot of the number of cases opened by each agency in a single year would capture their short-run rather than long-run allocation dilemmas, and we would expect a nonlinear relationship between their activity and the amount of potential monitoring. Because this differs from the long-run expectation of growth in monitoring activity for agencies in such a setting, this also suggests caution for empiricists interested in oversight, as their choices concerning how to operationalize monitoring activity may inadvertently capture short-run instead of long-run dynamics (or vice versa), leading to confusing or misleading results.
The model also suggests interesting directions for thinking about the specific dynamics of other resource-management problems. For example, consider expiring budgets. Many bureaucratic agencies have monetary budgets with rules that any money unspent by a certain date is lost, with the budget then refreshed in the next period. This is a different setting from that we examine, in that with expiring budgets there exists a final round each cycle where the consequences of expenditures do not carry over. Our model nonetheless suggests some implications for such settings; in particular, we would expect a pattern where resources are underspent until the final period of one budgeting cycle, then spent aggressively at the last possible moment. Such a setting would not necessarily be characterized by the aggregate underspending we see here (barring other frictions) but we would likely observe dynamics similar to those of our model in non-final periods, and thus more severe backloading of expenditures as monitoring problems grow more serious. A model of this process would thus look somewhat different from ours above, and optimal behavior might depend on institutional features such as the ability to delay that are unimportant in our general setting, but our model suggests novel avenues of thinking about this common form of resource constraint that could productively be explored in future research.
Our model also has implications beyond the question of how principals allocate resources, and suggests new directions for thinking about other fundamental questions in the study of hierarchical politics, such as how political principals choose to delegate policymaking responsibility. Our model suggests that the potential for interagent coordination and collusion significantly affects the amount of oversight principals do, and, when the number of agents is large, also affects the proportion of total agency errors that they are able to correct: or the principal’s overall policy influence. One possible implication of this result is that overburdened principals may have a strong interest in working to hamper the capacity of agents to coordinate with one another, so that their oversight setting is more likely to resemble Equilibrium 1 than Equilibrium 2. This may provide a foundation for further understanding the choice by principals to delegate to multiple, fragmented agencies rather than giving a single agency responsibility for many tasks. By delegating to many parallel agencies, a principal such as Congress can attempt to increase the transaction costs of interagent collusion, making it less likely. We would thus expect fragmentation in bureaucratic responsibility to be more likely as the number of actions taken by agencies grows larger.
A final implication of our model is that while observation of a downturn in review activity may not be immediately problematic, it may nonetheless also be an indication of a principal that is approaching an agency activity level that it might be unable to effectively oversee. The slack strategy can be effective for a significant range of the number of agents N, but as this term continues to grow larger, eventually it can grow so large as to prevent the principal from maintaining effective hierarchical control. Declines in review activity in the face of agency growth may be rational and effective, but also provide a warning sign that further growth in the scope of agency activity might be untenable. To the extent that our model provides a partial account of declines in Congressional and Supreme Court oversight, this may suggest the necessity of improvements to the capacity or the monitoring toolkits available to these institutions. This may be particularly true for the Supreme Court, which has less ability to broadly redefine the nature of its relationship with its agents.
Conclusions
In this paper we examine how principals facing severe time constraints can optimally manage their resources. As noted, oversight activity undertaken by two major institutions of American politics, Congress and the Supreme Court, has declined since the early to mid-1980s. Have these institutions abdicated their responsibility to impose control over agents and provide them guidance about their preferences and expectations? While lodged in normative terms, this question also implicates broad political science explanations of oversight behavior, which expect policy-maximizing actors to use resources such as committee hearings or the Supreme Court docket to the fullest in pursuit of their policy goals. Diminished levels of review would seemingly constitute a serious paradox for extant theories of principal–agent politics.
We have presented a formal model that suggests an answer to this paradox. Our model shows that a more restrictive use of resources can be a rational response to severe principal–agent problems. Seen from the perspective of our model, declining oversight levels may actually reflect policy-maximizing behavior by principals facing severe resource challenges. In this light, declines in oversight can be seen not as a withdrawal by these principals from their role as monitors of the bureaucracy or lower courts, but rather as the adoption of an alternative but rational review strategy to maximize compliance in the face of continued growth in the amount of government activity falling within their purview. It might still be the case that Congress or the Court could audit more frequently than they currently do and still maintain a sufficient threat of audit to maximize compliance. However, what our model highlights is the inherent tradeoff principals confront between using agenda space to make policy and maintaining available agenda space to present a credible threat of auditing future behavior. Policymakers acting in the real world may not strike that balance perfectly.
Footnotes
Appendix
Acknowledgements
The authors wish to thank Tom Clark, Ken Shotts, Jeff Staton, Georg Vanberg, and two anonymous reviewers for helpful comments and feedback.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
